From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C41D5C77B73 for ; Thu, 20 Apr 2023 07:44:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 192646B0071; Thu, 20 Apr 2023 03:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 142F46B0072; Thu, 20 Apr 2023 03:44:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 031386B0074; Thu, 20 Apr 2023 03:44:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E73A46B0071 for ; Thu, 20 Apr 2023 03:44:12 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B2FD2C056F for ; Thu, 20 Apr 2023 07:44:12 +0000 (UTC) X-FDA: 80700981144.04.DAF1CE5 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf30.hostedemail.com (Postfix) with ESMTP id EE0FD8000B for ; Thu, 20 Apr 2023 07:44:09 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf30.hostedemail.com: domain of xhao@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=xhao@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681976651; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oNkRqmarhJXtJFbkY4N1oxKA+pAgvzs4nKjGiPjN600=; b=HQO6JbXlX8gP/xPqJSlFrRyi6/FAEmxXdGnq3OKZGdOL5WuxCHbNW4hfBvUNPhiiq3kvHA zFnPi7jlxVlnFf3zFwZtDrTg7hoVjQbP2FtQQIZpv5XeuCo5bFCkC+24SbxPA0djaJ3OlS TmSMx6GlE1Lm4maaxo58iUD/YjLAE5E= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf30.hostedemail.com: domain of xhao@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=xhao@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681976651; a=rsa-sha256; cv=none; b=4JY9U9466Hx8jnKiiNFZrYTFOEvd4INmp4vfxbYsvCpB9nYcMCELh/w5i0VZyfD4sOY1lK 3Q6jD1aXZrm1TRtc7pIJ7CbzKaoG/KvRkN1PTZfWo/fPA7ZnEavez5Jeg3t5NIWrsw13lc HqeZbNCKAtiw8ygTxHFfhZjSMjHFI1A= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=xhao@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VgY3EAD_1681976643; Received: from 30.240.106.197(mailfrom:xhao@linux.alibaba.com fp:SMTPD_---0VgY3EAD_1681976643) by smtp.aliyun-inc.com; Thu, 20 Apr 2023 15:44:05 +0800 Message-ID: <97e79078-69e8-e387-9e77-a4d741eace4e@linux.alibaba.com> Date: Thu, 20 Apr 2023 15:44:03 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH] mm,unmap: avoid flushing TLB in batch if PTE is inaccessible To: Huang Ying , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel test robot , Nadav Amit , Mel Gorman , Hugh Dickins , Matthew Wilcox , David Hildenbrand References: <20230410075224.827740-1-ying.huang@intel.com> From: haoxin In-Reply-To: <20230410075224.827740-1-ying.huang@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EE0FD8000B X-Stat-Signature: ruq1d19nszhopxae3o48qazyz41a1csz X-HE-Tag: 1681976649-545153 X-HE-Meta: U2FsdGVkX1+AvI7QHEdgoBvIM6fFq9SvLw5+yMPBSQMpyPoKVFZmUQUhM/D68zyLJ5uouJsBFk8VGGcEFbty9YBh80Itq4/u7QGa4S7tPTBVIy7QRIJR7AZJCL+/BVsREkWN/Lzlb/BTeEt7rZzwImDtP4HVrptsxQ7QI7hsiDGx2nJf8FYH+CmAXKPnLzMHZTZ22ycJSyOS2w4RLe2YclfIp7QVjOLds4JQyE6meLQb2Rlogt0pKfBAfK/jcKGwmcouHGHTRKFYkSp0WxjdCpF67ticwglh5JEagGtC+orcCW32zAl2As2YufeCvlKx+XfhtxR/Okuy/maQBiJ3nMD0q0aM9/ZFFbuc4/ipriO1gU+XM/177f8ORzklqisZuFZUxGO3SgYZ62mxPMmhimGvBWj3/EawsyJOknp0qTNIspqnJL7QBRWz9JIf0YWaiVRB17VZI9F7V5/I7dCliuN+zLs1MIxKcC/dR7LBVH6Jtv56Q7WHMgWazM3gxqPEE/x+IZWvO9CJb9uIrnusN2RdHssUIAFBeaXlTDKok8KUbDiy73xtSa9KE3esXmsde7d+KWRgcZkOgOeHAPgRr/BzDTL1Y6CCn0BqBVdIeXDQ0kzDLnQndPlphWS5z1A1EfB+cglQPc/WhU9njBoasRuAXaYBtlN/OmusCD5wwBWhILGwkOOeLFrHRFlO7z3EpEUWVD1+717it4B3xg9mOnfRY/rJVHQZuOqFBdtT4aNt34/tO0zQsQo1uVJEXbxJsOMquzRfDM7+8TQZt1TKFLcL5on6OvnYzGntwk09cpXmExZthukMPA5H9pGFMzTXb7gFal813rxSPqQj9onooiQUwCJSe3yJy1WR14j378Yja5SDjrFIjwcmk+XQXA9YXvcZ2wCZ3w/ReplztZd3RoZWHbrk+PFFj0xHj9THTC3fyBJRXS5e/u1XuiNI0cqAYMevQc2qk5MaaugM77c WK/TgkDX 6pENwn476G1R3ToDo4a3BwAcvoJb8vFRBDMnWfu5BAAsDX/K+Isx3J6hlP353yxIUZ7udgEBkSqQkA7ouZTlj/3FsAbqgNjbk+qI8MuggbkByaA5BYw+CzcYme1qfOex+SpsApQOa3D8LoNRd0A+g8MA+PdVbGsilXmzwDbfaTzpiZjzS/Di9rY821V1p2m4K9iA9Vci1cYZsubvSZVMAuUSbq+l7dCFnkySxvPvTS8T9mG1XCsYXO/ekZf7uqeQ+hp5e9h60Slv64IxLNfhbjbEdod7KEvfL0+ziNIBi8YaGUSFDBTNTiCX8TWhveI5TlhzSw7hTvmsfNZJPraTPrrAwgmz/k6X6KgQU/vXl38+qrdshB5NOfxMUqA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: 在 2023/4/10 下午3:52, Huang Ying 写道: > 0Day/LKP reported a performance regression for commit > 7e12beb8ca2a ("migrate_pages: batch flushing TLB"). In the commit, the > TLB flushing during page migration is batched. So, in > try_to_migrate_one(), ptep_clear_flush() is replaced with > set_tlb_ubc_flush_pending(). In further investigation, it is found > that the TLB flushing can be avoided in ptep_clear_flush() if the PTE > is inaccessible. In fact, we can optimize in similar way for the > batched TLB flushing too to improve the performance. > > So in this patch, we check pte_accessible() before > set_tlb_ubc_flush_pending() in try_to_unmap/migrate_one(). Tests show > that the benchmark score of the anon-cow-rand-mt test case of > vm-scalability test suite can improve up to 2.1% with the patch on a > Intel server machine. The TLB flushing IPI can reduce up to 44.3%. > > Link: https://lore.kernel.org/oe-lkp/202303192325.ecbaf968-yujie.liu@intel.com > Link: https://lore.kernel.org/oe-lkp/ab92aaddf1b52ede15e2c608696c36765a2602c1.camel@intel.com/ > Fixes: 7e12beb8ca2a ("migrate_pages: batch flushing TLB") > Reported-by: kernel test robot > Signed-off-by: "Huang, Ying" > Cc: Nadav Amit > Cc: Mel Gorman > Cc: Hugh Dickins > Cc: Matthew Wilcox (Oracle) > Cc: David Hildenbrand > --- > mm/rmap.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 8632e02661ac..3c7c43642d7c 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1582,7 +1582,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > */ > pteval = ptep_get_and_clear(mm, address, pvmw.pte); > > - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > + if (pte_accessible(mm, pteval)) > + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > } else { > pteval = ptep_clear_flush(vma, address, pvmw.pte); > } > @@ -1963,7 +1964,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, > */ > pteval = ptep_get_and_clear(mm, address, pvmw.pte); > > - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > + if (pte_accessible(mm, pteval)) > + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); Just a advice, can you put pte_accessible() into set_tlb_ubc_flush_pendin(), just like ptep_clear_flush(); so that we no need to add  if (pte_accessible()) in per place where call set_tlb_ubc_flush_pending(); > } else { > pteval = ptep_clear_flush(vma, address, pvmw.pte); > }