From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72A2BC25B48 for ; Wed, 25 Oct 2023 03:03:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D72356B030A; Tue, 24 Oct 2023 23:03:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D224A6B030B; Tue, 24 Oct 2023 23:03:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C38616B030C; Tue, 24 Oct 2023 23:03:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B5C966B030A for ; Tue, 24 Oct 2023 23:03:07 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8722F160C12 for ; Wed, 25 Oct 2023 03:03:07 +0000 (UTC) X-FDA: 81382487214.14.79D1128 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf12.hostedemail.com (Postfix) with ESMTP id 7EF0440017 for ; Wed, 25 Oct 2023 03:02:57 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698202986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=11ZBSFhv8iZRVFuc1xG37hcmJ+cZEuJDw5XliwFpFWg=; b=Gfx6Pko14Q3N5lEeM7RLQ1+5cm8nBq5naxNsGIEUz+IZU11B98AmQJK8sTGEh4CmPfMyv+ b8zm6ToG9RlN3HQnpIg+C+czSaSmJhOL9Ad3bPA+bGDDltLLg8WRdbCAqCkIwLP323V431 3GQ7Fm6/dW3O1YG06gLWzB1xsEcmrfw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698202986; a=rsa-sha256; cv=none; b=jh5GlekIyJ9hZp0/FE5ZwSPY/KPbwZz0nIJXfVeFgXLLl5o7eKt7fcNeeyrK6C0u1Fx/LR /YWGN+ul1YPiSwH6YsswSe3vGM1lVVVJU3y1p0oOZbgg46l45JzGHtUEZbxxBHYCNWB3N/ xCSH/dRDaYGy60jqZ5ngizaWESDSKk8= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R651e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046051;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VusnLLH_1698202969; Received: from 30.97.48.63(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0VusnLLH_1698202969) by smtp.aliyun-inc.com; Wed, 25 Oct 2023 11:02:50 +0800 Message-ID: Date: Wed, 25 Oct 2023 11:03:06 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH] arm64: mm: drop tlb flush operation when clearing the access bit To: "Yin, Fengwei" , Barry Song <21cnbao@gmail.com> Cc: catalin.marinas@arm.com, will@kernel.org, akpm@linux-foundation.org, v-songbaohua@oppo.com, yuzhao@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <44e32b0e-0e41-4055-bdb9-15bc7d47197c@intel.com> From: Baolin Wang In-Reply-To: <44e32b0e-0e41-4055-bdb9-15bc7d47197c@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: tb1gdoknm56g4cwd3kxys11zdhkhc9i5 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7EF0440017 X-HE-Tag: 1698202977-945497 X-HE-Meta: U2FsdGVkX1/VBbJzazQ8HAXYQ4YSYAYbvR3nfrUqRFtkTiMsfAJQ5ObieHxu+2eoehEVbSkglAdiR8q4CLw6fAMQbIv8TpdX2jS6aw4uxgXGa/4M/3bchMpGB8TtZZsSodBLQX8V9SZZ2vwd2ZiRUhoJxAvJGqaoEixFeGuhsfaKRo0PwgRfNci8rZ13PFGVIKzoRG/RW8fFleitBXcnfwJ/A8R/aMFur3L3vzLaeoDmNP1m4nRsvOMlnhXBM+NH+rx5AnbpRzATqLLjm4HqwoXoGwg0Y6RbRFtz9mY6vf6PbWoo+WI2VSzuMMGag2zx9GU6vHsk1QnVMwfRvTc7DXajW0zgszsrEiJOxoyrP/eYd79/mRQZ5eJ0fSPgM6XfuTh0erGKpQ5wK8BNqI8H6PNhjtwNjAiueJozx3szfyNcjABbHQ4NwMFksm38KWgx2+IPpm/7cex+M1Z5dZs84ulloipZWZY7mY3OoutoSTIHv3zixFlnUhzLD8Y4B1csCIV7Ptxu+rx7EEjplAa8uKBy3ewR2MIf46vvWHTH6rdaqoRdWmqLdgumCxMhgI4mHIpDvFB3SrF5XDGi3nlt1V6Xn42/6CJjTcG4WTvryE3IhHAXf2nabYk4i2SrZkXQ3U/V98HDpEP5VtXvR1dkwn4oQQ6uYsDXTeC2BXIiqmimixUOb4b2cT0nxZpvKnpAQQf5//ndgoFutIQT3IdBu0ROZN6OPeXxkKGDg69Y5KCVbvwue/vtn7d/e+VQIIK/KYZKU/A93eufhl6/mZjzXZpEMoFyt3MU4FC9cZZket3Ku88nNEKyCDWs9ariXeYgWoH3uDOJ+brd4s7FogV0sXv2qrO9c7YweAx1aCmdC41N07dkCcEqS6h4Fmf61UL4SJW3oDS1ULr9hLtRwUOHfvckQSxF3v1LJxnBpwLPfBAH3MZBChfLcZkYEc5qabbbf4cQNjA+2H1W0CHehXy uWI0INka 6XoTOBQ2iYV+Pd4hKNzM2mxGQjxiICWRRoEVtJGwmTOZYRyxWDh/guSI6fYI8/t7m9sT3VenbaUG0QDUMJAz5AdydgqqA0iiqybRnbowfl9iqKGUXlxvIrMSJxtRyPC7khCfnrafEwdXli4Ori8DbQnciglkEJnio8yPWSJYB1ifxdHXpV1482aqdU2WkETA+ubW0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/25/2023 9:39 AM, Yin, Fengwei wrote: > >>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >>> index 0bd18de9fd97..2979d796ba9d 100644 >>> --- a/arch/arm64/include/asm/pgtable.h >>> +++ b/arch/arm64/include/asm/pgtable.h >>> @@ -905,21 +905,22 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, >>> static inline int ptep_clear_flush_young(struct vm_area_struct *vma, >>> unsigned long address, pte_t *ptep) >>> { >>> - int young = ptep_test_and_clear_young(vma, address, ptep); >>> - >>> - if (young) { >>> - /* >>> - * We can elide the trailing DSB here since the worst that can >>> - * happen is that a CPU continues to use the young entry in its >>> - * TLB and we mistakenly reclaim the associated page. The >>> - * window for such an event is bounded by the next >>> - * context-switch, which provides a DSB to complete the TLB >>> - * invalidation. >>> - */ >>> - flush_tlb_page_nosync(vma, address); >>> - } >>> - >>> - return young; >>> + /* >>> + * This comment is borrowed from x86, but applies equally to ARM64: >>> + * >>> + * Clearing the accessed bit without a TLB flush doesn't cause >>> + * data corruption. [ It could cause incorrect page aging and >>> + * the (mistaken) reclaim of hot pages, but the chance of that >>> + * should be relatively low. ] >>> + * >>> + * So as a performance optimization don't flush the TLB when >>> + * clearing the accessed bit, it will eventually be flushed by >>> + * a context switch or a VM operation anyway. [ In the rare >>> + * event of it not getting flushed for a long time the delay >>> + * shouldn't really matter because there's no real memory >>> + * pressure for swapout to react to. ] >>> + */ >>> + return ptep_test_and_clear_young(vma, address, ptep); >>> } > From https://lore.kernel.org/lkml/20181029105515.GD14127@arm.com/: > > This is blindly copied from x86 and isn't true for us: we don't invalidate > the TLB on context switch. That means our window for keeping the stale > entries around is potentially much bigger and might not be a great idea. > > > My understanding is that arm64 doesn't do invalidate the TLB during > context switch. The flush_tlb_page_nosync() here + DSB during context Yes, we only perform a TLB flush when the ASID is exhausted during context switch, and I think this is same with x86 IIUC. > switch make sure the TLB is invalidated during context switch. > So we can't remove flush_tlb_page_nosync() here? Or something was changed > for arm64 (I have zero knowledge to TLB on arm64. So some obvious thing > may be missed)? Thanks. IMHO, the tlb can be easily evicted or flushed if the system is under memory pressure, so like Barry said, the chance of reclaiming hot page is relatively low, at least on X86, we did not see any heavy refault issue. For MGLRU, it uses ptep_test_and_clear_young() instead of ptep_clear_flush_young_notify(), and we did not find any problems until now since deploying to ARM servers.