From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A161C43334 for ; Thu, 16 Jun 2022 23:29:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A89AA6B0071; Thu, 16 Jun 2022 19:29:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A12796B0072; Thu, 16 Jun 2022 19:29:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 868056B0073; Thu, 16 Jun 2022 19:29:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 70DD46B0071 for ; Thu, 16 Jun 2022 19:29:57 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4283776A for ; Thu, 16 Jun 2022 23:29:57 +0000 (UTC) X-FDA: 79585694034.19.1429AAD Received: from mail-vs1-f43.google.com (mail-vs1-f43.google.com [209.85.217.43]) by imf19.hostedemail.com (Postfix) with ESMTP id E7BB31A008F for ; Thu, 16 Jun 2022 23:29:56 +0000 (UTC) Received: by mail-vs1-f43.google.com with SMTP id e20so2612686vso.4 for ; Thu, 16 Jun 2022 16:29:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jTQu5pe6Zf0+qj19U3vviuhopcu1lsdgelMDE7rhu0Q=; b=tOonm2OIkZMBU/bLGDvI7kAER+D+N/KWwhs+Yk5bDeqcXoDUEirgvpt7Q9JysPsHYA 6LdxZobaPlu7v03YFmDCMAW3+t/XzxD6Xnuh1H32fIjUF1dv66ru8JuQDhuNJA85y/tj jUSf0Exxifrnke77ACFpsYgc3m4l6e17IxPmoPpEu9D7icFToolqS1hjIg6MAgJCK8cp jrsf932jCh/QooZfhdpc2EZ8SlIVNhVQSS5E/hWXAEzlhkomzv5h9DxTefjNLVA77OYa eXlnuSbNI3ELZ1clQiuOgRYCOLMy/JcYAv+YY1F9ryL4RGz/7hVzY8LTHm4fjfAnF4iQ ZXzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jTQu5pe6Zf0+qj19U3vviuhopcu1lsdgelMDE7rhu0Q=; b=6+Yb2gVSnw8+D3XLDdDJ3+88FtigOtBR3bwLmMBk4xqx6QoGrVezOfyYz4/7SruROj p4peyJz0yY7xJ9g1N1aQxrlscLjgwG6WP0vi6QUAXIAVemnJW6ZvkLadqRC5XCHyHxvy BZbnNyNonMpLCfici92ele2OGqVn5DsEXLOi/z+S+K/pEeJSipOU7kndTc6D9sYS7AUt kR+5YVwkv6TaFbOLDUreOVbU7rAuDivUsFUihxATH6lyvNgDGAvH1tYvcyXW6CsEDr4j Sr9nwUH9ToAZ32sVAhSz+oZ6PvhG1ezdYosHKU9rjXzPcTkL/hlR1jn1uTslos1naGp6 usAw== X-Gm-Message-State: AJIora/6q5UWllBrV/JEtNSPECwizjxKKA4mU9F57jdJNvy2jdWA48xS xl9p/dDWyYeIXe9Yv/FVSHFjyLX/TjohsxG4suR8fA== X-Google-Smtp-Source: AGRyM1veGbPT7enA0m3I3+tAnsLBKXsnQLJV3rPbk+ikhnpYHTwYIIIakxBL5WDVTz+5J16Gvq8L0TSQpNlnXCwr6jI= X-Received: by 2002:a05:6102:3e23:b0:34b:b6b0:2ae7 with SMTP id j35-20020a0561023e2300b0034bb6b02ae7mr3717280vsv.81.1655422195990; Thu, 16 Jun 2022 16:29:55 -0700 (PDT) MIME-Version: 1.0 References: <20220518014632.922072-1-yuzhao@google.com> <20220518014632.922072-8-yuzhao@google.com> <20220607102135.GA32448@willie-the-truck> <20220607104358.GA32583@willie-the-truck> In-Reply-To: From: Yu Zhao Date: Thu, 16 Jun 2022 17:29:19 -0600 Message-ID: Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap To: Barry Song <21cnbao@gmail.com> Cc: Linus Torvalds , Will Deacon , Andrew Morton , Linux-MM , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , LAK , Linux Doc Mailing List , LKML , x86 , Kernel Page Reclaim v2 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain , huzhanyuan@oppo.com Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tOonm2OI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.43 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655422197; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jTQu5pe6Zf0+qj19U3vviuhopcu1lsdgelMDE7rhu0Q=; b=3fp1Um19qncF7mSBPmmIxihoKJ1N+XMdB5DpdtuQ9GjlctW5OJ0pJEtYm9MxEQfPitxHyx mb1zs5HkRGw7rZ8Oy1G8OG/dgqhyIPlJPGgHMOoteaRMDUbQ3vi5g5ZZoUZz6LM5Zz0hZC nKzI7qSs4aGs13oySlkenoCrqptYSk0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655422197; a=rsa-sha256; cv=none; b=0f0Xu0oIiX2reWvzzma883uM7KRurwqiVzfGy7oWUanIfzQNS0jQslcIxSctjC8sy7EnSO US1JKssBK285Qw4dQtyyUEzR8apkt/lztSsWpkyH9TqSJUqxBGZ58nitsd8uk6szTcWAgD 9YIbT20WKjYKYmi5Ws1DSEYLsXdTqnc= X-Stat-Signature: 37rnr7jifru4jpwr3djhqhu7dt3abc4x X-Rspamd-Queue-Id: E7BB31A008F X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tOonm2OI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.43 as permitted sender) smtp.mailfrom=yuzhao@google.com X-Rspamd-Server: rspam10 X-HE-Tag: 1655422196-442096 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 16, 2022 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote: > > On Fri, Jun 17, 2022 at 9:56 AM Yu Zhao wrote: > > > > On Wed, Jun 8, 2022 at 4:46 PM Barry Song <21cnbao@gmail.com> wrote: > > > > > > On Thu, Jun 9, 2022 at 3:52 AM Linus Torvalds > > > wrote: > > > > > > > > On Tue, Jun 7, 2022 at 5:43 PM Barry Song <21cnbao@gmail.com> wrote: > > > > > > > > > > Given we used to have a flush for clear pte young in LRU, right now we are > > > > > moving to nop in almost all cases for the flush unless the address becomes > > > > > young exactly after look_around and before ptep_clear_flush_young_notify. > > > > > It means we are actually dropping flush. So the question is, were we > > > > > overcautious? we actually don't need the flush at all even without mglru? > > > > > > > > We stopped flushing the TLB on A bit clears on x86 back in 2014. > > > > > > > > See commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case > > > > clear the accessed bit instead of flushing the TLB"). > > > > > > This is true for x86, RISC-V, powerpc and S390. but it is not true for > > > most platforms. > > > > > > There was an attempt to do the same thing in arm64: > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793830.html > > > but arm64 still sent a nosync tlbi and depent on a deferred to dsb : > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1794484.html > > > > Barry, you've already answered your own question. > > > > Without commit 07509e10dcc7 arm64: pgtable: Fix pte_accessible(): > > #define pte_accessible(mm, pte) \ > > - (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid_young(pte)) > > + (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte)) > > > > You missed all TLB flushes for PTEs that have gone through > > ptep_test_and_clear_young() on the reclaim path. But most of the time, > > you got away with it, only occasional app crashes: > > https://lore.kernel.org/r/CAGsJ_4w6JjuG4rn2P=d974wBOUtXUUnaZKnx+-G6a8_mSROa+Q@mail.gmail.com/ > > > > Why? > > Yes. On the arm64 platform, ptep_test_and_clear_young() without flush > can cause random > App to crash. > ptep_test_and_clear_young() + flush won't have this kind of crashes though. > But after applying commit 07509e10dcc7 arm64: pgtable: Fix > pte_accessible(), on arm64, > ptep_test_and_clear_young() without flush won't cause App to crash. > > ptep_test_and_clear_young(), with flush, without commit 07509e10dcc7: OK > ptep_test_and_clear_young(), without flush, with commit 07509e10dcc7: OK > ptep_test_and_clear_young(), without flush, without commit 07509e10dcc7: CRASH I agree -- my question was rhetorical :) I was trying to imply this logic: 1. We cleared the A-bit in PTEs with ptep_test_and_clear_young() 2. We missed TLB flush for those PTEs on the reclaim path, i.e., case 3 (case 1 & 2 guarantee flushes) 3. We saw crashes, but only occasionally Assuming TLB cached those PTEs, we would have seen the crashes more often, which contradicts our observation. So the conclusion is TLB didn't cache them most of the time, meaning flushing TLB just for the sake of the A-bit isn't necessary. > do you think it is safe to totally remove the flush code even for > the original > LRU? Affirmative, based on not only my words, but 3rd parties': 1. Your (indirect) observation 2. Alexander's benchmark: https://lore.kernel.org/r/BYAPR12MB271295B398729E07F31082A7CFAA0@BYAPR12MB2712.namprd12.prod.outlook.com/ 3. The fundamental hardware limitation in terms of the TLB scalability (Fig. 1): https://www.usenix.org/legacy/events/osdi02/tech/full_papers/navarro/navarro.pdf