From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5ACD4C43334 for ; Tue, 7 Jun 2022 21:07:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F5D68D0001; Tue, 7 Jun 2022 17:07:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A4AD6B007B; Tue, 7 Jun 2022 17:07:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 845558D0001; Tue, 7 Jun 2022 17:07:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 763666B0071 for ; Tue, 7 Jun 2022 17:07:35 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 49F3380D4C for ; Tue, 7 Jun 2022 21:07:35 +0000 (UTC) X-FDA: 79552676070.13.6627F26 Received: from mail-vs1-f49.google.com (mail-vs1-f49.google.com [209.85.217.49]) by imf19.hostedemail.com (Postfix) with ESMTP id 9E0ED1A000C for ; Tue, 7 Jun 2022 21:07:33 +0000 (UTC) Received: by mail-vs1-f49.google.com with SMTP id d39so17617784vsv.7 for ; Tue, 07 Jun 2022 14:07:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BgxQ0h6I958Y2GI2JoqVGW/OESE9fxrXKfh9ZN0Zq08=; b=oKoX3EKP+If3Pg3CSsiIzp7MdgwGoVt1eWuoV8GtnXDvuCJWY+M7wmLS8sygJpkjaK iOAb0QC3WVE1pARsV9XTNGUaZNsCEia7M1Y1OMdqEzCr2DXhR1x2R0fjgls2GHghq0DJ kmV9VrpihV+4mk7fDF1AIBg06e8OMvXc5a6mzwLGlAFIx2PS+fPJA4xnLb/RNoCfe/qz 5sLk3qDe0vNhpDymnq40eNUHcQXuLAYm2G/O3hH9F58tM1JmX/d7h9aWfcX3LmHxBJIg ulq8z9he5xJHfSCzWyqQiUlWhK6wtsKlDnmt21fI11y8OSoQVCEq2KhLU5sFa9gV6pa0 7+zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BgxQ0h6I958Y2GI2JoqVGW/OESE9fxrXKfh9ZN0Zq08=; b=tIFVwDNn0ylGr8yoKm1PAH+Ea0FQ+2+J5Kcx2JXndhgPB96hwIOZzdraqmeagegBtK +qrC/bomIF5ytFSrNp/oevBbn3dyjszpeuXKCQAg4CIF0+rTAxKXVRTf1q35YKLOXZcJ ou6x414jFzhFBeIHD5YZAqmOuQr22ntkci5OlgwWHX+d7xGs0tT+e+rI9S2WUfBicFCN sIWeibShrBVUR1mEatd1d71v5SQcz5E8FPoZAhJZ2T6omUIkeqtdZw5BAD1tRW53kZUE PE8cgiuAv9sLewcDRQhzbVE4LqueRSmrfoVjvioXvDTEx2cX8VF8JFhyrgooaV9B+STz NJGw== X-Gm-Message-State: AOAM530YwbfPkuWKIgRyEPjgYMbl37sni0uPckhWw07XqVuM8oeoQqvb LrmKbpeiKVX17nq3THSabznUzV995AwBtFvw2wIk6Q== X-Google-Smtp-Source: ABdhPJyjqcySNrRwXkeaEuBGNMG5NFNqHbpYoFwoZhrkb2oGCeTEFoqKeluO9Bya5UCEshSWAgyv3unWIgN/r4anB6o= X-Received: by 2002:a67:f3d0:0:b0:34b:b52d:d676 with SMTP id j16-20020a67f3d0000000b0034bb52dd676mr6635528vsn.6.1654636053943; Tue, 07 Jun 2022 14:07:33 -0700 (PDT) MIME-Version: 1.0 References: <20220518014632.922072-1-yuzhao@google.com> <20220518014632.922072-8-yuzhao@google.com> <20220607102135.GA32448@willie-the-truck> <20220607104358.GA32583@willie-the-truck> In-Reply-To: <20220607104358.GA32583@willie-the-truck> From: Yu Zhao Date: Tue, 7 Jun 2022 15:06:57 -0600 Message-ID: Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap To: Will Deacon Cc: Barry Song <21cnbao@gmail.com>, Andrew Morton , Linux-MM , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , LAK , Linux Doc Mailing List , LKML , x86 , Kernel Page Reclaim v2 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain , huzhanyuan@oppo.com Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: mcbwe4pihpdqntuf3htjmtbrz66hrww6 X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=oKoX3EKP; spf=pass (imf19.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.49 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9E0ED1A000C X-HE-Tag: 1654636053-698662 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 7, 2022 at 4:44 AM Will Deacon wrote: > > On Tue, Jun 07, 2022 at 10:37:46AM +1200, Barry Song wrote: > > On Tue, Jun 7, 2022 at 10:21 PM Will Deacon wrote: > > > On Tue, Jun 07, 2022 at 07:37:10PM +1200, Barry Song wrote: > > > > I can't really explain why we are getting a random app/java vm crash in monkey > > > > test by using ptep_test_and_clear_young() only in lru_gen_look_around() on an > > > > armv8-a machine without hardware PTE young support. > > > > > > > > Moving to ptep_clear_flush_young() in look_around can make the random > > > > hang disappear according to zhanyuan(Cc-ed). > > > > > > > > On x86, ptep_clear_flush_young() is exactly ptep_test_and_clear_young() > > > > after > > > > 'commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear > > > > the accessed bit instead of flushing the TLB")' > > > > > > > > But on arm64, they are different. according to Will's comments in this > > > > thread which > > > > tried to make arm64 same with x86, > > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793881.html > > > > > > > > " > > > > This is blindly copied from x86 and isn't true for us: we don't invalidate > > > > the TLB on context switch. That means our window for keeping the stale > > > > entries around is potentially much bigger and might not be a great idea. > > > > > > > > If we roll a TLB invalidation routine without the trailing DSB, what sort of > > > > performance does that get you? > > > > " > > > > We shouldn't think ptep_clear_flush_young() is safe enough in LRU to > > > > clear PTE young? Any comments from Will? > > > > > > Given that this issue is specific to the multi-gen LRU work, I think Yu is > > > the best person to comment. However, looking quickly at your analysis above, > > > I wonder if the code is relying on this sequence: > > > > > > > > > ptep_test_and_clear_young(vma, address, ptep); > > > ptep_clear_flush_young(vma, address, ptep); > > > > > > > > > to invalidate the TLB. On arm64, that won't be the case, as the invalidation > > > in ptep_clear_flush_young() is predicated on the pte being young (and this > > > patches the generic implementation in mm/pgtable-generic.c. In fact, that > > > second function call is always going to be a no-op unless the pte became > > > young again in the middle. > > > > thanks for your reply, sorry for failing to let you understand my question. > > my question is actually as below, > > right now lru_gen_look_around() is using ptep_test_and_clear_young() > > only without flush to clear pte for a couple of pages including the specific > > address: > > void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) > > { > > ... > > > > for (i = 0, addr = start; addr != end; i++, addr += PAGE_SIZE) { > > ... > > > > if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) > > continue; > > > > ... > > } > > > > I wonder if it is safe to arm64. Do we need to move to ptep_clear_flush_young() > > in the loop? > > I don't know what this code is doing, so Yu is the best person to answer > that. There's nothing inherently dangerous about eliding the TLB > maintenance; it really depends on the guarantees needed by the caller. Ack. > However, the snippet you posted from folio_referenced_one(): > > | if (pvmw.pte) { > | + if (lru_gen_enabled() && pte_young(*pvmw.pte) && > | + !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) { > | + lru_gen_look_around(&pvmw); > | + referenced++; > | + } > | + > | if (ptep_clear_flush_young_notify(vma, address, > > > Does seem to call lru_gen_look_around() *and* > ptep_clear_flush_young_notify(), which is what prompted my question as it > looks pretty suspicious to me. The _notify varint reaches into the MMU notifier -- lru_gen_look_around() doesn't do that because GPA space generally has no locality. I hope this explains why both. As to why the code is organized this way -- it depends on the point of view. Mine is that lru_gen_look_around() is an add-on, since its logic is independent/separable from ptep_clear_flush_young_notify(). We can make lru_gen_look_around() include ptep_clear_flush_young_notify(), but that would make the code functionally interwinted, which is bad for my taste.