From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C925C83F0A for ; Tue, 8 Jul 2025 16:59:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB8DF6B0095; Tue, 8 Jul 2025 12:59:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B91286B0096; Tue, 8 Jul 2025 12:59:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA6DC6B0098; Tue, 8 Jul 2025 12:59:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 96DC76B0095 for ; Tue, 8 Jul 2025 12:59:23 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5DAF012729C for ; Tue, 8 Jul 2025 16:59:23 +0000 (UTC) X-FDA: 83641708206.11.131B3C5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 0A0181C0015 for ; Tue, 8 Jul 2025 16:59:20 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fGXCVzqi; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf18.hostedemail.com: domain of luizcap@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=luizcap@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751993961; a=rsa-sha256; cv=none; b=oyeOodyzNukEJ/XtltrBuA84PWHdUy4jbHE9JaSldRLd0wzMfp0z+Et1AAcTx+204s4dO2 0xXo7q7tNOeOgj0235pxDw1m6hxqINUfYYspU+gZenYPXrJndivUgaScqDlmrHWePv3U5n L7FgYtjQgIvyH5FnIF7bx/mHFM73SVc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fGXCVzqi; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf18.hostedemail.com: domain of luizcap@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=luizcap@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751993961; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wybB/fUy6mCrhP5jGn1a9ArjMgUrKLPn8Q0MHxBm0RE=; b=UajWWc78euhYEV6CaTOdRpPedy/4XXUIKs/cqWrN1QPh7ModAvBqRqCLmNXA0TK5ntBNVT LMm7Km6HRcGB/jqlcjFTznW+e02HLb5nz9avykm51jHKNj2x5ojK5N2t6m3+ICm3xCnzRX BkFGgVd0ksTfdfc+VEPAFedBJcgsMtQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1751993960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wybB/fUy6mCrhP5jGn1a9ArjMgUrKLPn8Q0MHxBm0RE=; b=fGXCVzqiiDUjrIElxsCgLwZuFWgnFFJO7oCVEyX9QTba52zwThNTFd0GpEY/1s/Re5/dbc hofqu0PXY2fy2p7nMzzY8xtpwpUXHsmnv0v8fi1e19HgmQ1BeVx1g3g9nKToAltaVaVzs0 p38rVVJgJzYpmPOnUNxwU4WS8A74WGw= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-204-gV6WAr3cMcuvrA2ZDhzq1Q-1; Tue, 08 Jul 2025 12:59:19 -0400 X-MC-Unique: gV6WAr3cMcuvrA2ZDhzq1Q-1 X-Mimecast-MFC-AGG-ID: gV6WAr3cMcuvrA2ZDhzq1Q_1751993959 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6fb5f71b363so74101206d6.2 for ; Tue, 08 Jul 2025 09:59:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751993959; x=1752598759; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wybB/fUy6mCrhP5jGn1a9ArjMgUrKLPn8Q0MHxBm0RE=; b=mlloWT+LkwfjLZlQ1Kyc/0PVW1pWZ/yN7AfzWNuebIbugIWe7vaWeNfLVtsnOGRqyb FYo3Vh2MXoT4xlWmt9z0/Yz07n9UvwfRuPaeEf8leFTSSlRjIL/lOHn+guJSYFcF++aE 97lomNueQAhEULFvEA8Q1BlZsqkjXD5ADHD+vrJcbQU0vrr+ArgMmTqEl4019o0DI2zy zNyvMg8S7rMrYXxPojBrnXV+fNrCvuuIhxwrQ6L1E4TQw5fDmgBiADi8iAJRn6UWUxY9 fInPrKJCL5WnDEzV5gyMPmt3jh2rJiqv9sMuicna1xV9daEKFeQlbqqDqViSGobROph7 66Jw== X-Forwarded-Encrypted: i=1; AJvYcCUWN4hURNbwIJbv0/n8OvhA3G6cVRxFozueuteYU3oqyjxI3XtyQlf3Uhq6CyykrL4O5vg0X4lsWQ==@kvack.org X-Gm-Message-State: AOJu0Yzm3Y/nPCcmID9ppNlcHXNCiZFjEtv74z9wzM//fetZECLd4iWK 4sSEOAZjcP25HqzKvT5480bPwQurVm2SVAZbQBHXgAvkfIMpOmLdLaejZXREhWXxpucFZCA21C2 X9S4R09RRNCi5PrnMmh3V78NJa2ofcNWU83KJQMQjtXsx0zj+eReL X-Gm-Gg: ASbGnctiQBAXx36MpCrXuZHZq0hwsqv1Z3e2BNsekGn2aupkl5WlzF+dpn2IBG5Iy4f UGZyN1XfC071d1kcNGY/5lOruISfGVjP2JCfW0j7yLXfsUgk3kQBYDJW0/aA+b1s8rgMMGyIO5t EIBJPtvCL3OH41Zdcrje7tCOJThFBPrCJcPIhoBiVk5fT2ooWtg/va9q7NeoKKaVl1XZc7IL34s v2kV4uQPEOlyLEXjUvq5tyL/5xjUwlVOA7c/dRLT80Wh7Mumo22kCmVGOyAov9vNtz7vbiPPR2Z j3NSZcnPHfNNQT0QGTMpjno= X-Received: by 2002:a0c:f113:0:b0:702:cea8:c598 with SMTP id 6a1803df08f44-702cea8c5cdmr200595546d6.3.1751993958683; Tue, 08 Jul 2025 09:59:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFXoR4uA6XBxL64aujeWrjmoayZW9DBpgTBBcSQum2HYEQkMOMEGFDRn0uDb4RXtMzn8yq9kw== X-Received: by 2002:a0c:f113:0:b0:702:cea8:c598 with SMTP id 6a1803df08f44-702cea8c5cdmr200595046d6.3.1751993958068; Tue, 08 Jul 2025 09:59:18 -0700 (PDT) Received: from [192.168.2.110] ([70.29.229.84]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-702c4d51040sm79052486d6.63.2025.07.08.09.59.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Jul 2025 09:59:17 -0700 (PDT) Message-ID: <60a1a526-52dd-478a-9ed4-79e6428743ac@redhat.com> Date: Tue, 8 Jul 2025 12:59:07 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/4] mm/util: introduce snapshot_page() To: Shivank Garg , david@redhat.com, willy@infradead.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, sj@kernel.org References: <88d956248f7528b7998ef00ca8742061839d1036.1751914235.git.luizcap@redhat.com> <3942336d-20e6-457d-8b53-e198eed5d9d9@amd.com> From: Luiz Capitulino In-Reply-To: <3942336d-20e6-457d-8b53-e198eed5d9d9@amd.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: JMI44Hk9_BCa-L7wzgEXxmnJV3MvLwwVkRNmh4WEmsg_1751993959 X-Mimecast-Originator: redhat.com Content-Language: en-US, en-CA Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 0A0181C0015 X-Stat-Signature: f5ejsyazuja1z5kce3uaieut11xukuxx X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1751993960-39934 X-HE-Meta: U2FsdGVkX19/B2YciC28DljeOcweZEXZmyq4Q67ROM+O7evGGNZkv5S62ImBss4z2JDAVYkblupWeBiQeD36EmpLgZ4n7OuYO5SN6BXkSHIH/Tq5jn5hjFwBNiMi7Q9ls63j5ujvSzzVkHCU/dOa5z500Sl38NAj3U6eRglOT+dwLC3hzmlRL5ZczusTMB4CGHFvOGWF2LSNC59m5or73Sz5qfIVJREwvkf0e2JPFuYWERzP20/HieEtqgnJ9aXy6uGQuFES0jsVlUxln7CNcBNu/C1aBBnL4bv/SeQscx8ydj2/Q+6XGCBh0uYrSlvn8QPpoAbzBd0SYK6AxtMvzA+kw47vKKEYdQFHqgJ6UAahx3ixbpS1w4JfjVeY/U6mSyCq1l3ubOFpgEsWuEyX/+00SFcNdooSBJsHK4us5nR1SfBaQ9Cds9KTl1b68Vqs/xvlmmafzQguYJbw3ffMARs97wwHZks1oEmWZLcHP+3cXXHRg5X6i/W33WCLv43divzRr/G//uxL3IXraX42Pw1la9kU3pFQT4mXJdE9Z6YSkkQisOqzETtLv4WAMKMNyFrHPI348FZFhJ2Bar4SRlGMbdpr2z2ttqWtYvms50meTPktiDnvH9NX26jCk4twhUt6tAZcYICue2mirRX2f+MwWYsqHyaLROf01n5ef2aZujS2f8UKbP1B56Ls5SkTH4bWCb89ObVzq4PjZQE9sG3LpzynGpoFG6l9KtTJZ6iA6ZwJmJtOWomzU3CTqOjB5GvFN1c4/XRyBT8XiYQY6+NivT877fEcQDnUY1rjYg230cVJKu4bXixO71bwtfVHuPhPkwVJM+dB0JdF/gUgsyGEjI0xY65odjH904OEvE1j5HseX4Gj2p7crUHXu1Cd1MJs9m6xZpSxczsY3etUzkeTUEyPQq8aLZZI/Xzrh/oKbvfF+V1EkiLaMm/K1sQ1kQb42spnVn9O0RnQFzO xMymcPyZ guKRKUZDjyT5s9eALM7GNTyqt2p66P/tbEV6I6SqILaFzXfWwiO4yAxJBE2ONXiEbILn0JNEuyB2JuRbZsgT5V/v9FTRtI+ROV4uWF0cz5UCvWMR0iMilNTkIymSEqL1gizcJmBWH0sX+A0eAsTHO2Q+QoJkQIU0Xtv9q7FpCbKh9EHT0ES26rz6nheeO8OfGiySyib7eHvfjJXOQuptCqGDdBtbhj4l6d79Byn0jwBa6TNU4E+RzUXPIE2Dxxf0sMgnHeE6FZcgSvTBy64B19oNGrjWPH3+M7j++V9cTXoVBaRyBASo/aPEUvlxcHVpIMS0VEUY1dPaJMlMXb3eANzdCJydGC4Hjh0lViDVaB12r4kF7kOTYCjVAT9kMekYsb8pydcOmZzcLQHTo491e9wag2U2+4znfZ3z8dn8taHGHJHU7xbzuxIjD6MtRjJt+kWiLn51Ua5K8NhcaGiuLp24VS3XxnI5/IrIafxlmyHgJqCZVlJkLGGaoRFKgtfqpGmA17R2EXiGiqnPPLU6xemZBiB7LmwbJFLA6yGm/TL7kR/33pT8z28c+vQGttXG7wN5+3pxV6pAoOvI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025-07-08 01:49, Shivank Garg wrote: > > > On 7/8/2025 12:20 AM, Luiz Capitulino wrote: >> This commit refactors __dump_page() into snapshot_page(). >> >> snapshot_page() tries to take a faithful snapshot of a page and its >> folio representation. The snapshot is returned in the struct >> page_snapshot parameter along with additional flags that are best >> retrieved at snapshot creation time to reduce race windows. >> >> This function is intended to be used by callers that need a stable >> representation of a struct page and struct folio so that pointers >> or page information doesn't change while working on a page. >> >> The idea and original implementation of snapshot_page() comes from >> Matthew Wilcox with suggestions for improvements from David Hildenbrand. >> All bugs and misconceptions are mine. >> >> Signed-off-by: Luiz Capitulino >> --- >> include/linux/mm.h | 19 ++++++++++++ >> mm/debug.c | 42 +++---------------------- >> mm/util.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 100 insertions(+), 38 deletions(-) >> >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 0ef2ba0c667a..090968c6eebb 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -4184,4 +4184,23 @@ static inline bool page_pool_page_is_pp(struct page *page) >> } >> #endif >> >> +#define PAGE_SNAPSHOT_FAITHFUL (1 << 0) >> +#define PAGE_SNAPSHOT_PG_FREE (1 << 1) >> +#define PAGE_SNAPSHOT_PG_IDLE (1 << 2) >> + >> +struct page_snapshot { >> + struct folio folio_snapshot; >> + struct page page_snapshot; >> + unsigned long pfn; >> + unsigned long idx; >> + unsigned long flags; >> +}; >> + >> +static inline bool snapshot_page_is_faithful(const struct page_snapshot *ps) >> +{ >> + return ps->flags & PAGE_SNAPSHOT_FAITHFUL; >> +} >> + >> +void snapshot_page(struct page_snapshot *ps, const struct page *page); >> + >> #endif /* _LINUX_MM_H */ >> diff --git a/mm/debug.c b/mm/debug.c >> index 907382257062..7349330ea506 100644 >> --- a/mm/debug.c >> +++ b/mm/debug.c >> @@ -129,47 +129,13 @@ static void __dump_folio(struct folio *folio, struct page *page, >> >> static void __dump_page(const struct page *page) >> { >> - struct folio *foliop, folio; >> - struct page precise; >> - unsigned long head; >> - unsigned long pfn = page_to_pfn(page); >> - unsigned long idx, nr_pages = 1; >> - int loops = 5; >> - >> -again: >> - memcpy(&precise, page, sizeof(*page)); >> - head = precise.compound_head; >> - if ((head & 1) == 0) { >> - foliop = (struct folio *)&precise; >> - idx = 0; >> - if (!folio_test_large(foliop)) >> - goto dump; >> - foliop = (struct folio *)page; >> - } else { >> - foliop = (struct folio *)(head - 1); >> - idx = folio_page_idx(foliop, page); >> - } >> + struct page_snapshot ps; >> >> - if (idx < MAX_FOLIO_NR_PAGES) { >> - memcpy(&folio, foliop, 2 * sizeof(struct page)); >> - nr_pages = folio_nr_pages(&folio); >> - if (nr_pages > 1) >> - memcpy(&folio.__page_2, &foliop->__page_2, >> - sizeof(struct page)); >> - foliop = &folio; >> - } >> - >> - if (idx > nr_pages) { >> - if (loops-- > 0) >> - goto again; >> + snapshot_page(&ps, page); >> + if (!snapshot_page_is_faithful(&ps)) >> pr_warn("page does not match folio\n"); >> - precise.compound_head &= ~1UL; >> - foliop = (struct folio *)&precise; >> - idx = 0; >> - } >> >> -dump: >> - __dump_folio(foliop, &precise, pfn, idx); >> + __dump_folio(&ps.folio_snapshot, &ps.page_snapshot, ps.pfn, ps.idx); >> } >> >> void dump_page(const struct page *page, const char *reason) >> diff --git a/mm/util.c b/mm/util.c >> index 0b270c43d7d1..c38d213be83f 100644 >> --- a/mm/util.c >> +++ b/mm/util.c >> @@ -25,6 +25,7 @@ >> #include >> #include >> #include >> +#include >> >> #include >> >> @@ -1171,3 +1172,79 @@ int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma) >> return 0; >> } >> EXPORT_SYMBOL(compat_vma_mmap_prepare); >> + >> +static void set_flags(struct page_snapshot *ps, const struct folio *folio, >> + const struct page *page) >> +{ >> + /* >> + * Caveats on high order pages: PG_buddy and PG_slab will only be set >> + * on the head page. >> + */ >> + if (PageBuddy(page)) >> + ps->flags |= PAGE_SNAPSHOT_PG_FREE; >> + else if (page_count(page) == 0 && is_free_buddy_page(page)) >> + ps->flags |= PAGE_SNAPSHOT_PG_FREE; >> + >> + if (folio_test_idle(folio)) >> + ps->flags |= PAGE_SNAPSHOT_PG_IDLE; >> +} >> + >> +/* >> + * Create a snapshot of a page and store its struct page and struct folio >> + * representations in a struct page_snapshot. >> + * >> + * @ps: struct page_snapshot to store the page snapshot >> + * @page: the page we want to snapshot >> + * >> + * Note that creating a faithful snapshot of a page may fail if the page >> + * compound keeps changing (eg. due to folio split). In this case we set >> + * ps->faithful to false and the snapshot will assume that @page refers >> + * to a single page. >> + */ >> +void snapshot_page(struct page_snapshot *ps, const struct page *page) >> +{ >> + unsigned long head, nr_pages = 1; >> + struct folio *foliop, folio; >> + int loops = 5; >> + >> + ps->pfn = page_to_pfn(page); >> + ps->flags = PAGE_SNAPSHOT_FAITHFUL; >> + >> +again: >> + memcpy(&ps->page_snapshot, page, sizeof(*page)); >> + head = ps->page_snapshot.compound_head; >> + if ((head & 1) == 0) { >> + foliop = (struct folio *)&ps->page_snapshot; >> + ps->idx = 0; >> + if (!folio_test_large(foliop)) { >> + set_flags(ps, page_folio(page), page); >> + goto out; >> + } >> + foliop = (struct folio *)page; >> + } else { >> + foliop = (struct folio *)(page->compound_head - 1); > ^^^^ > should we use ps->page_snapshot here? > IIUC, page may change due to race. You're right the race exists, but we can't make foliop point to the snapshot because we do pointer arithmetic and other operations that need the real page memory address to work properly. >> + ps->idx = folio_page_idx(foliop, page); >> + } >> + >> + if (ps->idx < MAX_FOLIO_NR_PAGES) { >> + memcpy(&folio, foliop, 2 * sizeof(struct page)); >> + nr_pages = folio_nr_pages(&folio); >> + if (nr_pages > 1) >> + memcpy(&folio.__page_2, &foliop->__page_2, >> + sizeof(struct page)); >> + set_flags(ps, foliop, page); >> + foliop = &folio; >> + } >> + >> + if (ps->idx > nr_pages) { >> + if (loops-- > 0) >> + goto again; >> + ps->page_snapshot.compound_head &= ~1UL; > > Should we use clear_compound_head() for clarity? Yes, we can do that. Andrew, since you applied this series already, can you squash the change below into this patch? If not, I can send v3. diff --git a/mm/util.c b/mm/util.c index c38d213be83f..ce4200529772 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1239,7 +1239,7 @@ void snapshot_page(struct page_snapshot *ps, const struct page *page) if (ps->idx > nr_pages) { if (loops-- > 0) goto again; - ps->page_snapshot.compound_head &= ~1UL; + clear_compound_head(&ps->page_snapshot); foliop = (struct folio *)&ps->page_snapshot; ps->flags = 0; ps->idx = 0;