From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C76E1C83038 for ; Wed, 2 Jul 2025 17:36:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 693706B0096; Wed, 2 Jul 2025 13:36:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 66B606B00C0; Wed, 2 Jul 2025 13:36:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 584616B00C1; Wed, 2 Jul 2025 13:36:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 468856B0096 for ; Wed, 2 Jul 2025 13:36:41 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E77941060C3 for ; Wed, 2 Jul 2025 17:36:40 +0000 (UTC) X-FDA: 83620029360.15.08A98EB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 8D468140005 for ; Wed, 2 Jul 2025 17:36:38 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HZryfUfi; spf=pass (imf09.hostedemail.com: domain of luizcap@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=luizcap@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751477798; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2GYUJnG0xPXypeyK4CSQtcgb/+YIuzlv2KlvaPQvSUE=; b=CfWUBawz412uyoYE1GxGaVTvWzyR3bppign5W5GO6kRXkInrViMTZfI/CVv8Zf0jqmeg8A BlFL3oh+Pljqk1geSJvl/AQ+UCni/CBsfcCy7Tq74vZcCMnMHVhXwc+VCMUaOli+Y1pWsq RBvr6PbEDgmCg/CITBDng2F2siOOd3U= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HZryfUfi; spf=pass (imf09.hostedemail.com: domain of luizcap@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=luizcap@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751477798; a=rsa-sha256; cv=none; b=dL5GMGzCZxqQYJpsYIuByEnQCXrHeCDRQL7V4rKPzpRPyl9sEN+XxkT0xF/oX0sQ9xMYEC XvWfYS+GckFoU/j4eRM8gukFncqrvwVR/Tn2b63pwYV3abJgSkCJzoCmdIMM/4Quh2lfcH BfMXvBodA3dHbK2C0qIe8fFvnqjFExY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1751477798; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2GYUJnG0xPXypeyK4CSQtcgb/+YIuzlv2KlvaPQvSUE=; b=HZryfUfilqwrrfgZM8D0GGGYN085bejbcgf3Weafs4Gww2acsX/6Wzx/xbPEd6ILqe/XJp MrEbY2UWF5BCg2tMTggeHdnNZs/IouHPDgCFEIedlQ3ZXe3RQ5eHk313COAacII+ZiciEj esGGXSFx/zzrmgntBS/0hKxS/x1mh7w= Received: from mail-oa1-f71.google.com (mail-oa1-f71.google.com [209.85.160.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-618-MvZqJp07O4yT8tVAPVxdMw-1; Wed, 02 Jul 2025 13:36:36 -0400 X-MC-Unique: MvZqJp07O4yT8tVAPVxdMw-1 X-Mimecast-MFC-AGG-ID: MvZqJp07O4yT8tVAPVxdMw_1751477796 Received: by mail-oa1-f71.google.com with SMTP id 586e51a60fabf-2eb1763cb08so4192600fac.3 for ; Wed, 02 Jul 2025 10:36:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751477796; x=1752082596; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2GYUJnG0xPXypeyK4CSQtcgb/+YIuzlv2KlvaPQvSUE=; b=sdX7oRDyfxaiyRcPX8XN5JZmy3VZMSFtJ9BwC41UDxnvQNOgL4R9yR39u7+3BksTsd rvEjPzkoc1hq24HiGjNnz9brRUSRZ8Irr9OupC1CHnhED8AnbBijbPcP929+GienboTq 4Dy8fY5YeR3Smbb82vEqjD3h9Zr03ZD+uC6RqUQkCh2k+k4kNq2pxe5We8lRG3AmnA41 nTyjTmgIV1c1S9OlZuRmQZ4RivkOJM7rZT2yF094Vq8RkxPGUPPYQElgger1eVQMLaOP BD3q5lUg8YDSDRSOhESS/bIvwo449u3gwpDUYaU4Snl+/jMV6Afof7pRLwv1SyyFVFuT kocA== X-Forwarded-Encrypted: i=1; AJvYcCUSLNj7iQIDHHEbC5sFviGTKb4R4RQy4UGWHtoB/vZgwJSKBJ0vkHKnh1C9ekj4AawikHRC7LDJKQ==@kvack.org X-Gm-Message-State: AOJu0YxOdrObM9mrpPArs0UK3F6bhl7OAXKfGZYaCn5Q56FTX274ye1z 0EUVfYhyXJoo5sIMbEMq1Dpaj7p+inR2IIe+7MqHVx7RMTg8P1gfVr/ydXnVaj4swAf8vvF6MKf ZmqJ2ai/rTOYbSkJWguQ3Ej0d5QE3j2oWA2oCrKtj0wN2LFMAzkME5mEHfxUD5tw= X-Gm-Gg: ASbGncvRmoKK+LR0bESr2PJ6Lw5iZcuQGYXxkKNYL33WLhXaz35qi7jx8oT7uju44QH MB3OkszrpaJVYI3esC5Dk5tNvg89+0VCp0FSY1VrUaJ5iOsewhNNRAwsthZ/MMF/6+DVW0UUlzx Q8KfmauY7XkWf1KkPJhnvGyC3YZBkhdLJ7HjRZcieUQDwOOBwY2/FikH+F05XgVCa+skE8+J3yV Mqc+xJipWznUc0rYWeovw1smeznNNHIneBGIHMme7SzjzQrk02AqbgDz0nYyJ6HFd6Ry0wXAu8P heQxRyQbuLfnD+sMVlyMBIL9 X-Received: by 2002:a05:6870:440e:b0:29e:74a0:e03f with SMTP id 586e51a60fabf-2f76c9bbc50mr238178fac.24.1751477795835; Wed, 02 Jul 2025 10:36:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH07K6+7yV9tkrUeO7NLAM5+j/shiD8vMQPM7CbAu2aFGr7A9l1QFTE/f7dtdS8hndgCujPHQ== X-Received: by 2002:a05:6870:440e:b0:29e:74a0:e03f with SMTP id 586e51a60fabf-2f76c9bbc50mr238158fac.24.1751477795444; Wed, 02 Jul 2025 10:36:35 -0700 (PDT) Received: from [192.168.2.110] ([69.156.206.24]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-2efd50b4d84sm3936817fac.23.2025.07.02.10.36.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 02 Jul 2025 10:36:35 -0700 (PDT) Message-ID: <0021e29e-381d-4f51-93ac-4f6faf0f2ff2@redhat.com> Date: Wed, 2 Jul 2025 13:36:23 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 3/3] fs: stable_page_flags(): use snapshot_page() To: David Hildenbrand , willy@infradead.org Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lcapitulino@gmail.com, shivankg@amd.com References: From: Luiz Capitulino In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: Tr4-QoPP8NrEgmUUlwPE7oPb1n0AEWneT768j-fZeNM_1751477796 X-Mimecast-Originator: redhat.com Content-Language: en-US, en-CA Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: 66ffgrzscm3qzsofe9994f3zjriyazyp X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 8D468140005 X-Rspam-User: X-HE-Tag: 1751477798-378642 X-HE-Meta: U2FsdGVkX18FVzN+zxcP9z5spxeeqbRGSoLGxkceY2ChAB1TAfcyQIi0hnNpoYndUzL6/TxEdwyyFQga3SttBrq97KlnMxzXawQny02VjVU0L+6igFt0mnE+a7cbpwHinEnPPb6E92yLRdmlcjqcNuNLcxNgizsweqIlj68xG7GoU7qOjbq0Fpwk8lhtpiDOW/2LA2ocXatLqTyLAzhuNJ2Zh/i6Qz/WcLMEU3B3lT1GUTwPk9yLUKmg0iMpcSM7VZ+Jrd5hpbam9tAO63Crq5BYRmyCZWifXs49t2BiiONpGyxISsYz7tJZk6tj5/2fHWJwtUMtK/O1R3iMX6mYp/ToVGPiRkshpJikyJgrD+0enjP/Hn6Y1NJHmDGEhcrnIl5+sjqpqmUajpwnVmS1LTQV01Vg6iHEKe/WdCkifhj+3h1rBmJiVaDdfMMxdsQESSgsP+Yl/rhswUGNu/tSAxifz9tKcF79wwGl624PQshzkun1PFKDEYp/bBbfrtAmkMJcA+DClreiYGf2hdsf4b+1LTG5Lt2E724v80udk1PyjClQHqKy7muDljH69/drCsWZnXEyJgXzTkfDTlqKoFAhLRdArjh9SCDUd7oddGPHqTCLkmwrwcKRMkXCvhxiUSB/A/YrIUDDWz52xOtCkEw+lL+43Mq0oXanvYF6r7aU0oXPYR4c34WWEYYj7X6zfOBlVue+SYM16g5FDQ27UYOAYEExFdPGO2ArNB5ze7NvQF5iKjpqYm23MG3+cFwcdhrwBXDog8Q6FMwz0cDnP45gSiPA3ehl8vJ644AacLedHHtzqIlvTc+4DJuJYtLuOL3PpSU25ebzzVHBjk6R3iR11XqsBRmi3MjUScDSbD/IVQYW8RUtQzHBEjYTkLtiYJRJ9aPGImGJKc8a/uTDrCMypK17/wpv2UD6WwVpVnW1iXqSPI47j/mPZ+TN6jTkIWJu2KmdRcqWUKB9QLv dcZJQOoi zcssPe/T1G+e4VB1a/TLxJnKX8ndiYGjvHgMuJcG65AOkrXPwHWE7RtdbypC6ucmgj33sqMhTva90lMRZgLWHHKthSTlM1ZuL3T45tsiBZ5WirY41xa9Nyg3PLgxvDgAcVqaO4DIGm6bkDP6vK9b1JD8MCo3U9mN9PbgtEsL+tGdTazSG7LFrR1BbSqvDUec0DJ3h/tddvpL53liec/b8jDl4SBrr05C6rH4iZPy1lF+oPdd/0xNS/yjxderytHtdYoqnRdL5IZOHP4GTQxANCn6GnKzmTHXSlBIfUtJdmv7Y37A+Dt2bBKN044KWPWXvlW7HH6Kle+MdA8BLffOyEsXKJeyABf6zMK9lN5dtl0TUonGKSb7aNhfwxupUy9PfUqgbBuU9/qQydZqvi6bbgh15jXGUB/lHH62P9FJaJswMQ6bKw4yKOISpCemr5e0k0sr1jdYrhnOTjHUwOlYq7Nqn0t0vDvOzY0Z7QX+KvffUy53PJG3LbuttrVP0DzfhL5ippiFzJwvtdO7D5x71cALcELlIi8n1I7pf8Rthusc5bgK7iiePMri8g67zo/PGXfrZHit7CS1oGjrbbbYnTf+tATFxQheObpcVyikBeuDUB2jYXu9BMB8dgw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025-07-01 14:44, David Hildenbrand wrote: > On 26.06.25 20:16, Luiz Capitulino wrote: >> A race condition is possible in stable_page_flags() where user-space is >> reading /proc/kpageflags concurrently to a folio split. This may lead to >> oopses or BUG_ON()s being triggered. >> >> To fix this, this commit uses snapshot_page() in stable_page_flags() so >> that stable_page_flags() works with a stable page and folio snapshots >> instead. >> >> Note that stable_page_flags() makes use of some functions that require >> the original page or folio pointer to work properly (eg. >> is_free_budy_page() and folio_test_idle()). Since those functions can't >> be used on the page snapshot, we replace their usage with flags that >> were set by snapshot_page() for this purpose. >> >> Signed-off-by: Luiz Capitulino >> --- >>   fs/proc/page.c | 25 ++++++++++++++----------- >>   1 file changed, 14 insertions(+), 11 deletions(-) >> >> diff --git a/fs/proc/page.c b/fs/proc/page.c >> index 936f8bbe5a6f..a2ee95f727f0 100644 >> --- a/fs/proc/page.c >> +++ b/fs/proc/page.c >> @@ -147,6 +147,7 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit) >>   u64 stable_page_flags(const struct page *page) >>   { >>       const struct folio *folio; >> +    struct page_snapshot ps; >>       unsigned long k; >>       unsigned long mapping; >>       bool is_anon; >> @@ -158,7 +159,9 @@ u64 stable_page_flags(const struct page *page) >>        */ >>       if (!page) >>           return 1 << KPF_NOPAGE; >> -    folio = page_folio(page); >> + >> +    snapshot_page(&ps, page); >> +    folio = &ps.folio_snapshot; >>       k = folio->flags; >>       mapping = (unsigned long)folio->mapping; >> @@ -167,7 +170,7 @@ u64 stable_page_flags(const struct page *page) >>       /* >>        * pseudo flags for the well known (anonymous) memory mapped pages >>        */ >> -    if (page_mapped(page)) >> +    if (folio_mapped(folio)) >>           u |= 1 << KPF_MMAP; >>       if (is_anon) { >>           u |= 1 << KPF_ANON; >> @@ -179,7 +182,7 @@ u64 stable_page_flags(const struct page *page) >>        * compound pages: export both head/tail info >>        * they together define a compound page's start/end pos and order >>        */ >> -    if (page == &folio->page) >> +    if (ps.idx == 0) >>           u |= kpf_copy_bit(k, KPF_COMPOUND_HEAD, PG_head); >>       else >>           u |= 1 << KPF_COMPOUND_TAIL; >> @@ -189,10 +192,10 @@ u64 stable_page_flags(const struct page *page) >>                folio_test_large_rmappable(folio)) { >>           /* Note: we indicate any THPs here, not just PMD-sized ones */ >>           u |= 1 << KPF_THP; >> -    } else if (is_huge_zero_folio(folio)) { >> +    } else if (ps.flags & PAGE_SNAPSHOT_PG_HUGE_ZERO) { > > For that, we could use > > is_huge_zero_pfn(ps.pfn) > > from > > https://lkml.kernel.org/r/20250617154345.2494405-10-david@redhat.com > > > You should be able to cherry pick that commit (only minor conflict in vm_normal_page_pmd()) and include it in this series. OK, will do. > >>           u |= 1 << KPF_ZERO_PAGE; >>           u |= 1 << KPF_THP; >> -    } else if (is_zero_folio(folio)) { >> +    } else if (is_zero_pfn(ps.pfn)) { >>           u |= 1 << KPF_ZERO_PAGE; >>       } >> @@ -200,14 +203,14 @@ u64 stable_page_flags(const struct page *page) >>        * Caveats on high order pages: PG_buddy and PG_slab will only be set >>        * on the head page. >>        */ >> -    if (PageBuddy(page)) >> +    if (PageBuddy(&ps.page_snapshot)) >>           u |= 1 << KPF_BUDDY; >> -    else if (page_count(page) == 0 && is_free_buddy_page(page)) > > +    else if (ps.flags & PAGE_SNAPSHOT_PG_FREE) > > Yeah, that is nasty, and inherently racy. So detecting it an snapshot time might be best. > > Which makes me wonder if this whole block should simply be > > if (ps.flags & PAGE_SNAPSHOT_PG_BUDDY) >     u |= 1 << KPF_BUDDY; > > and you move all buddy detection into the snapshotting function. That is, PAGE_SNAPSHOT_PG_BUDDY gets set for head and tail pages of buddy pages. > > Looks less special that way ;) I can do this too. > >>           u |= 1 << KPF_BUDDY; >> -    if (PageOffline(page)) >> +    if (folio_test_offline(folio)) >>           u |= 1 << KPF_OFFLINE; > > -    if (PageTable(page))> +    if (folio_test_pgtable(folio)) >>           u |= 1 << KPF_PGTABLE; > > I assume, long-term none of these will actually be folios. But we can change that once we get to it. > > (likely, going back to pages ... just like for the slab case below) > >>       if (folio_test_slab(folio)) >>           u |= 1 << KPF_SLAB; >> @@ -215,7 +218,7 @@ u64 stable_page_flags(const struct page *page) >>   #if defined(CONFIG_PAGE_IDLE_FLAG) && defined(CONFIG_64BIT) >>       u |= kpf_copy_bit(k, KPF_IDLE,          PG_idle); >>   #else >> -    if (folio_test_idle(folio)) >> +    if (ps.flags & PAGE_SNAPSHOT_PG_IDLE) >>           u |= 1 << KPF_IDLE; > > Another nasty 32bit case. At least once we decouple pages from folios, > the while test-idle in page-ext will vanish and this will get cleaned up. Thanks for the review!