linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Luiz Capitulino <luizcap@redhat.com>
To: David Hildenbrand <david@redhat.com>, willy@infradead.org
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, lcapitulino@gmail.com, shivankg@amd.com
Subject: Re: [PATCH 3/3] fs: stable_page_flags(): use snapshot_page()
Date: Wed, 2 Jul 2025 13:36:23 -0400	[thread overview]
Message-ID: <0021e29e-381d-4f51-93ac-4f6faf0f2ff2@redhat.com> (raw)
In-Reply-To: <fa3a6258-5304-40b2-bb58-e6081ed845d1@redhat.com>

On 2025-07-01 14:44, David Hildenbrand wrote:
> On 26.06.25 20:16, Luiz Capitulino wrote:
>> A race condition is possible in stable_page_flags() where user-space is
>> reading /proc/kpageflags concurrently to a folio split. This may lead to
>> oopses or BUG_ON()s being triggered.
>>
>> To fix this, this commit uses snapshot_page() in stable_page_flags() so
>> that stable_page_flags() works with a stable page and folio snapshots
>> instead.
>>
>> Note that stable_page_flags() makes use of some functions that require
>> the original page or folio pointer to work properly (eg.
>> is_free_budy_page() and folio_test_idle()). Since those functions can't
>> be used on the page snapshot, we replace their usage with flags that
>> were set by snapshot_page() for this purpose.
>>
>> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
>> ---
>>   fs/proc/page.c | 25 ++++++++++++++-----------
>>   1 file changed, 14 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/proc/page.c b/fs/proc/page.c
>> index 936f8bbe5a6f..a2ee95f727f0 100644
>> --- a/fs/proc/page.c
>> +++ b/fs/proc/page.c
>> @@ -147,6 +147,7 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
>>   u64 stable_page_flags(const struct page *page)
>>   {
>>       const struct folio *folio;
>> +    struct page_snapshot ps;
>>       unsigned long k;
>>       unsigned long mapping;
>>       bool is_anon;
>> @@ -158,7 +159,9 @@ u64 stable_page_flags(const struct page *page)
>>        */
>>       if (!page)
>>           return 1 << KPF_NOPAGE;
>> -    folio = page_folio(page);
>> +
>> +    snapshot_page(&ps, page);
>> +    folio = &ps.folio_snapshot;
>>       k = folio->flags;
>>       mapping = (unsigned long)folio->mapping;
>> @@ -167,7 +170,7 @@ u64 stable_page_flags(const struct page *page)
>>       /*
>>        * pseudo flags for the well known (anonymous) memory mapped pages
>>        */
>> -    if (page_mapped(page))
>> +    if (folio_mapped(folio))
>>           u |= 1 << KPF_MMAP;
>>       if (is_anon) {
>>           u |= 1 << KPF_ANON;
>> @@ -179,7 +182,7 @@ u64 stable_page_flags(const struct page *page)
>>        * compound pages: export both head/tail info
>>        * they together define a compound page's start/end pos and order
>>        */
>> -    if (page == &folio->page)
>> +    if (ps.idx == 0)
>>           u |= kpf_copy_bit(k, KPF_COMPOUND_HEAD, PG_head);
>>       else
>>           u |= 1 << KPF_COMPOUND_TAIL;
>> @@ -189,10 +192,10 @@ u64 stable_page_flags(const struct page *page)
>>                folio_test_large_rmappable(folio)) {
>>           /* Note: we indicate any THPs here, not just PMD-sized ones */
>>           u |= 1 << KPF_THP;
>> -    } else if (is_huge_zero_folio(folio)) {
>> +    } else if (ps.flags & PAGE_SNAPSHOT_PG_HUGE_ZERO) {
> 
> For that, we could use
> 
> is_huge_zero_pfn(ps.pfn)
> 
> from
> 
> https://lkml.kernel.org/r/20250617154345.2494405-10-david@redhat.com
> 
> 
> You should be able to cherry pick that commit (only minor conflict in vm_normal_page_pmd()) and include it in this series.

OK, will do.

> 
>>           u |= 1 << KPF_ZERO_PAGE;
>>           u |= 1 << KPF_THP;
>> -    } else if (is_zero_folio(folio)) {
>> +    } else if (is_zero_pfn(ps.pfn)) {
>>           u |= 1 << KPF_ZERO_PAGE;
>>       }
>> @@ -200,14 +203,14 @@ u64 stable_page_flags(const struct page *page)
>>        * Caveats on high order pages: PG_buddy and PG_slab will only be set
>>        * on the head page.
>>        */
>> -    if (PageBuddy(page))
>> +    if (PageBuddy(&ps.page_snapshot))
>>           u |= 1 << KPF_BUDDY;
>> -    else if (page_count(page) == 0 && is_free_buddy_page(page))
>  > +    else if (ps.flags & PAGE_SNAPSHOT_PG_FREE)
> 
> Yeah, that is nasty, and inherently racy. So detecting it an snapshot time might be best.
> 
> Which makes me wonder if this whole block should simply be
> 
> if (ps.flags & PAGE_SNAPSHOT_PG_BUDDY)
>      u |= 1 << KPF_BUDDY;
> 
> and you move all buddy detection into the snapshotting function. That is, PAGE_SNAPSHOT_PG_BUDDY gets set for head and tail pages of buddy pages.
> 
> Looks less special that way ;)

I can do this too.

> 
>>           u |= 1 << KPF_BUDDY;
>> -    if (PageOffline(page))
>> +    if (folio_test_offline(folio))
>>           u |= 1 << KPF_OFFLINE;
>  > -    if (PageTable(page))> +    if (folio_test_pgtable(folio))
>>           u |= 1 << KPF_PGTABLE;
> 
> I assume, long-term none of these will actually be folios. But we can change that once we get to it.
> 
> (likely, going back to pages ... just like for the slab case below)
> 
>>       if (folio_test_slab(folio))
>>           u |= 1 << KPF_SLAB;
>> @@ -215,7 +218,7 @@ u64 stable_page_flags(const struct page *page)
>>   #if defined(CONFIG_PAGE_IDLE_FLAG) && defined(CONFIG_64BIT)
>>       u |= kpf_copy_bit(k, KPF_IDLE,          PG_idle);
>>   #else
>> -    if (folio_test_idle(folio))
>> +    if (ps.flags & PAGE_SNAPSHOT_PG_IDLE)
>>           u |= 1 << KPF_IDLE;
> 
> Another nasty 32bit case. At least once we decouple pages from folios,
> the while test-idle in page-ext will vanish and this will get cleaned up.

Thanks for the review!



  reply	other threads:[~2025-07-02 17:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-26 18:16 [PATCH 0/3] mm: introduce snapshot_page() Luiz Capitulino
2025-06-26 18:16 ` [PATCH 1/3] " Luiz Capitulino
2025-06-26 21:39   ` Andrew Morton
2025-06-26 21:42     ` Luiz Capitulino
2025-06-26 18:16 ` [PATCH 2/3] proc: kpagecount: use snapshot_page() Luiz Capitulino
2025-06-27 18:30   ` SeongJae Park
2025-07-01 18:36   ` David Hildenbrand
2025-07-02  6:25   ` Shivank Garg
2025-07-02 17:38     ` Luiz Capitulino
2025-06-26 18:16 ` [PATCH 3/3] fs: stable_page_flags(): " Luiz Capitulino
2025-07-01 18:44   ` David Hildenbrand
2025-07-02 17:36     ` Luiz Capitulino [this message]
2025-07-02  6:40 ` [PATCH 0/3] mm: introduce snapshot_page() Shivank Garg
2025-07-02 17:39   ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0021e29e-381d-4f51-93ac-4f6faf0f2ff2@redhat.com \
    --to=luizcap@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=lcapitulino@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shivankg@amd.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox