linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gladyshev Ilya <gladyshev.ilya1@h-partners.com>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Zi Yan <ziy@nvidia.com>,
	Harry Yoo <harry.yoo@oracle.com>,
	Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Alistair Popple <apopple@nvidia.com>,
	Gorbunov Ivan <gorbunov.ivan@h-partners.com>,
	Muchun Song <muchun.song@linux.dev>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>,
	Kiryl Shutsemau <kirill@shutemov.name>,
	Linus Torvalds <torvalds@linuxfoundation.org>
Subject: Re: [PATCH 1/1] mm: implement page refcount locking via dedicated bit
Date: Fri, 6 Mar 2026 17:29:15 +0300	[thread overview]
Message-ID: <75aa8f34-4d91-4758-847d-248a5be3db62@h-partners.com> (raw)
In-Reply-To: <54cc6e6d-bda7-4c5b-a998-df854ead1985@kernel.org>

>>> For example, all pages we add to the page allocator through
>>> __free_pages_core().
>>
>> You are right that refcount = 0 is tricky. However, for a bad outcome
>> you will need:
>> 1. Some external reference to this page, through which you try to
>> increment the refcount;
>> 2. set_page_count(0) somewhere between freeing and "it is safe to alloc"
>> state.
> 
> That is way, way, way too dodgy.
> 
> What you should likely do is
> 
> (a) Make set_page_count(0) set it to PAGEREF_LOCKED_BIT
> (b) Make any places that might skip set_page_count(0) to use it
> (c) Document this all extremely thoroughly.
Okay, I agree. My reasoning was more about "it's not _that_ bad if we 
miss something outside the allocator" than "everything is fine as it 
is", but maybe it is really that bad...

> Alternatively, disallow set_page_count(0) ccompletely and add something
> like "initialize_page_as_frozen()" or sth. like that.

(Below are my thoughts, not active proposal)

Actually, we can keep "zero is locked" property if we change scheme a 
little:

If we invert locked bit so that 0 means locked, then we don't care about 
memset() / GFP_ZERO / set_page(0). But then there is extra bit on each 
active refcount, and (un)masking will be required on each refcount 
read/set/etc operation.

We can place locked bit in the highest bit like now, or move it to the 
lowest. This doesn't make any difference except for how masking is 
performed (and or shift). I can't really tell which approach is faster 
(and more optimizable after inline).

This approach doesn't look appealing to me, mostly because of redundant 
masking. So for now I will try to add proper initialization / 
documentation in all risky places.
>> So adding new pages with zeroed refcount to allocator is safe, because
>> there are no external references. Zeroing tail page's refcount is safe,
>> unless someone actually tries to increment its refcount (and this is bug).
>>
>> Generally, the only unsafe set_page_count() (or any other zeroing) will
>> be in allocator itself between freeing and allocating. Or maybe I missed
>> something, and this approach is indeed incorrect
>>
>> Probably we can think of some debug checks to prevent bugs in "safe"
>> scenarious
> 
> Just take a look at do_migrate_range(), where we do a
> 
> page = pfn_to_page(pfn);
> folio = page_folio(page);
> 
> if (!folio_try_get(folio))
> 	continue;
> 
> If that is a page that was added to the buddy (set_page_count(0)) but
> either (a) not allocated yet or (b) allocated as frozen that's a problem.
> 
> We really rely on
> 
> (1) Frozen pages
> (2) Buddy pages
> (3) Tail pages (compound or non-compoud)
 >
> To have a refcount of 0 that *reliably* makes folio_try_get() etc. fail.
> 
> Anything else is just playing with fire.
>


  reply	other threads:[~2026-03-06 14:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-26 16:27 [PATCH 0/1] mm: improve folio refcount scalability Gladyshev Ilya
2026-02-26 16:27 ` [PATCH 1/1] mm: implement page refcount locking via dedicated bit Gladyshev Ilya
2026-03-04 19:16   ` David Hildenbrand (Arm)
2026-03-05  8:10     ` David Hildenbrand (Arm)
2026-03-06 11:50       ` Gladyshev Ilya
2026-03-06 13:10         ` David Hildenbrand (Arm)
2026-03-06 14:29           ` Gladyshev Ilya [this message]
2026-02-28 22:19 ` [PATCH 0/1] mm: improve folio refcount scalability Andrew Morton
2026-03-01  3:27   ` Linus Torvalds
2026-03-01 18:52     ` Linus Torvalds
2026-03-01 20:26       ` Pedro Falcato
2026-03-01 21:16         ` Linus Torvalds
2026-03-04 17:34           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=75aa8f34-4d91-4758-847d-248a5be3db62@h-partners.com \
    --to=gladyshev.ilya1@h-partners.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=gorbunov.ivan@h-partners.com \
    --cc=harry.yoo@oracle.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=torvalds@linuxfoundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox