From: Zi Yan <ziy@nvidia.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Matthew Wilcox" <willy@infradead.org>,
"Balbir Singh" <balbirs@nvidia.com>,
"Francois Dugast" <francois.dugast@intel.com>,
intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
"Matthew Brost" <matthew.brost@intel.com>,
"Madhavan Srinivasan" <maddy@linux.ibm.com>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
"Felix Kuehling" <Felix.Kuehling@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Lyude Paul" <lyude@redhat.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Logan Gunthorpe" <logang@deltatee.com>,
"David Hildenbrand" <david@kernel.org>,
"Oscar Salvador" <osalvador@suse.de>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Leon Romanovsky" <leon@kernel.org>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Mike Rapoport" <rppt@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Michal Hocko" <mhocko@suse.com>,
"Alistair Popple" <apopple@nvidia.com>,
linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org,
nouveau@lists.freedesktop.org, linux-pci@vger.kernel.org,
linux-mm@kvack.org, linux-cxl@vger.kernel.org
Subject: Re: [PATCH v4 1/7] mm/zone_device: Add order argument to folio_free callback
Date: Mon, 12 Jan 2026 13:55:18 -0500 [thread overview]
Message-ID: <6AFCEB51-8EE1-4AC9-8F39-FCA561BE8CB5@nvidia.com> (raw)
In-Reply-To: <20260112182500.GI745888@ziepe.ca>
On 12 Jan 2026, at 13:25, Jason Gunthorpe wrote:
> On Mon, Jan 12, 2026 at 12:46:57PM -0500, Zi Yan wrote:
>> On 12 Jan 2026, at 11:50, Jason Gunthorpe wrote:
>>
>>> On Mon, Jan 12, 2026 at 11:31:04AM -0500, Zi Yan wrote:
>>>>> folio_free()
>>>>>
>>>>> 1) Allocator finds free memory
>>>>> 2) zone_device_page_init() allocates the memory and makes refcount=1
>>>>> 3) __folio_put() knows the recount 0.
>>>>> 4) free_zone_device_folio() calls folio_free(), but it doesn't
>>>>> actually need to undo prep_compound_page() because *NOTHING* can
>>>>> use the page pointer at this point.
>>>>> 5) Driver puts the memory back into the allocator and now #1 can
>>>>> happen. It knows how much memory to put back because folio->order
>>>>> is valid from #2
>>>>> 6) #1 happens again, then #2 happens again and the folio is in the
>>>>> right state for use. The successor #2 fully undoes the work of the
>>>>> predecessor #2.
>>>>
>>>> But how can a successor #2 undo the work if the second #1 only allocates
>>>> half of the original folio? For example, an order-9 at PFN 0 is
>>>> allocated and freed, then an order-8 at PFN 0 is allocated and another
>>>> order-8 at PFN 256 is allocated. How can two #2s undo the same order-9
>>>> without corrupting each other’s data?
>>>
>>> What do you mean? The fundamental rule is you can't read the folio or
>>> the order outside folio_free once it's refcount reaches 0.
>>
>> There is no such a rule. In core MM, folio_split(), which splits a high
>> order folio to low order ones, freezes the folio (turning refcount to 0)
>> and manipulates the folio order and all tail pages compound_head to
>> restructure the folio.
>
> That's different, I am talking about reaching 0 because it has been
> freed, meaning there are no external pointers to it.
>
> Further, when a page is frozen page_ref_freeze() takes in the number
> of references the caller has ownership over and it doesn't succeed if
> there are stray references elsewhere.
>
> This is very important because the entire operating model of split
> only works if it has exclusive locks over all the valid pointers into
> that page.
>
> Spurious refcount failures concurrent with split cannot be allowed.
>
> I don't see how pointing at __folio_freeze_and_split_unmapped() can
> justify this series.
>
But from anyone looking at the folio state, refcount == 0, compound_head
is set, they cannot tell the difference.
If what you said is true, why is free_pages_prepare() needed? No one
should touch these free pages. Why bother resetting these states.
>> Your fundamental rule breaks this. Allowing compound information
>> to stay after a folio is freed means you cannot tell whether a folio
>> is under split or freed.
>
> You can't refcount a folio out of nothing. It has to come from a
> memory location that already is holding a refcount, and then you can
> incr it.
Right. There is also no guarantee that all code is correct and follows
this.
My point here is that calling prep_compound_page() on a compound page
does not follow core MM’s conventions.
Best Regards,
Yan, Zi
next prev parent reply other threads:[~2026-01-12 18:55 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-11 20:55 [PATCH v4 0/7] Enable THP support in drm_pagemap Francois Dugast
2026-01-11 20:55 ` [PATCH v4 1/7] mm/zone_device: Add order argument to folio_free callback Francois Dugast
2026-01-11 22:35 ` Matthew Wilcox
2026-01-12 0:19 ` Balbir Singh
2026-01-12 0:51 ` Zi Yan
2026-01-12 1:37 ` Matthew Brost
2026-01-12 4:50 ` Balbir Singh
2026-01-12 13:45 ` Jason Gunthorpe
2026-01-12 16:31 ` Zi Yan
2026-01-12 16:50 ` Jason Gunthorpe
2026-01-12 17:46 ` Zi Yan
2026-01-12 18:25 ` Jason Gunthorpe
2026-01-12 18:55 ` Zi Yan [this message]
2026-01-12 19:28 ` Jason Gunthorpe
2026-01-12 23:07 ` Matthew Brost
2026-01-12 21:49 ` Matthew Brost
2026-01-12 23:15 ` Zi Yan
2026-01-12 23:22 ` Matthew Brost
2026-01-11 20:55 ` [PATCH v4 2/7] mm/zone_device: Add free_zone_device_folio_prepare() helper Francois Dugast
2026-01-12 0:44 ` Balbir Singh
2026-01-12 1:16 ` Matthew Brost
2026-01-12 2:15 ` Balbir Singh
2026-01-12 2:37 ` Matthew Brost
2026-01-12 2:50 ` Matthew Brost
2026-01-11 20:55 ` [PATCH v4 3/7] fs/dax: Use " Francois Dugast
2026-01-12 4:14 ` kernel test robot
2026-01-11 20:55 ` [PATCH v4 4/7] drm/pagemap: Unlock and put folios when possible Francois Dugast
2026-01-11 20:55 ` [PATCH v4 5/7] drm/pagemap: Add helper to access zone_device_data Francois Dugast
2026-01-11 20:55 ` [PATCH v4 6/7] drm/pagemap: Correct cpages calculation for migrate_vma_setup Francois Dugast
2026-01-12 14:17 ` Francois Dugast
2026-01-11 20:55 ` [PATCH v4 7/7] drm/pagemap: Enable THP support for GPU memory migration Francois Dugast
2026-01-11 21:37 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6AFCEB51-8EE1-4AC9-8F39-FCA561BE8CB5@nvidia.com \
--to=ziy@nvidia.com \
--cc=Felix.Kuehling@amd.com \
--cc=Liam.Howlett@oracle.com \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=apopple@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=bhelgaas@google.com \
--cc=chleroy@kernel.org \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=david@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jgg@ziepe.ca \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=logang@deltatee.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=lyude@redhat.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=maddy@linux.ibm.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=mpe@ellerman.id.au \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=simona@ffwll.ch \
--cc=surenb@google.com \
--cc=tzimmermann@suse.de \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox