From: Matthew Brost <matthew.brost@intel.com>
To: Zi Yan <ziy@nvidia.com>
Cc: "Jason Gunthorpe" <jgg@nvidia.com>,
"Balbir Singh" <balbirs@nvidia.com>,
"Matthew Wilcox" <willy@infradead.org>,
"Alistair Popple" <apopple@nvidia.com>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Francois Dugast" <francois.dugast@intel.com>,
intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
"adhavan Srinivasan" <maddy@linux.ibm.com>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
"Felix Kuehling" <Felix.Kuehling@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Lyude Paul" <lyude@redhat.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David Hildenbrand" <david@kernel.org>,
"Oscar Salvador" <osalvador@suse.de>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Leon Romanovsky" <leon@kernel.org>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
"Mike Rapoport" <rppt@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Michal Hocko" <mhocko@suse.com>,
linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org,
nouveau@lists.freedesktop.org, linux-mm@kvack.org,
linux-cxl@vger.kernel.org
Subject: Re: [PATCH v6 1/5] mm/zone_device: Reinitialize large zone device private folios
Date: Wed, 21 Jan 2026 23:19:45 -0800 [thread overview]
Message-ID: <aXHPkQfwhMHU/oP6@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <F7E3DF24-A37B-40A0-A507-CEF4AB76C44D@nvidia.com>
On Tue, Jan 20, 2026 at 10:01:18PM -0500, Zi Yan wrote:
> On 20 Jan 2026, at 8:53, Jason Gunthorpe wrote:
>
This whole thread makes my head hurt, as does core MM.
IMO the TL;DR is:
- Why is Intel the only one proving this stuff works? We can debate all
day about what should or should not work — but someone else needs to
actually prove it.i, rather than type hypotheticals.
- Intel has demonstrated that this works and is still getting blocked.
- This entire thread is about a fixes patch for large device pages.
Changing prep_compound_page is completely out of scope for a fixes
patch, and honestly so is most of the rest of what’s being proposed.
- At a minimum, you must clear every page’s flags in the loop. So why not
conservatively clear anything else a folio might have set before calling
an existing core-MM function, ensuring the pages are in a known state?
This is a fixes patch.
- Given the current state of the discussion, I don’t think large device
pages should be in 6.19. And if so, why didn’t the entire device pages
series receive this level of scrutiny earlier? It’s my mistake for not
saying “no” until the reallocation at different sizes issue was resolved.
@Andrew. - I'd revert large device pages in 6.19 as it doesn't work and
we seemly cannot close on this.
Matt
> > On Mon, Jan 19, 2026 at 09:50:16PM -0500, Zi Yan wrote:
> >>>> I suppose we want some prep_single_page(page) and some reorg to share
> >>>> code with the other prep function.
> >>
> >> This is just an unnecessary need due to lack of knowledge of/do not want
> >> to investigate core MM page and folio initialization code.
> >
> > It will be better to keep this related code together, not spread all
> > around.
>
> Or clarify what code is for preparing pages, which would go away at memdesc
> time, and what code is for preparing folios, which would stay.
>
> >
> >>>> I don't think so. It should do the above job efficiently and iterate
> >>>> over the page list exactly once.
> >>
> >> folio initialization should not iterate over any page list, since folio is
> >> supposed to be treated as a whole instead of individual pages.
> >
> > The tail pages need to have the right data in them or compound_head
> > won't work.
>
> That is done by set_compound_head() in prep_compound_tail().
> prep_compound_page() take cares of it. As long as it is called, even if
> the pages in that compound page have random states before, the compound
> page should function correctly afterwards.
>
> >
> >> folio->mapping = NULL;
> >> folio->memcg_data = 0;
> >> folio->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
> >>
> >> should be enough.
> >
> > This seems believable to me for setting up an order 0 page.
>
> It works for any folio, regardless of its order. fields used in second
> or third subpages are all taken care of by prep_compound_page().
>
> >
> >> if (order)
> >> folio_set_large_rmappable(folio);
> >
> > That one is in zone_device_folio_init()
>
> Yes. And the code location looks right to me.
>
> >
> > And maybe the naming has got really confused if we have both functions
> > now :\
>
> Yes. One of the issues is that device private code used to only handles
> order-0 pages and was converted to use high order folio directly without
> using high order page (namely compound page) as an intermediate step.
> This two-step-in-one caused confusion. But the key thing to avoid the
> confusion is that to form a high order folio, a list of contiguous pages
> would become a compound page by calling prep_compound_page(), then
> the compound page becomes a folio by calling folio_set_large_rmappable().
>
> BTW, the code in prep_compound_head() after folio_set_order(folio, order)
> should belong to folio_set_large_rmappable() and they are causing confusion,
> since they are only applicable to rmappable large folios. I am going to
> send a patch to fix it.
>
>
> Best Regards,
> Yan, Zi
next prev parent reply other threads:[~2026-01-22 7:20 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-16 11:10 [PATCH v6 0/5] Enable THP support in drm_pagemap Francois Dugast
2026-01-16 11:10 ` [PATCH v6 1/5] mm/zone_device: Reinitialize large zone device private folios Francois Dugast
2026-01-16 13:10 ` Balbir Singh
2026-01-16 16:07 ` Vlastimil Babka
2026-01-16 17:20 ` Jason Gunthorpe
2026-01-16 17:27 ` Vlastimil Babka
2026-01-22 8:02 ` Vlastimil Babka
2026-01-16 17:49 ` Jason Gunthorpe
2026-01-16 19:17 ` Vlastimil Babka
2026-01-16 20:31 ` Matthew Brost
2026-01-17 0:51 ` Jason Gunthorpe
2026-01-17 3:55 ` Matthew Brost
2026-01-17 4:42 ` Balbir Singh
2026-01-17 5:27 ` Matthew Brost
2026-01-19 5:59 ` Alistair Popple
2026-01-19 14:20 ` Jason Gunthorpe
2026-01-19 20:09 ` Zi Yan
2026-01-19 20:35 ` Jason Gunthorpe
2026-01-19 22:15 ` Balbir Singh
2026-01-20 2:50 ` Zi Yan
2026-01-20 13:53 ` Jason Gunthorpe
2026-01-21 3:01 ` Zi Yan
2026-01-22 7:19 ` Matthew Brost [this message]
2026-01-22 8:00 ` Vlastimil Babka
2026-01-22 9:10 ` Balbir Singh
2026-01-22 21:41 ` Andrew Morton
2026-01-22 22:53 ` Alistair Popple
2026-01-23 6:45 ` Vlastimil Babka
2026-01-22 14:29 ` Jason Gunthorpe
2026-01-22 15:46 ` Jason Gunthorpe
2026-01-23 2:41 ` Zi Yan
2026-01-23 14:19 ` Jason Gunthorpe
2026-01-21 3:51 ` Balbir Singh
2026-01-17 0:19 ` Jason Gunthorpe
2026-01-19 5:41 ` Alistair Popple
2026-01-19 14:24 ` Jason Gunthorpe
2026-01-16 22:34 ` Andrew Morton
2026-01-16 22:36 ` Matthew Brost
2026-01-16 11:10 ` [PATCH v6 2/5] drm/pagemap: Unlock and put folios when possible Francois Dugast
2026-01-16 11:10 ` [PATCH v6 3/5] drm/pagemap: Add helper to access zone_device_data Francois Dugast
2026-01-16 11:10 ` [PATCH v6 4/5] drm/pagemap: Correct cpages calculation for migrate_vma_setup Francois Dugast
2026-01-16 11:37 ` Balbir Singh
2026-01-16 12:02 ` Francois Dugast
2026-01-16 11:10 ` [PATCH v6 5/5] drm/pagemap: Enable THP support for GPU memory migration Francois Dugast
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXHPkQfwhMHU/oP6@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=Felix.Kuehling@amd.com \
--cc=Liam.Howlett@oracle.com \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=apopple@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=chleroy@kernel.org \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=david@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jgg@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lyude@redhat.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=maddy@linux.ibm.com \
--cc=mhocko@suse.com \
--cc=mpe@ellerman.id.au \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=simona@ffwll.ch \
--cc=surenb@google.com \
--cc=tzimmermann@suse.de \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox