linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Gregory Price <gourry@gourry.net>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	Brendan Jackman <jackmanb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Jason Wang <jasowang@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm@kvack.org, virtualization@lists.linux.dev,
	Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
	Matthew Brost <matthew.brost@intel.com>,
	Joshua Hahn <joshua.hahnjy@gmail.com>,
	Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
	Ying Huang <ying.huang@linux.alibaba.com>,
	Alistair Popple <apopple@nvidia.com>,
	Hugh Dickins <hughd@google.com>,
	Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Harry Yoo <harry.yoo@oracle.com>, Chris Li <chrisl@kernel.org>,
	Kairui Song <kasong@tencent.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH RFC v3 01/19] mm: thread user_addr through page allocator for cache-friendly zeroing
Date: Thu, 23 Apr 2026 07:57:01 -0400	[thread overview]
Message-ID: <20260423074433-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <c165d6af-4a35-4a41-9086-ee307640c044@kernel.org>

On Thu, Apr 23, 2026 at 11:46:56AM +0200, David Hildenbrand (Arm) wrote:
> On 4/23/26 06:31, Gregory Price wrote:
> > On Wed, Apr 22, 2026 at 05:20:27PM -0400, Michael S. Tsirkin wrote:
> >> On Wed, Apr 22, 2026 at 03:47:07PM -0400, Gregory Price wrote:
> >>>
> >>> __alloc_user_pages(..., gfp_t gfp, user_addr)
> >>
> >> With a wrapper approach, looks like we'd need something like
> >> __GFP_SKIP_ZERO so post_alloc_hook doesn't zero sequentially, then the
> >> wrapper re-zeros with folio_zero_user().  But then the wrapper needs to
> >> know whether the page was pre-zeroed (PG_zeroed), which is cleared by
> >> post_alloc_hook before return.  So the information doesn't survive to
> >> the wrapper.
> >>
> > 
> > I was thinking more that internally you already have that information
> > you need to know to skip the zeroing - and so the wrapper can just pass
> > __GFP_ZERO and post_alloc_hook() would do the right thing regardless
> > 
> > Then on the way out, the new wrapper would take care of cacheline piece.
> > 
> > However, i explored this a bit - and while it saves some churn on the
> > interface, it adds two paths into the buddy - and that increase in
> > surface might not be worth it.
> > 
> > So I see the tradeoff here.  The churn is probably worth it.
> 
> In v2 I commented [1]
> 
> "
> For example, instead of changing all callers of post_alloc_hook() to
> pass USER_ADDR_NONE, can we make post_alloc_hook() a simple wrapper
> around a variant that consumes an address.
> 
> So isn't there a way we can just keep the changes mostly to mm/page_alloc.c?
> "
> 
> That should avoid most of the churn outside of page_alloc, no?
> 
> [1] https://lore.kernel.org/r/4bdc66f2-1469-4b91-9935-74c3d3ca0ed9@kernel.org
> 
> -- 
> Cheers,
> 
> David

Yes I'm sorry I missed that one and didn't answer it. My bad.

To answer the question, no, definitely not *most*. Here are some numbers:

What we save: 11 one-line changes adding , USER_ADDR_NONE in callers
that don't care about user_addr (compaction, filemap, khugepaged,
migrate, page_frag_cache, shmem, slub, swap_state).

Changes we have to add: 8 changes
Rename 4 existing APIs adding _user: __alloc_pages,
__folio_alloc, folio_alloc_mpol, __alloc_frozen_pages + add
4 wrapper macros/inlines in gfp.h that forward to the _user variants with
USER_ADDR_NONE. Roughly 6-8 lines of boilerplate per API.

 
What stays the same, but now has to call the new APIs:

all internal plumbing (page_alloc.c, mempolicy.c, internal.h, hugetlb.c,
the actual user-page callers in memory.c/huge_memory.c/highmem.h):
38 calls, of these 21 in page_alloc.c, 17 outside it.


The changes would be less trivial to review, too. And I would not call
the resulting very long names "elegant".

More importantly, it makes it harder to review: instead of
compiler checking we did not forget a parameter, it compiles fine,
but we get a subtle slowdown or corruption on non x86.

And I thought you agreed with this sentiment when you said
"yea something that is harder to mess up would be nice":
https://lore.kernel.org/all/20260414062524-mutt-send-email-mst@kernel.org/




From where I stand, we would adding technical debt by piling up new
wrapper APIs, for no benefit.

But then, I'm not the maintainer here. And adding extra wrappers
is easy for me. So, let me know. Thanks!


-- 
MST



  reply	other threads:[~2026-04-23 11:57 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 22:01 [PATCH RFC v3 00/19] mm/virtio: skip redundant zeroing of host-zeroed reported pages Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 01/19] mm: thread user_addr through page allocator for cache-friendly zeroing Michael S. Tsirkin
2026-04-22 19:47   ` Gregory Price
2026-04-22 20:32     ` Michael S. Tsirkin
2026-04-22 21:20     ` Michael S. Tsirkin
2026-04-23  4:31       ` Gregory Price
2026-04-23  9:46         ` David Hildenbrand (Arm)
2026-04-23 11:57           ` Michael S. Tsirkin [this message]
2026-04-23 13:42             ` Gregory Price
2026-04-23 14:13               ` David Hildenbrand (Arm)
2026-04-23 14:46                 ` Michael S. Tsirkin
2026-04-23 15:54                   ` David Hildenbrand (Arm)
2026-04-23 14:57                 ` Gregory Price
2026-04-21 22:01 ` [PATCH RFC v3 02/19] mm: add folio_zero_user stub for configs without THP/HUGETLBFS Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 03/19] mm: page_alloc: move prep_compound_page before post_alloc_hook Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 04/19] mm: use folio_zero_user for user pages in post_alloc_hook Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 05/19] mm: use __GFP_ZERO in vma_alloc_zeroed_movable_folio Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 06/19] mm: use __GFP_ZERO in alloc_anon_folio Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 07/19] mm: use __GFP_ZERO in vma_alloc_anon_folio_pmd Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 08/19] mm: hugetlb: use __GFP_ZERO and skip zeroing for zeroed pages Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 09/19] mm: memfd: skip zeroing for zeroed hugetlb pool pages Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 10/19] mm: remove arch vma_alloc_zeroed_movable_folio overrides Michael S. Tsirkin
2026-04-22  6:54   ` Geert Uytterhoeven
2026-04-22 21:29   ` Greg Ungerer
2026-04-21 22:01 ` [PATCH RFC v3 11/19] mm: page_alloc: propagate PageReported flag across buddy splits Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 12/19] mm: page_reporting: skip redundant zeroing of host-zeroed reported pages Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 13/19] virtio_balloon: a hack to enable host-zeroed page optimization Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 14/19] mm: page_reporting: add flush parameter with page budget Michael S. Tsirkin
2026-04-21 22:01 ` [PATCH RFC v3 15/19] mm: add free_frozen_pages_zeroed Michael S. Tsirkin
2026-04-21 22:02 ` [PATCH RFC v3 16/19] mm: add put_page_zeroed and folio_put_zeroed Michael S. Tsirkin
2026-04-21 22:02 ` [PATCH RFC v3 17/19] mm: page_alloc: clear PG_zeroed on buddy merge if not both zero Michael S. Tsirkin
2026-04-21 22:02 ` [PATCH RFC v3 18/19] mm: page_alloc: preserve PG_zeroed in page_del_and_expand Michael S. Tsirkin
2026-04-21 22:02 ` [PATCH RFC v3 19/19] virtio_balloon: mark deflated pages as zeroed Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260423074433-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=byungchul@sk.com \
    --cc=chrisl@kernel.org \
    --cc=cl@gentwo.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=hughd@google.com \
    --cc=jackmanb@google.com \
    --cc=jasowang@redhat.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=npache@redhat.com \
    --cc=nphamcs@gmail.com \
    --cc=osalvador@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=virtualization@lists.linux.dev \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox