linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Fuad Tabba <tabba@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	linux-mm@kvack.org, kvm@vger.kernel.org,
	 nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	 rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org,
	 muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com,
	 pbonzini@redhat.com, seanjc@google.com, willy@infradead.org,
	 jhubbard@nvidia.com, ackerleytng@google.com,
	vannapurve@google.com,  mail@maciej.szmigiero.name,
	kirill.shutemov@linux.intel.com,  quic_eberman@quicinc.com,
	maz@kernel.org, will@kernel.org, qperret@google.com,
	 keirf@google.com, roypat@amazon.co.uk
Subject: Re: [RFC PATCH v1 00/10] mm: Introduce and use folio_owner_ops
Date: Mon, 11 Nov 2024 08:26:54 +0000	[thread overview]
Message-ID: <CA+EHjTy3kNdg7pfN9HufgibE7qY1S+WdMZfRFRiF5sHtMzo64w@mail.gmail.com> (raw)
In-Reply-To: <9dc212ac-c4c3-40f2-9feb-a8bcf71a1246@redhat.com>

Hi Jason and David,

On Fri, 8 Nov 2024 at 19:33, David Hildenbrand <david@redhat.com> wrote:
>
> On 08.11.24 18:05, Jason Gunthorpe wrote:
> > On Fri, Nov 08, 2024 at 04:20:30PM +0000, Fuad Tabba wrote:
> >> Some folios, such as hugetlb folios and zone device folios,
> >> require special handling when the folio's reference count reaches
> >> 0, before being freed. Moreover, guest_memfd folios will likely
> >> require special handling to notify it once a folio's reference
> >> count reaches 0, to facilitate shared to private folio conversion
> >> [*]. Currently, each usecase has a dedicated callback when the
> >> folio refcount reaches 0 to that effect. Adding yet more
> >> callbacks is not ideal.
> >
>
> Thanks for having a look!
>
> Replying to clarify some things. Fuad, feel free to add additional
> information.

Thanks for your comments Jason, and for clarifying my cover letter
David. I think David has covered everything, and I'll make sure to
clarify this in the cover letter when I respin.

Cheers,
/fuad

>
> > Honestly, I question this thesis. How complex would it be to have 'yet
> > more callbacks'? Is the challenge really that the mm can't detect when
> > guestmemfd is the owner of the page because the page will be
> > ZONE_NORMAL?
>
> Fuad might have been a bit imprecise here: We don't want an ever growing
> list of checks+callbacks on the page freeing fast path.
>
> This series replaces the two cases we have by a single generic one,
> which is nice independent of guest_memfd I think.
>
> >
> > So the point of this is really to allow ZONE_NORMAL pages to have a
> > per-allocator callback?
>
> To intercept the refcount going to zero independent of any zones or
> magic page types, without as little overhead in the common page freeing
> path.
>
> It can be used to implement custom allocators, like factored out for
> hugetlb in this series. It's not necessarily limited to that, though. It
> can be used as a form of "asynchronous page ref freezing", where you get
> notified once all references are gone.
>
> (I might have another use case with PageOffline, where we want to
> prevent virtio-mem ones of them from getting accidentally leaked into
> the buddy during memory offlining with speculative references --
> virtio_mem_fake_offline_going_offline() contains the interesting bits.
> But I did not look into the dirty details yet, just some thought where
> we'd want to intercept the refcount going to 0.)
>
> >
> > But this is also why I suggested to shift them to ZONE_DEVICE for
> > guestmemfd, because then you get these things for free from the pgmap.
>
> With this series even hugetlb gets it for "free", and hugetlb is not
> quite the nail for the ZONE_DEVICE hammer IMHO :)
>
> For things we can statically set aside early during boot and never
> really want to return to the buddy/another allocator, I would agree that
> static ZONE_DEVICE would have possible.
>
> Whenever the buddy or other allocators are involved, and we might have
> granularity as a handful of pages (e.g., taken from the buddy), getting
> ZONE_DEVICE involved is not a good (or even feasible) approach.
>
> After all, all we want is intercept the refcount going to 0.
>
> --
> Cheers,
>
> David / dhildenb
>


  reply	other threads:[~2024-11-11  8:27 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08 16:20 Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 01/10] mm/hugetlb: rename isolate_hugetlb() to folio_isolate_hugetlb() Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 02/10] mm/migrate: don't call folio_putback_active_hugetlb() on dst hugetlb folio Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 03/10] mm/hugetlb: rename "folio_putback_active_hugetlb()" to "folio_putback_hugetlb()" Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 04/10] mm/hugetlb-cgroup: convert hugetlb_cgroup_css_offline() to work on folios Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 05/10] mm/hugetlb: use folio->lru int demote_free_hugetlb_folios() Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 06/10] mm/hugetlb: use separate folio->_hugetlb_list for hugetlb-internals Fuad Tabba
2024-11-12 15:28   ` wang wei
2024-11-12 15:48     ` [RFC " David Hildenbrand
2024-11-08 16:20 ` [RFC PATCH v1 07/10] mm: Introduce struct folio_owner_ops Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 08/10] mm: Use getters and setters to access page pgmap Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 09/10] mm: Use owner_ops on folio_put for zone device pages Fuad Tabba
2024-11-08 16:20 ` [RFC PATCH v1 10/10] mm: hugetlb: Use owner_ops on folio_put for hugetlb Fuad Tabba
2024-11-08 17:05 ` [RFC PATCH v1 00/10] mm: Introduce and use folio_owner_ops Jason Gunthorpe
2024-11-08 19:33   ` David Hildenbrand
2024-11-11  8:26     ` Fuad Tabba [this message]
2024-11-12  5:26       ` Matthew Wilcox
2024-11-12  9:10         ` David Hildenbrand
2024-11-12 13:53           ` Jason Gunthorpe
2024-11-12 14:22             ` David Hildenbrand
2024-11-13  4:57               ` Matthew Wilcox
2024-11-13 11:27                 ` David Hildenbrand
2024-11-14  4:02                 ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+EHjTy3kNdg7pfN9HufgibE7qY1S+WdMZfRFRiF5sHtMzo64w@mail.gmail.com \
    --to=tabba@google.com \
    --cc=ackerleytng@google.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=keirf@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nouveau@lists.freedesktop.org \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_eberman@quicinc.com \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=simona@ffwll.ch \
    --cc=vannapurve@google.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox