From: Matthew Wilcox <willy@infradead.org>
To: Steve Sistare <steven.sistare@oracle.com>
Cc: linux-mm@kvack.org, Vivek Kasireddy <vivek.kasireddy@intel.com>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Xu <peterx@redhat.com>,
David Hildenbrand <david@redhat.com>,
Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: [PATCH] mm/hugetlb: simplify refs in memfd_alloc_folio
Date: Wed, 4 Sep 2024 21:11:26 +0100 [thread overview]
Message-ID: <Zti-7nPVMcGgpcbi@casper.infradead.org> (raw)
In-Reply-To: <Zti888Zzk1vwjw3-@casper.infradead.org>
On Wed, Sep 04, 2024 at 09:02:59PM +0100, Matthew Wilcox wrote:
> On Wed, Sep 04, 2024 at 12:41:08PM -0700, Steve Sistare wrote:
> > The folio_try_get in memfd_alloc_folio is not necessary. Delete it, and
> > delete the matching folio_put in memfd_pin_folios. This also avoids
> > leaking a ref if the memfd_alloc_folio call to hugetlb_add_to_page_cache
> > fails, which would otherwise need an additional folio_put. This is a
> > continuation of the fix
> > "mm/hugetlb: fix memfd_pin_folios free_huge_pages leak"
>
> I think you're right, but don't we also need to get rid of the
> folio_put() call in the 'if (err)' case after calling
> hugetlb_add_to_page_cache()?
After scratching my head about this a bit more, I was trying to preserve
the existing semantics of the code, but I think the code was always
buggy.
The correct code would be:
folio = alloc_hugetlb_folio_nodemask(...);
folio_put(folio);
The code as in tree today would trip an assertion:
VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
as alloc_hugetlb_folio_nodemask() returns a folio with a refcount 1,
folio_try_get() would increment it to 2, folio_put() would decrement
it to 1, and so we'd call free_huge_folio() with a refcount of 1.
But after your patch, the code _is_ still wrong because we'll
start with a refcount of 1, fail to add to the page cache, call
folio_put() which will decrement the refcount to 0 _and call
free_huge_folio() itself_. Then we'll call free_huge_folio()
on an already freed and possibly reallocated folio.
So every version suggested so far (current, yours, mine) is wrong,
and the right code looks like:
folio = alloc_hugetlb_folio_nodemask(...);
if (folio) {
err = hugetlb_add_to_page_cache(...);
if (err) {
folio_put(folio);
return ERR_PTR(err);
}
...
Or have I got something wrong in that analysis?
> > Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
> >
> > Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> > ---
> > mm/gup.c | 4 +---
> > mm/memfd.c | 2 +-
> > 2 files changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index bccabaa..947881ff 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -3618,7 +3618,7 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
> > pgoff_t start_idx, end_idx, next_idx;
> > struct folio *folio = NULL;
> > struct folio_batch fbatch;
> > - struct hstate *h = NULL;
> > + struct hstate *h;
> > long ret = -EINVAL;
> >
> > if (start < 0 || start > end || !max_folios)
> > @@ -3662,8 +3662,6 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
> > &fbatch);
> > if (folio) {
> > folio_put(folio);
> > - if (h)
> > - folio_put(folio);
> > folio = NULL;
> > }
> >
> > diff --git a/mm/memfd.c b/mm/memfd.c
> > index bcb131d..f715301 100644
> > --- a/mm/memfd.c
> > +++ b/mm/memfd.c
> > @@ -89,7 +89,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
> > numa_node_id(),
> > NULL,
> > gfp_mask);
> > - if (folio && folio_try_get(folio)) {
> > + if (folio) {
> > err = hugetlb_add_to_page_cache(folio,
> > memfd->f_mapping,
> > idx);
> > --
> > 1.8.3.1
> >
>
next prev parent reply other threads:[~2024-09-04 20:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 19:41 Steve Sistare
2024-09-04 20:02 ` Matthew Wilcox
2024-09-04 20:11 ` Matthew Wilcox [this message]
2024-09-04 20:20 ` Steven Sistare
2024-09-04 20:13 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zti-7nPVMcGgpcbi@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jgg@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=peterx@redhat.com \
--cc=steven.sistare@oracle.com \
--cc=vivek.kasireddy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox