linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: John Hubbard <jhubbard@nvidia.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, songmuchun@bytedance.com,
	mike.kravetz@oracle.com, tsahu@linux.ibm.com, david@redhat.com
Subject: Re: [PATCH mm-unstable] mm: clarify folio_set_compound_order() zero support
Date: Thu, 8 Dec 2022 19:33:06 +0000	[thread overview]
Message-ID: <Y5I78soNmAFv7pi8@casper.infradead.org> (raw)
In-Reply-To: <0187f9c2-e80a-9cde-68bc-c9bdbd96b6fe@oracle.com>

On Thu, Dec 08, 2022 at 10:06:07AM -0800, Sidhartha Kumar wrote:
> On 12/7/22 6:27 PM, John Hubbard wrote:
> > On 12/7/22 17:42, Sidhartha Kumar wrote:
> > > > Wouldn't it be better to instead just create a new function for that
> > > > case, such as:
> > > > 
> > > >      dissolve_large_folio()
> > > > 
> > > 
> > > Prior to the folio conversion, the helper function
> > > __destroy_compound_gigantic_page() did:
> > > 
> > >      set_compound_order(page, 0);
> > > #ifdef CONFIG_64BIT
> > >      page[1].compound_nr = 0;
> > > #endif
> > > 
> > > as part of dissolving the page. My goal for this patch was to create
> > > a function that would encapsulate that segment of code with a single
> > > call of folio_set_compound_order(folio, 0). set_compound_order()
> > > does not set compound_nr to 0 when 0 is passed in to the order
> > > argument so explicitly setting it is required. I don't think a
> > > separate dissolve_large_folio() function for the hugetlb case is
> > > needed as __destroy_compound_gigantic_folio() is pretty concise as
> > > it is.
> > > 
> > 
> > Instead of "this is abusing function X()" comments, we should prefer
> > well-named functions that do something understandable. And you can get
> > that by noticing that folio_set_compound_order() collapses down to
> > nearly nothing in the special "order 0" case. So just inline that code
> > directly into __destroy_compound_gigantic_folio(), taking a moment to
> > fill in and consolidate the CONFIG_64BIT missing parts in mm.h.
> > 
> > And now you can get rid of this cruft and "abuse" comment, and instead
> > just end up with two simple lines of code that are crystal clear--as
> > they should be, in a "__destroy" function. Like this:
> > 
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 105878936485..cf227ed00945 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1754,6 +1754,7 @@ static inline void set_page_links(struct page
> > *page, enum zone_type zone,
> >   #endif
> >   }
> > 
> > +#ifdef CONFIG_64BIT
> >   /**
> >    * folio_nr_pages - The number of pages in the folio.
> >    * @folio: The folio.
> > @@ -1764,13 +1765,32 @@ static inline long folio_nr_pages(struct folio
> > *folio)
> >   {
> >       if (!folio_test_large(folio))
> >           return 1;
> > -#ifdef CONFIG_64BIT
> >       return folio->_folio_nr_pages;
> > +}
> > +
> > +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> > +{
> > +    folio->_folio_nr_pages = nr_pages;
> > +}
> >   #else
> > +/**
> > + * folio_nr_pages - The number of pages in the folio.
> > + * @folio: The folio.
> > + *
> > + * Return: A positive power of two.
> > + */
> > +static inline long folio_nr_pages(struct folio *folio)
> > +{
> > +    if (!folio_test_large(folio))
> > +        return 1;
> >       return 1L << folio->_folio_order;
> > -#endif
> >   }
> > 
> > +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> > +{
> > +}
> > +#endif
> > +
> >   /**
> >    * folio_next - Move to the next physical folio.
> >    * @folio: The folio we're currently operating on.
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index e3500c087893..b507a98063e6 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1344,7 +1344,8 @@ static void
> > __destroy_compound_gigantic_folio(struct folio *folio,
> >               set_page_refcounted(p);
> >       }
> > 
> > -    folio_set_compound_order(folio, 0);
> > +    folio->_folio_order = 0;
> > +    folio_set_nr_pages(folio, 0);
> >       __folio_clear_head(folio);
> >   }
> > 
> > 
> > Yes?
> 
> This works for me, I will take this approach along with Muchun's feedback
> about a wrapper function so as not to touch _folio_order directly and send
> out a new version.
> 
> One question I have is if I should then get rid of
> folio_set_compound_order() as hugetlb is the only compound page user I've
> converted to folios so far and its use can be replaced by the suggested
> folio_set_nr_pages() and folio_set_order().
> 
> Hugetlb also has one has one call to folio_set_compound_order() with a
> non-zero order, should I replace this with a call to folio_set_order() and
> folio_set_nr_pages() as well, or keep folio_set_compound_order() and remove
> zero order support and the comment. Please let me know which approach you
> would prefer.

None of the above!

Whatever we're calling this function *it does not belong* in mm.h.
Anything outside the MM calling it is going to be a disaster -- can you
imagine what will happen if a filesystem or device driver is handed a
folio and decides "Oh, I'll just change the size of this folio"?  It is
an attractive nuisance and should be confined to mm/internal.h *at best*.

Equally, we *must not have* separate folio_set_order() and
folio_set_nr_pages().  These are the same thing!  They must be kept
in sync!  If we are to have a folio_set_order() instead of open-coding
it, then it should also update nr_pages.

So, given that this is now an internal-to-mm, if not internal-to-hugetlb
function, I see no reason that it should not handle the case of 0.
I haven't studied what hugetlb_dissolve does, or why it can't use the
standard split_folio(), but I'm sure there's a good reason.


  parent reply	other threads:[~2022-12-08 19:33 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-07 22:37 Sidhartha Kumar
2022-12-08  0:38 ` John Hubbard
2022-12-08  1:42   ` Sidhartha Kumar
2022-12-08  2:27     ` John Hubbard
2022-12-08  4:41       ` Muchun Song
2022-12-08 18:06       ` Sidhartha Kumar
2022-12-08 19:32         ` Mike Kravetz
2022-12-08 19:33         ` Matthew Wilcox [this message]
2022-12-08 19:56           ` John Hubbard
2022-12-08 20:01           ` Mike Kravetz
2022-12-08 21:58             ` Sidhartha Kumar
2022-12-08 22:01               ` John Hubbard
2022-12-08 22:12                 ` Sidhartha Kumar
2022-12-08 22:14                   ` John Hubbard
2022-12-08 22:33                     ` Sidhartha Kumar
2022-12-08 22:39                       ` John Hubbard
2022-12-09 14:27                         ` Muchun Song
2022-12-09 21:10                           ` John Hubbard
2022-12-09 21:20                             ` John Hubbard
2022-12-14  3:00                               ` Muchun Song
2022-12-08 22:04               ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y5I78soNmAFv7pi8@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=sidhartha.kumar@oracle.com \
    --cc=songmuchun@bytedance.com \
    --cc=tsahu@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox