From: Mike Kravetz <mike.kravetz@oracle.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Tarun Sahu <tsahu@linux.ibm.com>,
linux-mm@kvack.org, akpm@linux-foundation.org,
muchun.song@linux.dev, aneesh.kumar@linux.ibm.com,
sidhartha.kumar@oracle.com, gerald.schaefer@linux.ibm.com,
linux-kernel@vger.kernel.org, jaypatel@linux.ibm.com
Subject: Re: [PATCH] mm/folio: Avoid special handling for order value 0 in folio_set_order
Date: Tue, 18 Apr 2023 11:56:08 -0700 [thread overview]
Message-ID: <20230418185608.GA4907@monkey> (raw)
In-Reply-To: <ZDmzyag88pO1Kdk8@casper.infradead.org>
On 04/14/23 21:12, Matthew Wilcox wrote:
> On Sat, Apr 15, 2023 at 01:18:32AM +0530, Tarun Sahu wrote:
> > folio_set_order(folio, 0); which is an abuse of folio_set_order as 0-order
> > folio does not have any tail page to set order.
>
> I think you're missing the point of how folio_set_order() is used.
> When splitting a large folio, we need to zero out the folio_nr_pages
> in the tail, so it does have a tail page, and that tail page needs to
> be zeroed. We even assert that there is a tail page:
>
> if (WARN_ON_ONCE(!folio_test_large(folio)))
> return;
>
> Or maybe you need to explain yourself better.
>
> > folio->_folio_nr_pages is
> > set to 0 for order 0 in folio_set_order. It is required because
> > _folio_nr_pages overlapped with page->mapping and leaving it non zero
> > caused "bad page" error while freeing gigantic hugepages. This was fixed in
> > Commit ba9c1201beaa ("mm/hugetlb: clear compound_nr before freeing gigantic
> > pages"). Also commit a01f43901cfb ("hugetlb: be sure to free demoted CMA
> > pages to CMA") now explicitly clear page->mapping and hence we won't see
> > the bad page error even if _folio_nr_pages remains unset. Also the order 0
> > folios are not supposed to call folio_set_order, So now we can get rid of
> > folio_set_order(folio, 0) from hugetlb code path to clear the confusion.
>
> ... this is all very confusing.
>
> > The patch also moves _folio_set_head and folio_set_order calls in
> > __prep_compound_gigantic_folio() such that we avoid clearing them in the
> > error path.
>
> But don't we need those bits set while we operate on the folio to set it
> up? It makes me nervous if we don't have those bits set because we can
> end up with speculative references that point to a head page while that
> page is not marked as a head page. It may not be a problem, but I want
> to see some air-tight analysis of that.
I am fairly certain we are 'safe'. Here is code before setting up the
pointer to the head page.
* In the case of demote, the ref count will be zero.
*/
if (!demote) {
if (!page_ref_freeze(p, 1)) {
pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n");
goto out_error;
}
} else {
VM_BUG_ON_PAGE(page_count(p), p);
}
if (i != 0)
set_compound_head(p, &folio->page);
So, before setting the pointer to head page ref count will be zero.
I 'think' it would actually be better to move the calls to _folio_set_head and
folio_set_order in __prep_compound_gigantic_folio() as suggested here. Why?
In the current code, the ref count on the 'head page' is still 1 (or more)
while those calls are made. So, someone could take a speculative ref on the
page BEFORE the tail pages are set up.
TBH, I do not have much of an opinion about potential confusion surrounding
folio_set_compound_order(folio, 0). IIUC, hugetlb gigantic page setup is the
only place outside the page allocation code that sets up compound pages/large
folios. So, it is going to be a bit 'special'. As mentioned, when this was
originally discussed I suggested folio_clear_order(). I would be happy with
either.
--
Mike Kravetz
next prev parent reply other threads:[~2023-04-18 18:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-14 19:48 Tarun Sahu
2023-04-14 20:12 ` Matthew Wilcox
2023-04-18 9:03 ` Tarun Sahu
2023-04-18 18:56 ` Mike Kravetz [this message]
2023-04-24 15:40 ` Tarun Sahu
2023-04-24 15:31 ` Tarun Sahu
2023-04-14 21:35 ` Sidhartha Kumar
2023-04-18 9:20 ` Tarun Sahu
2023-04-18 9:25 ` Tarun Sahu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230418185608.GA4907@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=jaypatel@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=sidhartha.kumar@oracle.com \
--cc=tsahu@linux.ibm.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox