From: Matthew Wilcox <willy@infradead.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, Kent Overstreet <kent.overstreet@gmail.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
Roman Gushchin <guro@fb.com>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 00/11] PageSlab: eliminate unnecessary compound_head() calls
Date: Wed, 13 Oct 2021 04:19:18 +0100 [thread overview]
Message-ID: <YWZQNj+axlIQrD5C@casper.infradead.org> (raw)
In-Reply-To: <YWXwXINogE0Qb0Ip@cmpxchg.org>
On Tue, Oct 12, 2021 at 04:30:20PM -0400, Johannes Weiner wrote:
> On Tue, Oct 12, 2021 at 08:23:26PM +0100, Matthew Wilcox wrote:
> > On Tue, Oct 12, 2021 at 02:01:37PM -0400, Johannes Weiner wrote:
> > > PageSlab() currently imposes a compound_head() call on all callsites
> > > even though only very few situations encounter tailpages. This short
> > > series bubbles tailpage resolution up to the few sites that need it,
> > > and eliminates it everywhere else.
> > >
> > > This is a stand-alone improvement. However, it's inspired by Willy's
> > > patches to split struct slab from struct page. Those patches currently
> > > resolve a slab object pointer to its struct slab as follows:
> > >
> > > slab = virt_to_slab(p); /* tailpage resolution */
> > > if (slab_test_cache(slab)) { /* slab or page alloc bypass? */
> > > do_slab_stuff(slab);
> > > } else {
> > > page = (struct page *)slab;
> > > do_page_stuff(page);
> > > }
> > >
> > > which makes struct slab an ambiguous type that needs to self-identify
> > > with the slab_test_cache() test (which in turn relies on PG_slab in
> > > the flags field shared between page and slab).
> > >
> > > It would be preferable to do:
> > >
> > > page = virt_to_head_page(p); /* tailpage resolution */
> > > if (PageSlab(page)) { /* slab or page alloc bypass? */
> > > slab = page_slab(page);
> > > do_slab_stuff(slab);
> > > } else {
> > > do_page_stuff(page);
> > > }
> > >
> > > and leave the ambiguity and the need to self-identify with struct
> > > page, so that struct slab is a strong and unambiguous type, and a
> > > non-NULL struct slab encountered in the wild is always a valid object
> > > without the need to check another dedicated flag for validity first.
> > >
> > > However, because PageSlab() currently implies tailpage resolution,
> > > writing the virt->slab lookup in the preferred way would add yet more
> > > unnecessary compound_head() call to the hottest MM paths.
> > >
> > > The page flag helpers should eventually all be weaned off of those
> > > compound_head() calls for their unnecessary overhead alone. But this
> > > one in particular is now getting in the way of writing code in the
> > > preferred manner, and bleeding page ambiguity into the new types that
> > > are supposed to eliminate specifically that. It's ripe for a cleanup.
> >
> > So what I had in mind was more the patch at the end (which I now realise
> > is missing the corresponding changes to __ClearPageSlab()). There is,
> > however, some weirdness with kfence, which appears to be abusing PageSlab
> > by setting it on pages which are not slab pages???
> >
> > page = virt_to_page(p);
> > if (PageSlab(page)) { /* slab or page alloc bypass? */
> > slab = page_slab(page); /* tail page resolution */
> > do_slab_stuff(slab);
> > } else {
> > do_page_stuff(page); /* or possibly compound_head(page) */
> > }
>
> Can you elaborate why you think this would be better?
>
> If the object is sitting in a tailpage, the flag test itself could
> avoid the compound_head(), but at the end of the day it's the slab or
> the headpage that needs to be operated on in the fastpaths, and we
> need to do the compound_head() whether the flag is set or not.
>
> I suppose it could make some debugging checks marginally cheaper?
>
> But OTOH it comes at the cost of the flag setting and clearing loops
> in the slab allocation path, even when debugging checks are disabled.
>
> And it would further complicate the compound page model by introducing
> another distinct flag handling scheme (would there be other users for
> it?). The open-coded loops as a means to ensure flag integrity seem
> error prone; but creating Set and Clear variants that encapsulate the
> loops sounds like a move in the wrong direction, given the pain the
> compound_head() alone in them has already created.
I see this as a move towards the dynamically allocated future.
In that future, I think we'll do something like:
static inline struct slab *alloc_slab_pages(...)
{
struct page *page;
struct slab *slab = kmalloc(sizeof(struct slab), gfp);
if (!slab)
return -ENOMEM;
... init slab ...
page = palloc(slab, MEM_TYPE_SLAB, node, gfp, order);
if (!page) {
kfree(slab);
return -ENOMEM;
}
slab->virt = page_address(page);
return slab;
}
where palloc() does something like:
unsigned long page_desc = mem_type | (unsigned long)mem_desc;
... allocates the struct pages ...
for (i = 0; i < (1 << order); i++)
page[i] = page_desc;
return page;
}
For today, testing PageSlab on the tail page helps the test proceed
in parallel with the action. Looking at slub's kfree() for an example:
page = virt_to_head_page(x);
if (unlikely(!PageSlab(page))) {
free_nonslab_page(page, object);
return;
}
slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
Your proposal is certainly an improvement (since gcc doesn't know
that compound_head(compound_head(x)) == compound_head(x)), but I
think checking on the tail page is even better:
page = virt_to_page(x);
if (unlikely(!PageSlab(page))) {
free_nonslab_page(compound_head(page), object);
return;
}
slab = page_slab(page);
slab_free(slab->slab_cache, slab, object, NULL, 1, _RET_IP_);
The compound_head() parts can proceed in parallel with the check of
PageSlab().
As far as the cost of setting PageSlab, those cachelines are already
dirty because we set compound_head on each of those pages already
(or in the case of freeing, we're about to clear compound_head on
each of those pages).
> > There could also be a PageTail check in there for some of the cases --
> > catch people doing something like:
> > kfree(kmalloc(65536, GFP_KERNEL) + 16384);
> > which happens to work today, but should probably not.
>
> I actually wondered about that when looking at the slob code. Its
> kfree does this:
>
> sp = virt_to_page(block);
> if (PageSlab(compound_head(sp))) {
> int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
> unsigned int *m = (unsigned int *)(block - align);
> slob_free(m, *m + align);
> } else {
> unsigned int order = compound_order(sp);
> mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
> -(PAGE_SIZE << order));
> __free_pages(sp, order);
>
> }
>
> Note the virt_to_page(), instead of virt_to_head_page(). It does test
> PG_slab correctly, but if this is a bypass page, it operates on
> whatever tail page the kfree() argument points into. If you did what
> you write above, it would leak the pages before the object.
slob doesn't have to be as careful because it falls back to the page
allocator for all allocations > PAGE_SIZE (see calls to
slob_new_pages())
next prev parent reply other threads:[~2021-10-13 3:20 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-12 18:01 Johannes Weiner
2021-10-12 18:01 ` [PATCH 01/11] PageSlab: bubble compound_head() into callsites Johannes Weiner
2021-10-12 18:01 ` [PATCH 02/11] PageSlab: eliminate unnecessary compound_head() calls in fs/proc/page Johannes Weiner
2021-10-12 18:01 ` [PATCH 03/11] PageSlab: eliminate unnecessary compound_head() calls in kernel/resource Johannes Weiner
2021-10-12 18:01 ` [PATCH 04/11] PageSlab: eliminate unnecessary compound_head() calls in mm/debug Johannes Weiner
2021-10-12 18:01 ` [PATCH 05/11] PageSlab: eliminate unnecessary compound_head() calls in mm/kasan Johannes Weiner
2021-10-12 18:01 ` [PATCH 06/11] PageSlab: eliminate unnecessary compound_head() call in mm/nommu Johannes Weiner
2021-10-12 18:01 ` [PATCH 07/11] PageSlab: eliminate unnecessary compound_head() call in mm/slab Johannes Weiner
2021-10-12 18:01 ` [PATCH 08/11] PageSlab: eliminate unnecessary compound_head() calls in mm/slab_common Johannes Weiner
2021-10-12 18:01 ` [PATCH 09/11] PageSlab: eliminate unnecessary compound_head() calls in mm/slub Johannes Weiner
2021-10-12 18:01 ` [PATCH 10/11] PageSlab: eliminate unnecessary compound_head() call in mm/usercopy Johannes Weiner
2021-10-12 18:01 ` [PATCH 11/11] PageSlab: eliminate unnecessary compound_head() calls in mm/memcontrol Johannes Weiner
2021-10-12 19:23 ` [PATCH 00/11] PageSlab: eliminate unnecessary compound_head() calls Matthew Wilcox
2021-10-12 20:30 ` Johannes Weiner
2021-10-13 3:19 ` Matthew Wilcox [this message]
2021-10-13 13:49 ` Johannes Weiner
2021-10-13 17:55 ` Matthew Wilcox
2021-10-13 19:32 ` Johannes Weiner
2021-10-13 19:57 ` Johannes Weiner
2021-10-14 7:35 ` David Hildenbrand
2021-10-25 13:49 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YWZQNj+axlIQrD5C@casper.infradead.org \
--to=willy@infradead.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kent.overstreet@gmail.com \
--cc=kernel-team@fb.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox