linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: lsf-pc@lists.linux-foundation.org
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: [LSF/MM/BPF TOPIC] Improving alloc_contig_range()
Date: Wed, 9 Jun 2021 15:39:51 +0200	[thread overview]
Message-ID: <c8e21ac4-ace7-3176-8782-535bd6590583@redhat.com> (raw)

Hi,

our range allocator -- alloc_contig_range() -- already works fairly 
reliable with MIGRATE_CMA, as used by the CMA allocator, and 
ZONE_MOVABLE, as used by virtio-mem for memory hotunplug. However, there 
are some things to improve, especially when allocating from one of the 
kernel zones, such as ZONE_NORMAL, as used for allocating gigantic pages 
and by virtio-mem for memory hotunplug.

a) MAX_ORDER (and pageblock_order) limitation

The current implementation is tightly glued to pageblock_order and 
MAX_ORDER. For example, alloc_contig_range() works fairly unreliable on 
ZONE_NORMAL with granularity < MAX_ORDER - 1, because we isolate all 
pageblocks in the  MAX_ORDER - 1 range and any unmovable page in that 
range will bail out. Further, when isolating a pageblock we lose 
movability information, so isolating a (partially) unmovable pageblock 
might be problematic and we would like to retain the original movability 
information.

As one example, virtio-mem currently uses MAX_ORDER - 1 granularity 
instead of smaller (like pageblock_order) granularity, for example, 
supporting (un)plug of 4MiB chunks on x86-64 only. We'd like to support 
2 MiB here.

As another example, a CMA area has to be aligned to MAX_ORDER - 1 due to 
the current limitations. pageblock_order is still problematic on some 
archs (arm64 with 64 KiB base pages), but getting rid of the MAX_ORDER 
limitation feels like a low hanging fruit.

As there is interest in increasing MAX_ORDER, the problem will get worse 
over time. The question are 1) what it takes to only isolate a single 
pageblock and not all pageblocks composing a MAX_ORDER - 1 range when 
not required and 2) how to handle isolating partially unmovable pageblocks.

b) Shrinking the slab

set_migratetype_isolate() has a nice comment "FIXME: Now, memory hotplug 
doesn't call shrink_slab() by itself". IIUC, we could significantly 
improve alloc_contig_range() reliability on ZONE_NORMAL when shrinking 
the slab in some environments. The questions are, 1) who should shrink 
the slab and 2) when, because it obviously can temporarily harm 
performance. However, memory hotunplug already temporarily harms 
performance.

Ideally, we'd want to shrink the slab only on the area of interest. How 
could something like that be realized?

c) PCP handling

While we disable the PCP right now when offlining memory to avoid races 
with concurrent freeing to the PCP, we don't do the same in 
alloc_contig_range(); instead, we only drain the PCP once.

Disabling the PCP will currently lock a mutex until re-enabled, which 
would essentially serialize alloc_contig_range(), which is undesired.

What would it take to make disabling the PCP scale? Do we care at all or 
can the races actually result in significant allocation failures, 
especially on ZONE_MOVABLE or MIGRATE_CMA?

d) Unification of alloc_contig_range() and memory offlining code.

Both do roughly the same thing, however, with some notable differences 
(dissolving huge pages, retry handling, ...). What does it take to unify 
both, or are there compelling reasons to not unify them?


-- 
Thanks,

David / dhildenb



                 reply	other threads:[~2021-06-09 13:39 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8e21ac4-ace7-3176-8782-535bd6590583@redhat.com \
    --to=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox