Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Luis Chamberlain <mcgrof@kernel.org>
To: Zi Yan <ziy@nvidia.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	"Yin, Fengwei" <fengwei.yin@intel.com>,
	Yu Zhao <yuzhao@google.com>, Vlastimil Babka <vbabka@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Rohan Puri <rohan.puri15@gmail.com>,
	Adam Manzanares <a.manzanares@samsung.com>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction
Date: Wed, 20 Sep 2023 17:55:51 -0700	[thread overview]
Message-ID: <ZQuUl2DdwDlzKoeM@bombadil.infradead.org> (raw)
In-Reply-To: <20230912162815.440749-1-zi.yan@sent.com>

On Tue, Sep 12, 2023 at 12:28:11PM -0400, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> Feel free to give comments and ask questions.

How about testing? I'm looking with an eye towards creating a
pathalogical situation which can be automated for fragmentation and see
how things go.

Mel Gorman's original artificial fragmentation taken from his first
patches ot help with fragmentation avoidance from 2018 suggested he
tried [0]:

------ From 2018
a) Create an XFS filesystem

b) Start 4 fio threads that write a number of 64K files inefficiently.
Inefficiently means that files are created on first access and not
created in advance (fio parameterr create_on_open=1) and fallocate is
not used (fallocate=none). With multiple IO issuers this creates a mix
of slab and page cache allocations over time. The total size of the
files is 150% physical memory so that the slabs and page cache pages get
mixed

c) Warm up a number of fio read-only threads accessing the same files
created in step 2. This part runs for the same length of time it took to
create the files. It'll fault back in old data and further interleave
slab and page cache allocations. As it's now low on memory due to step
2, fragmentation occurs as pageblocks get stolen. While step 3 is still
running, start a process that tries to allocate 75% of memory as huge
pages with a number of threads. The number of threads is based on a
(NR_CPUS_SOCKET - NR_FIO_THREADS)/4 to avoid THP threads contending with
fio, any other threads or forcing cross-NUMA scheduling. Note that the
test has not been used on a machine with less than 8 cores. The
benchmark records whether huge pages were allocated and what the fault
latency was in microseconds

d) Measure the number of events potentially causing external fragmentation,
the fault latency and the huge page allocation success rate.
------- end of extract

These days we can probably do a bit more damage. There has been concerns
that LBS support (block size > ps) could hinder fragmentation, one of
the reasons is that any file created despite it's size will require at
least the block size, and if using 64k block size that means 64k
allocation for each new file on that 64k block size filesystem, so
clearly you may run out of lower order allocations pretty quickly. You
can also create different larg eblock filesystems too, one for 64k
another for 32k. Although LBS is new and we're still ironing out the
kinks if you wanna give it a go we've rebased the patches onto Linus'
tree [1], and if you wanted to ramp up fast you could use kdevops [2] which
let's you pick that branch and also a series of NVMe drives (by enabling
CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_NVME) for large IO experimentation (by
enabling CONFIG_VAGRANT_ENABLE_LARGEIO). Creating different filesystem
with large block size (64k, 32k, 16k) on a 4k sector size drive
(mkfs.xfs -f -b size=64k -s size=4k) should let you easily do tons of
crazy pathalogical things.

Are there other known recipes test help test this stuff?
How do we measure success in your patches for fragmentation exactly?

[0] https://lwn.net/Articles/770235/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=large-block-linus-nobdev
[2] https://github.com/linux-kdevops/kdevops

  Luis

next prev parent reply	other threads:[~2023-09-21  0:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-12 16:28 Zi Yan
2023-09-12 16:28 ` [RFC PATCH 1/4] mm/compaction: add support for " Zi Yan
2023-09-12 17:32   ` Johannes Weiner
2023-09-12 17:38     ` Zi Yan
2023-09-15  9:33   ` Baolin Wang
2023-09-18 17:06     ` Zi Yan
2023-10-10  8:07   ` Huang, Ying
2023-09-12 16:28 ` [RFC PATCH 2/4] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan
2023-09-18  7:34   ` Baolin Wang
2023-09-18 17:20     ` Zi Yan
2023-09-20  8:15       ` Baolin Wang
2023-09-12 16:28 ` [RFC PATCH 3/4] mm/compaction: optimize >0 order folio compaction by sorting source pages Zi Yan
2023-09-12 17:56   ` Johannes Weiner
2023-09-12 20:31     ` Zi Yan
2023-09-12 16:28 ` [RFC PATCH 4/4] mm/compaction: enable compacting >0 order folios Zi Yan
2023-09-15  9:41   ` Baolin Wang
2023-09-18 17:17     ` Zi Yan
2023-09-20 14:44   ` kernel test robot
2023-09-21  0:55 ` Luis Chamberlain [this message]
2023-09-21  1:16   ` [RFC PATCH 0/4] Enable >0 order folio memory compaction Luis Chamberlain
2023-09-21  2:05     ` John Hubbard
2023-09-21  3:14       ` Luis Chamberlain
2023-09-21 15:56         ` Zi Yan
2023-10-02 12:32 ` Ryan Roberts
2023-10-09 13:24   ` Zi Yan
2023-10-09 14:10     ` Ryan Roberts
2023-10-09 15:42       ` Zi Yan
2023-10-09 15:52       ` Zi Yan
2023-10-10 10:00         ` Ryan Roberts
2023-10-09  7:12 ` Huang, Ying
2023-10-09 13:43   ` Zi Yan
2023-10-10  6:08     ` Huang, Ying
2023-10-10 16:48       ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQuUl2DdwDlzKoeM@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=a.manzanares@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=rohan.puri15@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox