Re: [RESEND PATCH v7 00/10] Small-sized THP for anonymous memory

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: John Hubbard <jhubbard@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Yin Fengwei <fengwei.yin@intel.com>,
	David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Itaru Kitayama <itaru.kitayama@gmail.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH v7 00/10] Small-sized THP for anonymous memory
Date: Wed, 22 Nov 2023 22:28:04 -0800	[thread overview]
Message-ID: <101e5ffa-acf3-459c-85f4-7f36a63b125a@nvidia.com> (raw)
In-Reply-To: <20231122162950.3854897-1-ryan.roberts@arm.com>

On 11/22/23 08:29, Ryan Roberts wrote:
...
> Prerequisites
> =============
> 
> Some work items identified as being prerequisites are listed on page 3 at [8].
> The summary is:
> 
> | item                          | status                  |
> |:------------------------------|:------------------------|
> | mlock                         | In mainline (v6.7)      |
> | madvise                       | In mainline (v6.6)      |
> | compaction                    | v1 posted [9]           |
> | numa balancing                | Investigated: see below |
> | user-triggered page migration | In mainline (v6.7)      |
> | khugepaged collapse           | In mainline (NOP)       |
> 
> On NUMA balancing, which currently ignores any PTE-mapped THPs it encounters,
> John Hubbard has investigated this and concluded that it is A) not clear at the
> moment what a better policy might be for PTE-mapped THP and B) questions whether
> this should really be considered a prerequisite given no regression is caused
> for the default "small-sized THP disabled" case, and there is no correctness
> issue when it is enabled - its just a potential for non-optimal performance.
> (John please do elaborate if I haven't captured this correctly!)

That's accurate. I actually want to continue looking into this (Mel
Gorman's recent replies to v6 provided helpful touchstones to the NUMA
reasoning leading up to the present day), and maybe at least bring
pte-thps into rough parity with THPs with respect to NUMA.

But that really doesn't seem like something that needs to happen first,
especially since the outcome might even be, "first, do no harm"--as in,
it's better as-is. We'll see.

> 
> If there are no disagreements about removing numa balancing from the list, then
> that just leaves compaction which is in review on list at the moment.
> 
> I really would like to get this series (and its remaining comapction
> prerequisite) in for v6.8. I accept that it may be a bit optimistic at this
> point, but lets see where we get to with review?
> 
> 
> Testing
> =======
> 
> The series includes patches for mm selftests to enlighten the cow and khugepaged
> tests to explicitly test with small-order THP, in the same way that PMD-order
> THP is tested. The new tests all pass, and no regressions are observed in the mm
> selftest suite. I've also run my usual kernel compilation and java script
> benchmarks without any issues.
> 
> Refer to my performance numbers posted with v6 [6]. (These are for small-sized
> THP only - they do not include the arm64 contpte follow-on series).
> 
> John Hubbard at Nvidia has indicated dramatic 10x performance improvements for
> some workloads at [10]. (Observed using v6 of this series as well as the arm64
> contpte series).
> 

Testing continues. Some workloads do even much better than than 10x,
it's quite remarkable and glorious to see. :)  I can send more perf data
perhaps in a few days or a week, if there is still doubt about the
benefits.

That was with the v6 series, though. I'm about to set up and run with
v7, and expect to provide a tested by tag for functionality, sometime
soon (in the next few days), if machine availability works out as
expected.


thanks,
-- 
John Hubbard
NVIDIA

next prev parent reply	other threads:[~2023-11-23  6:28 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-22 16:29 Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 01/10] mm: Allow deferred splitting of arbitrary anon large folios Ryan Roberts
2023-11-27  8:27   ` Barry Song
2023-11-22 16:29 ` [RESEND PATCH v7 02/10] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-11-24 17:40   ` David Hildenbrand
2023-11-27 10:34     ` Ryan Roberts
2023-11-27  4:36   ` Barry Song
2023-11-27 11:30     ` Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 03/10] mm: thp: Introduce per-size thp sysfs interface Ryan Roberts
2023-11-29  3:42   ` John Hubbard
2023-11-29  8:05     ` David Hildenbrand
2023-11-29 11:05     ` Ryan Roberts
2023-11-29 19:40       ` John Hubbard
2023-11-30 12:14         ` Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 04/10] mm: thp: Support allocation of anonymous small-sized THP Ryan Roberts
2023-11-27  3:41   ` Barry Song
2023-11-27 11:28     ` Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 05/10] selftests/mm/kugepaged: Restore thp settings at exit Ryan Roberts
2023-11-23  5:54   ` Alistair Popple
2023-11-22 16:29 ` [RESEND PATCH v7 06/10] selftests/mm: Factor out thp settings management Ryan Roberts
2023-11-23  6:07   ` Alistair Popple
2023-11-27 12:22     ` Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 07/10] selftests/mm: Support small-sized THP interface in thp_settings Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 08/10] selftests/mm/khugepaged: Enlighten for small-sized THP Ryan Roberts
2023-11-22 16:29 ` [RESEND PATCH v7 09/10] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2023-11-24 17:48   ` David Hildenbrand
2023-11-27 10:48     ` Ryan Roberts
2023-11-27 13:59       ` David Hildenbrand
2023-11-27 14:11         ` Ryan Roberts
2023-11-27 14:17           ` David Hildenbrand
2023-11-22 16:29 ` [RESEND PATCH v7 10/10] selftests/mm/cow: Add tests for anonymous small-sized THP Ryan Roberts
2023-11-27 14:02   ` Ryan Roberts
2023-11-27 14:50     ` David Hildenbrand
2023-11-27 14:54       ` Ryan Roberts
2023-11-22 16:32 ` [RESEND PATCH v7 00/10] Small-sized THP for anonymous memory David Hildenbrand
2023-11-23  6:28 ` John Hubbard [this message]
2023-11-23 15:59 ` Matthew Wilcox
2023-11-23 16:05   ` David Hildenbrand
2023-11-23 16:18     ` Matthew Wilcox
2023-11-23 16:50       ` David Hildenbrand
2023-11-24  1:14         ` John Hubbard
2023-11-24  1:34         ` Zi Yan
2023-11-24  9:02           ` David Hildenbrand
2023-11-24  9:56   ` Ryan Roberts
2023-11-24 15:13     ` Matthew Wilcox
2023-11-24 15:23       ` Ryan Roberts
2023-11-24 15:25       ` David Hildenbrand
2023-11-24 15:53         ` Matthew Wilcox
2023-11-24 17:34           ` David Hildenbrand
2023-11-27  8:20             ` Alistair Popple
2023-11-27 10:31               ` Ryan Roberts
2023-11-28  2:09                 ` John Hubbard
2023-11-28  8:48                   ` David Hildenbrand
2023-11-28 12:15                     ` Ryan Roberts
2023-11-28 14:09                       ` David Hildenbrand
2023-11-28 15:34                         ` Ryan Roberts
2023-11-28 16:40                           ` David Hildenbrand
2023-11-28 18:39                           ` John Hubbard
2023-11-29  9:59                             ` Ryan Roberts
2023-11-29 19:46                               ` John Hubbard
2023-11-28  4:10               ` Matthew Wilcox
2023-11-28  4:05             ` Matthew Wilcox
2023-11-28  8:47               ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=101e5ffa-acf3-459c-85f4-7f36a63b125a@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hughd@google.com \
    --cc=itaru.kitayama@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=rientjes@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox