linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Yang Shi <shy828301@gmail.com>, Yu Zhao <yuzhao@google.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [Chapter Three] THP HVO: bring the hugeTLB feature to THP
Date: Fri, 1 Mar 2024 16:42:13 +0100	[thread overview]
Message-ID: <4c748e61-7f0a-4681-a2c9-48347a700ba7@redhat.com> (raw)
In-Reply-To: <CAHbLzkrmT7=HYimU8f0BcvsjQ=GM2bQdLGRohNeXcnCJoNzrCQ@mail.gmail.com>

On 29.02.24 23:54, Yang Shi wrote:
> On Thu, Feb 29, 2024 at 10:34 AM Yu Zhao <yuzhao@google.com> wrote:
>>
>> HVO can be one of the perks for heavy THP users like it is for hugeTLB
>> users. For example, if such a user uses 60% of physical memory for 2MB
>> THPs, THP HVO can reduce the struct page overhead by half (60% * 7/8
>> ~= 50%).
>>
>> ZONE_NOMERGE considerably simplifies the implementation of HVO for
>> THPs, since THPs from it cannot be split or merged and thus do not
>> require any correctness-related operations on tail pages beyond the
>> second one.
>>
>> If a THP is mapped by PTEs, two optimization-related operations on its
>> tail pages, i.e., _mapcount and PG_anon_exclusive, can be binned to
>> track a group of pages, e.g., eight pages per group for 2MB THPs. The
>> estimation, as the copying cost incurred during shattering, is also by
>> design, since mapping by PTEs is another discouraged behavior.
> 
> I'm confused by this. Can you please elaborate a little bit about
> binning mapcount and PG_anon_exclusive?
> 
> For mapcount, IIUC, for example, when inc'ing a subpage's mapcount,
> you actually inc the (i % 64) page's mapcount (assuming THP size is 2M
> and base page size is 4K, so 8 strides and 64 pages in each stride),
> right? But how you can tell each page of the 8 pages has mapcount 1 or
> one page is mapped 8 times? Or this actually doesn't matter, we don't
> even care to distinguish the two cases?

I'm hoping we won't need such elaborate approaches that make the 
mapcounts even more complicated in the future.

Just like for hugetlb HGM (if it ever becomes real), I'm hoping that we 
can just avoid subpage mapcounts completely, at least in some kernel 
configs initially.

I was looking into having only a single PAE bit this week, but 
migration+swapout are (again) giving me a really hard time. In theory 
it's simple, the corner cases are killing me.

What I really dislike about PAE right now is not necessarily the space, 
but that they reside in multiple cachelines and that we have to use 
atomic operations to set/clear them simply because other page flags 
might be set concurrently. PAE can only be set/cleared while holding the 
page table lock already, so I really want to avoid atomics.

I have not given up on a single PAE bit per folio, but the alternative I 
was thinking about this week was simply allocating the space required 
for maintaining them and storing a pointer to that in the (anon) folio. 
Not perfect.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-03-01 15:42 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29 18:34 [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations Yu Zhao
2024-02-29 18:34 ` [Chapter One] THP zones: the use cases of policy zones Yu Zhao
2024-02-29 20:28   ` Matthew Wilcox
2024-03-06  3:51     ` Yu Zhao
2024-03-06  4:33       ` Matthew Wilcox
2024-02-29 23:31   ` Yang Shi
2024-03-03  2:47     ` Yu Zhao
2024-03-04 15:19   ` Matthew Wilcox
2024-03-05 17:22     ` Matthew Wilcox
2024-03-05  8:41   ` Barry Song
2024-03-05 10:07     ` Vlastimil Babka
2024-03-05 21:04       ` Barry Song
2024-03-06  3:05         ` Yu Zhao
2024-05-24  8:38   ` Barry Song
2024-11-01  2:35   ` Charan Teja Kalla
2024-11-01 16:55     ` Yu Zhao
2024-02-29 18:34 ` [Chapter Two] THP shattering: the reverse of collapsing Yu Zhao
2024-02-29 21:55   ` Zi Yan
2024-03-03  1:17     ` Yu Zhao
2024-03-03  1:21       ` Zi Yan
2024-06-11  8:32   ` Barry Song
2024-02-29 18:34 ` [Chapter Three] THP HVO: bring the hugeTLB feature to THP Yu Zhao
2024-02-29 22:54   ` Yang Shi
2024-03-01 15:42     ` David Hildenbrand [this message]
2024-03-03  1:46     ` Yu Zhao
2024-02-29 18:34 ` [Epilogue] Profile-Guided Heap Optimization and THP fungibility Yu Zhao
2024-03-05  8:37 ` [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations Barry Song
2024-03-06 15:51 ` Johannes Weiner
2024-03-06 16:40   ` Zi Yan
2024-03-13 22:09   ` Kaiyang Zhao
2024-05-15 21:17 ` Yu Zhao
2024-05-15 21:52   ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c748e61-7f0a-4681-a2c9-48347a700ba7@redhat.com \
    --to=david@redhat.com \
    --cc=corbet@lwn.net \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=shy828301@gmail.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox