linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: David Hildenbrand <david@redhat.com>,
	akpm@linux-foundation.org, hughd@google.com
Cc: willy@infradead.org, wangkefeng.wang@huawei.com,
	ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com,
	shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com,
	da.gomez@samsung.com, p.raghav@samsung.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/6] add mTHP support for anonymous shmem
Date: Fri, 31 May 2024 18:13:03 +0800	[thread overview]
Message-ID: <db3517d0-54b1-4d3a-b798-1c13572d07be@linux.alibaba.com> (raw)
In-Reply-To: <f1783ff0-65bd-4b2b-8952-52b6822a0835@redhat.com>



On 2024/5/31 17:35, David Hildenbrand wrote:
> On 30.05.24 04:04, Baolin Wang wrote:
>> Anonymous pages have already been supported for multi-size (mTHP) 
>> allocation
>> through commit 19eaf44954df, that can allow THP to be configured 
>> through the
>> sysfs interface located at 
>> '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled'.
>>
>> However, the anonymous shmem will ignore the anonymous mTHP rule 
>> configured
>> through the sysfs interface, and can only use the PMD-mapped THP, that 
>> is not
>> reasonable. Many implement anonymous page sharing through 
>> mmap(MAP_SHARED |
>> MAP_ANONYMOUS), especially in database usage scenarios, therefore, 
>> users expect
>> to apply an unified mTHP strategy for anonymous pages, also including the
>> anonymous shared pages, in order to enjoy the benefits of mTHP. For 
>> example,
>> lower latency than PMD-mapped THP, smaller memory bloat than 
>> PMD-mapped THP,
>> contiguous PTEs on ARM architecture to reduce TLB miss etc.
>>
>> The primary strategy is similar to supporting anonymous mTHP. Introduce
>> a new interface '/mm/transparent_hugepage/hugepage-XXkb/shmem_enabled',
>> which can have all the same values as the top-level
>> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', with adding a new
>> additional "inherit" option. By default all sizes will be set to "never"
>> except PMD size, which is set to "inherit". This ensures backward 
>> compatibility
>> with the anonymous shmem enabled of the top level, meanwhile also allows
>> independent control of anonymous shmem enabled for each mTHP.
>>
>> Use the page fault latency tool to measure the performance of 1G 
>> anonymous shmem
>> with 32 threads on my machine environment with: ARM64 Architecture, 32 
>> cores,
>> 125G memory:
>> base: mm-unstable
>> user-time    sys_time    faults_per_sec_per_cpu     faults_per_sec
>> 0.04s        3.10s         83516.416                  2669684.890
>>
>> mm-unstable + patchset, anon shmem mTHP disabled
>> user-time    sys_time    faults_per_sec_per_cpu     faults_per_sec
>> 0.02s        3.14s         82936.359                  2630746.027
>>
>> mm-unstable + patchset, anon shmem 64K mTHP enabled
>> user-time    sys_time    faults_per_sec_per_cpu     faults_per_sec
>> 0.08s        0.31s         678630.231                 17082522.495
>>
>>  From the data above, it is observed that the patchset has a minimal 
>> impact when
>> mTHP is not enabled (some fluctuations observed during testing). When 
>> enabling 64K
>> mTHP, there is a significant improvement of the page fault latency.
> 
> Let me summarize the takeaway from the bi-weekly MM meeting as I 
> understood it, that includes Hugh's feedback on per-block tracking vs. 

Thanks David for the summarization.

> mTHP:
> 
> (1) Per-block tracking
> 
> Per-block tracking is currently considered unwarranted complexity in 
> shmem.c. We should try to get it done without that. For any test cases 
> that fail, we should consider if they are actually valid for shmem.
> 
> To optimize FALLOC_FL_PUNCH_HOLE for the cases where splitting+freeing
> is not possible at fallcoate() time, detecting zeropages later and
> retrying to split+free might be an option, without per-block tracking.
> 
> (2) mTHP controls
> 
> As a default, we should not be using large folios / mTHP for any shmem, 
> just like we did with THP via shmem_enabled. This is what this series 
> currently does, and is aprt of the whole mTHP user-space interface design.
> 
> Further, the mTHP controls should control all of shmem, not only 
> "anonymous shmem".

Yes, that's what I thought and in my TODO list.

> 
> Also, we should properly fallback within the configured sizes, and not 
> jump "over" configured sizes. Unless there is a good reason.
> 
> (3) khugepaged
> 
> khugepaged needs to handle larger folios properly as well. Until fixed, 
> using smaller THP sizes as fallback might prohibit collapsing a 
> PMD-sized THP later. But really, khugepaged needs to be fixed to handle 
> that. >
> (4) force/disable
> 
> These settings are rather testing artifacts from the old ages. We should 
> not add them to the per-size toggles. We might "inherit" it from the 
> global one, though.

Sorry, I missed this. So I thould remove the 'force' and 'deny' option 
for each mTHP, right?

> 
> "within_size" might have value, and especially for consistency, we 
> should have them per size.
> 
> 
> 
> So, this series only tackles anonymous shmem, which is a good starting 
> point. Ideally, we'd get support for other shmem (especially during 
> fault time) soon afterwards, because we won't be adding separate toggles 
> for that from the interface POV, and having inconsistent behavior 
> between kernel versions would be a bit unfortunate.
> 
> 
> @Baolin, this series likely does not consider (4) yet. And I suggest we 
> have to take a lot of the "anonymous thp" terminology out of this 
> series, especially when it comes to documentation.

Sure. I will remove the "anonymous thp" terminology from the 
documentation, but want to still keep it in the commit message, cause I 
want to start from the anonymous shmem.

> 
> @Daniel, Pankaj, what are your plans regarding that? It would be great 
> if we could get an understanding on the next steps on !anon shmem.
> 


  reply	other threads:[~2024-05-31 10:13 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-30  2:04 Baolin Wang
2024-05-30  2:04 ` [PATCH v3 1/6] mm: memory: extend finish_fault() to support large folio Baolin Wang
2024-06-03  4:44   ` Lance Yang
2024-06-03  8:04     ` Baolin Wang
2024-06-03  5:28   ` Barry Song
2024-06-03  8:29     ` Baolin Wang
2024-06-03  8:58       ` Barry Song
2024-06-03  9:01         ` Barry Song
2024-06-03  9:37           ` Baolin Wang
2024-05-30  2:04 ` [PATCH v3 2/6] mm: shmem: add THP validation for PMD-mapped THP related statistics Baolin Wang
2024-05-30  2:04 ` [PATCH v3 3/6] mm: shmem: add multi-size THP sysfs interface for anonymous shmem Baolin Wang
2024-06-01  3:29   ` wang wei
2024-06-02  4:36     ` [PATCH " Baolin Wang
2024-05-30  2:04 ` [PATCH v3 4/6] mm: shmem: add mTHP support " Baolin Wang
2024-05-30  6:36   ` kernel test robot
2024-06-02  4:16     ` Baolin Wang
2024-06-04  9:23   ` Dan Carpenter
2024-06-04  9:46     ` Baolin Wang
2024-05-30  2:04 ` [PATCH v3 5/6] mm: shmem: add mTHP size alignment in shmem_get_unmapped_area Baolin Wang
2024-05-30  2:04 ` [PATCH v3 6/6] mm: shmem: add mTHP counters for anonymous shmem Baolin Wang
2024-05-31  9:35 ` [PATCH v3 0/6] add mTHP support " David Hildenbrand
2024-05-31 10:13   ` Baolin Wang [this message]
2024-05-31 11:13     ` David Hildenbrand
2024-06-02  4:15       ` Baolin Wang
2024-06-04  8:18       ` Daniel Gomez
2024-06-04  9:45         ` Baolin Wang
2024-06-04 12:05           ` Daniel Gomez
2024-06-06  3:31             ` Baolin Wang
2024-06-06  8:38               ` David Hildenbrand
2024-06-06  9:31                 ` Baolin Wang
2024-06-07  9:05                 ` Daniel Gomez
2024-06-07 10:39                   ` David Hildenbrand
2024-06-01  3:54     ` wang wei
2024-05-31 13:19   ` Daniel Gomez
2024-05-31 14:43     ` David Hildenbrand
2024-06-04  9:29       ` Daniel Gomez
2024-06-04  9:59         ` David Hildenbrand
2024-06-04 12:30           ` Daniel Gomez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db3517d0-54b1-4d3a-b798-1c13572d07be@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=da.gomez@samsung.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=p.raghav@samsung.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox