linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
	 David Hildenbrand <david@redhat.com>,
	 akpm@linux-foundation.org,  shuah@kernel.org,
	linux-mm@kvack.org,  chrisl@kernel.org,  hughd@google.com,
	kaleshsingh@google.com,  kasong@tencent.com,
	linux-kernel@vger.kernel.org,  linux-kselftest@vger.kernel.org,
	 Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH] selftests/mm: Introduce a test program to assess swap entry allocation for thp_swapout
Date: Mon, 24 Jun 2024 14:59:09 +0800	[thread overview]
Message-ID: <871q4m25du.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <CAGsJ_4y9JinvzA6Wd2aXe_FRYhxED0vkkvU2HwWW8WBEX+8oqw@mail.gmail.com> (Barry Song's message of "Mon, 24 Jun 2024 16:05:44 +1200")

Barry Song <21cnbao@gmail.com> writes:

> On Mon, Jun 24, 2024 at 3:44 PM Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Barry Song <21cnbao@gmail.com> writes:
>>
>> > On Fri, Jun 21, 2024 at 9:24 PM Huang, Ying <ying.huang@intel.com> wrote:
>> >>
>> >> Barry Song <21cnbao@gmail.com> writes:
>> >>
>> >> > On Fri, Jun 21, 2024 at 7:25 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>> >> >>
>> >> >> On 20/06/2024 12:34, David Hildenbrand wrote:
>> >> >> > On 20.06.24 11:04, Ryan Roberts wrote:
>> >> >> >> On 20/06/2024 01:26, Barry Song wrote:
>> >> >> >>> From: Barry Song <v-songbaohua@oppo.com>
>> >> >> >>>
>> >> >> >>> Both Ryan and Chris have been utilizing the small test program to aid
>> >> >> >>> in debugging and identifying issues with swap entry allocation. While
>> >> >> >>> a real or intricate workload might be more suitable for assessing the
>> >> >> >>> correctness and effectiveness of the swap allocation policy, a small
>> >> >> >>> test program presents a simpler means of understanding the problem and
>> >> >> >>> initially verifying the improvements being made.
>> >> >> >>>
>> >> >> >>> Let's endeavor to integrate it into the self-test suite. Although it
>> >> >> >>> presently only accommodates 64KB and 4KB, I'm optimistic that we can
>> >> >> >>> expand its capabilities to support multiple sizes and simulate more
>> >> >> >>> complex systems in the future as required.
>> >> >> >>
>> >> >> >> I'll try to summarize the thread with Huang Ying by suggesting this test program
>> >> >> >> is "neccessary but not sufficient" to exhaustively test the mTHP swap-out path.
>> >> >> >> I've certainly found it useful and think it would be a valuable addition to the
>> >> >> >> tree.
>> >> >> >>
>> >> >> >> That said, I'm not convinced it is a selftest; IMO a selftest should provide a
>> >> >> >> clear pass/fail result against some criteria and must be able to be run
>> >> >> >> automatically by (e.g.) a CI system.
>> >> >> >
>> >> >> > Likely we should then consider moving other such performance-related thingies
>> >> >> > out of the selftests?
>> >> >>
>> >> >> Yes, that would get my vote. But of the 4 tests you mentioned that use
>> >> >> clock_gettime(), it looks like transhuge-stress is the only one that doesn't
>> >> >> have a pass/fail result, so is probably the only candidate for moving.
>> >> >>
>> >> >> The others either use the times as a timeout and determines failure if the
>> >> >> action didn't occur within the timeout (e.g. ksm_tests.c) or use it to add some
>> >> >> supplemental performance information to an otherwise functionality-oriented test.
>> >> >
>> >> > Thank you very much, Ryan. I think you've found a better home for this
>> >> > tool . I will
>> >> > send v2, relocating it to tools/mm and adding a function to swap in
>> >> > either the whole
>> >> > mTHPs or a portion of mTHPs by "-a"(aligned swapin).
>> >> >
>> >> > So basically, we will have
>> >> >
>> >> > 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code under
>> >> > high exercise in a short time.
>> >> >
>> >> > 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in freeing
>> >> > memory, as well as for munmap, app exits, or OOM killer scenarios. This ensures
>> >> > new mTHP is always generated, released or swapped out, similar to the behavior
>> >> > on a PC or Android phone where many applications are frequently started and
>> >> > terminated.
>> >>
>> >> MADV_DONTNEED 64KB memory, then memset() it, this just simulates the
>> >> large folio swap-in exactly, which hasn't been merged by upstream.  I
>> >> don't think that it's a good idea to make such kind of trick.
>> >
>> > I disagree. This is how userspace heaps can manage memory
>> > deallocation.
>>
>> Sorry, I don't understand how.  Can you show some examples?  Such as
>> strace log with 64KB aligned MADV_DONTNEED?
>
> In Java heap and memory allocators such as jemalloc and Scudo, memory is freed
> using the MADV_DONTNEED flag when either free() is called or garbage collection
> occurs. In Android, the Java heap is freed in chunks aligned to 64KB
> or larger.

Originally, I heard about that MADV_FREE is used by jemalloc.  Now, I
know that they use MADV_DONTNEED too.  Thanks!

Although I still suspect that libc/java allocator will free pages in
exact 64KB size (IIUC, they should free pages in much larger trunk).  I
agree that MADV_DONTNEED is a way to create fragmentation in swap
devices.

> In
> Scudo and jemalloc, there is a configuration option to set the
> management granularity.
> This granularity is set to match the mTHP size(though the default
> value is 16KB in the
> latest Android if we don't run mTHP). Otherwise, you could end up with
> millions of
> partial unmap operations, which would severely degrade the performance of mTHP.
>
> Imagine libc/Java functioning like a slab allocator. When kfree() is
> called, some pages
> may become completely unoccupied and can be returned to the buddy allocator. In
> userspace, memory is given back to the kernel in a similar manner,
> typically using
> MADV_DONTNEED. Therefore, MADV_DONTNEED is the most common memory
> reclamation behavior in Android, coming with free(), delete() or GC.
>
> Imagine a system with extensive malloc, free, new, and delete
> operations, where objects
> are constantly being created and destroyed.
>
> On the other hand, whether libc/Java use MADV_DONTNEED to free memory is not
> crucial, although they do. We need a method to simulate the lifecycle
> of applications
> —exiting and starting anew—on PCs or Android phones. It doesn't matter if you
> use MADV_DONTNEED or munmap to achieve this.
>
> It is important to note that mTHP currently operates on a one-shot
> basis(after swap-out,
> you never get them back as mTHP as we don't support large folios
> swapin). For the test
> program, we need a method to generate new mTHPs continuously. Without this,
> after the initial iterations, we would be left with only folios,
> rendering the entire
> test program *pointless*.

I understand the requirements for new mTHPs.

>>
>> > Additionally, in the event of an application exit, munmap, or OOM killer, the
>> > amount of freed memory can be much larger than 64KB. The primary purpose
>> > of using MADV_DONTNEED is to release anonymous memory and generate
>> > new mTHP so that the iteration can continue. Otherwise, the test program
>> > becomes entirely pointless, as we only have large folios at the beginning.
>> > That is exactly why Chris has failed to find his bugs by using other small
>> > programs.
>>
>> Although I still don't understand how 64KB aligned MADV_DONTNEED is used
>> for libc/java heap or munmap in a practical way.  After more thoughts, I
>> think 64KB Aligned MADV_DONTNEED can simulate the fragmentation effect
>> of processes exit at some degree if 64KB folios in these processes are
>> swapped out without splitting.  If you have no other practical use
>> cases, I suggest to make it explicit with comments in program.
>>

[snip]

--
Best Regards,
Huang, Ying


  reply	other threads:[~2024-06-24  7:01 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-20  0:26 Barry Song
2024-06-20  1:53 ` Huang, Ying
2024-06-20  2:04   ` Barry Song
2024-06-20  5:20     ` Huang, Ying
2024-06-20  6:09       ` Barry Song
2024-06-20  6:34         ` Huang, Ying
2024-06-20  7:25           ` Barry Song
2024-06-20  7:59             ` Huang, Ying
2024-06-20  8:11               ` Barry Song
2024-06-20  8:26                 ` Huang, Ying
2024-06-20  9:07                   ` Barry Song
     [not found]   ` <3e185f8d-da63-4a61-9cd1-9804bd972515@redhat.com>
2024-06-20  7:24     ` Huang, Ying
2024-06-20  9:04 ` Ryan Roberts
2024-06-20 11:34   ` David Hildenbrand
2024-06-21  2:33     ` Huang, Ying
2024-06-21  7:25     ` Ryan Roberts
2024-06-21  7:47       ` Barry Song
2024-06-21  7:58         ` Ryan Roberts
2024-06-21  8:50         ` Chris Li
2024-06-21 11:20           ` Barry Song
2024-06-21  9:22         ` Huang, Ying
2024-06-21  9:43           ` Barry Song
2024-06-24  3:42             ` Huang, Ying
2024-06-24  4:05               ` Barry Song
2024-06-24  6:59                 ` Huang, Ying [this message]
2024-06-24  7:55                   ` Barry Song
2024-06-21  8:52       ` David Hildenbrand
2024-06-20 23:34 ` Chris Li
2024-06-21  7:34   ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871q4m25du.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=kaleshsingh@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=v-songbaohua@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox