linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: David Hildenbrand <david@redhat.com>
Cc: akpm@linux-foundation.org, justinjiang@vivo.com,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	opensource.kernel@vivo.com,  willy@infradead.org
Subject: Re: [PATCH v7] mm: shrink skip folio mapped by an exiting process
Date: Wed, 10 Jul 2024 16:02:32 +1200	[thread overview]
Message-ID: <CAGsJ_4zkt5wKk-JhEpZgqpQgNK--50jwpZFK4E_eXgBpKkMKmQ@mail.gmail.com> (raw)
In-Reply-To: <dc2c3395-e514-40ad-b9d8-b76cf04ba0df@redhat.com>

On Wed, Jul 10, 2024 at 3:59 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 10.07.24 05:32, Barry Song wrote:
> > On Wed, Jul 10, 2024 at 9:23 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >>
> >> On Tue,  9 Jul 2024 20:31:15 +0800 Zhiguo Jiang <justinjiang@vivo.com> wrote:
> >>
> >>> The releasing process of the non-shared anonymous folio mapped solely by
> >>> an exiting process may go through two flows: 1) the anonymous folio is
> >>> firstly is swaped-out into swapspace and transformed into a swp_entry
> >>> in shrink_folio_list; 2) then the swp_entry is released in the process
> >>> exiting flow. This will result in the high cpu load of releasing a
> >>> non-shared anonymous folio mapped solely by an exiting process.
> >>>
> >>> When the low system memory and the exiting process exist at the same
> >>> time, it will be likely to happen, because the non-shared anonymous
> >>> folio mapped solely by an exiting process may be reclaimed by
> >>> shrink_folio_list.
> >>>
> >>> This patch is that shrink skips the non-shared anonymous folio solely
> >>> mapped by an exting process and this folio is only released directly in
> >>> the process exiting flow, which will save swap-out time and alleviate
> >>> the load of the process exiting.
> >>
> >> It would be helpful to provide some before-and-after runtime
> >> measurements, please.  It's a performance optimization so please let's
> >> see what effect it has.
> >
> > Hi Andrew,
> >
> > This was something I was curious about too, so I created a small test program
> > that allocates and continuously writes to 256MB of memory. Using QEMU, I set
> > up a small machine with only 300MB of RAM to trigger kswapd.
> >
> > qemu-system-aarch64 -M virt,gic-version=3,mte=off -nographic \
> >   -smp cpus=4 -cpu max \
> >   -m 300M -kernel arch/arm64/boot/Image
> >
> > The test program will be randomly terminated by its subprocess to trigger
> > the use case of this patch.
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <unistd.h>
> > #include <string.h>
> > #include <sys/types.h>
> > #include <sys/wait.h>
> > #include <time.h>
> > #include <signal.h>
> >
> > #define MEMORY_SIZE (256 * 1024 * 1024)
> >
> > unsigned char *memory;
> >
> > void allocate_and_write_memory()
> > {
> >      memory = (unsigned char *)malloc(MEMORY_SIZE);
> >      if (memory == NULL) {
> >          perror("malloc");
> >          exit(EXIT_FAILURE);
> >      }
> >
> >      while (1)
> >          memset(memory, 0x11, MEMORY_SIZE);
> > }
> >
> > int main()
> > {
> >      pid_t pid;
> >      srand(time(NULL));
> >
> >      pid = fork();
> >
> >      if (pid < 0) {
> >          perror("fork");
> >          exit(EXIT_FAILURE);
> >      }
> >
> >      if (pid == 0) {
> >          int delay = (rand() % 10000) + 10000;
> >          usleep(delay * 1000);
> >
> >       /* kill parent when it is busy on swapping */
> >          kill(getppid(), SIGKILL);
> >          _exit(0);
> >      } else {
> >          allocate_and_write_memory();
> >
> >          wait(NULL);
> >
> >          free(memory);
> >      }
> >
> >      return 0;
> > }
> >
> > I tracked the number of folios that could be redundantly
> > swapped out by adding a simple counter as shown below:
> >
> > @@ -879,6 +880,9 @@ static bool folio_referenced_one(struct folio *folio,
> >                      check_stable_address_space(vma->vm_mm)) &&
> >                      folio_test_swapbacked(folio) &&
> >                      !folio_likely_mapped_shared(folio)) {
> > +                       static long i, size;
> > +                       size += folio_size(folio);
> > +                       pr_err("index: %d skipped folio:%lx total size:%d\n", i++, (unsigned long)folio, size);
> >                          pra->referenced = -1;
> >                          page_vma_mapped_walk_done(&pvmw);
> >                          return false;
> >
> >
> > This is what I have observed:
> >
> > / # /home/barry/develop/linux/skip_swap_out_test
> > [   82.925645] index: 0 skipped folio:fffffdffc0425400 total size:65536
> > [   82.925960] index: 1 skipped folio:fffffdffc0425800 total size:131072
> > [   82.927524] index: 2 skipped folio:fffffdffc0425c00 total size:196608
> > [   82.928649] index: 3 skipped folio:fffffdffc0426000 total size:262144
> > [   82.929383] index: 4 skipped folio:fffffdffc0426400 total size:327680
> > [   82.929995] index: 5 skipped folio:fffffdffc0426800 total size:393216
> > ...
> > [   88.469130] index: 6112 skipped folio:fffffdffc0390080 total size:97230848
> > [   88.469966] index: 6113 skipped folio:fffffdffc038d000 total size:97296384
> > [   89.023414] index: 6114 skipped folio:fffffdffc0366cc0 total size:97300480
> >
> > I observed that this patch effectively skipped 6114 folios (either 4KB or 64KB
> > mTHP), potentially reducing the swap-out by up to 92MB (97,300,480 bytes) during
> > the process exit.
> >
> > Despite the numerous mistakes Zhiguo made in sending this patch, it is still
> > quite valuable. Please consider pulling his v9 into the mm tree for testing.
>
> BTW, we dropped the folio_test_anon() check, but what about shmem? They
> also do __folio_set_swapbacked()?

my point is that the purpose is skipping redundant swap-out, if shmem is single
mapped, they could be also skipped.

>
> --
> Cheers,
>
> David / dhildenb
>

Thanks
Barry


  reply	other threads:[~2024-07-10  4:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-09 12:31 Zhiguo Jiang
2024-07-09 13:02 ` Barry Song
2024-07-10  1:46   ` zhiguojiang
2024-07-10  2:00     ` Barry Song
2024-07-09 21:23 ` Andrew Morton
2024-07-10  3:32   ` Barry Song
2024-07-10  3:59     ` David Hildenbrand
2024-07-10  4:02       ` Barry Song [this message]
2024-07-10  4:04         ` David Hildenbrand
2024-07-10  4:44           ` Barry Song
2024-07-10  6:47             ` zhiguojiang
2024-07-10  7:11               ` Barry Song
2024-07-10  8:38                 ` zhiguojiang
2024-07-10  2:12 ` Barry Song
2024-07-10  2:41   ` zhiguojiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4zkt5wKk-JhEpZgqpQgNK--50jwpZFK4E_eXgBpKkMKmQ@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=justinjiang@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=opensource.kernel@vivo.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox