[PATCH v4 0/5] batched remove rmap in try_to_unmap_one()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yin Fengwei <fengwei.yin@intel.com>
To: linux-mm@kvack.org, akpm@linux-foundation.org,
	willy@infradead.org, mike.kravetz@oracle.com,
	sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com,
	jane.chu@oracle.com, david@redhat.com
Cc: fengwei.yin@intel.com
Subject: [PATCH v4 0/5] batched remove rmap in try_to_unmap_one()
Date: Mon, 13 Mar 2023 20:45:21 +0800	[thread overview]
Message-ID: <20230313124526.1207490-1-fengwei.yin@intel.com> (raw)

This series is trying to bring the batched rmap removing to
try_to_unmap_one(). It's expected that the batched rmap
removing bring performance gain than remove rmap per page.

This series reconstruct the try_to_unmap_one() from:
  loop:
     clear and update PTE
     unmap one page
     goto loop
to:
  loop:
     clear and update PTE
     goto loop
  unmap the range of folio in one call
It is one step to always map/unmap the entire folio in one call.
Which can simplify the folio mapcount handling by avoid dealing
with each page map/unmap.


The changes are organized as:
Patch1/2 move the hugetlb and normal page unmap to dedicated
functions to make try_to_unmap_one() logic clearer and easy
to add batched rmap removing. To make code review easier, no
function change.

Patch3 cleanup the try_to_unmap_one_page(). Try to removed
some duplicated function calls.

Patch4 adds folio_remove_rmap_range() which batched remove rmap.

Patch5 make try_to_unmap_one() to batched remove rmap.

Functional testing done with the V3 patchset in a qemu guest
with 4G mem:
  - kernel mm selftest to trigger vmscan() and final hit
    try_to_unmap_one().
  - Inject hwpoison to hugetlb page to trigger try_to_unmap_one()
    call against hugetlb.
  - 8 hours stress testing: Firefox + kernel mm selftest + kernel
    build.

For performance gain demonstration, changed the MADV_PAGEOUT not
to split the large folio for page cache and created a micro
benchmark mainly as following:

        #define FILESIZE (2 * 1024 * 1024)
        char *c = mmap(NULL, FILESIZE, PROT_READ|PROT_WRITE,
                       MAP_PRIVATE, fd, 0);
	count = 0;
        while (1) {
                unsigned long i;

                for (i = 0; i < FILESIZE; i += pgsize) {
                        cc = *(volatile char *)(c + i);
                }
                madvise(c, FILESIZE, MADV_PAGEOUT);
		count++;
        }
        munmap(c, FILESIZE);

Run it with 96 instances + 96 files on xfs file system for 1
second. The test platform was IceLake with 48C/96T + 192G memory.

Test result (number count) got around %7 (58865 -> 63247) improvement
with this patch series. And perf shows following:

Without this series:
18.26%--try_to_unmap_one
        |          
        |--10.71%--page_remove_rmap
        |          |          
        |           --9.81%--__mod_lruvec_page_state
        |                     |          
        |                     |--1.36%--__mod_memcg_lruvec_state
        |                     |          |          
        |                     |           --0.80%--cgroup_rstat_updated
        |                     |          
        |                      --0.67%--__mod_lruvec_state
        |                                |          
        |                                 --0.59%--__mod_node_page_state
        |          
        |--5.41%--ptep_clear_flush
        |          |          
        |           --4.64%--flush_tlb_mm_range
        |                     |          
        |                      --3.88%--flush_tlb_func
        |                                |          
        |                                 --3.56%--native_flush_tlb_one_user
        |          
        |--0.75%--percpu_counter_add_batch
        |          
         --0.53%--PageHeadHuge

With this series:
9.87%--try_to_unmap_one
        |          
        |--7.14%--try_to_unmap_one_page.constprop.0.isra.0
        |          |          
        |          |--5.21%--ptep_clear_flush
        |          |          |          
        |          |           --4.36%--flush_tlb_mm_range
        |          |                     |          
        |          |                      --3.54%--flush_tlb_func
        |          |                                |          
        |          |                                 --3.17%--native_flush_tlb_one_user
        |          |          
        |           --0.82%--percpu_counter_add_batch
        |          
        |--1.18%--folio_remove_rmap_and_update_count.part.0
        |          |          
        |           --1.11%--folio_remove_rmap_range
        |                     |          
        |                      --0.53%--__mod_lruvec_page_state
        |          
         --0.57%--PageHeadHuge

As expected, the cost of __mod_lruvec_page_state is reduced significantly
with batched folio_remove_rmap_range. Suppose the page reclaim path can
get same benefit also.


This series based on next-20230310.

Changes from v3:
  - General
    - Rebase to next-20230310
    - Add performance testing result

  - Patch1
    - Fixed incorrect comments as Mike Kravetz pointed out
    - Use huge_pte_dirty() as Mike Kravetz suggested
    - Use true instead of folio_test_hugetlb() in
      try_to_unmap_one_hugetlb() as it's hugetlb page
      for sure as Mike Kravetz suggested

Changes from v2:
  - General
    - Rebase the patch to next-20230303
    - Update cover letter about the preparation to unmap
      the entire folio in one call
    - No code change comparing to V2. But fix the patch applying
      conflict because of wrong patch order in V2.

Changes from v1:
  - General
    - Rebase the patch to next-20230228

  - Patch1
    - Removed the if (PageHWPoison(page) && !(flags & TTU_HWPOISON)
      as suggestion from Mike Kravetz and HORIGUCHI NAOYA
    - Removed the mlock_drain_local() as suggestion from Mike Kravetz
    _ Removed the comments about the mm counter change as suggestion
      from Mike Kravetz

Yin Fengwei (5):
  rmap: move hugetlb try_to_unmap to dedicated function
  rmap: move page unmap operation to dedicated function
  rmap: cleanup exit path of try_to_unmap_one_page()
  rmap:addd folio_remove_rmap_range()
  try_to_unmap_one: batched remove rmap, update folio refcount

 include/linux/rmap.h |   5 +
 mm/page_vma_mapped.c |  30 +++
 mm/rmap.c            | 623 +++++++++++++++++++++++++------------------
 3 files changed, 398 insertions(+), 260 deletions(-)

-- 
2.30.2

next             reply	other threads:[~2023-03-13 12:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-13 12:45 Yin Fengwei [this message]
2023-03-13 12:45 ` [PATCH v4 1/5] rmap: move hugetlb try_to_unmap to dedicated function Yin Fengwei
2023-03-13 12:45 ` [PATCH v4 2/5] rmap: move page unmap operation " Yin Fengwei
2023-03-13 12:45 ` [PATCH v4 3/5] rmap: cleanup exit path of try_to_unmap_one_page() Yin Fengwei
2023-03-13 12:45 ` [PATCH v4 4/5] rmap:addd folio_remove_rmap_range() Yin Fengwei
2023-03-13 12:45 ` [PATCH v4 5/5] try_to_unmap_one: batched remove rmap, update folio refcount Yin Fengwei
2023-03-13 18:49 ` [PATCH v4 0/5] batched remove rmap in try_to_unmap_one() Andrew Morton
2023-03-14  3:09   ` Yin Fengwei
2023-03-14  9:16     ` David Hildenbrand
2023-03-14  9:48       ` Matthew Wilcox
2023-03-14  9:50         ` David Hildenbrand
2023-03-14 14:50         ` Yin, Fengwei
2023-03-14 15:01           ` Matthew Wilcox
2023-03-15  2:17             ` Yin Fengwei
2023-03-20 13:47     ` Yin, Fengwei
2023-03-21 14:17       ` David Hildenbrand
2023-03-22  1:31         ` Yin Fengwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230313124526.1207490-1-fengwei.yin@intel.com \
    --to=fengwei.yin@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=sidhartha.kumar@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox