linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: riel@surriel.com
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	muchun.song@linux.dev, leit@meta.com, willy@infradead.org
Subject: Re: [PATCH 3/3] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock
Date: Sun, 1 Oct 2023 22:22:42 -0700	[thread overview]
Message-ID: <20231002052242.GA103993@monkey> (raw)
In-Reply-To: <20231001005659.2185316-4-riel@surriel.com>

On 09/30/23 20:55, riel@surriel.com wrote:
> From: Rik van Riel <riel@surriel.com>
> 
> Replace the custom hugetlbfs VMA locking code with the recently
> introduced invalidate_lock. This greatly simplifies things.
> 
> However, this is a large enough change that it should probably go in
> separately from the other changes.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Rik van Riel <riel@surriel.com>
> ---
>  fs/hugetlbfs/inode.c    |  68 +-----------
>  include/linux/fs.h      |   6 ++
>  include/linux/hugetlb.h |  21 +---
>  mm/hugetlb.c            | 227 ++++------------------------------------
>  4 files changed, 32 insertions(+), 290 deletions(-)

As noted elsewhere, there are issues with patch 2 of this series, and the
complete series does not pass libhugetlbfs tests.  However, there were
questions about the performance characteristics of replacing hugetlb vma
lock with the invalidate_lock.

This is from commit 188a39725ad7 describing the performance gains from
the hugetlb vma lock.

    The recent regression report [1] notes page fault and fork latency of
    shared hugetlb mappings.  To measure this, I created two simple programs:
    1) map a shared hugetlb area, write fault all pages, unmap area
       Do this in a continuous loop to measure faults per second
    2) map a shared hugetlb area, write fault a few pages, fork and exit
       Do this in a continuous loop to measure forks per second
    These programs were run on a 48 CPU VM with 320GB memory.  The shared
    mapping size was 250GB.  For comparison, a single instance of the program
    was run.  Then, multiple instances were run in parallel to introduce
    lock contention.  Changing the locking scheme results in a significant
    performance benefit.
    
    test            instances       unmodified      revert          vma
    --------------------------------------------------------------------------
    faults per sec  1               393043          395680          389932
    faults per sec  24               71405           81191           79048
    forks per sec   1                 2802            2747            2725
    forks per sec   24                 439             536             500
    Combined faults 24                1621           68070           53662
    Combined forks  24                 358              67             142
    
    Combined test is when running both faulting program and forking program
    simultaneously.

This series was 'stable enough' to run the test, although I did see some
bad PMD state warnings and threw out those runs.  Here are the results:

test            instances       next-20230925	next-20230925+series
--------------------------------------------------------------------------
faults per sec  1               382994		386884
faults per sec  24               97959		 75427
forks per sec   1                 3105		  3148
forks per sec   24                 693		   715
Combined faults 24               74506		 31648
Combined forks  24                 233		   282
    
The significant measurement is 'Combined faults 24'.  There is a 50+% drop,
which is better than I expected.  It might be interesting to fix up all
issues in the series and rerun these tests?

Do note that the performance issue was originally reported as an issue
with a database using hugetlbfs (not my employer).  I did not have access
to the actual DB to recreate issue.  However, the user verified that changes
in the 'Combined faults 24' measurement reflected changes in their DB
performance.
-- 
Mike Kravetz


  reply	other threads:[~2023-10-02  5:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-01  0:55 [PATCH v5 0/3] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-10-01  0:55 ` [PATCH 1/3] hugetlbfs: extend hugetlb_vma_lock to private VMAs riel
2023-10-01  0:55 ` [PATCH 2/3] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-10-02  4:39   ` Mike Kravetz
2023-10-02 13:13     ` Rik van Riel
2023-10-03 19:35     ` Rik van Riel
2023-10-03 20:19       ` Mike Kravetz
2023-10-04  0:20         ` Rik van Riel
2023-10-01  0:55 ` [PATCH 3/3] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock riel
2023-10-02  5:22   ` Mike Kravetz [this message]
2023-10-01  2:54 ` [PATCH v5 0/3] hugetlbfs: close race between MADV_DONTNEED and page fault Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2023-10-04  3:25 [PATCH v6 " riel
2023-10-04  3:25 ` [PATCH 3/3] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock riel
2023-10-06  0:19   ` Mike Kravetz
2023-10-06  3:28     ` Rik van Riel
2023-09-26  3:10 [PATCH v4 0/3] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-09-26  3:10 ` [PATCH 3/3] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock riel
2023-09-25 20:28 [PATCH v3 0/3] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-09-25 20:28 ` [PATCH 3/3] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock riel
2023-09-22 19:02 [PATCH v2 0/3] hugetlbfs: close race between MADV_DONTNEED and page fault riel
2023-09-22 19:02 ` [PATCH 3/3] hugetlbfs: replace hugetlb_vma_lock with invalidate_lock riel
2023-09-24  6:44   ` kernel test robot
2023-09-25  2:04   ` kernel test robot
2023-09-25 19:22     ` Rik van Riel
2023-09-25 20:06       ` Mike Kravetz
2023-09-25 20:11         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231002052242.GA103993@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=kernel-team@meta.com \
    --cc=leit@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=riel@surriel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox