linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: Linux MM <linux-mm@kvack.org>, Andi Kleen <ak@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@techsingularity.net>, Jan Kara <jack@suse.cz>,
	Davidlohr Bueso <dbueso@suse.de>, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Subject: Re: mmap_sem bottleneck
Date: Mon, 17 Oct 2016 14:57:17 +0200	[thread overview]
Message-ID: <20161017125717.GK23322@dhcp22.suse.cz> (raw)
In-Reply-To: <ea12b8ee-1892-fda1-8a83-20fdfdfa39c4@linux.vnet.ibm.com>

On Mon 17-10-16 14:33:53, Laurent Dufour wrote:
> Hi all,
> 
> I'm sorry to resurrect this topic, but with the increasing number of
> CPUs, this becomes more frequent that the mmap_sem is a bottleneck
> especially between the page fault handling and the other threads memory
> management calls.
> 
> In the case I'm seeing, there is a lot of page fault occurring while
> other threads are trying to manipulate the process memory layout through
> mmap/munmap.
> 
> There is no *real* conflict between these operations, the page fault are
> done a different page and areas that the one addressed by the mmap/unmap
> operations. Thus threads are dealing with different part of the
> process's memory space. However since page fault handlers and mmap/unmap
> operations grab the mmap_sem, the page fault handling are serialized
> with the mmap operations, which impact the performance on large system.

Could you quantify how much overhead are we talking about here?

> For the record, the page fault are done while reading data from a file
> system, and I/O are really impacted by this serialization when dealing
> with a large number of parallel threads, in my case 192 threads (1 per
> online CPU). But the source of the page fault doesn't really matter I guess.

But we are dropping the mmap_sem for the IO and retry the page fault.
I am not sure I understood you correctly here though.

> I took time trying to figure out how to get rid of this bottleneck, but
> this is definitively too complex for me.
> I read this mailing history, and some LWN articles about that and my
> feeling is that there is no clear way to limit the impact of this
> semaphore. Last discussion on this topic seemed to happen last march
> during the LSFMM submit (https://lwn.net/Articles/636334/). But this
> doesn't seem to have lead to major changes, or may be I missed them.

At least mmap/munmap write lock contention could be reduced by the above
proposed range locking. Jan Kara has implemented a prototype [1] of the
lock for mapping which could be used for mmap_sem as well) but it had
some perfomance implications AFAIR. There wasn't a strong usecase for
this so far. If there is one, please describe it and we can think what
to do about it.

There were also some attempts to replace mmap_sem by RCU AFAIR but my
vague recollection is that they had some issues as well.

[1] http://linux-kernel.2935.n7.nabble.com/PATCH-0-6-RFC-Mapping-range-lock-td592872.html
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-10-17 12:57 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 12:33 Laurent Dufour
2016-10-17 12:51 ` Peter Zijlstra
2016-10-18 14:50   ` Laurent Dufour
2016-10-18 15:01     ` Kirill A. Shutemov
2016-10-18 15:02     ` Peter Zijlstra
2016-11-18 11:08       ` [RFC PATCH v2 0/7] Speculative page faults Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 1/7] mm: Dont assume page-table invariance during faults Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 2/7] mm: Prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 3/7] mm: Introduce pte_spinlock Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 4/7] mm: VMA sequence count Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 5/7] SRCU free VMAs Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 6/7] mm: Provide speculative fault infrastructure Laurent Dufour
2016-11-18 11:08         ` [RFC PATCH v2 7/7] mm,x86: Add speculative pagefault handling Laurent Dufour
2016-11-18 14:08         ` [RFC PATCH v2 0/7] Speculative page faults Andi Kleen
2016-12-01  8:34           ` Laurent Dufour
2016-12-01 12:50             ` Balbir Singh
2016-12-01 13:26               ` Laurent Dufour
2016-12-02 14:10         ` Michal Hocko
2016-10-17 12:57 ` Michal Hocko [this message]
2016-10-20  7:23   ` mmap_sem bottleneck Laurent Dufour
2016-10-20 10:55     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161017125717.GK23322@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dbueso@suse.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox