From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81907C43603 for ; Thu, 12 Dec 2019 15:40:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 414E120663 for ; Thu, 12 Dec 2019 15:40:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="kgU7ZIVq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 414E120663 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D21DB8E0006; Thu, 12 Dec 2019 10:40:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CD2438E0001; Thu, 12 Dec 2019 10:40:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC05A8E0006; Thu, 12 Dec 2019 10:40:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id A757D8E0001 for ; Thu, 12 Dec 2019 10:40:07 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 60096180AD817 for ; Thu, 12 Dec 2019 15:40:07 +0000 (UTC) X-FDA: 76256900454.01.jam35_1e1a713a20839 X-HE-Tag: jam35_1e1a713a20839 X-Filterd-Recvd-Size: 3161 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Thu, 12 Dec 2019 15:40:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=NFj3JKrK6V93SmE9sfvmlll0F0UCQ3aNrE7Xt0xl41A=; b=kgU7ZIVqpKHXe7npUux7Kxo3d 3b3KMdFZ9BvGiGzl50n/pEX6KUlkw1qj4nUqHoyWI7ePA3vXeUX50psly+KTjwS9pjeV08R5GoosU wrBMQYgLByjubIGbv7K+Y8Cj5Q8vy/iA2NmoQKYwun5BK/JamhpAnUhKiwk3qkU63Yc7x62WNQY5+ xsXPu79FTuqyLhiIxz6np0zfzhk2V8FWr+Ah3X7utQTd4RDCX3o/kDmFGXs0gcxQuH7ovLjOL/6CX Rs9ql0btnqlkf7JKb6YTYjXvNk0+cURRed+7xD1JKH/iwv3G9uDyRAz8jBdYK0/3IkLmXlEsvn5Un 6cw+ZHsEQ==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1ifQZO-0007gl-RE; Thu, 12 Dec 2019 15:40:02 +0000 Date: Thu, 12 Dec 2019 07:40:02 -0800 From: Matthew Wilcox To: "Kirill A. Shutemov" Cc: linux-mm@kvack.org Subject: Re: Splitting the mmap_sem Message-ID: <20191212154002.GR32169@bombadil.infradead.org> References: <20191203222147.GV20752@bombadil.infradead.org> <20191212142457.zqp4mawjz7frpyvk@box> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191212142457.zqp4mawjz7frpyvk@box> User-Agent: Mutt/1.12.1 (2019-06-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 12, 2019 at 05:24:57PM +0300, Kirill A. Shutemov wrote: > On Tue, Dec 03, 2019 at 02:21:47PM -0800, Matthew Wilcox wrote: > > My preferred solution to the mmap_sem scalability problem is to allow > > VMAs to be looked up under the RCU read lock then take a per-VMA lock. > > I've been focusing on the first half of this problem (looking up VMAs > > in an RCU-safe data structure) and ignoring the second half (taking a > > lock while holding the RCU lock). > > Do you see this approach to be regression-free for uncontended case? > I doubt it will not cause regressions for signle-threaded applications... Which part of the approach do you think will cause a regression? The maple tree is quicker to traverse than the rbtree (in our simulations). Incrementing a refcount on a VMA is surely no slower than acquiring an uncontended rwsem for read. mmap() and munmap() will get slower, but is that a problem? > > We currently only have one ->map_pages() callback, and it's > > filemap_map_pages(). It only needs to sleep in one place -- to allocate > > a PTE table. I think that can be allocated ahead of time if needed. > > No, filemap_map_pages() doesn't sleep. It cannot. Whole body of the > function is under rcu_read_lock(). It uses pre-allocated page table. > See do_fault_around(). Oh, thank you! That makes the ->map_pages() optimisation already workable with no changes.