From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACC53C35247 for ; Thu, 6 Feb 2020 20:15:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6D71B22464 for ; Thu, 6 Feb 2020 20:15:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KWFgZoNC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D71B22464 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EBA916B0003; Thu, 6 Feb 2020 15:15:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E6AD66B0006; Thu, 6 Feb 2020 15:15:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7FFC6B0007; Thu, 6 Feb 2020 15:15:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0003.hostedemail.com [216.40.44.3]) by kanga.kvack.org (Postfix) with ESMTP id C167E6B0003 for ; Thu, 6 Feb 2020 15:15:47 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7D2B7180AD801 for ; Thu, 6 Feb 2020 20:15:47 +0000 (UTC) X-FDA: 76460807934.11.fact85_156997d1df31d X-HE-Tag: fact85_156997d1df31d X-Filterd-Recvd-Size: 3706 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Thu, 6 Feb 2020 20:15:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=rOtabeKj1/SVdJtB+hh7g3WPa8cd2w8upfRTuud8vr4=; b=KWFgZoNCMBv7P9+gEI8adbi1V/ E7sbKu1PxzMYoGPMb+Pkps3/1zqNWvOTNHEKjKilaVz7KzeSpqmfdp9CAlN74yjIsbMiDAI7BZ6tX pWSMpT4jXthaZEtCsjykIsaZw26vpbvd97XLnuhhjk5JMtE4tiV+HdZuETJ2ZRZe7+vRFJLaf8T65 ErzszXmWfSL4jEnKLy0B6bScBKWq+5qO0hCMnRV6nGVuraDUPbxjVIKusRgjEZ/AtMGDpfbIPSlxL mFvHbXqMm70glrtoZBDb8bqyqyn6rSbSd1GrP9CheV2YERv9Izi/ozlP15qS/CkxgVL1124+DzAym +S7jtsLg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1iznYn-0004Ut-1u; Thu, 06 Feb 2020 20:15:37 +0000 Date: Thu, 6 Feb 2020 12:15:36 -0800 From: Matthew Wilcox To: Peter Zijlstra Cc: SeongJae Park , Michal Hocko , Vlastimil Babka , "Kirill A. Shutemov" , linux-mm@kvack.org Subject: Re: Re: Splitting the mmap_sem Message-ID: <20200206201536.GX8731@bombadil.infradead.org> References: <20200109170715.GV4951@dhcp22.suse.cz> <20200109173206.3731-1-sj38.park@gmail.com> <20200109201320.GO6788@bombadil.infradead.org> <20200206135920.GS14914@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200206135920.GS14914@hirez.programming.kicks-ass.net> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 06, 2020 at 02:59:20PM +0100, Peter Zijlstra wrote: > > The proposal consists of three phases. In phase 1, we convert the > > rbtree to the maple tree, and leave the locking alone. In phase 2, > > we change the locking to a per-VMA refcount, looked up under RCU. > > > > This problem arises during phase 3 where we attempt to handle page > > faults entirely under the RCU read lock. If we encounter problems, > > we can fall back to acquiring the VMA refcount, but we need the > > page allocation to fail rather than sleep (or magically drop the > > RCU lock and return an indication that it has done so, but that > > doesn't seem to be an approach that would find any favour). > > So why not use SRCU? You can do full blocking faults under SRCU and > don't need no 'stinkin' refcounts ;-) I have to say, SRCU is not in my mental toolbox of "how to solve a problem", so it simply hadn't occurred to me. Thanks. So, we'd DEFINE_SRCU(vma_srcu); in mm/memory.c then, at the beginning of a page fault call srcu_read_lock(&vma_srcu); walk the tree as we do now, allocate memory for PTEs, sleep waiting for pages to arrive back from disc, etc, etc, then at the end of the fault, call srcu_read_unlock(&vma_srcu). munmap() would consist of removing the VMA from the tree, then calling synchronize_srcu() to wait for all faults to finish, then putting the backing file, etc, etc and freeing the VMA. This seems pretty reasonable, and investigation could actually proceed before the Maple tree work lands. Today, that would be: srcu_read_lock(&vmas_srcu); down_read(&mm->mmap_sem); find_vma(mm, address); up_read(&mm->mmap_sem); ... rest of fault handler path ... srcu_read_unlock(&vmas_srcu); Kind of a pain because we still call find_vma() in the per-arch page fault handler, but for prototyping, we'd only have to do one or two architectures.