From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EED19C433DB for ; Sat, 9 Jan 2021 19:47:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3BB1323A82 for ; Sat, 9 Jan 2021 19:47:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3BB1323A82 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B93B6B0175; Sat, 9 Jan 2021 14:47:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 543238D000D; Sat, 9 Jan 2021 14:47:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 395A98D0002; Sat, 9 Jan 2021 14:47:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id 1CDE96B0175 for ; Sat, 9 Jan 2021 14:47:07 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DC0A6362E for ; Sat, 9 Jan 2021 19:47:06 +0000 (UTC) X-FDA: 77687270052.16.idea82_4d0a3bb274fe Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id B3BE6100E6903 for ; Sat, 9 Jan 2021 19:47:06 +0000 (UTC) X-HE-Tag: idea82_4d0a3bb274fe X-Filterd-Recvd-Size: 6303 Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Sat, 9 Jan 2021 19:47:06 +0000 (UTC) Received: by mail-lf1-f42.google.com with SMTP id h205so31297070lfd.5 for ; Sat, 09 Jan 2021 11:47:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=veLut2Mq67QlA+4PjesLq+vNFT/elLTUULwTvEbHofA=; b=GlclWeZ9nBeSxG/j4lXsymddq6D/nXzTQ1864ONV4MnfIJph4uPm/YIxt3/2aVCGt7 q9bHnIkyRQNn0BrnVGVX4f8SVLvpAksZdPxJNAN7hsvl+HgfPSuRHWVd3/N/sRGGiGkF 5h3Ip2bsNYBPb+Fmm7zFXD/qP3cCRJS0kSlQI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=veLut2Mq67QlA+4PjesLq+vNFT/elLTUULwTvEbHofA=; b=YjG9cAfqMnVJrj/TF99/muBkVWmnnTxdtM4VD+xnLVdVLAOGIQ+mv277uWJ6qMgATj tdEJ+7i53jasOG9GrnzMj8zZVY18QUVgjMBAQdsauccAc9849jAn709RTh/D5s9EuZZ/ Kp1b72rY/0mWTxJL9iMb95JJvYbizxNOFgKLORSLQziOw6q1tnYRnhA5O/mERUpAExUA kYa5NwHcMU+m04u2a1PdWFg+zhQOuDgeZuqTh8CqT/uCDgXiJf+CM6+9rQ92OR4v93ni xC/GJUOfKi9OGzxql52h87PgUUq5AMBvyDZAY0wdOz3GptbHni4SsiToakCT6fFyvZ7t KA7w== X-Gm-Message-State: AOAM5318/1t0Q+bi9Cng3X9tHct/oX1lJbrDJNurldUvp2SVbHEm4qVr lUqg1Fl8k3Mhg/HwDoz8aumC07z+ZctFeg== X-Google-Smtp-Source: ABdhPJxpDQk/slYCJ4A2xgZNbk6X9vZIQgy6s8XCF6S3hjqwlAMR/5aSc4WYbSmo1rmYbO5EuV3GSA== X-Received: by 2002:a2e:98d9:: with SMTP id s25mr3828027ljj.476.1610221624294; Sat, 09 Jan 2021 11:47:04 -0800 (PST) Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com. [209.85.167.47]) by smtp.gmail.com with ESMTPSA id c5sm2530473lfh.160.2021.01.09.11.47.02 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 09 Jan 2021 11:47:03 -0800 (PST) Received: by mail-lf1-f47.google.com with SMTP id u25so10517723lfc.2 for ; Sat, 09 Jan 2021 11:47:02 -0800 (PST) X-Received: by 2002:a2e:9d89:: with SMTP id c9mr4479195ljj.220.1610221622595; Sat, 09 Jan 2021 11:47:02 -0800 (PST) MIME-Version: 1.0 References: <20210107200402.31095-1-aarcange@redhat.com> <20210107202525.GD504133@ziepe.ca> <20210109193224.GB35215@casper.infradead.org> In-Reply-To: <20210109193224.GB35215@casper.infradead.org> From: Linus Torvalds Date: Sat, 9 Jan 2021 11:46:46 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 0/2] page_count can't be used to decide when wp_page_copy To: Matthew Wilcox Cc: Jason Gunthorpe , Andrea Arcangeli , Linux-MM , Linux Kernel Mailing List , Yu Zhao , Andy Lutomirski , Peter Xu , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , Minchan Kim , Will Deacon , Peter Zijlstra , Hugh Dickins , "Kirill A. Shutemov" , Oleg Nesterov , Jann Horn , Kees Cook , John Hubbard , Leon Romanovsky , Jan Kara , Kirill Tkhai Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jan 9, 2021 at 11:33 AM Matthew Wilcox wrote: > > On Thu, Jan 07, 2021 at 01:05:19PM -0800, Linus Torvalds wrote: > > Side note, and not really related to UFFD, but the mmap_sem in > > general: I was at one point actually hoping that we could make the > > mmap_sem a spinlock, or at least make the rule be that we never do any > > IO under it. At which point a write lock hopefully really shouldn't be > > such a huge deal. > > There's a (small) group of us working towards that. It has some > prerequisites, but where we're hoping to go currently: > > - Replace the vma rbtree with a b-tree protected with a spinlock > - Page faults walk the b-tree under RCU, like peterz/laurent's SPF patchset > - If we need to do I/O, take a refcount on the VMA > > After that, we can gradually move things out from mmap_sem protection > to just the vma tree spinlock, or whatever makes sense for them. In a > very real way the mmap_sem is the MM layer's BKL. Well, we could do the "no IO" part first, and keep the semaphore part. Some people actually prefer a semaphore to a spinlock, because it doesn't end up causing preemption issues. As long as you don't do IO (or memory allocations) under a semaphore (ok, in this case it's a rwsem, same difference), it might even be preferable to keep it as a semaphore rather than as a spinlock. So it doesn't necessarily have to go all the way - we _could_ just try something like "when taking the mmap_sem, set a thread flag" and then have a "warn if doing allocations or IO under that flag". And since this is about performance, not some hard requirement, it might not even matter if we catch all cases. If we fix it so that any regular load on most normal filesystems never see the warning, we'd already be golden. Of course, I think we've had issues with rw_sems for _other_ reasons. Waiman actually removed the reader optimistic spinning because it caused bad interactions with mixed reader-writer loads. So rwsemapores may end up not working as well as spinlocks if the common situation is "just wait a bit, you'll get it". Linus