From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_RED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34868C433DB for ; Thu, 7 Jan 2021 21:05:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B26D8233CF for ; Thu, 7 Jan 2021 21:05:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B26D8233CF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 461D88D014D; Thu, 7 Jan 2021 16:05:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4120C8D013A; Thu, 7 Jan 2021 16:05:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 302848D014D; Thu, 7 Jan 2021 16:05:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id 170EE8D013A for ; Thu, 7 Jan 2021 16:05:40 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C2700283C for ; Thu, 7 Jan 2021 21:05:39 +0000 (UTC) X-FDA: 77680210398.17.fork76_5f10bbf274ed Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 99F5C180D018B for ; Thu, 7 Jan 2021 21:05:39 +0000 (UTC) X-HE-Tag: fork76_5f10bbf274ed X-Filterd-Recvd-Size: 6591 Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Thu, 7 Jan 2021 21:05:38 +0000 (UTC) Received: by mail-lf1-f51.google.com with SMTP id h22so17984974lfu.2 for ; Thu, 07 Jan 2021 13:05:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VUX48h8GuHiWfyIFKQorOeDH5aRr+rHoGufAxjKRRyc=; b=ftovYP2cAH+hAB7ZAEeA0T4+9dtWXkVbAjLBGYjo+GzUmiXiG8sO/GenbvD/TRQ5w9 wnX3vD6FSNsYeiugwknMKP9/r7XvQdKmwwCOYUgWfcTkYFocALDMgnv661VNQPLP/ePd xM+cAz7WW3Lj2L46AwRaEYALXXbS6ZIgKA9kI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VUX48h8GuHiWfyIFKQorOeDH5aRr+rHoGufAxjKRRyc=; b=ZjOSAv/RBtibwJnV+vXcNgtlglBpJ+nDxO23E1YfFKXfTHz4hX+4RMKrKZjCFBHUO0 jCj3+NXXTfewYU7yz9IVKe0yTyJq6bZwF3noPZogvHRfjXE94MDXQhKA11Yny7N2QItd Xajpt3fkgjv3a9MkG0DOIPpNBGiw/IxzFgpkfMKCcgxcPHLKxl4gnUTzR53EHL1yLKgs t022w5mtDexLdmxnTSV48L1Ao6VtP+7ndZOcsHydEJyMUku6ESDOvOWmCP8g38NbMRdy g+b/JOgbHCWH/oYTaZKPyZiCZneihr4uEaCqdWDWDiRFAEACxlusT8zwXCVMzk+ooGra Fe4Q== X-Gm-Message-State: AOAM532LcSMMeTuhU73KCsgxfBRQC4s253svyxJHJ60cepO/ZfehbaAj H8TbOv6TYK408sryRqwUkkoz84d4JnbtUA== X-Google-Smtp-Source: ABdhPJyWNjJ8IJK0zlARrzgQfymYwfxFQShdpDIllAH2Nt/SvoFnG/0pMFge6CbF06H4ujckQpm9iw== X-Received: by 2002:a2e:b00c:: with SMTP id y12mr132273ljk.85.1610053536617; Thu, 07 Jan 2021 13:05:36 -0800 (PST) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com. [209.85.167.53]) by smtp.gmail.com with ESMTPSA id w20sm226516lfk.67.2021.01.07.13.05.35 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Jan 2021 13:05:35 -0800 (PST) Received: by mail-lf1-f53.google.com with SMTP id b26so17879603lff.9 for ; Thu, 07 Jan 2021 13:05:35 -0800 (PST) X-Received: by 2002:a19:7d85:: with SMTP id y127mr267718lfc.253.1610053534902; Thu, 07 Jan 2021 13:05:34 -0800 (PST) MIME-Version: 1.0 References: <20210107200402.31095-1-aarcange@redhat.com> <20210107202525.GD504133@ziepe.ca> In-Reply-To: From: Linus Torvalds Date: Thu, 7 Jan 2021 13:05:19 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 0/2] page_count can't be used to decide when wp_page_copy To: Jason Gunthorpe Cc: Andrea Arcangeli , Linux-MM , Linux Kernel Mailing List , Yu Zhao , Andy Lutomirski , Peter Xu , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , Minchan Kim , Will Deacon , Peter Zijlstra , Hugh Dickins , "Kirill A. Shutemov" , Matthew Wilcox , Oleg Nesterov , Jann Horn , Kees Cook , John Hubbard , Leon Romanovsky , Jan Kara , Kirill Tkhai Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 7, 2021 at 12:32 PM Linus Torvalds wrote: > > Which is really why I think this needs to be fixed by just fixing UFFD > to take the write lock. Side note, and not really related to UFFD, but the mmap_sem in general: I was at one point actually hoping that we could make the mmap_sem a spinlock, or at least make the rule be that we never do any IO under it. At which point a write lock hopefully really shouldn't be such a huge deal. The main source of IO under the mmap lock was traditionally the page faults obviously needing to read the data in, but we now try to handle that with the whole notion of page fault restart instead. But I'm 100% sure we don't do as good a job of it as we _could_ do, and there are probably a lot of other cases where we end up doing IO under the mmap lock simply because we can and nobody has looked at it very much. So if taking the mmap_sem for writing is a huge deal - because it ends up serializing with IO by people who take it for reading - I think that is something that might be worth really looking into. For example, right now I think we (still) only do the page fault retry once - and the second time if the page still isn't available, we'll actually wait with the mmap_sem held. That goes back to the very original page fault retry logic, when I was worried that some infinite retry would cause busy-waiting because somebody didn't do the proper "drop mmap_sem, then wait, then return retry". And if that actually causes problems, maybe we should just make sure to fix it? remove that FAULT_FLAG_TRIED bit entirely, and make the rule be that we always drop the mmap_sem and retry? Similarly, if there are users that don't set FAULT_FLAG_ALLOW_RETRY at all (because they don't have the logic to check if it's a re-try and re-do the mmap_sem etc), maybe we can just fix them. I think all the architectures do it properly in their page fault paths (I think Peter Xu converted them all - no?), but maybe there are cases of GUP that don't have it. Or maybe there is something else that I just didn't notice, where we end up having bad latencies on the mmap_sem. I think those would very much be worth fixing, so that if UFFDIO_WRITEPROTECT taking the mmapo_sem for writing causes problems, we can _fix_ those problems. But I think it's entirely wrong to treat UFFDIO_WRITEPROTECT as specially as Andrea seems to want to treat it. Particularly with absolutely zero use cases to back it up. Linus