From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07286C433FE for ; Tue, 21 Dec 2021 18:00:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0ACE36B007B; Tue, 21 Dec 2021 13:00:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 05EC56B007D; Tue, 21 Dec 2021 13:00:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3FB66B0081; Tue, 21 Dec 2021 13:00:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0134.hostedemail.com [216.40.44.134]) by kanga.kvack.org (Postfix) with ESMTP id D39466B007B for ; Tue, 21 Dec 2021 13:00:55 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 9DBB3181AC550 for ; Tue, 21 Dec 2021 18:00:55 +0000 (UTC) X-FDA: 78942567270.06.4938DAC Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf25.hostedemail.com (Postfix) with ESMTP id F415CA0071 for ; Tue, 21 Dec 2021 18:00:43 +0000 (UTC) Received: by mail-ed1-f44.google.com with SMTP id f5so32110692edq.6 for ; Tue, 21 Dec 2021 10:00:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VIbA3tyKRgvmJRKKTWC54hBhDnm0QxqACXVEDucCono=; b=hi49HjIoBTBvSXG2A2+cFTDsRuJjpXTbXPT2YO0B5WLC6WmWLV13V5z0TR/SlDr/63 iOmlX+Uu844DfzR1WbHs0Qw1BOG13MhIrY+PoLF4H5qxGf8+AMW/UNirqtZtyPp4EF02 DRy+25A1pXY0epijI8EyJvK8QbbP7eNmzrRr8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VIbA3tyKRgvmJRKKTWC54hBhDnm0QxqACXVEDucCono=; b=sVlTB9xKMvNv0ycxszeDUbQAOVDOYwab65X+Z0ZtWgQopkmPQrzCe7nG8lxL7RnxU/ 2Tz1zzi3lXE3hxT9wleLCXHUG1Sg5fwxZhtJIo4CSTH4pZUyqFg2JUkMwpBt93mWhjGy XkXQTPQlG8+gpfm0TNO9Ij3BMpvhRcbQhfVp/FNWi9SFXJ4I29WW62ikSnSJQXoDuIHx QZQwyA0pqjz4VdN0Qn6qKn3r+VRrmwwCELObkJ+1Tzq5ot2TCpFRBVdYfX+Z0fhDClbh 4otzJWW97DjTVg4vEdicFvf6oKQnodE8u/hsGYiAIG35/TRKZnBeNQEMOuM5+a0yDt+E Y3wA== X-Gm-Message-State: AOAM530K+Z4XeUBuEKrnAAbMYcspZEP67v8kaT2vJ3UpNXjE8aVw81lZ FDsi9rXt4Z/XOqmzzaVWtwkQDazStkLSa//EE2Y= X-Google-Smtp-Source: ABdhPJyMDw1ZYZLA+bVT0hXKthA73SY06fVD1OzWcsZo8g3ESwgz4MctKGd0PvcwaIeXFOFJ9I8Q6g== X-Received: by 2002:a05:6402:4386:: with SMTP id o6mr4451094edc.47.1640109650513; Tue, 21 Dec 2021 10:00:50 -0800 (PST) Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com. [209.85.128.41]) by smtp.gmail.com with ESMTPSA id sc34sm2487985ejc.7.2021.12.21.10.00.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Dec 2021 10:00:50 -0800 (PST) Received: by mail-wm1-f41.google.com with SMTP id a203-20020a1c7fd4000000b003457874263aso2281804wmd.2 for ; Tue, 21 Dec 2021 10:00:50 -0800 (PST) X-Received: by 2002:a7b:cb17:: with SMTP id u23mr3769746wmj.155.1640109639062; Tue, 21 Dec 2021 10:00:39 -0800 (PST) MIME-Version: 1.0 References: <20211218184233.GB1432915@nvidia.com> <5CA1D89F-9DDB-4F91-8929-FE29BB79A653@vmware.com> <4D97206A-3B32-4818-9980-8F24BC57E289@vmware.com> <5A7D771C-FF95-465E-95F6-CD249FE28381@vmware.com> <20211221010312.GC1432915@nvidia.com> <900b7d4a-a5dc-5c7b-a374-c4a8cc149232@redhat.com> In-Reply-To: <900b7d4a-a5dc-5c7b-a374-c4a8cc149232@redhat.com> From: Linus Torvalds Date: Tue, 21 Dec 2021 10:00:22 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb) To: David Hildenbrand Cc: Jason Gunthorpe , Nadav Amit , Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , David Rientjes , Shakeel Butt , John Hubbard , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , "open list:DOCUMENTATION" Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=hi49HjIo; spf=pass (imf25.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.44 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-Rspamd-Queue-Id: F415CA0071 X-Stat-Signature: zeqh97gdcbsrbzyk5hftkf84oj9qpz98 X-Rspamd-Server: rspam04 X-HE-Tag: 1640109643-663517 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Dec 21, 2021 at 9:40 AM David Hildenbrand wrote: > > > I do think the existing "maybe_pinned()" logic is fine for that. The > > "exclusive to this VM" bit can be used to *help* that decision - > > because only an exclusive page can be pinned - bit I don't think it > > should _replace_ that logic. > > The issue is that O_DIRECT uses FOLL_GET and cannot easily be changed to > FOLL_PIN unfortunately. So I'm *trying* to make it more generic such > that such corner cases can be handled as well correctly. But yeah, I'll > see where this goes ... O_DIRECT has to be fixed one way or the other. > > John H. mentioned that he wants to look into converting that to > FOLL_PIN. So maybe that will work eventually. I'd really prefer that as the plan. What exactly is the issue with O_DIRECT? Is it purely that it uses "put_page()" instead of "unpin", or what? I really think that if people look up pages and expect those pages to stay coherent with the VM they looked it up for, they _have_ to actively tell the VM layer - which means using FOLL_PIN. Note that this is in absolutely no way a "new" issue. It has *always* been true. If some O_DIORECT path depends on pinning behavior, it has never worked correctly, and it is entirely on O_DIRECT, and not at all a VM issue. We've had people doing GUP games forever, and being burnt by those games not working reliably. GUP (before we even had the notion of pinning) would always just take a reference to the page, but it would not guarantee that that exact page then kept an association with the VM. Now, in *practice* this all works if: (a) the GUP user had always written to the page since the fork (either explicitly, or with FOLL_WRITE obviously acting as such) (b) the GUP user never forks afterwards until the IO is done (c) the GUP user plays no other VM games on that address and it's also very possible that it has worked by pure luck (ie we've had a lot of random code that actively mis-used things and it would work in practice just because COW would happen to cut the right direction etc). Is there some particular GUP user you happen to care about more than others? I think it's a valid option to try to fix things up one by one, even if you don't perhaps fix _all_ cases. Linus