From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 302B7C433F5 for ; Sun, 19 Dec 2021 17:45:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F2366B0073; Sun, 19 Dec 2021 12:45:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 59EDD6B0074; Sun, 19 Dec 2021 12:45:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 466F56B0075; Sun, 19 Dec 2021 12:45:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id 39E5E6B0073 for ; Sun, 19 Dec 2021 12:45:28 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E24A18770D for ; Sun, 19 Dec 2021 17:45:11 +0000 (UTC) X-FDA: 78935270022.29.57BBAFD Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf26.hostedemail.com (Postfix) with ESMTP id 3AB6114003D for ; Sun, 19 Dec 2021 17:45:11 +0000 (UTC) Received: by mail-ed1-f49.google.com with SMTP id z29so29164723edl.7 for ; Sun, 19 Dec 2021 09:45:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=37CNA591iIXmlmMafqkXJhXbVBzDgE9AqG9TjO4gWSk=; b=EGjpeBlMwidry0LgAPZByewyzxqi1TxS8Ma+x7tUC7MePHkwhun+ud8sjUPXmbw8lb tSNLnq69ftAxYPRofGwjVYV/1gVXtPZQf0iYqC5xHN53fV2Gel87pMwvLJTWstmkmOSx eTyb/hZEid7yI9vmpkQVOXMCCcCHMmVfAeXDY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=37CNA591iIXmlmMafqkXJhXbVBzDgE9AqG9TjO4gWSk=; b=TLzbZdWPgpfw+UiQ6TVlXBLmgRtLClnfnsGUU6kYWtdD8T1t1/mEsfPuuUyxsLnMYA 75rcIQYceRXECnl2sWdqc5J82kk+NqDfU6CdvgHraKCof4bx09MAsdWGD5RNRXAOYAt4 vqIG6Po3zCYL3h6rW9OJVeQodzfXbG4D+eU2ownrDj+u6nQKIW4nWjRzHcTwqgN9pOPN JFjH2fbjGGULbIBkq5kSK94EtY2M2gCMp7qf7sFmQcFbXtdlEerEoPSuPylURlTMufM8 kCbChS/SvlbIfliEO5i9wFCwK2VWyMKWsTPyQN8wnwlwVdvuVnJ3t6PXHAUJXy2n7Y8/ m6KA== X-Gm-Message-State: AOAM5320UCZTqeiOc+p2B0kY3WZO28eggHBp4n7GM6scj4I5CSdmHBUS XLQUG3E/K2IJ5QzeR44BatffPuxagp0cpHNgMSk= X-Google-Smtp-Source: ABdhPJwhLy0kHnATrjkO6e8FYfYPtft4wvKc57aOi9jKh4hU9Ug6q4jIRMeIwE8XqxVUCLFvH/FOyw== X-Received: by 2002:a17:907:6287:: with SMTP id nd7mr9895327ejc.152.1639935909575; Sun, 19 Dec 2021 09:45:09 -0800 (PST) Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com. [209.85.208.43]) by smtp.gmail.com with ESMTPSA id ne2sm1888365ejc.108.2021.12.19.09.45.08 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 19 Dec 2021 09:45:09 -0800 (PST) Received: by mail-ed1-f43.google.com with SMTP id z5so29288201edd.3 for ; Sun, 19 Dec 2021 09:45:08 -0800 (PST) X-Received: by 2002:adf:f54e:: with SMTP id j14mr10061016wrp.442.1639935898113; Sun, 19 Dec 2021 09:44:58 -0800 (PST) MIME-Version: 1.0 References: <54c492d7-ddcd-dcd0-7209-efb2847adf7c@redhat.com> <20211217204705.GF6385@nvidia.com> <2E28C79D-F79C-45BE-A16C-43678AD165E9@vmware.com> <20211218030509.GA1432915@nvidia.com> <5C0A673F-8326-4484-B976-DA844298DB29@vmware.com> <20211218184233.GB1432915@nvidia.com> <5CA1D89F-9DDB-4F91-8929-FE29BB79A653@vmware.com> <4D97206A-3B32-4818-9980-8F24BC57E289@vmware.com> <5A7D771C-FF95-465E-95F6-CD249FE28381@vmware.com> In-Reply-To: From: Linus Torvalds Date: Sun, 19 Dec 2021 09:44:41 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb) To: Nadav Amit Cc: David Hildenbrand , Jason Gunthorpe , Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , David Rientjes , Shakeel Butt , John Hubbard , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , "open list:DOCUMENTATION" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3AB6114003D X-Stat-Signature: mxxqxowbetr5969d8h8h87m8qcbw8h98 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=EGjpeBlM; spf=pass (imf26.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.49 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-HE-Tag: 1639935911-453177 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David, you said that you were working on some alternative model. Is it perhaps along these same lines below? I was thinking that a bit in the page tables to say "this page is exclusive to this VM" would be a really simple thing to deal with for fork() and swapout and friends. But we don't have such a bit in general, since many architectures have very limited sets of SW bits, and even when they exist we've spent them on things like UDDF_WP., But the more I think about the "bit doesn't even have to be in the page tables", the more I think maybe that's the solution. A bit in the 'struct page' itself. For hugepages, you'd have to distribute said bit when you split the hugepage. But other than that it looks quite simple: anybody who does a virtual copy will inevitably be messing with the page refcount, so clearing the "exclusive ownership" bit wouldn't be costly: the 'struct page' cacheline is already getting dirtied. Or what was your model you were implying you were thinking about in your other email? You said "I might have had an idea yesterday on how to fix most of the issues without relying on the mapcount, doing it similar [..]" but I didn't then reply to that email because I had just written this other long email to Nadav. Linus On Sun, Dec 19, 2021 at 9:27 AM Linus Torvalds wrote: > > Adding another bit in the page tables - *purely* to say "this VM owns > the page outright" - would be fairly powerful. And fairly simple. > > Then any COW event will set that bit - because when you actually COW, > the page you install is *yours*. No questions asked. > [ snip snip ] > > Btw, the extra bit doesn't really have to be in the page tables. It > could be a bit in the page itself. We could add another page bit that > we just clear when we do the "add ref to page as you make a virtual > copy during fork() etc". > > And no, we can't use "pincount" either, because it's not exact. The > fact that the page count is so elevated that we think it's pinned is a > _heuristic_, and that's ok when you have the opposite problem, and ask > "*might* this page be pinned". You want to never get a false negative, > but it can get a false positive. > > Linus