linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Hildenbrand <david@redhat.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Hugh Dickins <hughd@google.com>,
	David Rientjes <rientjes@google.com>,
	 Shakeel Butt <shakeelb@google.com>,
	John Hubbard <jhubbard@nvidia.com>,
	 Jason Gunthorpe <jgg@nvidia.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	 Mike Rapoport <rppt@linux.ibm.com>,
	Yang Shi <shy828301@gmail.com>,
	 "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	Michal Hocko <mhocko@kernel.org>,  Nadav Amit <namit@vmware.com>,
	Rik van Riel <riel@surriel.com>, Roman Gushchin <guro@fb.com>,
	 Andrea Arcangeli <aarcange@redhat.com>,
	Peter Xu <peterx@redhat.com>,  Donald Dutile <ddutile@redhat.com>,
	Christoph Hellwig <hch@lst.de>, Oleg Nesterov <oleg@redhat.com>,
	 Jan Kara <jack@suse.cz>, Linux-MM <linux-mm@kvack.org>,
	 "open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	 "open list:DOCUMENTATION" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb)
Date: Fri, 17 Dec 2021 13:36:41 -0800	[thread overview]
Message-ID: <CAHk-=wiFhVXZH_ht_dYQ_g2WNuhvWVrv8MjZ8B8_g6Kz2cZrHw@mail.gmail.com> (raw)
In-Reply-To: <e93f3fc9-00fd-5404-83f9-136b372e4867@redhat.com>

On Fri, Dec 17, 2021 at 12:55 PM David Hildenbrand <david@redhat.com> wrote:
>
> If we have a shared anonymous page we cannot have GUP references, not
> even R/O ones. Because GUP would have unshared and copied the page,
> resulting in a R/O mapped anonymous page.

Doing a GUP on an actual shared page is wrong to begin with.

You even know that, you try to use "page_mapcount() > 1" to disallow it.

My point is that it's wrong regardless, and that "mapcount" is
dubious, and that COW cannot - and must not - use mapcount, and that I
think your shared case should strive to avoid it for the exact same
reason.

So, what I think should happen is:

 (a) GUP makes sure that it only ever looks up pages that can be
shared with this VM. This may in involve breaking COW early with any
past fork().

 (b) it marks such pages so that any future work will not cause them
to COW either

Note that (a) is not necessarily "always COW and have to allocate and
copy new page". In particular, if the page is already writable, you
know you already have exclusive access to it and don't need to COW.

And if it isn't writable, then the other common case is "the cow has
only one user, and it's us" - that's the "refcount == 1" case.

And (b) is what we do with that page_maybe_dma_pinned() logic for
fork(), but also for things like swap cache creation (eg see commit
feb889fb40fa: "mm: don't put pinned pages into the swap cache").

Note that this code all already exists, and already works - even
without getting the (very expensive) mmap_sem. So it works with
fast-GUP and it can race with concurrent forking by another thread,
which is why we also have that seqcount thing.

As far as I can tell, your "mapcount" logic fundamentally requires
mmap_sem for the fork() race avoidance, for example.

So this is why I don't like the mapcount games - I think they are very
fragile, and not at all as logical as the two simple rules a/b above.

I believe you can make mapcount games _work_ - we used to have
something like that. It was incredibly fragile, and it had its own set
of bugs, but with enough care it's doable.

But my argument really is that I think it's the wrong approach, and
that we should simply strive to follow the two simple conceptual rules
above.

            Linus


  reply	other threads:[~2021-12-17 21:37 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17 11:30 [PATCH v1 00/11] mm: COW fixes part 1: fix the COW security issue for THP and hugetlb David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 01/11] seqlock: provide lockdep-free raw_seqcount_t variant David Hildenbrand
2021-12-17 17:02   ` Nadav Amit
2021-12-17 17:29     ` David Hildenbrand
2021-12-17 17:49       ` David Hildenbrand
2021-12-17 18:01         ` Nadav Amit
2021-12-17 21:28   ` Thomas Gleixner
2021-12-17 22:02     ` David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 02/11] mm: thp: consolidate mapcount logic on THP split David Hildenbrand
2021-12-17 19:06   ` Yang Shi
2021-12-18 14:24   ` Kirill A. Shutemov
2021-12-17 11:30 ` [PATCH v1 03/11] mm: simplify hugetlb and file-THP handling in __page_mapcount() David Hildenbrand
2021-12-17 17:16   ` Nadav Amit
2021-12-17 17:30     ` David Hildenbrand
2021-12-17 18:06   ` Mike Kravetz
2021-12-17 18:11     ` David Hildenbrand
2021-12-17 19:07   ` Yang Shi
2021-12-18 14:31   ` Kirill A. Shutemov
2021-12-17 11:30 ` [PATCH v1 04/11] mm: thp: simlify total_mapcount() David Hildenbrand
2021-12-17 19:12   ` Yang Shi
2021-12-18 14:35   ` Kirill A. Shutemov
2021-12-17 11:30 ` [PATCH v1 05/11] mm: thp: allow for reading the THP mapcount atomically via a raw_seqlock_t David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb) David Hildenbrand
2021-12-17 19:04   ` Linus Torvalds
2021-12-17 19:22     ` Linus Torvalds
2021-12-17 20:17       ` David Hildenbrand
2021-12-17 20:36         ` Linus Torvalds
2021-12-17 20:39           ` Linus Torvalds
2021-12-17 20:43             ` Linus Torvalds
2021-12-17 20:42           ` David Hildenbrand
2021-12-17 20:45             ` Linus Torvalds
2021-12-18 22:52               ` Kirill A. Shutemov
2021-12-18 23:05                 ` Linus Torvalds
2021-12-17 20:47           ` Jason Gunthorpe
2021-12-17 20:56             ` Linus Torvalds
2021-12-17 21:17               ` David Hildenbrand
2021-12-17 21:04             ` David Hildenbrand
2021-12-18  0:50               ` Jason Gunthorpe
2021-12-17 21:15             ` Nadav Amit
2021-12-17 21:20               ` David Hildenbrand
2021-12-18  0:50               ` Jason Gunthorpe
2021-12-18  1:53               ` Linus Torvalds
2021-12-18  2:17                 ` Linus Torvalds
2021-12-18  2:42                   ` Linus Torvalds
2021-12-18  3:36                     ` Linus Torvalds
2021-12-18  3:05                 ` Jason Gunthorpe
2021-12-18  3:30                   ` Nadav Amit
2021-12-18  3:38                     ` Linus Torvalds
2021-12-18 18:42                       ` Jason Gunthorpe
2021-12-18 21:48                         ` Nadav Amit
2021-12-18 22:53                           ` Linus Torvalds
2021-12-19  0:19                             ` Nadav Amit
2021-12-19  0:35                               ` Linus Torvalds
2021-12-19  6:02                                 ` Nadav Amit
2021-12-19  8:01                                   ` John Hubbard
2021-12-19 11:30                                     ` Matthew Wilcox
2021-12-19 17:27                                   ` Linus Torvalds
2021-12-19 17:44                                     ` David Hildenbrand
2021-12-19 17:44                                     ` Linus Torvalds
2021-12-19 17:59                                       ` David Hildenbrand
2021-12-19 21:12                                         ` Matthew Wilcox
2021-12-19 21:27                                           ` Linus Torvalds
2021-12-19 21:47                                             ` Matthew Wilcox
2021-12-19 21:53                                               ` Linus Torvalds
2021-12-19 22:02                                                 ` Matthew Wilcox
2021-12-19 22:12                                                   ` Linus Torvalds
2021-12-19 22:26                                                     ` Matthew Wilcox
2021-12-20 18:37                                           ` Matthew Wilcox
2021-12-20 18:52                                             ` Matthew Wilcox
2021-12-20 19:38                                               ` Linus Torvalds
2021-12-20 19:15                                             ` Linus Torvalds
2021-12-20 21:02                                               ` Matthew Wilcox
2021-12-20 21:27                                                 ` Linus Torvalds
2021-12-21  1:03                                         ` Jason Gunthorpe
2021-12-21  3:29                                           ` Matthew Wilcox
2021-12-21  8:58                                           ` David Hildenbrand
2021-12-21 14:28                                             ` Jason Gunthorpe
     [not found]                                               ` <303f21d3-42b4-2f11-3f22-28f89f819080@redhat.com>
2021-12-21 23:54                                                 ` Jason Gunthorpe
2021-12-21 17:05                                             ` Linus Torvalds
2021-12-21 17:40                                               ` David Hildenbrand
2021-12-21 18:00                                                 ` Linus Torvalds
     [not found]                                                   ` <dda021c8-69ec-c660-46be-793ae345a5bb@redhat.com>
2021-12-21 21:11                                                     ` John Hubbard
2021-12-21 18:07                                                 ` Jan Kara
2021-12-21 18:30                                                   ` Linus Torvalds
     [not found]                                                     ` <d23ede12-5df7-2f28-00fd-ea58d85ae400@redhat.com>
2021-12-21 18:58                                                       ` Linus Torvalds
2021-12-21 21:16                                                     ` John Hubbard
2021-12-21 19:07                                                 ` Jason Gunthorpe
     [not found]                                                   ` <3e0868e6-c714-1bf8-163f-389989bf5189@redhat.com>
     [not found]                                                     ` <dfe1c8d5-6fac-9040-0272-6d77bafa6a16@redhat.com>
2021-12-22 12:41                                                       ` Jan Kara
     [not found]                                                         ` <4a28e8a0-2efa-8b5e-10b5-38f1fc143a98@redhat.com>
2021-12-22 14:42                                                           ` Jan Kara
     [not found]                                                             ` <505d3d0f-23ee-0eec-0571-8058b8eedb97@redhat.com>
2021-12-22 16:08                                                               ` Jan Kara
2021-12-22 16:44                                                                 ` Matthew Wilcox
2021-12-22 18:40                                                                 ` Linus Torvalds
2021-12-23 12:54                                                                   ` Jan Kara
2021-12-23 17:18                                                                     ` Linus Torvalds
2021-12-23  0:21                                                           ` Matthew Wilcox
2021-12-24  2:53                                                             ` Jason Gunthorpe
2021-12-24  4:53                                                               ` Matthew Wilcox
2022-01-04  0:33                                                                 ` Jason Gunthorpe
2021-12-21 23:59                                                 ` Jason Gunthorpe
2021-12-22 12:44                                                   ` Jan Kara
2021-12-17 20:45     ` David Hildenbrand
2021-12-17 20:51       ` Linus Torvalds
2021-12-17 20:55         ` David Hildenbrand
2021-12-17 21:36           ` Linus Torvalds [this message]
2021-12-17 21:47             ` David Hildenbrand
2021-12-17 21:50               ` Linus Torvalds
2021-12-17 22:29                 ` David Hildenbrand
2021-12-17 22:58                   ` Linus Torvalds
2021-12-17 23:29                     ` David Hildenbrand
2021-12-17 23:53                       ` Nadav Amit
2021-12-18  4:02                         ` Linus Torvalds
2021-12-18  4:52                           ` Nadav Amit
2021-12-18  5:03                             ` Matthew Wilcox
2021-12-18  5:23                               ` Nadav Amit
2021-12-18 18:37                               ` Linus Torvalds
2021-12-17 22:18               ` Linus Torvalds
2021-12-17 22:43                 ` David Hildenbrand
2021-12-17 23:20                   ` Linus Torvalds
2021-12-18  9:57                     ` David Hildenbrand
2021-12-18 19:21                       ` Linus Torvalds
2021-12-18 19:52                         ` Linus Torvalds
2021-12-19  8:43                           ` David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 07/11] mm: gup: trigger unsharing via FAULT_FLAG_UNSHARE when required (!hugetlb) David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 08/11] mm: hugetlb: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 09/11] mm: gup: trigger unsharing via FAULT_FLAG_UNSHARE when required (hugetlb) David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 10/11] mm: thp: introduce and use page_trans_huge_anon_shared() David Hildenbrand
2021-12-17 11:30 ` [PATCH v1 11/11] selftests/vm: add tests for the known COW security issues David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wiFhVXZH_ht_dYQ_g2WNuhvWVrv8MjZ8B8_g6Kz2cZrHw@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=guro@fb.com \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=namit@vmware.com \
    --cc=oleg@redhat.com \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rppt@linux.ibm.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox