From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF6ACD65552 for ; Tue, 26 Nov 2024 20:44:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 397386B0093; Tue, 26 Nov 2024 15:44:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 347186B0095; Tue, 26 Nov 2024 15:44:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 20F016B0099; Tue, 26 Nov 2024 15:44:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 062E96B0093 for ; Tue, 26 Nov 2024 15:44:05 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8376C1A149C for ; Tue, 26 Nov 2024 20:44:04 +0000 (UTC) X-FDA: 82829423082.24.EF53F77 Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf22.hostedemail.com (Postfix) with ESMTP id 8E290C0002 for ; Tue, 26 Nov 2024 20:43:56 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vLxwZ5MM; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732653839; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Knumxb+ipa3/QWPd1lEfjlfQFOTdrGiNogqlwZ9Cmps=; b=6hAHcau9jZCQRLZB4IPPadooWpuYLYl3c9FxXs3b2KxPtqniiWJ1Xx2kuORhMw2gukohW1 Ki//jKsM80ge7BIpFAtOENGfxCXt+GRrN7fbxPAiJgqF0CVCAEs/E6YsQyonAEf0NmCQNB twB9beNc3uE1tW8YoDYwqBH9g0ZhYC0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732653839; a=rsa-sha256; cv=none; b=xADzZdjR9QiHf0e/dn/9X947DXRtNXKEyySVPBZl2hLiXvObQQBTup0BfCEwKZYMyieU0g z7+bO7fRxzh4ctcTSH23E6zHwMTtHWeXfzFIZFr0GOBmltz+tf/WYGbrL03Y2QUgvwvf0t 7BbD+y8DaXAZK22itA9Itk3s9NlIyk4= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vLxwZ5MM; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-5cfc264b8b6so50a12.0 for ; Tue, 26 Nov 2024 12:44:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1732653841; x=1733258641; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Knumxb+ipa3/QWPd1lEfjlfQFOTdrGiNogqlwZ9Cmps=; b=vLxwZ5MMnJsMRakVugJMPzyH3xAEM0csB/tX9BXZMkWhk5DbbrXQcYJnDbA23trWX3 rCJ/jYA/JvWFF6UcEP6zImbmM3Bxb5I0puQAodawmg0pxQ1lD4wVaOO57d3VJ2dgqf1c Hqn6Xn1rj33m3EEKFGVsl8DQOx2PtYZaxS3kPn6YJ8J5s2A8O8ic3ECGqfHthsdjujGQ P64oK9DsoMbeY3YhDy64uHviUHEN9gkn/301VqVP+jc8EEwzjEsh0tLHMzxFnNjZGh5Z a/yAbIOv0Acd+Ww5KR69GrQOhlBTyH6KG4lIhNlnfx0O2mh8VVRnjQyeqE5ubtEDIRVg 9c4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732653841; x=1733258641; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Knumxb+ipa3/QWPd1lEfjlfQFOTdrGiNogqlwZ9Cmps=; b=QDiqblf9XUza0L6HU3GojwyG94+U/Cd+wOW83VSt+Hpiem5pRU0cL6ufeSJPUtdL6J zgNKUAEdghrDu+b1nc+4sreHx9uEY0yEf1Umz34ZjJdfgSlXpAUj5OjEBaCLRew7KMZL b3sHPtjhJkfCAx1uLJEJQ3LkgD4RmU+OnbSYGvegz/wId2Ps3LvM2olDCZNFX3Kc/ATq YmJ/u1FRgfddGhZeD03RA1kbyuNA1aPbMGNk9sxuLcFFuya9Q+9ocoSVYKHeLFdHItr0 AFXeNt+lB9O6BDzyhBB+yXW6yK81CrGxNXimG8BPWfSc4Hxajo/07lAF3ARbrUf23CiZ MDxw== X-Forwarded-Encrypted: i=1; AJvYcCUUlVQM7f58V46QacuUE7z3q23DRVibpmTa2e3knXj2qrQFAf8Oo6AjZDyg0KLd43DCFqzk6MK+qw==@kvack.org X-Gm-Message-State: AOJu0Yxs/GyllYUWQG/Afes2WRu2jO4S81SlsJvrOTa2c0TAfgjJXk+9 z8Gl0+S8/0DtnNSk8YsMmZkBiXE5pK4NIMLeAUAg5t0IhoY8UrE8RgTWNowje+W3PpcAj1ZOKer mm14Xk4cracNa8AeciDoFr30gd8vv+PjNqyf+ X-Gm-Gg: ASbGncsOM6a2U+GM5gSZh2uNtw4YxOI1xiMa0vCJlKkV5n8Xrw/VIhyEXgoJ3pcm9lR w/je8X4UcZD2kQwXtZD2Wb/P8AKiAhAZR6/4rPtWimuuwLWMEbKoatMv67Xc= X-Google-Smtp-Source: AGHT+IGWR89iYFiAgSPZ12ezMJJKqp8c3+6xErZv5QF6yYfwDocCF/qoTNo0Vi8c1TDce/+jAUrcLpa9cJl3TwT6E6g= X-Received: by 2002:a05:6402:1c91:b0:5d0:3bfb:c479 with SMTP id 4fb4d7f45d1cf-5d081abf56cmr8356a12.3.1732653840751; Tue, 26 Nov 2024 12:44:00 -0800 (PST) MIME-Version: 1.0 References: <20241119112408.779243-1-abdiel.janulgue@gmail.com> In-Reply-To: From: Jann Horn Date: Tue, 26 Nov 2024 21:43:24 +0100 Message-ID: Subject: Re: [PATCH v3 0/2] rust: page: Add support for existing struct page mappings To: Matthew Wilcox Cc: Boqun Feng , Alice Ryhl , Abdiel Janulgue , rust-for-linux@vger.kernel.org, Miguel Ojeda , Alex Gaynor , Gary Guo , =?UTF-8?Q?Bj=C3=B6rn_Roy_Baron?= , Benno Lossin , Andreas Hindborg , Trevor Gross , Danilo Krummrich , Wedson Almeida Filho , Valentin Obst , open list , Andrew Morton , "open list:MEMORY MANAGEMENT" , airlied@redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8E290C0002 X-Stat-Signature: cb7ysh9g5woywfwaj71z9waripoa58eu X-Rspam-User: X-HE-Tag: 1732653836-639672 X-HE-Meta: U2FsdGVkX1+mruVezwOZ9anOr+u4hT3kW5xzn6lErE45HCoLK0mqPsxkj1u+63XOPyTvapzkNBFBjx3tQOZQMB926AwuY4BceplW9rIjLXNS5KKmm3ehA7v0EC7NGmRxDBNNDatkpSRh6a/S+Io89EDok+HV+FaPdALONkmkCdbVgNB3/L96uiPjW2YM7Er5ChBdJR4OzdyPHbjEA6j2TOlcKLvin/sNNyAPPVYZXlypyLY/EFirnUUzGc6+zYSK1X+P39YVZdXZIsN4yKpMCFGOUCZil8vZn/RfDgt6RhBKOFHz9peVKJlzs3IWfG6B2S4myrN8vCzZAejRmEpWw79CQp0jLVkIrKBf07A7M4RqlVMgvq9ZYyHxsyHSv3Kw32UJ5YjhCH1LVnBeQQWTYchhlveaPzGsEwGwE1fkeCrxOLP/iEbemx++W/m4USbq4hlkCyfzO/KNwW6vUgh/VltZMW6iXWWCf74cFkxBBzFomJtBhBKW3zb430i8KKJwT9PfALnKFoQmZaLuIVmggH9WjYwEm6joWNf5jBQrhVwGP1QbsF7QViVS2yQaMP4YMEAdvtQvN6kA7sI5vWf2NPBdsbRv3duWIB2btBIs3eYI0f8zEBSWdGaDQtgoR6zmi235mnNv2S2DkIFx1B2w0k636go3s6+ptoJNTXdi6yPH54Kk7fC9RA4eP3ZwlNwCi0guDJe15NjJki+aCMwH7LISnu6fAo6GVY/DS/rlCDJSI5p4IwfM9eazoQSlWcv/q7Bhgd/HOm0L2VJntCrAxv/ZwjhqNf5+6Pczk6fBoNlUHBtXg9KUrOkmSAZ94fs3fqvMCDToBY+keNVMqw3zAyBMOxGKVvgOBrPKYB9Q0cJ4bb/rAHx2IdSwlhgwZU9+HAgJJk20+cXLBeYl7bGGgryorb7ubHbMBxICxbpG8107UA4HBqFcnD09RjoWliMTYKdeBiILmr/s24uhW+I KOcR4U5y gp9C8N4yLhuxk4yPm+JC3MIaaXHA01BxpJsLIPCRCFALo/3V7W0mGZ70TWjXFlaupW7V9vrj4m5CiIFENdc+XTpSlP5+W4XUi6xUVnB2U/kvYVSg9FyvZ1kO/RJ+r+Xf+FJOLsZsPy0hWz9vguV16KoLCWgDsyqlzpxATch3KB8tarGBopvvQWh8arLHnzB2XxsODZeZjB8cSnBDdGGh3/1ZuVoP8D9yW8XKM3E2YONiWWZhg7Aawk5uBH2IqA5L/iHMF8WayI8nBMZYmcQMNmH8/US/xvC6j1CKWF/QizgtawMPQ0NkozO6ph2t5e4alJ16LkOZdchcQFr0lU0Zv4fCIjtTIA+Lh5m6YOpPHbpMW9TxrQWUeIDIqDZltC+NCrjiEKWyAQLSCsfI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 26, 2024 at 9:31=E2=80=AFPM Jann Horn wrote: > On Wed, Nov 20, 2024 at 6:02=E2=80=AFPM Matthew Wilcox wrote: > > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > > On Wed, Nov 20, 2024 at 5:57=E2=80=AFAM Matthew Wilcox wrote: > > > > > We don't have a fully formed destination yet, so I can't give you= a > > > > > definite answer to a lot of questions. Obviously I don't want to= hold > > > > > up the Rust project in any way, but I need to know that what we'r= e trying > > > > > to do will be expressible in Rust. > > > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > > > I don't think this patch needs the refcount at all, and the previou= s > > > > version did not expose it. This came out of the advice to use put_p= age > > > > over free_page. Does this mean that we should switch to put_page bu= t > > > > not use get_page? > > > > Did I advise using put_page() over free_page()? I hope I didn't say > > that. I don't see a reason why binder needs to refcount its pages (nor > > use a mapcount on them), but I don't fully understand binder so maybe > > it does need a refcount. > > I think that was me, at > . > Looking at the C binder version, binder_install_single_page() installs > pages into userspace page tables in a VM_MIXEDMAP mapping using > vm_insert_page(), and when you do that with pages from the page > allocator, userspace can grab references to them through GUP-fast (and > I think also through GUP). (See how vm_insert_page() and > vm_get_page_prot() don't use pte_mkspecial(), which is pretty much the > only thing that can stop GUP-fast on most architectures.) > > My understanding is that the combination VM_IO|VM_MIXEDMAP would stop > normal GUP, but currently the only way to block GUP-fast is to use > VM_PFNMAP. (Which, as far as I understand, is also why GPU drivers use > VM_PFNMAP so much.) Maybe we should change that, so that VM_IO and/or > VM_MIXEDMAP blocks GUP in the region and causes installed PTEs to be > marked with pte_mkspecial()? > > I am not entirely sure about this stuff, but I was recently looking at > net/packet/af_packet.c, and I tested that vmsplice() can grab > references to the high-order compound pages that > alloc_one_pg_vec_page() allocates with __get_free_pages(GFP_KERNEL | > __GFP_COMP | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY, order), > packet_mmap() inserts with vm_insert_page(), and free_pg_vec() drops > with free_pages(). (But that all happens to actually work fine, > free_pages() actually handles refcounted compound pages properly.) And also, the C binder driver wants to free pages in its shrinker callback, but those pages might still be mapped into userspace. Binder tries to zap such userspace mappings, but it does that by absolute virtual address instead of going through the rmap (see binder_alloc_free_page()), so it will miss page mappings in VMAs that have been mremap()'d (though legitimate userspace never does that with binder VMAs) or are concurrently being torn down by munmap(); so currently the thing that keeps this from falling apart is that if page mappings are left over somewhere, the page refcount ensures that this userspace-mapped page doesn't get freed. (I think the C binder code does its job, but is not exactly a great model for how to write a clean driver that integrates nicely with the rest of the kernel.)