From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2650FD5E140 for ; Mon, 11 Nov 2024 08:27:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F5516B0085; Mon, 11 Nov 2024 03:27:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A5506B0088; Mon, 11 Nov 2024 03:27:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56CF26B0089; Mon, 11 Nov 2024 03:27:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 381796B0085 for ; Mon, 11 Nov 2024 03:27:36 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8D477817E6 for ; Mon, 11 Nov 2024 08:27:35 +0000 (UTC) X-FDA: 82773134304.15.3B123A4 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf11.hostedemail.com (Postfix) with ESMTP id EC07C40006 for ; Mon, 11 Nov 2024 08:26:43 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mSWYKjPa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of tabba@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731313479; a=rsa-sha256; cv=none; b=jdMgsBDlqf99sPThVsqo1iPwqAU90AXYAXm6KU4P3w9WhROKo17oVJhjW0iU5h6tFfTKbI hTNx1J56daTBrSxUs6AU9lKM4yq6JMTN0gAzBu8UT3uDS/1M8TzQB2fcAQdMv6xNTqjztI bXEPLcC3opA3l6n80pb8LZc0y89KZWI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mSWYKjPa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of tabba@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731313479; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oTlZ4xJedYzVMnb4atsTXhQWqyB2ECChSFkUmRbl3i0=; b=JwssMFyW2FirUoG8Xg+fccYZjAM/eMLOwwHTDXXOcsLQHxeUVMR7mdSzIUKbRTF6zR2y6E OxtVUhhPec/U/MUJfr229vHlRj2epkd9+2TuUtZg0pamDLImNkUOBgRINWI5p8UpSivSYt z0dJJdocliTUXCLQVYhVlHCXxqvFno4= Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5c932b47552so7089a12.0 for ; Mon, 11 Nov 2024 00:27:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731313652; x=1731918452; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=oTlZ4xJedYzVMnb4atsTXhQWqyB2ECChSFkUmRbl3i0=; b=mSWYKjPaKQEcIZ7EkHTZIQIj7fW2HvvOyHZj/iENVHkqn7enzaYASinaNuH+ZlV5+g TtLMNzU6X0Lq4ubSMjzg72RQEp3OxR6SqCK9u85g7D0IzzUzcsOQ6ubnWUYf5oojgi2B d9d9j4AD7dmEFhzoXG0uR0mM6wtIhouq8UfIWC/7XcSOMuaI+h7S5QToj9fSDDBx6R5q EnQWmcoLWMmOxRbbfMOqWfZekjfbWmAMrzTJo1w4BInz0kBnGTVTTNIRi9b1PbiXv6Xk kErM/WVQWy3Y5l/HZjnw0Wwm4J5OwDjP8ZnvEeJvgue8srgtZLmTHyh6/QtdYzQiRFi7 SnGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731313652; x=1731918452; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oTlZ4xJedYzVMnb4atsTXhQWqyB2ECChSFkUmRbl3i0=; b=LcphLN56FXCwxibfqjICw3v9XYeMZn5Q3ls2+CMKE7AdUYZ6iolFSWf2U6HKSuSV+H rt9gdWpud6TYfNaC0dqGxtEsFWPMOAY93ZJIZ29bkwgCxF68A25PC0kWg5qbptyP78ub RbPnXHG4KAjzL/eqZxb71r8ZuS2tvDSfPx9yb6cYkm4RF1wHTDKJc+lWikOKzuteKpic pf7CW2lej3P400iCe8jad2VDoAQtXcbtACkgy7vkX9gKhuigEPGgRrR67pj+Qh2rdYWE 2OzpIpGoKWtL+Owgtf8p0mvZDVyl8JT1tLoIQg+E5OeKJXPK6mfo0+t6M2e9R3T/4KQJ 3+tw== X-Forwarded-Encrypted: i=1; AJvYcCW05floQ71OW51Uro/Tb0xjxxN+3bXYnyjNrZT8FnOpCdeaz07Hlrnhv1xpLv+XWC6Byhpj560eZw==@kvack.org X-Gm-Message-State: AOJu0YxUOY2ZUGH9A8MPGPjJCQbu5WR8O32SmJlW//qLv3lxXET5d22S am5SuMeNhDl8gAOOM2JEC4jnus40Agajf6MUbMTElJlhf4goW0/4eRTOMSzeQc47WEZ+k5pE8/P Mi0StBSKZwN+XC6xDAhnG1VhlekqXbD6QLVr1 X-Gm-Gg: ASbGncvkeF2PJd9jdrw1ZSPT1aPPEHNl6oLe+EQstn0zLnbz2FuMneeyPtFQNnL1AK5 uESyLUhRCufwrc9zS0SSav+o+JFT1eA== X-Google-Smtp-Source: AGHT+IEsLtD4/gOd20okj8nZ9YDzz7v0ASgpzD8te1wJetCOEqNBZw2DVQaobhIGcOfsx2LNSc32Qzn/r3Tf8+lUjc0= X-Received: by 2002:a50:999e:0:b0:5ca:18ba:4a79 with SMTP id 4fb4d7f45d1cf-5cf2273ee9fmr135365a12.7.1731313650537; Mon, 11 Nov 2024 00:27:30 -0800 (PST) MIME-Version: 1.0 References: <20241108162040.159038-1-tabba@google.com> <20241108170501.GI539304@nvidia.com> <9dc212ac-c4c3-40f2-9feb-a8bcf71a1246@redhat.com> In-Reply-To: <9dc212ac-c4c3-40f2-9feb-a8bcf71a1246@redhat.com> From: Fuad Tabba Date: Mon, 11 Nov 2024 08:26:54 +0000 Message-ID: Subject: Re: [RFC PATCH v1 00/10] mm: Introduce and use folio_owner_ops To: David Hildenbrand Cc: Jason Gunthorpe , linux-mm@kvack.org, kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: gc7ongxozpm54zuotbrgowuazbam8tcz X-Rspamd-Queue-Id: EC07C40006 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731313603-194075 X-HE-Meta: U2FsdGVkX18RaCMmUCbLk09aWNqNquxNvkqixD3cX4J/xA7kcGugiuYji4yq1UYsy7gFLcN6cY0eifBg/VCKoMFoPtgl7dPrdA8gdoCrdzXTs/cQQDld2befmJ5c4uFQbIP+zcbpYntIcXRMlEPIjvVVPJ+crlzTjAJn3XYZ14ipvk2HFZ2IUTiLgW5KfWlj4fQG54L+96dEYBz4Lewt/NqEsgXVNPk7yKJgSZVIT0JDcgB7iJiw0wqJirj4+zlbuWfVlmfMfhvdZFsdqiLeRbTZX1n+42aXaAE+e10PgIBmpZJ0IxjGhqBTp76g0OEztnZ5nNv/9Y8Iz9i6AWFWl41Or8yzaudB+2yGoc8w8lj/b7QKKnsamyIy418xy6g3/CEd4ZCZ/mHQCPvBo2KJbHC6E+sgw1eiD0e34fIxS/tG9kYp6skh+5O0dcUhxEIwyy9mVAmlJesxR8ooqYjRxK6Z0Fq3b2lMj2BcqmFo/boSvQ93x+YpZ3yqFAjwuhke+iCXHCrfuq0xyJPcv2V/NTa0Fa4MKX3EpoAlkD04Ti8UpaAp5bYa66EKItY1h3Kg7o9NC2RTatUZL2zZy8K+cdgcEPuDXANJQ4czCnpJS6rIva7CNI4nnddXrEQ+CAZDvXvvu5LVTbaZlQ1XYMQMgNA6KOMMPzF3ulNQya9oPLa7kLDAUP4TZPEQYVhLTIbGGIcsoCJzLMeJ/cFAqx9pXRAeaQ36LfUj2kjCdQZ1dc5hutxCxVcimjvDYN8+/INeTT/d4sxgNaHcp94CNzUt4AcRsDaEiuPzWkwhrGxXckpRpCe7H5XGrvqrATm3zA1INVB7lUI8bbDkhH8urvSQvN/BZY7osnrSr86C9KnL6aD7F5u3kh01jCU9NZpfYpDC0eWgnq9h+Iqbxzc/dBTa862OWunpyszF3v0FyJ+9rX5DgdqWlpNNc67pA3YUzOHBUKm7UehNNYZDIqQl4rf o+PrbAmk vnrejjHa1vQZPmR6NQixk/ecT/TMduZI4xoyP9HMJWLSg5B7Gdh5c4pmxSpNOgFxAIJAmZ4fa2OusCI82+bGwhCYaLEbtK8vwvL1FFVnLRhYLx+wn5EtGLaIfj3N+hTkmRW/TtqxtUWkSVBLjpBXggaQ36Z2G75DU9bL6n+gSPmRFsaFswH5ZWydTR+kdSlgbWtkJ5CC3nde0GgZjmwkmcxPFPiUyFFt1bcOZxQM/MCMK8v84HodSoWJNm+XRbzHqHO5QjgliFTDBbZ0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Jason and David, On Fri, 8 Nov 2024 at 19:33, David Hildenbrand wrote: > > On 08.11.24 18:05, Jason Gunthorpe wrote: > > On Fri, Nov 08, 2024 at 04:20:30PM +0000, Fuad Tabba wrote: > >> Some folios, such as hugetlb folios and zone device folios, > >> require special handling when the folio's reference count reaches > >> 0, before being freed. Moreover, guest_memfd folios will likely > >> require special handling to notify it once a folio's reference > >> count reaches 0, to facilitate shared to private folio conversion > >> [*]. Currently, each usecase has a dedicated callback when the > >> folio refcount reaches 0 to that effect. Adding yet more > >> callbacks is not ideal. > > > > Thanks for having a look! > > Replying to clarify some things. Fuad, feel free to add additional > information. Thanks for your comments Jason, and for clarifying my cover letter David. I think David has covered everything, and I'll make sure to clarify this in the cover letter when I respin. Cheers, /fuad > > > Honestly, I question this thesis. How complex would it be to have 'yet > > more callbacks'? Is the challenge really that the mm can't detect when > > guestmemfd is the owner of the page because the page will be > > ZONE_NORMAL? > > Fuad might have been a bit imprecise here: We don't want an ever growing > list of checks+callbacks on the page freeing fast path. > > This series replaces the two cases we have by a single generic one, > which is nice independent of guest_memfd I think. > > > > > So the point of this is really to allow ZONE_NORMAL pages to have a > > per-allocator callback? > > To intercept the refcount going to zero independent of any zones or > magic page types, without as little overhead in the common page freeing > path. > > It can be used to implement custom allocators, like factored out for > hugetlb in this series. It's not necessarily limited to that, though. It > can be used as a form of "asynchronous page ref freezing", where you get > notified once all references are gone. > > (I might have another use case with PageOffline, where we want to > prevent virtio-mem ones of them from getting accidentally leaked into > the buddy during memory offlining with speculative references -- > virtio_mem_fake_offline_going_offline() contains the interesting bits. > But I did not look into the dirty details yet, just some thought where > we'd want to intercept the refcount going to 0.) > > > > > But this is also why I suggested to shift them to ZONE_DEVICE for > > guestmemfd, because then you get these things for free from the pgmap. > > With this series even hugetlb gets it for "free", and hugetlb is not > quite the nail for the ZONE_DEVICE hammer IMHO :) > > For things we can statically set aside early during boot and never > really want to return to the buddy/another allocator, I would agree that > static ZONE_DEVICE would have possible. > > Whenever the buddy or other allocators are involved, and we might have > granularity as a handful of pages (e.g., taken from the buddy), getting > ZONE_DEVICE involved is not a good (or even feasible) approach. > > After all, all we want is intercept the refcount going to 0. > > -- > Cheers, > > David / dhildenb >