From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B628D64080 for ; Fri, 8 Nov 2024 19:33:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A901E6B0088; Fri, 8 Nov 2024 14:33:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F1BF6B00C8; Fri, 8 Nov 2024 14:33:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81C116B00D0; Fri, 8 Nov 2024 14:33:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5BF3D6B0088 for ; Fri, 8 Nov 2024 14:33:58 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A9697C0709 for ; Fri, 8 Nov 2024 19:33:57 +0000 (UTC) X-FDA: 82763926686.06.90A77B8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 6A323140006 for ; Fri, 8 Nov 2024 19:33:29 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L60C5nhK; spf=pass (imf09.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731094247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HXUvQ9DDoTuKmTlcnKhUpWVHW2+zg/XnWwIaMSXVfPs=; b=1FkLdCgddDC4sJP5LXabv14Rt2KqfUK/Rf45fWmKefxTWpWRSnPwo1/iA1CLUuRixnDmrw daUNmYnHwnNhoWe1LxF2p9e7Bfo/BINpzsPm9B7P2kzge8jquUwkWrNIXGFlcO8j3aPah7 CGtNOEsZulKUYpHALpTV63Q87urnyeU= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L60C5nhK; spf=pass (imf09.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731094247; a=rsa-sha256; cv=none; b=Dm8u1yBqYXp8LMSkVkl9m2gppJIh0XWC/euZgI5D1HUYjKHBKr7U5AMFElqu43tbwQ4RWu HPpSaWH8U2tOUqjPMWFrGNRkeUkublewi4mAnvUhXXszjDGVE/JEkeYG+rZsioP16vVbGp QlOHU/WMJ5hWOApaQtFe3/uT/BiJP1o= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1731094434; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=HXUvQ9DDoTuKmTlcnKhUpWVHW2+zg/XnWwIaMSXVfPs=; b=L60C5nhKRy5F5hnXiDEA83IG1rydqE53hf+s6hxB5RQEEIByPxy0B5iNQ15t2frXbnTsjw T5om+zYWuA6QQAU4mlqwRydd82MQ2s0vXaxmSYnI6Cl4up1AZ1vzerhwcaMAMu13AerwVT L7WB75zHpsJICtQajS3MP0GMs1OjSwc= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-192-VCxXnlXXN8WjOUW6Ad8SwA-1; Fri, 08 Nov 2024 14:33:53 -0500 X-MC-Unique: VCxXnlXXN8WjOUW6Ad8SwA-1 X-Mimecast-MFC-AGG-ID: VCxXnlXXN8WjOUW6Ad8SwA Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-37d4af408dcso1174593f8f.0 for ; Fri, 08 Nov 2024 11:33:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731094432; x=1731699232; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=HXUvQ9DDoTuKmTlcnKhUpWVHW2+zg/XnWwIaMSXVfPs=; b=wlI79pbd9NmjTIviswwsjVFGjG7ntv/GiaT8MOLouqphECpjukLLuu680kGbMexBq3 dUR7leuxGGGe8N8SZ0SF9XONPdxU1qYvkhzFqzeeUobXsUYO0B0cSzz1jrxTgvvqNITK uYJrCsbjg6kC3EeFMOu7Zgk1PDh+TauM0adkz+krt+oFmBvid0Qfg5BGFVAfqVafsM19 uwEVnakIfNKKe0B0aSGp0KilmgosjKBIJrIEs12TpK+rUPBlPm2xB+DuPEp32fk+AS0v mhNGL1HD5yKwE24veZJWgQ7fKseXGaSyNlx1fI0/lePf+NvyhCmBipgKOdXjm9Tsd3Rc oD+w== X-Gm-Message-State: AOJu0YwdPc6Xa7t5z9TgBaoAl9biIIuFxy8ywcL9iyvHDkym7wQeIYdF +o+7G4dPLTKOUESeVDMC4Tcm3kKJyWrC128FExYKQXrZpUEgHM+j3cuRX3TTJEkq8tjtE/4Nd55 nYcM3A7n6g170XDhXY+bU30hPBiJwljrdc5lvnW+LGRj81Vcq X-Received: by 2002:a05:6000:1564:b0:381:d014:9be0 with SMTP id ffacd0b85a97d-381f186c6a3mr3649598f8f.17.1731094431764; Fri, 08 Nov 2024 11:33:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IHMHDodmRF53aDkZGw6BGZ9F+ccf5Mc+bz5kN7YN38h8pDU+2YBOqvKdtD83yKuSgIuDHnfag== X-Received: by 2002:a05:6000:1564:b0:381:d014:9be0 with SMTP id ffacd0b85a97d-381f186c6a3mr3649554f8f.17.1731094431330; Fri, 08 Nov 2024 11:33:51 -0800 (PST) Received: from ?IPV6:2003:d8:2f3a:cb00:3f4e:6894:3a3b:36b5? (p200300d82f3acb003f4e68943a3b36b5.dip0.t-ipconnect.de. [2003:d8:2f3a:cb00:3f4e:6894:3a3b:36b5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-381ed9ea7c3sm5912898f8f.80.2024.11.08.11.33.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 08 Nov 2024 11:33:50 -0800 (PST) Message-ID: <9dc212ac-c4c3-40f2-9feb-a8bcf71a1246@redhat.com> Date: Fri, 8 Nov 2024 20:33:49 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 00/10] mm: Introduce and use folio_owner_ops To: Jason Gunthorpe , Fuad Tabba Cc: linux-mm@kvack.org, kvm@vger.kernel.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, rppt@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, simona@ffwll.ch, airlied@gmail.com, pbonzini@redhat.com, seanjc@google.com, willy@infradead.org, jhubbard@nvidia.com, ackerleytng@google.com, vannapurve@google.com, mail@maciej.szmigiero.name, kirill.shutemov@linux.intel.com, quic_eberman@quicinc.com, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk References: <20241108162040.159038-1-tabba@google.com> <20241108170501.GI539304@nvidia.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <20241108170501.GI539304@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: hFJs6wabr__fyFRpd6ZW38q7VjDdgv8aR_QJ5L1AU0s_1731094432 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6A323140006 X-Stat-Signature: ukwq5uw95xssk7w7irnj589161n5p8xe X-Rspam-User: X-HE-Tag: 1731094409-215037 X-HE-Meta: U2FsdGVkX18OCDgOWaoaGtKRq2ZeM4BuYGlG11VimUY0khrh1PtUUxme2G6jmNZvEVekj8yZCtBg9v2zlUpxII0C8JtJdjWc1H/17J+g7R6DHNeVwcDufyzcgTF9mx3Oxbz4xiNPUhn0zk93rY+jgOiA3AdD8cMzCtmGILy9SSWAA28MwWCzZZXZYDIMrUj6ss0EbjM1gxHkqyZAUH1JwXRcKOTaom6XeJG1n6NfdHfzLv5sYGooP5OzmSRt3JIw8K2YMb+tKA3WmIotZfg0b4FNUJE0PBWFlXqMccQC0B9eiDVdKU1dGalO29kqkZFdvAIOwliH7TKyVY1P3iNcrVFUWwH+xtu+WJAFTGJKDt23LO7UxTNAtdS2LFxhxCHUH0IZrpcvtIYnjtaaKSMUoBzaQjmP3aIXo6OSplpkRMGDDdgfpfQ1d5L6G2orhZ3au5wvzGJO0RvGXOgBfoZ1jd09xew/X4HVhBEuTc95/aMnV1uY8NxrsPhnv/Tm4qRzNzw93cRyR15xb4IiR7/VS9Ln4/wR17FD6JmdwJhDsdm6OR6YGqoXpB2FZEwt4asrwURV5o1/H8csmdbSraCUF/lGr/L5vecZAWGVptD6J53o3j/5uOn2cGG5u7VfoKMrmbiMlOLGZOhdjoFGRRSvRa3x33ubS92JAEEWE7bX1/vzv3BtgSczczSHOnOR47HwXQFrUnhEN1Fzh/fn9jFjylJbH9eib9NPB8Ec90cseZJ3dRRpExl0+sNNPWuSZ7t5tWSkvvOOCCt1URIaaf9xEfpaOTPgmoQOQiF/1pxwTrOm2Hjy01M/pAtPAvlCqzmViopibSlYAeRgVBSo9PHj0EHmY7k8LFtcuD1Gv73UQiz6BzWwdabPs6gmmmi7kUOyOzAhyJQUOPIFJhTtk8nEXYsHs04yeenD3hCJo2NtMzQogcoQ9zKS2Vj71DHAhLgZ6pHTnnPgqpVaX+Lls14 pyOQszkd MWPTKstk/uU8ZBoXERSE8IY1ZKDn8RItxiDUfwvrh1LU8CESXMuxObdD0kqFN1dg5/FQSsHX2YmtHJJ/pEFVCzXNB3EtWNsNy8oM2Tp0iVHaX5ROlLq6dd5pBYQ71fmP1qtvYutf8Tw47cUOLJYIgN6dHqJBXFoKkmHJL2+oJ1hzJMbocdM226jifWOs2N3f75slJ6dX5XOMPPY1gweeCNk/z6BpTkQKhNCUsZhv5GSgR75xZ2g9DfwDBSg5gNWTfEkgYMYRCFGccs/Cl26uDYBSsAIVtY/pUG9Ak2j6LlsLsVGasUWcFFaW0ACHgrLheOf5sOzy89a1qyr1QYHDfg/VElAavrZYMFFVwy3PwNHpcB3EMEiHJhACR3sVgMGriQIhw6JTseqp8WS2gqq0tWp7tMPcY3dFcvbEbLFe9/1nBw0Vo9nS7ZYSXXffUCuA4N4vw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 08.11.24 18:05, Jason Gunthorpe wrote: > On Fri, Nov 08, 2024 at 04:20:30PM +0000, Fuad Tabba wrote: >> Some folios, such as hugetlb folios and zone device folios, >> require special handling when the folio's reference count reaches >> 0, before being freed. Moreover, guest_memfd folios will likely >> require special handling to notify it once a folio's reference >> count reaches 0, to facilitate shared to private folio conversion >> [*]. Currently, each usecase has a dedicated callback when the >> folio refcount reaches 0 to that effect. Adding yet more >> callbacks is not ideal. > Thanks for having a look! Replying to clarify some things. Fuad, feel free to add additional information. > Honestly, I question this thesis. How complex would it be to have 'yet > more callbacks'? Is the challenge really that the mm can't detect when > guestmemfd is the owner of the page because the page will be > ZONE_NORMAL? Fuad might have been a bit imprecise here: We don't want an ever growing list of checks+callbacks on the page freeing fast path. This series replaces the two cases we have by a single generic one, which is nice independent of guest_memfd I think. > > So the point of this is really to allow ZONE_NORMAL pages to have a > per-allocator callback? To intercept the refcount going to zero independent of any zones or magic page types, without as little overhead in the common page freeing path. It can be used to implement custom allocators, like factored out for hugetlb in this series. It's not necessarily limited to that, though. It can be used as a form of "asynchronous page ref freezing", where you get notified once all references are gone. (I might have another use case with PageOffline, where we want to prevent virtio-mem ones of them from getting accidentally leaked into the buddy during memory offlining with speculative references -- virtio_mem_fake_offline_going_offline() contains the interesting bits. But I did not look into the dirty details yet, just some thought where we'd want to intercept the refcount going to 0.) > > But this is also why I suggested to shift them to ZONE_DEVICE for > guestmemfd, because then you get these things for free from the pgmap. With this series even hugetlb gets it for "free", and hugetlb is not quite the nail for the ZONE_DEVICE hammer IMHO :) For things we can statically set aside early during boot and never really want to return to the buddy/another allocator, I would agree that static ZONE_DEVICE would have possible. Whenever the buddy or other allocators are involved, and we might have granularity as a handful of pages (e.g., taken from the buddy), getting ZONE_DEVICE involved is not a good (or even feasible) approach. After all, all we want is intercept the refcount going to 0. -- Cheers, David / dhildenb