From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31889C7EE23 for ; Fri, 26 May 2023 09:00:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6F37900002; Fri, 26 May 2023 05:00:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1EF56B0074; Fri, 26 May 2023 05:00:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E6CF900002; Fri, 26 May 2023 05:00:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 908E96B0072 for ; Fri, 26 May 2023 05:00:28 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 41660160DF7 for ; Fri, 26 May 2023 09:00:28 +0000 (UTC) X-FDA: 80831810136.14.1A6C8BA Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) by imf08.hostedemail.com (Postfix) with ESMTP id BF8AB16001E for ; Fri, 26 May 2023 09:00:25 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=MbStjYhH; spf=pass (imf08.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685091625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qss+zxEtzgocrouex16BhRqrU2936RxdIAYJmgdoCvo=; b=DRPZKD4trHz4Su+vpGCyf5ulsRR2Mh84ZzbyADFYcZ9Zcu76pii5AXceUdr8JsUIvD7ujA suuh0qtUdbG/aLhzmh5KgrxuVda4jpVDuDhbO3sNk6pzlTTwDNVkYFd20KySaNMIsV4e+7 dQQuOiS6jDpu2roiVyUDkiHxveP3AQo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685091625; a=rsa-sha256; cv=none; b=8rGwpCdtC8oHrIOZi5Co1A+g62rIYsNkSVIBix4tGCsyCt5V055bNQhQASxh7/0D3yaujG iLWNKGQf1j3eYqL2hYxgzUHQwHvuQWSsutJTbLxKmpqi5QtCG0SWY0msuJ3Md3c5VIh45l uMR+3o7+PstH83amYNCb9xhgsgTl8lA= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=MbStjYhH; spf=pass (imf08.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-30ab87a1897so245418f8f.1 for ; Fri, 26 May 2023 02:00:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685091624; x=1687683624; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=qss+zxEtzgocrouex16BhRqrU2936RxdIAYJmgdoCvo=; b=MbStjYhHUzFc8ILRIy/UykQjsaI3qWMrTCSLE4yyRrdoWKoyQHKS18s+UxIXuXz0ob 1+tBeGkdZzgxM0fnFYmZo8klcndWIROQO8qqQnuSJYjvMYGO7jZreoeWx7uylUqKgEEp 8ZnZIXPXpM3/P0CryvcxTY1dk7JK/h0qAf6KRgES785Fj2V/aul/qY2OQmqLHtBTTiXR 6egji2GEcafG7yYfFUZfPwmiGiHaOZCetl1jjVVH0eN1lZYxVG/em58qiy1wHY8qbb9o XWTtKi17FDMgoPMHhgYEaeFYHXdmmGQaQVjEy43D/xZNZHFj8mTyQ6mnAMNKxJDJSPk3 UafA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685091624; x=1687683624; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=qss+zxEtzgocrouex16BhRqrU2936RxdIAYJmgdoCvo=; b=UpA0fk17gp8OrEhK8NwjUd9h7wkybukyTGEp6ndtAdy6Sb7iFv2umcDLvjjKKEb9QI I3/bUbjicyzqrkXTyln2bFLFB4UHxbHkrV0aF0/I9rFfBNL65XiluPRDHUJ4nHKyLOoa htKRjqHIXrcLLR27hhOJJucTlLe+P94QgBcMuEc4KgjrkyhPzvLSLhl2ck6YVHnouZ9Y YKxJ9Pud5nqx2j7eG9IbRneIOs6pR3MuzHPw2K0YLzhH+ZQ2divC3Bfy0vcLQRytc6Jf 5/mfSE/qGhzL2UTUC7gzhhzY/vuPYw0X/6DdewwLMhvpCH6mahMxYqCOTnZPJHEyb9nv WZfQ== X-Gm-Message-State: AC+VfDwIlU7+xxZc57VoRy8Tkh4ZQZQERUsBSlgH1p/Evgou56QO8oB/ kSCb97QYLoMq/Z4eCHSCZsA= X-Google-Smtp-Source: ACHHUZ5EkZrkFXdnKG+OsSeolqc727IKNW5Swk1R+LzKGKOjvK629ibzmcfNdrXmo0Z/6uqU4u/JlQ== X-Received: by 2002:adf:dfcc:0:b0:30a:dd15:bb69 with SMTP id q12-20020adfdfcc000000b0030add15bb69mr75933wrn.18.1685091623852; Fri, 26 May 2023 02:00:23 -0700 (PDT) Received: from localhost (host81-154-179-160.range81-154.btcentralplus.com. [81.154.179.160]) by smtp.gmail.com with ESMTPSA id k7-20020adfe3c7000000b003062b2c5255sm4448182wrm.40.2023.05.26.02.00.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 May 2023 02:00:23 -0700 (PDT) Date: Fri, 26 May 2023 09:58:12 +0100 From: Lorenzo Stoakes To: David Howells Cc: Christoph Hellwig , David Hildenbrand , Jens Axboe , Al Viro , Matthew Wilcox , Jan Kara , Jeff Layton , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton Subject: Re: [RFC PATCH v2 1/3] mm: Don't pin ZERO_PAGE in pin_user_pages() Message-ID: <4f479af6-2865-4bb3-98b9-78bba9d2065f@lucifer.local> References: <89c7f535-8fc5-4480-845f-de94f335d332@lucifer.local> <20230525223953.225496-1-dhowells@redhat.com> <20230525223953.225496-2-dhowells@redhat.com> <520730.1685090615@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <520730.1685090615@warthog.procyon.org.uk> X-Rspamd-Queue-Id: BF8AB16001E X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: k8srrhq6wp1j9t5go3zj3hks8cc7kud1 X-HE-Tag: 1685091625-903735 X-HE-Meta: U2FsdGVkX1+PddDTpJ9XaoUY+Bd/WkEOyff+rl4d/XtZXN+wGZ8/mK+hDWtWEZf1P3Jquk+xN6Us+/k9oKqsUU0bDsoINHUhU8+dbrdzREMFh9CSi6je02eJ+m/xnNZ4o6bTuofn1KdhzdN8fZbW/AHoK4Lg9l3cds0d2SYdmT6GdWy3kPyvco4b/Ctc3zvUu2uwGx0QLFtOvgXxBbkr2DtcWx4rWx2bxZaXM/bDGYbjHLGyhLvwjZxQfYUiUQibcgsrpqIPmTUaPPfEVnab4b8PylshaMNSR80TyuxAqT8zZtpAqFpep26rn46x3ePUTjAWDgWxevk1k5E6ZyvEKrgQfsiYXLB9xcWBh6XjJkc8ChilM+sOVV9fH1N2UvF6SMQlxkZBHNGeVY3Aw94898JBsSChbPUlwIjQXdbteKx+kSF6o/y2QXTJaUve9I7doDwJf6wXoRUSdgpeRBU7QRgxcQs2D93KwWNOCwQ48tKGaMQOXQu659DvqWHw8Xp9bnFFh8UTUSmEoDc2cdZZEyvSkG8vP67RUA8098H5WWP1kSbSrvNNjqYPYO6n/yp8E0RwRLeYM+PhRWVvsOSkTMiX8ULs+tSJWMS5GTbOGPcu5No9WT8OoX8FLiehbjtx/7W3E1fIzB2WDgfsdJmcvR50Mpnx7rxIXpKCnyzFstlI1SMC1GxjhN9chrLrvsnbN1prZNCUZbJ8gIDvUIev+mXxi6OtGYDNJWR+S5fuVIrODvOPcG+7sJfXdqPDAmVqhNyOJGUpsJgtEvsoTaZXhUngSB3+QUjbufwCctzwsN8skEoBYX7l18UaSxrgkruMRmQ/dEL1byN1On/o5iNZORjX1MlxaDD3mIKUonYEOIAejmcEBglzd0zF7QLOUM8y9JiVlNtUep/xXyQ1slrXARJQALCBPonlV3+nwOxsz/bBY4/AIyGmBaE9/pC2n9qCrWULjzUOJbQyh8AFWkt LEIQGotp 72WV4j3YsE/R3N+1SGGIzWkEsW7hAKaBI/2sNX6U4r12fvgm4IBrT/mMAX6+cFo98a9U98QszKRa9sHrneU4RtXivUFtP4bz7oIKzQAzzuyr5KnH9BcYda6KhvMl20gYUgRfYJOfiAR7rq56b2jkR14AjvkSNK2Xgeo//jsl2jHvNP+BuTZVppT3Yt0e56SxVeTd8VgJy5jgevO/+MNbGy8r7tyI+c8GURMUfB8zUsRnqSEugQ4xq3fS/FZB525EXTUBI9RNk4I5QvVWGWhl/EYGIGam5lQHOJVe0cWla/3IS+PEPPTixX24FBSwOX0oQrZQEwGOOWwUBhGpgSCBo1GPq9mfgzhzhZBDmJkbyEPMq/MWwz/86Ivhxvo8TeG6XVuYuAvKWSTbeWhoEP0t9K3j7R1bS0IP1sHLBHNScIs2/r9Sk/qJGErcWCiasCMn4Nau2Mkx+8pIOX1VfiXrX7FrCzPKHVuulbgnxIT1iqUBVk5XuhhSaqwoAZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 26, 2023 at 09:43:35AM +0100, David Howells wrote: > Lorenzo Stoakes wrote: > > > I guess we're not quite as concerned about FOLL_GET because FOLL_GET should > > be ephemeral and FOLL_PIN (horrifically) adds GUP_PIN_COUNTING_BIAS each > > time? > > It's not that - it's that iov_iter_get_pages*() is a lot more commonly used at > the moment, and we'd have to find *all* the places that things using that hand > refs around. > > iov_iter_extract_pages(), on the other hand, is only used in two places with > these patches and the pins are always released with unpin_user_page*() so it's > a lot easier to audit. Thanks for the clarification. I guess these are the cases where you're likely to see zero page usage, but since this is changing all PUP*() callers don't you need to audit all of those too? > > I could modify put_page(), folio_put(), etc. to ignore the zero pages, but > that might have a larger performance impact. > > > > + if (is_zero_page(page)) > > > + return page_folio(page); > > > + > > > > This will capture huge page cases too which have folio->_pincount and thus > > don't suffer the GUP_PIN_COUNTING_BIAS issue, however it is equally logical > > to simply skip these when pinning. > > I'm not sure I understand. The zero page(s) is/are single-page folios? I'm actually a little unsure of how huge zero pages are handled (not an area I'm hugely familiar with) so this might just be mistaken, I mean the point was more so that hugetlb calls into this, but seems then not an issue. > > > This does make me think that we should just skip pinning for FOLL_GET cases > > too - there's literally no sane reason we should be pinning zero pages in > > any case (unless I'm missing something!) > > As mentioned above, there's a code auditing issue and a potential performance > issue, depending on how it's done. Ack, makes sense. It'd be good to have this documented somewhere though in commit msg or docs so this trade-off is clear. > > > Another nitty thing that I noticed is, in is_longterm_pinnable_page():- > > > > /* The zero page may always be pinned */ > > if (is_zero_pfn(page_to_pfn(page))) > > return true; > > > > Which, strictly speaking I suppose we are 'pinning' it or rather allowing > > the pin to succeed without actually pinning, but to be super pedantic > > perhaps it's worth updating this comment too. > > Yeah. It is "pinnable" but no pin will actually be added. The comment striks me as misleading, previously it literally meant that you could pin the zero page. Now it means that we just don't. I do think for the sake of avoiding confusion this should be tweaked. Obviously something of a nit, however! I did dig into this change a fair bit and kept adding then deleting comments since you cover all the bases well, so overall this is nice + I can but nit it :) Nice to see further improvements to GUP which is crying out for that. > > David > >