From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEE74C6FD18 for ; Wed, 19 Apr 2023 17:23:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 285BC6B0071; Wed, 19 Apr 2023 13:23:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 23628900002; Wed, 19 Apr 2023 13:23:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BBA16B0074; Wed, 19 Apr 2023 13:23:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EC0466B0071 for ; Wed, 19 Apr 2023 13:23:24 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C43934039E for ; Wed, 19 Apr 2023 17:23:24 +0000 (UTC) X-FDA: 80698811928.05.2D783A8 Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) by imf17.hostedemail.com (Postfix) with ESMTP id ED99840017 for ; Wed, 19 Apr 2023 17:23:22 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=Wy6Ctj0K; spf=pass (imf17.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.48 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681925003; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PbdVv88DWDlVS6fOv6CfQ4TcqHKZBvuBMSO1VZVRUV4=; b=pvT5O1M7uYNg83f5XE7TvrLYbdVSQKzres7azsJ7dMEfFLBxvJpV89rA8JUzudazMOXp6t bkkUcNTSm7nbcf1rwCAAV2xcxp4TAkhN1MyRLTk+btdyI88bURrjWjO/3RgzHUvrwfafQs Xv5la2LynmtBUPGkf88gCZKG7fnP6Qg= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=Wy6Ctj0K; spf=pass (imf17.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.48 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681925003; a=rsa-sha256; cv=none; b=OHvQ009tVbzahHLxBPZFS6UhZtyaTfy1T9IVDWP+VssI1JFViXHgIeJQ+tSLRmt9oMq2q0 dKo5FWAJpHp9l+7EMg1/7XmaVwI09291JpmcHfoiDXxZQ3SID+XApoG4g2xOn0Vt2kldvO BjYxuDrhiwMFH7ybs2juVB/kXV496pk= Received: by mail-wm1-f48.google.com with SMTP id eo4-20020a05600c82c400b003f05a99a841so1812220wmb.3 for ; Wed, 19 Apr 2023 10:23:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681925001; x=1684517001; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=PbdVv88DWDlVS6fOv6CfQ4TcqHKZBvuBMSO1VZVRUV4=; b=Wy6Ctj0Kprf1YVYrpduGBmN1bUBvf8KbeRZJyo4EitncylaSdNDMYth80L7gMZxSye VM9mrr7bqRnT+5Lvwf+bDkBk1J96oB17wAC57D08Ps2MDPRQwCS44vgX0Ww/0WPM2q1r 7/RfPFf2BWzksrskehryWdUKgBr46y2+FzE09QdNroXyWGdnggeciRYPZYHLTjaTyFRc GL03RUd+Qb6LPHxVFrtI0zAX2Nq0IKiZckjyowf1j/813S6ymxPBjFoojauQGOKs8Vnr bR892VqSwfFyqbm2HDlUOJN+O9x9QRgbar+yr2MLpmrFHx2KkRKJDhHbd01QFlTJsLHr YHFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681925001; x=1684517001; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PbdVv88DWDlVS6fOv6CfQ4TcqHKZBvuBMSO1VZVRUV4=; b=Ks1xAyQSCP3DHJdsD8Hq2Co1JwI1mviSUweksirvnb/yTmGfl4hFer0eW170skDHxm Udbau959M0mNISft4Q1qI0yPBu16Iz5GBtTw7CfFHcE4LJjAh5JEkoEZFbnmQBMbz9sP roXy/JoMeYMqs5gVzg0/yzSy+bKjpuCeOPgNhOWMCCNZMOy1c2WXlhQr6MGDTvmB3mag q27ag8rX1LtaQMSw+wtrvUGKaurBWNQ1NuwEzuPx2bvMyMdftTm7nKylkVxPsHlQLVXc /sm6aeq5fHryyOLItBUCNL+ms1Irt6rngoIKTi9cQ/AmeBtQJ7dPcPspNNQAU3ORxWi4 4Heg== X-Gm-Message-State: AAQBX9f9e8rnPIfUo/I/k6HLuHLKwozbIcfd2K3J+Av7zYJK6N6O6uLd X5TIXkrsXmz0zhP3mpQj9fM= X-Google-Smtp-Source: AKy350bUR51nnMmwbXx1xt4z86WrUUpbxvFDqBTSzEFf/0HbReZDU/Z0j0yKNf0EiJ8HmzCkI8JMtQ== X-Received: by 2002:a7b:cd0b:0:b0:3eb:37ce:4c3d with SMTP id f11-20020a7bcd0b000000b003eb37ce4c3dmr17068957wmj.38.1681925001356; Wed, 19 Apr 2023 10:23:21 -0700 (PDT) Received: from localhost ([2a00:23c5:dc8c:8701:1663:9a35:5a7b:1d76]) by smtp.gmail.com with ESMTPSA id c10-20020a7bc2aa000000b003f080b2f9f4sm2760084wmk.27.2023.04.19.10.23.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Apr 2023 10:23:20 -0700 (PDT) Date: Wed, 19 Apr 2023 18:23:19 +0100 From: Lorenzo Stoakes To: Jens Axboe Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Matthew Wilcox , David Hildenbrand , Pavel Begunkov , io-uring@vger.kernel.org, Jason Gunthorpe Subject: Re: [PATCH v4 4/6] io_uring: rsrc: avoid use of vmas parameter in pin_user_pages() Message-ID: References: <956f4fc2204f23e4c00e9602ded80cb4e7b5df9b.1681831798.git.lstoakes@gmail.com> <936e8f52-00be-6721-cb3e-42338f2ecc2f@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: ED99840017 X-Rspam-User: X-Stat-Signature: mgjyj15g94mwyafqy7ntocfe9hd8x8a9 X-HE-Tag: 1681925002-208305 X-HE-Meta: U2FsdGVkX1/BTkSveIV15qjtSvR0iQs3JhH+K9nSaGWHl/jQpw90ihonYXzr/swiTUl5egT1hvwJWNi4i+J7cSkLkOYUobClidRUI/snwOSkCA6a2zHGY7+bCIAxpfE/jMmMeWH/DXSKSXlZw5eCQD4Ib+UuReym6f8gip27rUQDeYVAYgprkxWQRUF6nH1ivcug+NOG+eDY8M/iShJhelxeCzRAk7koIipMrn55JwzkPiemmDszSBsO2iV+cRBzc57R2CeckLvFegb69fLnmyb/Gx3uMTaMOndhwPnV0tMm3L7rgv0LPn232l77ShbdioVOvgATUQGRa45mLgdO9Wn7+L198WsDrRLVY4XPkZyeuhW6dIUaFwAntF/MTF9qs8vm4ySrYmyYxATzWetVGwahoIUi2t3U+bs/u/wJTEazFcWh7m5ckI2FriJa+6rozmrBKHZzzgPpAjiadtqWBMMq0SjUSDw7nQC8fHBa+k0EFWGT6phUYkh+MrpKRAUOooMNiV9zP+XFCN1KGV0LW15VdtveACaHIxc7O0rDTi5hjiqTMwfELVIKczsH9Owwt4zZ2Q2gilV9rsPRL6hWcEC3fm9FDIUszDA+ZVoazEe9Almra9C9pvBJQqL2CPz+73OsUz8DZBgVSYX1o1NOXpJsY0bCf8djFJH9X350jQFlg4PAYq+SW/gDz1vOoxWYc7wGnYT2W008Lj32SaPTqEfRlAHCC2AHcpyvqJybJJLY2+ponZkm5EA/YVJt/Pab5j2VOPWUBj4+0m/78q0Z00yfwRGqbRERbJXq8PRsNw0mXSWMwmpt+3mMedE7O/8UCxf9TuzaKZ2rXMDPNzcxONww+BMOEynKZ9qYQh/y79mXvexaaIZbcg9JTPmPYgM39+z6UNtYS4gtJt2RdcUBHl//Pf4G2mfOrnA5j6wpYRMySw+4tbVBMYmjhWexxkmGQG7QX2ZFkr3V+yix5gu K91ybEkZ vJwiDC+a+kVxnBwlaGdOY6JfT2n5r5rDvxP4yxHOWKbZV5bxiyxQz0CGutII3G0aNmBheDn8mkZK/s31E11pX8BzPZHqWxaOp4WCYqdldTqr2LEx4vIL0s9doBwnlAROrbWzuzO6w77rwWZ+Dtu7WMHzcBFbk2+fGdHJ3fRiyGoLKK8GnTx9ymRIMcIbG+RYg/kutgDJM2PfnLVQyQ/BZclHE6JGewu1rBQMUL/qFrHH2QwR6nyAzJEdyXFV5rFfvnaEm4n3HGgKwXWP0+PAhaxJUAGQn1XOzyZPGQfTPse6D4drxwH1aUYVC433DA0xKWfxq7jb2nenGqYoz4wf+6kdaazm01pPBsSJQFi9XgBY4HMpeXLjzwEKETc8WQgl/UgNUIcOm93xEn1EEy4jzErOll0UCBKlZvs93fU+A4yBnXmqyGzaub/tFqhFkKy84VeggVI47R2cC8YSm6GpvKW600ukNX5g44sewibOjBNgE2ymaKbVOeEAQFdLS+bIrbxtk957yOXY4KY1yjAYY1qKVMQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 19, 2023 at 10:59:27AM -0600, Jens Axboe wrote: > On 4/19/23 10:35?AM, Jens Axboe wrote: > > On 4/18/23 9:49?AM, Lorenzo Stoakes wrote: > >> We are shortly to remove pin_user_pages(), and instead perform the required > >> VMA checks ourselves. In most cases there will be a single VMA so this > >> should caues no undue impact on an already slow path. > >> > >> Doing this eliminates the one instance of vmas being used by > >> pin_user_pages(). > > > > First up, please don't just send single patches from a series. It's > > really annoying when you are trying to get the full picture. Just CC the > > whole series, so reviews don't have to look it up separately. > > > > So when you're doing a respin for what I'll mention below and the issue > > that David found, please don't just show us patch 4+5 of the series. > > I'll reply here too rather than keep some of this conversaion > out-of-band. > > I don't necessarily think that making io buffer registration dumber and > less efficient by needing a separate vma lookup after the fact is a huge > deal, as I would imagine most workloads register buffers at setup time > and then don't change them. But if people do switch sets at runtime, > it's not necessarily a slow path. That said, I suspect the other bits > that we do in here, like the GUP, is going to dominate the overhead > anyway. Thanks, and indeed I expect the GUP will dominate. > > My main question is, why don't we just have a __pin_user_pages or > something helper that still takes the vmas argument, and drop it from > pin_user_pages() only? That'd still allow the cleanup of the other users > that don't care about the vma at all, while retaining the bundled > functionality for the case/cases that do? That would avoid needing > explicit vma iteration in io_uring. > The desire here is to completely eliminate vmas as an externally available parameter from GUP. While we do have a newly introduced helper that returns a VMA, doing the lookup manually for all other vma cases (which look up a single page and vma), that is more so a helper that sits outside of GUP. Having a separate function that still bundled the vmas would essentially undermine the purpose of the series altogether which is not just to clean up some NULL's but rather to eliminate vmas as part of the GUP interface altogether. The reason for this is that by doing so we simplify the GUP interface, eliminate a whole class of possible future bugs with people holding onto pointers to vmas which may dangle and lead the way to future changes in GUP which might be more impactful, such as trying to find means to use the fast paths in more areas with an eye to gradual eradication of the use of mmap_lock. While we return VMAs, none of this is possible and it also makes the interface more confusing - without vmas GUP takes flags which define its behaviour and in most cases returns page objects. The odd rules about what can and cannot return vmas under what circumstances are not helpful for new users. Another point here is that Jason suggested adding a new FOLL_ALLOW_BROKEN_FILE_MAPPINGS flag which would, by default, not be set. This could assert that only shmem/hugetlb file mappings are permitted which would eliminate the need for you to perform this check at all. This leads into the larger point that GUP-writing file mappings is fundamentally broken due to e.g. GUP not honouring write notify so this check should at least in theory not be necessary. So it may be the case that should such a flag be added this code will simply be deleted at a future point :) > -- > Jens Axboe >