From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83DD8C7EE23 for ; Tue, 23 May 2023 07:25:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0265900002; Tue, 23 May 2023 03:25:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB2E66B007E; Tue, 23 May 2023 03:25:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D522C900002; Tue, 23 May 2023 03:25:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C1D366B007D for ; Tue, 23 May 2023 03:25:35 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 18D5AC06B1 for ; Tue, 23 May 2023 07:25:32 +0000 (UTC) X-FDA: 80820684504.10.490B596 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf27.hostedemail.com (Postfix) with ESMTP id 2323F40006 for ; Tue, 23 May 2023 07:25:29 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=MbWTUJK0; spf=pass (imf27.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684826730; a=rsa-sha256; cv=none; b=ohR3mk8YnjYzdGXyESNUqmbgBXAH5MHs7ncLfHq9ozoXcyt5pMhl+ihTSFCl0iyCbLI2n/ 8mlZpjhF2vc6g5RA+RSqvBAbxhl3AWzuu2cNjF9i4jd58IiyM2evXcV1iedTMDNip9vXjK qUW3MDWfQnOrNAn/i4Ky4zl+h37a84c= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=MbWTUJK0; spf=pass (imf27.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684826730; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hh1M/PyhqnPF3gHXcsPzIKqPXVy0TQ5XWDVGt8r34Cw=; b=QtX+jGJznUpEFBohZ0x7XeajF7k4vBuUaelsOi+NocNH0TbR13uXgiq78XqgariXxC/15z GeEWzs3xsD8wF+e6fL34/8jW9tBslhizWmw3XW4b9zJRWGEdolX3+xEYvydULZRNV0Gb1C El5ilR84tnxrlfeyTpjzF7juVWZ1jy4= Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-3f607dc98cdso10441075e9.1 for ; Tue, 23 May 2023 00:25:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684826728; x=1687418728; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=hh1M/PyhqnPF3gHXcsPzIKqPXVy0TQ5XWDVGt8r34Cw=; b=MbWTUJK0onitNaT4PQy7LWYs3/eCrjGWOztlNDkwLuKrPc2NB3ITFEKXZNPSCQBVkm IsDCDKehWZ8npbFaRSGkKIfRvHD8P75fyk5si+zzsFeFE8w/tJz3S7O1mOUp7Mt7HTdT r6LF5FjChUepkHIWCN4SCllJcZNcSGNLmnC3zqHu1xZ7+jWeJ7IYlfbR7gxLtDkH4JKX mxay8Vb1CJPUnjMgALNdyy5NHyJTWR0JvVa/hBUlmukAz3edKWeYRZ8sytnO6b0QbvCw JlsLc6mnPtF81xYjLlCDrGVNDeM1Zij5DB32GhWHgEDPh620Z76Sx5nLWt4mZOCnoR8i ZMOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684826728; x=1687418728; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hh1M/PyhqnPF3gHXcsPzIKqPXVy0TQ5XWDVGt8r34Cw=; b=hKtIv4TB628sP7qXRlUK36WwCuis3Ena9SgxMWuSP7Crymlubzn03YNI91ktcbd7Yw c/GII9rriFf5fevEQRJRwuXawG5g8ksz1h1LMNXb8HjULg1fHVHVU74IKnPmCh+xBUm3 CtCO360I64cfNphLG0oqBBcK3IgkwCXoApFoGdhUcRhHA3lbJhmaiXdkdkjIGditv8zo 0bgwb22pyd7nWzOtlsZcmv2UKzIFSNSUNzP1pUq5u6RIXHOTC3LfpP0xt8d/bwlHkRuR fFL0V9t+appxaxZEpAhpKeQMRrjy/XlI7/sgkPKVTbDE79Pd8TDmZ2ZP6k+MaVgx+vPl d9ZQ== X-Gm-Message-State: AC+VfDwi3JcaTlJrVkQvE+VADSZGjO49JJqVUa17GC7A/umwHwSc9AJA E44lpjGnDQNbQOZggOAiT50= X-Google-Smtp-Source: ACHHUZ6TvcH4VxLvgQZWHCNak2I8mUxXM2ty6AS9EVucuJhjX5yJLuwW0XNYUI3IR6a9XcTU9oOJTg== X-Received: by 2002:a1c:4c10:0:b0:3f4:5058:a033 with SMTP id z16-20020a1c4c10000000b003f45058a033mr9372191wmf.24.1684826728204; Tue, 23 May 2023 00:25:28 -0700 (PDT) Received: from localhost (host81-154-179-160.range81-154.btcentralplus.com. [81.154.179.160]) by smtp.gmail.com with ESMTPSA id h1-20020a1ccc01000000b003eddc6aa5fasm13988260wmb.39.2023.05.23.00.25.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 00:25:27 -0700 (PDT) Date: Tue, 23 May 2023 08:25:26 +0100 From: Lorenzo Stoakes To: John Hubbard Cc: David Hildenbrand , Sumit Garg , Christoph Hellwig , Xiaoming Ding , Jens Wiklander , Matthias Brugger , AngeloGioacchino Del Regno , op-tee@lists.trustedfirmware.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, fei.xu@mediatek.com, srv_heupstream@mediatek.com, linux-mm@kvack.org Subject: Re: FOLL_LONGTERM vs FOLL_EPHEMERAL Re: [PATCH] tee: add FOLL_LONGTERM for CMA case when alloc shm Message-ID: <106e7886-ef02-4979-b96d-66d33cfa119c@lucifer.local> References: <8b898768-a3c0-198d-a70c-9aa96b7f4a29@redhat.com> <1ced7f32-553e-2a5b-eec9-f794d7983d56@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1ced7f32-553e-2a5b-eec9-f794d7983d56@nvidia.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 2323F40006 X-Stat-Signature: nzm6bxzmpkgcrarefujdczu8nocnrdwm X-Rspam-User: X-HE-Tag: 1684826729-965370 X-HE-Meta: U2FsdGVkX1801/JdJq4VIAgGkuuN8DqtLqp8sygIVmNuAh2Zp5bKIMeaz0J0get/qn6s6JttQgJxsR1irE7+MF/UH3C3s5KCIjs/9DWP1/hH+2ccfi+hGsSTZLDbQND4jnL6Sca+zoW9NBuZ8o52GNNp8AeXKp0rvHLXX7Fnau1Dw9vffoAC4Mb0+5OHDsq3WQK/s6+ULHOezmqCMXYTyhLdshpGr209dw+VcNWNMb+OAg0teliz0yz23FOi6gzxsAry5vkMQ7sruqHjGhSMH7DJowHy9gofO0yWogSRJW51I5Ty7jmp6Y5hc4rAQjZqgQgjWUnGu8hMn6cmD/1Mxer7DTLI3s7VaxKQoJcvN///xh+w7seG7GvNWUo4XQVRFL+Iv9N++U42GhaalPEN6OV1hN67w9qhTnQ9WvNx68i5XyoJpnZom6ewD9kYka/c187JIa3/X4ESzwsCcaojFyLvdOyti2XudXnqKuNapzP/9V12SdH6c+Pu5Ua5yhUIjFLwTzjMLFC5dJk5FnosJ2Omox7/3ddmacDRJMxQvVcjjfj9w0Awh2JKUkay8u9V3RS+dupgh0zL9UFqHY44CRdtO72IO0aRfq6GI6f6dIa+L7sb+f9fwdNZW8YLu+Cj+NVtdN0whSCSANg6By4VJfrdkF5W5gF9eC8ChoZ4ZeSMUtmx0ZJiTIGZJ7lTWjOIxfm34JE3S8rV+Fh8NudH5pbgixY7C2fHM2LqFEDdwaFELZ4NuZM9J+Dh/U7uppGWvGMO2WO976i+4FQPf4lcbtkP6qq1wUF05r0OnWJ7ZRSvRHRsGebrazuDJE+D74Gazn1gubnI0EkWkHE9hfVpe3A5bkqHP64TC8gV25O+QmqR5vIsMyq10n1hADBmNi6m6gFFm6ZURzkwE8rW1CoDHX7JJ8WNV+ayak4eyONv4CoCrDamOusvC1itj7QY0qdLCMLlqR7egwN2dbL8hSC 2aRBxUtg 0HNs8sQ5tAi3gye8beCtdwlD4TKYr/Z9X5rEBeqdOBo9x035a5Vtx7Mqgduszy3FwJXNOjLFzedqAPGIPu+6dyUh0TAG1EDbu23yK+1f1qx4F5vDN94+EH1jSEx41i1+W9ldABos27/TW5rqJHN57pRhT3/OaKXpgq2WU0VLx5huECpyuG6wjyu/2iZKCRbZWGQ/EWsuNohuwuOBFasDZHOsEwKturG/pGAveOqb9GpIJelfNnPuvbZ5h7A9/vrmOwmh2nQR41umVGlI0B1Iy9PYbYbkE42i5leos2AzSXt+0ISnqQseieNIZAb16dd+KJcz5VxD5rNuJ3VWQMuIulSRmJIeo4xpU+0/yVLm+ZjP6sYUN/Mph1Zbc5X4Z68hcVD23V8765/Wzbpa5tWaftwdhi/gSFqcIRyl81nuxAvgLtgZzffQKlM9qlkEUcZjjVaokguKE17g3XABYDYZ/orKSF9JTI+irqMLQWUpI4/r4fYa3Dg5bT2s8D0kO7tCcJfwaM+5jHw8TXEa0qKRi0ZJHefhURlSN63eLP8qk/i5ipUOMaWBqb6XF1xPsb2GTbD/h X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 22, 2023 at 06:54:29PM -0700, John Hubbard wrote: > On 5/18/23 06:56, David Hildenbrand wrote: > > On 18.05.23 08:08, Sumit Garg wrote: > > > On Thu, 18 May 2023 at 09:51, Christoph Hellwig wrote: > > > > > > > > On Wed, May 17, 2023 at 08:23:33PM +0200, David Hildenbrand wrote: > > > > > In general: if user space controls it -> possibly forever -> long-term. Even > > > > > if in most cases it's a short delay: there is no trusting on user space. > > > > > > > > > > For example, iouring fixed buffers keep pages pinned until user space > > > > > decides to unregistered the buffers -> long-term. > > > > > > > > > > Short-term is, for example, something like O_DIRECT where we pin -> DMA -> > > > > > unpin in essentially one operation. > > > > > > > > Btw, one thing that's been on my mind is that I think we got the > > > > polarity on FOLL_LONGTERM wrong.  Instead of opting into the long term > > > > behavior it really should be the default, with a FOLL_EPHEMERAL flag > > > > to opt out of it.  And every users of this flag is required to have > > > > a comment explaining the life time rules for the pin.. I couldn't agree more, based on my recent forays into GUP the interface continues to strike me as odd:- - FOLL_GET is a wing and a prayer that nothing that [folio|page]_maybe_dma_pinned() prevents happens in the brief period the page is pinned/manipulated. So agree completely with David's concept of unexporting that and perhaps carefully considering our use of it. Obviously the comments around functions like gup_remote() make clear that 'this page not be what you think it is' but I wonder whether many callers of GUP _truly_ take that on board. - FOLL_LONGTERM is entirely optional for PUP and you can just go ahead and fragment page blocks to your heart's content. Of course this would be an abuse, but abuses happen. - With the recent change to PUP/FOLL_LONGTERM disallowing dirty tracked file-backed mappings we're now really relying on this flag indicating a _long term_ pin semantically. By defaulting to this being switched on, we avoid cases of callers who might end up treating the won't reclaim/etc. aspect of PUP as all they care about while ignoring the MIGRATE_MOVABLE aspect. > > I see maybe 10 or 20 call sites today. So it is definitely feasible to add > documentation at each, explaining the why it wants a long term pin. > Yeah, my efforts at e.g. dropping vmas has been eye-opening in actually quite how often a refactoring like this often ends up being more straightforward than you might imagine. > > > > > > It does look like a better approach to me given the very nature of > > > user space pages. > > > > Yeah, there is a lot of historical baggage. For example, FOLL_GET should be inaccessible to kernel modules completely at one point, to be only used by selected core-mm pieces. > > Yes. When I first mass-converted call sites from gup to pup, I just > preserved FOLL_GET behavior in order to keep from changing too much at > once. But I agree that that it would be nice to make FOLL_GET an > mm internal-only flag like FOLL_PIN. Very glad you did that work! And totally understandable as to you being conservative with that, but I think we're at a point where there's more acceptance of incremental improvements to GUP as a whole. I have another patch series saved up for _yet more_ changes on this. But mindful of churn I am trying to space them out... until Jason nudges me of course :) > > > > > Maybe we should even disallow passing in FOLL_LONGTERM as a flag and only provide functions like pin_user_pages() vs. pin_user_pages_longterm(). Then, discussions about conditional flag-setting are no more :) > > > > ... or even use pin_user_pages_shortterm() vs. pin_user_pages() ... to make the default be longterm. > > > > Yes, it is true that having most gup flags be internal to mm does tend > to avoid some bugs. But it's also a lot of churn. I'm still on the fence > as to whether it's really a good move to do this for FOLL_LONGTERM or > not. But it's really easy to push me off of fences. :) *nudge* ;) > > thanks, > -- > John Hubbard > NVIDIA > Looking at non-fast, non-FOLL_LONGTERM PUP callers (forgive me if I missed any):- - pin_user_pages_remote() in process_vm_rw_single_vec() for the process_vm_access functionality. - pin_user_pages_remote() in user_event_enabler_write() in kernel/trace/trace_events_user.c. - pin_user_pages_unlocked() in ivtv_udma_setup() in drivers/media/pci/ivtv/ivtv-udma.c and ivtv_yuv_prep_user_dma() in ivtv-yuv.c. And none that actually directly invoke PUP without FOLL_LOGNTERM... That suggests that we could simply disallow non-FOLL_LONGTERM non-fast PUP calls altogether and move to pin_user_pages_longterm() [I'm happy to write a patch series doing this]. The ivtv callers look like they really actually want FOLL_LONGTERM unless I'm missing something so we should probably change that too? I haven't surveyed the fast versions, but I think defaulting to FOLL_LONGTERM on them also makes sense.