From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1557C3ABAA for ; Fri, 2 May 2025 12:20:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 226866B008C; Fri, 2 May 2025 08:20:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D6BC6B0093; Fri, 2 May 2025 08:20:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 076D96B0095; Fri, 2 May 2025 08:20:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DF0D16B008C for ; Fri, 2 May 2025 08:20:44 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B3108C13F7 for ; Fri, 2 May 2025 12:20:45 +0000 (UTC) X-FDA: 83397876450.07.BEEA65B Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf23.hostedemail.com (Postfix) with ESMTP id 43CD7140008 for ; Fri, 2 May 2025 12:20:43 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=V6DegJyw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=H+e5aQFV; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=zIljpT8R; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=TfQgwou6; dmarc=none; spf=pass (imf23.hostedemail.com: domain of jack@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746188443; a=rsa-sha256; cv=none; b=qr8voj0+szTt+vpy2+uZYi53Pq6FS1Q0zk3Ql78RwAYvQpYqKzz3+mmaPjGf/3Rjy4hUga T2aFBip/TOuDcLmR1/iWLA/Mw3mtoXytSAhsBPAXtUrm3qWzO2f3sm8wdTB7aB2MTAKysZ zEKWoFEeGijyy6kM6tC6ZXPGRzOIBbo= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=V6DegJyw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=H+e5aQFV; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=zIljpT8R; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=TfQgwou6; dmarc=none; spf=pass (imf23.hostedemail.com: domain of jack@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746188443; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l3XGzJGzTYHGCyVRD8KTVZyUw74RJvDiVdSvzltzOus=; b=SYbMCpsbSUukcJXaJv/YFRejVSHMLLxqjl9PJ7UwUXRtUgj9qk2a/iCAfS/qgxznQQ4KxT etnt2+ikaaaiiyuC1Rl8s+vC5AFzSPGLalwM7scVzM7u34IEdxr3yyKjvOVNTXAMnhDtTE RDJ21TNQqAWBDEVmoTR0b56/LuISX9M= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F14451F387; Fri, 2 May 2025 12:20:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1746188440; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=l3XGzJGzTYHGCyVRD8KTVZyUw74RJvDiVdSvzltzOus=; b=V6DegJywf70AoxraMTuMW3jX9SKGfYwdLkLahmtXuDy5LAxewP6td7SnDTTO+NzYfktOsm PnN3lOm084p+AOAKJdweLY4xX5KFWpqGMyspqZW0t/WEwJnouVx9eWrpAcHZTdN9CYO+k7 Q7qO4v9gXdHur7d7rPmxltYcck5Y6uw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1746188440; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=l3XGzJGzTYHGCyVRD8KTVZyUw74RJvDiVdSvzltzOus=; b=H+e5aQFVjUIdjrBDL6m5nNJPljJgie3g0+7oaAQxYM88121amB/4uERDL6LtjksEXKCWEk Fzwdmp4cEKLT92CA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1746188438; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=l3XGzJGzTYHGCyVRD8KTVZyUw74RJvDiVdSvzltzOus=; b=zIljpT8RYZGWd/HBqgGRL0/G/EPLljblmAn6xhBisAStcablM7g7QF0gUsaHrQXmW0jKTl dwjOsB7J49AfnSGAcZGtH3/MakJXuhOpKUZuo7xVVEnTlXycCvM2AcykGc1HUaDAs/6A/P zz76gazA5mrQeaa7EMBz8jLGP4fox9o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1746188438; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=l3XGzJGzTYHGCyVRD8KTVZyUw74RJvDiVdSvzltzOus=; b=TfQgwou6GFyRLC590TZR8fNmQk57E8vvDttvruRgNUVOqSnjaPNyXinuLP4siuNGqte9Jx ZbU7EJsr0SUHThAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id DBE861372E; Fri, 2 May 2025 12:20:38 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id lB6qNZa4FGiICAAAD6G6ig (envelope-from ); Fri, 02 May 2025 12:20:38 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 520F6A0921; Fri, 2 May 2025 14:20:38 +0200 (CEST) Date: Fri, 2 May 2025 14:20:38 +0200 From: Jan Kara To: Lorenzo Stoakes Cc: Andrew Morton , David Hildenbrand , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Jann Horn , Pedro Falcato , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox Subject: Re: [RFC PATCH v2 0/3] eliminate mmap() retry merge, add .mmap_prepare hook Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Action: no action X-Rspamd-Queue-Id: 43CD7140008 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: oc8yk6jzkm8qcwqi6q6jgxfugb75f6th X-HE-Tag: 1746188443-276336 X-HE-Meta: U2FsdGVkX1+YpoF51izP9JZ6mIi5xpKvA0dDrPzc8X+uK6sLV3UVHnopoHAtw0wFn9N4uUDWWNoxaN/nBCb4sJ68fTRh/I3BAi5iwufltmxvV+8sVfWswRNh/28+a+CGTizNYFEXQeyW4zzDMZZd8NlV7NCNSP6UjjoFVXlqJlMOuqL8eLIFUdINbEgLx0U4ebAROhzrQGB46WPF3S7GesZDnmukdJoEfU49AoA9CXFsBJNXeE3fGRqRbHRFVriT5Yq0IUIAqsiPuiH5wKRWgcM6YzDI/lmBdS0YSBiD6DSlZAF0HaNEVKyk3kHEAHp4bO8qvcp6mwssYnSIwyeD05SyfZQWqoF3HalgcmgUWiC9vAMXvKB2xn2fBMWTYWRkFC58daHHR9FMFV5612tYwYM/GV1AfHHOIt5dnubGYeVAmEbJJtnU5bWOireO6yg+SMPAc0lk/U7MJ7ElbJw8Lu4NclM4bLVAO8PCAIBBRLt6unlI0q8xxtvXtWmy3hJh3iI7vG4uxUvoRKnL+FsKs3Bh5Lj9WfK+81Qr4B8Oo09x7RZhOkQRzoMUDWbyZW0yjiwS0eUU/DVas8gte7+V20LSNrfrziBkyaiR5DLSrJyObuXhYkCd7ZH/A8f/OqeTM2pGOseFzN4/ynzCFt90cA7mJ4XYWilgO1JcFn+DXqZG29sMM0wTuGus+eX/s56HfloNHs8165jjfr1jTJE0R/xO9M0PMaJl7PU+mWyjY2sHV8oYsWDymGobqG+pFW35tC0L0swo0nnpYFa9t3iZln/LD0s/9oNxwdoGygALZ6E+jwOirDd0kFHPAUvYvlzlnR36R1EyTmNx6/nuH1glWTBEHn6Byigbn4lkt94HOFt9jQ3gQT+ZcpqtAdxGkJXlRaYQneZdzX0BbgjMX2aWNTdv56haxd825hR31kdR3JFW/J5HxUeJoKgN5eNIkWG8KCBc7v+o/XSbQlxe8/Q To8lQe5J 1dYcM5+rwr/ZUxPdOV4nKEIObck47H8qy9UzwIC7d0uyqr/E2j9Ya2NPdHZgobOrvPPWIq0iamYeFkrrN7zOD6hxNDu8DMm4nt49O7LPSb35O/uhLWNqUH4Cg8YZI1Dqhx2hTjhjOs46ecGoF1SSrNSxsksCJ4kISb2tRv88xS81lhECvWtAbckXFh0yLAN97d5iY8ERu01H/1NGFaOjDFMhuCHXajJnWJ3NBjTuM3PRqNukybib0R3f3YqnvqlkldCtNRUZxWDDHEJJ/hRPm0nkPNeCfAlomueLhJPaNhybThtmMvCY/756DAvP64jI4EielOzxofwB0lwjRWmO74J632lluEVWE/QR3UZiOlacdjMXKFXQ62BOh4DXZSw7l2qqQsw++HV2wXao= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu 01-05-25 18:25:26, Lorenzo Stoakes wrote: > During the mmap() of a file-backed mapping, we invoke the underlying driver > file's mmap() callback in order to perform driver/file system > initialisation of the underlying VMA. > > This has been a source of issues in the past, including a significant > security concern relating to unwinding of error state discovered by Jann > Horn, as fixed in commit 5de195060b2e ("mm: resolve faulty mmap_region() > error path behaviour") which performed the recent, significant, rework of > mmap() as a whole. > > However, we have had a fly in the ointment remain - drivers have a great > deal of freedom in the .mmap() hook to manipulate VMA state (as well as > page table state). > > This can be problematic, as we can no longer reason sensibly about VMA > state once the call is complete (the ability to do - anything - here does > rather interfere with that). > > In addition, callers may choose to do odd or unusual things which might > interfere with subsequent steps in the mmap() process, and it may do so and > then raise an error, requiring very careful unwinding of state about which > we can make no assumptions. > > Rather than providing such an open-ended interface, this series provides an > alternative, far more restrictive one - we expose a whitelist of fields > which can be adjusted by the driver, along with immutable state upon which > the driver can make such decisions: > > struct vm_area_desc { > /* Immutable state. */ > struct mm_struct *mm; > unsigned long start; > unsigned long end; > > /* Mutable fields. Populated with initial state. */ > pgoff_t pgoff; > struct file *file; > vm_flags_t vm_flags; > pgprot_t page_prot; > > /* Write-only fields. */ > const struct vm_operations_struct *vm_ops; > void *private_data; > }; > > The mmap logic then updates the state used to either merge with a VMA or > establish a new VMA based upon this logic. > > This is achieved via new file hook .mmap_prepare(), which is, importantly, > invoked very early on in the mmap() process. > > If an error arises, we can very simply abort the operation with very little > unwinding of state required. Looks sensible. So is there a plan to transform existing .mmap hooks to .mmap_prepare hooks? I agree that for most filesystems this should be just easy 1:1 replacement and AFAIU this would be prefered? Honza -- Jan Kara SUSE Labs, CR