From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4503ACDB482 for ; Thu, 12 Oct 2023 08:39:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A3EA58D0118; Thu, 12 Oct 2023 04:39:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EFEF8D0002; Thu, 12 Oct 2023 04:39:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B6AB8D0118; Thu, 12 Oct 2023 04:39:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7D4C48D0002 for ; Thu, 12 Oct 2023 04:39:04 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2C46D160580 for ; Thu, 12 Oct 2023 08:39:04 +0000 (UTC) X-FDA: 81336159408.24.FFEACF7 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf27.hostedemail.com (Postfix) with ESMTP id EB0A54000A for ; Thu, 12 Oct 2023 08:39:00 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Wl+Ozvuj; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=OqpZ1G7F; dmarc=none; spf=pass (imf27.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697099941; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8c6j9VDLhFYc5XmunB4hBYMK+2M9al9NYLUoZ11ZYVE=; b=1hm99XtRiLiO/pMlHSXAZmewGI1R/X+XEPAjsTaoDSKn9S6yp2iQM2Cd/fsnn6Pq4fbeGR 3mnxIeFGG6Fa3DTQtj9JARNW4mxFbGbk+xk8IVKvtBYcGDWNVlpKniJmpieyx6ZaOLhTls ZbPW8pEu68n/WHCjnZdoXPfSHb608y0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Wl+Ozvuj; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=OqpZ1G7F; dmarc=none; spf=pass (imf27.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697099941; a=rsa-sha256; cv=none; b=Btep2pROdwfMdKcYEjwgnv1TK7oFRoEM/tuCVj/WYbxvwCqlcVcxdEeOwZa6x4Cuf9CFVU WQynQFHjLL7zh6es7B0nYgs1Qf9ApjBqAr7qCG1wsmJ9ynhsi2MLVCLJpfOUvnQeDdSRhy I2DytTOpESAA+s7XnN8FkCRDw5Pwtqo= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 8BBC91F74C; Thu, 12 Oct 2023 08:38:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1697099938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8c6j9VDLhFYc5XmunB4hBYMK+2M9al9NYLUoZ11ZYVE=; b=Wl+Ozvuj8Nk9C7oDWauDDMtBBusebE03rVPeCwDvmJYrHMkxfXOBPt0zc63Nr50YITICVE Xc/wXLDUGEX9TYzBQy6WMirJ6fYNPooqR4EDMFOvfpLJGYmQ2WaSHcsYxFRTMTlI5jJtoT vpz+tCQoL9srkPNQBkiIETjg2bz2zgU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1697099938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8c6j9VDLhFYc5XmunB4hBYMK+2M9al9NYLUoZ11ZYVE=; b=OqpZ1G7FQHL2iZ2KYqxZhwCiDkq/qYL7Juof0mRlQdToTCF/UL3aew1AKtlnygFIL9OYWc 8WHEHcon6A+TfqBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7BB79139F9; Thu, 12 Oct 2023 08:38:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id sckdHqKwJ2XPUgAAMHmgww (envelope-from ); Thu, 12 Oct 2023 08:38:58 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id E8078A06B0; Thu, 12 Oct 2023 10:38:57 +0200 (CEST) Date: Thu, 12 Oct 2023 10:38:57 +0200 From: Jan Kara To: Lorenzo Stoakes Cc: Jan Kara , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Mike Kravetz , Muchun Song , Alexander Viro , Christian Brauner , Matthew Wilcox , Hugh Dickins , Andy Lutomirski , linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org Subject: Re: [PATCH v3 3/3] mm: enforce the mapping_map_writable() check after call_mmap() Message-ID: <20231012083857.ty66retpyhxkaem3@quack3> References: <20231011094627.3xohlpe4gm2idszm@quack3> <512d8089-759c-47b7-864d-f4a38a9eacf3@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <512d8089-759c-47b7-864d-f4a38a9eacf3@lucifer.local> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: EB0A54000A X-Stat-Signature: 3exnywofq57x3bneas41j1duqtns94ry X-Rspam-User: X-HE-Tag: 1697099940-514336 X-HE-Meta: U2FsdGVkX189nK71hVuDgihopVcoFCpSzXPS1bSQGCZfju2forWN4PAOyE5/wV51XMSA15LDENLefcgjDpAbgdUGkLw2L0mNkbugU9GUbqo13CkXVQkutbVf8lkHgPW/kFAt1kIJb6e2xBvdAYXDj6sK5QfnJ1Ytht3hRu4hKmkl0LyLHnlQ08ebOm7F+0Nxo9Rx81cbPQmY3ymc2VJp3QQu2pMNHWtCb6GtiASR2dWjXl35Yj+JHkojoA+YyIvBzza31EuXH5YIdB5YH8DqtYIYUUfPVE3R9T6AxDnqRKQ4fN1Aum5VZwinUoMgYwgEpp7ZT6cAsuGwrD/Vo0VhtB6RECpIAY8nSO24Dsyj+WcA8jQF16x9YkOJM6T+C2PSdVyavFwwh3MaeiAnUuU/IizF2TlvHcCliDNwO9AQ8djBW+MBOjbFmnycHJcblhO7vQEgSLPGROSJikOYO48uS/rH5FQYYY0VBu0bCipBiO46IILX/NUA32mvt3YYNzkWejXy2upiOii/AdYQNQa07BW/XezhgNzsq5oQ1pnAraMqrjAuNGkJ5MR+5rKeS73O/ON72Xd14UpgCbegoy+IFMIYdr0sx2cMfs83NhlYjj5Ymtr88xhAAK/d6Zfr4UtI27PmvYaXKqk8WLCx/AqSwomtIAb7CmodDs3Z4BLBknPcTDAviFkZupHDI8K4ATYquTFxJNxyJNQmFf35quwQEuGuI3U9R+jGKwKLrKIaumvudL8sSy1owyolpLr9ixYJRnov60a4JwMGhkZEHULSeegZXs4QOreqM8WCBGUj1Em8sReK14WWKeEWqVhILFnL96EsDgUDvlhz2JiXMYp/mdD0XvcjYqqDaw4oYKaQJOUmeiPtckfL6RfH2s0QEIG3tnIL+FnhI6ie6UAHOVOQjS8pFaAsXHP+EDolGoBRtqE+NWzF5N4oI9cH6l8t+u/WGlYNJ4IiUIhP12kRS6D 5wrLKODz 6cWHo+SSMduYejLrI60PorU1x6fC1kiSQBGusys0gFFgMpS7LEjK4ShM3jrnxzXZiYsQHMHQMtuoNvKlJLX7LrXSY6bxKO7ybHOppEqEbf2X88Jw2bxxpy0DjqfudFbejcdOOPXvr/7dqCqg2Pag4um9nkGlcWILPcOUHGtOkVHJo9irefKZQA7hfqecyrMjvivdqxel7juiLvF9LkluYJCzGVNrUhaFM6C/672fwnkmWBR79owZqVJ6od6lq1r57+kCBZ7ma80nbli1zBliRm4eN4oajz+Otctj/megEehlQ/nUWQBC2TZkrz4E9AkW5aMkVtKa+RPo2CWvJBk9PBtN227rLK7ZO4zz5oC9ASItaAIeRjyz/tHE7lC2hp9yNhS8G X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 11-10-23 19:14:10, Lorenzo Stoakes wrote: > On Wed, Oct 11, 2023 at 11:46:27AM +0200, Jan Kara wrote: > > On Sat 07-10-23 21:51:01, Lorenzo Stoakes wrote: > > > In order for an F_SEAL_WRITE sealed memfd mapping to have an opportunity to > > > clear VM_MAYWRITE in seal_check_write() we must be able to invoke either > > > the shmem_mmap() or hugetlbfs_file_mmap() f_ops->mmap() handler to do so. > > > > > > We would otherwise fail the mapping_map_writable() check before we had > > > the opportunity to clear VM_MAYWRITE. > > > > > > However, the existing logic in mmap_region() performs this check BEFORE > > > calling call_mmap() (which invokes file->f_ops->mmap()). We must enforce > > > this check AFTER the function call. > > > > > > In order to avoid any risk of breaking call_mmap() handlers which assume > > > this will have been done first, we continue to mark the file writable > > > first, simply deferring enforcement of it failing until afterwards. > > > > > > This enables mmap(..., PROT_READ, MAP_SHARED, fd, 0) mappings for memfd's > > > sealed via F_SEAL_WRITE to succeed, whereas previously they were not > > > permitted. > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217238 > > > Signed-off-by: Lorenzo Stoakes > > > > ... > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > index 6f6856b3267a..9fbee92aaaee 100644 > > > --- a/mm/mmap.c > > > +++ b/mm/mmap.c > > > @@ -2767,17 +2767,25 @@ unsigned long mmap_region(struct file *file, unsigned long addr, > > > vma->vm_pgoff = pgoff; > > > > > > if (file) { > > > - if (is_shared_maywrite(vm_flags)) { > > > - error = mapping_map_writable(file->f_mapping); > > > - if (error) > > > - goto free_vma; > > > - } > > > + int writable_error = 0; > > > + > > > + if (vma_is_shared_maywrite(vma)) > > > + writable_error = mapping_map_writable(file->f_mapping); > > > > > > vma->vm_file = get_file(file); > > > error = call_mmap(file, vma); > > > if (error) > > > goto unmap_and_free_vma; > > > > > > + /* > > > + * call_mmap() may have changed VMA flags, so retry this check > > > + * if it failed before. > > > + */ > > > + if (writable_error && vma_is_shared_maywrite(vma)) { > > > + error = writable_error; > > > + goto close_and_free_vma; > > > + } > > > > Hum, this doesn't quite give me a peace of mind ;). One bug I can see is > > that if call_mmap() drops the VM_MAYWRITE flag, we seem to forget to drop > > i_mmap_writeable counter here? > > This wouldn't be applicable in the F_SEAL_WRITE case, as the > i_mmap_writable counter would already have been decremented, and thus an > error would arise causing no further decrement, and everything would work > fine. > > It'd be very odd for something to be writable here but the driver to make > it not writable. But we do need to account for this. Yeah, it may be odd but this is indeed what i915 driver appears to be doing in i915_gem_object_mmap(): if (i915_gem_object_is_readonly(obj)) { if (vma->vm_flags & VM_WRITE) { i915_gem_object_put(obj); return -EINVAL; } vm_flags_clear(vma, VM_MAYWRITE); } > > I've checked why your v2 version broke i915 and I think the reason maybe > > has nothing to do with i915. Just in case call_mmap() failed, it ended up > > jumping to unmap_and_free_vma which calls mapping_unmap_writable() but we > > didn't call mapping_map_writable() yet so the counter became imbalanced. > > yeah that must be the cause, I thought perhaps somehow > __remove_shared_vm_struct() got invoked by i915_gem_mmap() but I didn't > trace it through to see if it was possible. > > Looking at it again, i don't think that is possible, as we hold a mmap/vma > write lock, and the only operations that can cause > __remove_shared_vm_struct() to run are things that would not be able to do > so with this lock held. > > > So I'd be for returning to v2 version, just fix up the error handling > > paths... > > So in conclusion, I agree, this is the better approach. Will respin in v4. Thanks! Honza -- Jan Kara SUSE Labs, CR