From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43159D10F30 for ; Sun, 17 Nov 2024 21:12:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B00C98D0019; Sun, 17 Nov 2024 16:12:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A89148D0018; Sun, 17 Nov 2024 16:12:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B38F8D0019; Sun, 17 Nov 2024 16:12:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 531328D0018 for ; Sun, 17 Nov 2024 16:12:15 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C315A1A006F for ; Sun, 17 Nov 2024 21:12:14 +0000 (UTC) X-FDA: 82796833518.23.301C6B8 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf01.hostedemail.com (Postfix) with ESMTP id 9EDB54000A for ; Sun, 17 Nov 2024 21:11:35 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linuxfoundation.org header.s=korg header.b=bFw6+02h; spf=pass (imf01.hostedemail.com: domain of gregkh@linuxfoundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=gregkh@linuxfoundation.org; dmarc=pass (policy=none) header.from=linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731877874; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CMV7sruiExUAc3E4WxtmTGW5qLPyafHT7CLf7P4VJRU=; b=sm4OFdliQe8dqOGZg2pk4V6ysPr6fbZ82OnbidNTlA6+5kuIxIBQrX/5IVp1bs8TPad5db ykUMQPGzckGElINopNAmTXCnF3ls0k0pLzG7zgUm2+I/NaURf8adi5/65q+yv42cHtj83i wpMuNP0aFNSbVPM9dg8Jbort6a6xvkE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731877874; a=rsa-sha256; cv=none; b=hF5gMwKlmM25R3YvrRjxzzBe7Tgril4/6xP8YuGdfAeYYcKpHbWUAyHNuYzgL9QrPO06mN x0yBx8Owpz0CCuXih2ZEyjxDoO7jzycUiF55UQIOcCEvNJFaU/guVewO7fsMBG1vrmSksw vQ2Urcbys0tMRUBjr/XRrRIEYRX3tmA= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linuxfoundation.org header.s=korg header.b=bFw6+02h; spf=pass (imf01.hostedemail.com: domain of gregkh@linuxfoundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=gregkh@linuxfoundation.org; dmarc=pass (policy=none) header.from=linuxfoundation.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 1B8585C53FA; Sun, 17 Nov 2024 21:11:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4BC9BC4CED2; Sun, 17 Nov 2024 21:12:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1731877931; bh=HGRx4uxFKRJlW6uig3Endwi2UsrcK+2HJJC5CFseVrE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bFw6+02hlzEKV45miXHj2ui9OPScaDREt4GyBl7/aCQ9ze76Gd94XQtAwU2ANvV9x kk+OCMLWdVKxIVrAYQqJClJWqZdADNAM0CVqPmYihslgNzNHqItAzi1iZS3EijE8JR DF1VIfkwtbjMN49rU2grAFgmm8BMOyljCgNI47E0= Date: Sun, 17 Nov 2024 22:11:47 +0100 From: Greg KH To: Lorenzo Stoakes Cc: Vlastimil Babka , stable@vger.kernel.org, Andrew Morton , "Liam R . Howlett" , Jann Horn , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Peter Xu , Catalin Marinas , Will Deacon , Mark Brown , "David S . Miller" , Andreas Larsson , "James E . J . Bottomley" , Helge Deller Subject: Re: [PATCH 6.1.y 4/4] mm: resolve faulty mmap_region() error path behaviour Message-ID: <2024111713-syndrome-impolite-d154@gregkh> References: <4cb9b846f0c4efcc4a2b21453eea4e4d0136efc8.1731671441.git.lorenzo.stoakes@oracle.com> <2979df31-ce8c-4382-ab01-7e66f852099d@suse.cz> <01fbc3f2-bccb-4694-99ec-2ee8e9ff6e4e@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <01fbc3f2-bccb-4694-99ec-2ee8e9ff6e4e@lucifer.local> X-Stat-Signature: p7rbh18j3u9d376p6uj7nrnr5m48psfw X-Rspam-User: X-Rspamd-Queue-Id: 9EDB54000A X-Rspamd-Server: rspam02 X-HE-Tag: 1731877895-565091 X-HE-Meta: U2FsdGVkX1+Hg3wFZxL161xEpWdbUALfFgA4iTZoGWVikQ0mm/XzaFe8v9Ws2xQIVVMjBsyR4fiRUUDTzvAXF2bOls5Q17+g+35kLG2dQtIgXQPQ3V+KXa/T0gGiT67GU3Wlak4/fmtSCFnXfMUTovdFsq5D2OkB/5iAs2v3XinWB7LaZe3boMytHpvDG/DY7PgHq+4Go2Qk8n5EyT/I1oGB5NIOxwJFhez0hXngkClGur5XCrjP6FgUxh06u+CxtQ3QtphO4RfxEl9Q+nkmelzIEF4lWAr6gBPo5wVwaw6r5DcAn1qJymENu7CVso3PoTkq5V0ps49n/HJCQj9l5iPaaOsax3B60acut72AAsFCpKSae4R52ZNk0Dn8NjLuCVXk9bqRX8OleEsnunvdtiAAzy1S//wVsC6cMNA2xNEt5Hx8IgFv3YjHFat1dE1ipGdz0N08+yWefeE7Rd6r1Z3iw4fHBIV69dK5t6TBl8H80zR7O3jnToirutcyDDozwpn6I0PKatOoBdigmCZTNkOoi9qa+6JZASt/dpLK4ZyrndI5bYU/3V3q4Ki928aRxMVTbw5mrtEE48OOBOPjI42XS7VdNf2Rby0BmwxJNErplOn0xNVQqjorrS/BtPtv1Za3XvW2hCgdoSw52BGvqbyou6XlAOoItCLJggegePWE7aQzxosx+0C5I9ea4f4jFMOQtkbiEDtkIr+oWp74evxsFuvJDIlgIHucBBlvAp91pAqB7jBV4/eLdMylMnf49A5U21WaxmXNOAfvneFysQW3eZB8diqDYkF8svnGACNFDRAOmF9+pMp6qHQKnb8MraiECbcBEYaAEQvaplfF5OgmhTQYf2PcVI9kzjaOH69vJhaM6aCCL63df6vcm7xl+fQlORFY/UcHacEyy6GwHJ7QumAbz3QvkaWxRXeFJCNoaGICfmxesyvVICFdcMU44c2/qsRmoLFc9c0mMbR sp1HYwmb yahwdxSrbot2x+woQNnpQH8uqgzzyPosycYV5ER3cYJ5QC8i/WYrTJE72JeQ+TUPyRd1xG6m7q1gtSYa72Dv4t81+FNeLrMFnmRV3rIIoOBNQnLxgnmMQB90UmUwCxegep1jJPZr7IGWGwmyYBuQr66llBh5d+8zQgb4PhLP9eY0mjb2M108C4jVnoqn+7dzYhtABBDkN8YC50vBaebTEhS/NBoPnHTaRuTdhq0Wc1a5kP+eEG4qIcGJMKP8Zg45suUnmzz3UUHPtWgpxaIGriagML83ZGkiUMdOZmukzncIGmSzikOSK7DBR4vBfj8v5/hio3EdlkkAvx5HIzqt4xU+K1Bw8t8FcPfZlACTdCK90sOdtfU1QdtCVSXFcOFPoHHwU1jeAbUiyG1Mp1tWVdRm55VRywiAP9gErVe4m9M4jtNWrslrl4flaFdrAFg0/O7JPt2nbCax61q8mbxbaTAL16Q/RGBP/VZJOy5kJbAF4YKXjjvJf4melygA4g8418aNQMr5qFzY0+6Du77yg3HPEPQRaG9FEAwMCLyghpyhqMkFZFBSByaAN9uanlob8NiblZsycaUSRW+rZOUVDgld0YI1iD++Ap8N2txzJQjhUmiZIk9iKwIdhrI7toicakWb9afxCI9x1b6Amz6oIM79mBacWELdsNjfj1UpbSO2uPbR7EBuXPjr36Lwn+0CAdjLaMvp+qZ5qvD1wGliCGv5818oQfQSvMQ81U3RMpTqp47Ygvwp7ngifJkqmESB8hkZQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Nov 15, 2024 at 07:28:34PM +0000, Lorenzo Stoakes wrote: > On Fri, Nov 15, 2024 at 08:06:05PM +0100, Vlastimil Babka wrote: > > On 11/15/24 13:40, Lorenzo Stoakes wrote: > > > [ Upstream commit 5de195060b2e251a835f622759550e6202167641 ] > > > > > > The mmap_region() function is somewhat terrifying, with spaghetti-like > > > control flow and numerous means by which issues can arise and incomplete > > > state, memory leaks and other unpleasantness can occur. > > > > > > A large amount of the complexity arises from trying to handle errors late > > > in the process of mapping a VMA, which forms the basis of recently > > > observed issues with resource leaks and observable inconsistent state. > > > > > > Taking advantage of previous patches in this series we move a number of > > > checks earlier in the code, simplifying things by moving the core of the > > > logic into a static internal function __mmap_region(). > > > > > > Doing this allows us to perform a number of checks up front before we do > > > any real work, and allows us to unwind the writable unmap check > > > unconditionally as required and to perform a CONFIG_DEBUG_VM_MAPLE_TREE > > > validation unconditionally also. > > > > > > We move a number of things here: > > > > > > 1. We preallocate memory for the iterator before we call the file-backed > > > memory hook, allowing us to exit early and avoid having to perform > > > complicated and error-prone close/free logic. We carefully free > > > iterator state on both success and error paths. > > > > > > 2. The enclosing mmap_region() function handles the mapping_map_writable() > > > logic early. Previously the logic had the mapping_map_writable() at the > > > point of mapping a newly allocated file-backed VMA, and a matching > > > mapping_unmap_writable() on success and error paths. > > > > > > We now do this unconditionally if this is a file-backed, shared writable > > > mapping. If a driver changes the flags to eliminate VM_MAYWRITE, however > > > doing so does not invalidate the seal check we just performed, and we in > > > any case always decrement the counter in the wrapper. > > > > > > We perform a debug assert to ensure a driver does not attempt to do the > > > opposite. > > > > > > 3. We also move arch_validate_flags() up into the mmap_region() > > > function. This is only relevant on arm64 and sparc64, and the check is > > > only meaningful for SPARC with ADI enabled. We explicitly add a warning > > > for this arch if a driver invalidates this check, though the code ought > > > eventually to be fixed to eliminate the need for this. > > > > > > With all of these measures in place, we no longer need to explicitly close > > > the VMA on error paths, as we place all checks which might fail prior to a > > > call to any driver mmap hook. > > > > > > This eliminates an entire class of errors, makes the code easier to reason > > > about and more robust. > > > > > > Link: https://lkml.kernel.org/r/6e0becb36d2f5472053ac5d544c0edfe9b899e25.1730224667.git.lorenzo.stoakes@oracle.com > > > Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") > > > Signed-off-by: Lorenzo Stoakes > > > Reported-by: Jann Horn > > > Reviewed-by: Liam R. Howlett > > > Reviewed-by: Vlastimil Babka > > > Tested-by: Mark Brown > > > Cc: Andreas Larsson > > > Cc: Catalin Marinas > > > Cc: David S. Miller > > > Cc: Helge Deller > > > Cc: James E.J. Bottomley > > > Cc: Linus Torvalds > > > Cc: Peter Xu > > > Cc: Will Deacon > > > Cc: > > > Signed-off-by: Andrew Morton > > > Signed-off-by: Lorenzo Stoakes > > > --- > > > mm/mmap.c | 103 +++++++++++++++++++++++++++++------------------------- > > > 1 file changed, 56 insertions(+), 47 deletions(-) > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > index 322677f61d30..e457169c5cce 100644 > > > --- a/mm/mmap.c > > > +++ b/mm/mmap.c > > > @@ -2652,7 +2652,7 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, > > > return do_mas_munmap(&mas, mm, start, len, uf, false); > > > } > > > > > > -unsigned long mmap_region(struct file *file, unsigned long addr, > > > +static unsigned long __mmap_region(struct file *file, unsigned long addr, > > > unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, > > > struct list_head *uf) > > > { > > > @@ -2750,26 +2750,28 @@ unsigned long mmap_region(struct file *file, unsigned long addr, > > > vma->vm_page_prot = vm_get_page_prot(vm_flags); > > > vma->vm_pgoff = pgoff; > > > > > > - if (file) { > > > - if (vm_flags & VM_SHARED) { > > > - error = mapping_map_writable(file->f_mapping); > > > - if (error) > > > - goto free_vma; > > > - } > > > + if (mas_preallocate(&mas, vma, GFP_KERNEL)) { > > > + error = -ENOMEM; > > > + goto free_vma; > > > + } > > > > > > + if (file) { > > > vma->vm_file = get_file(file); > > > error = mmap_file(file, vma); > > > if (error) > > > - goto unmap_and_free_vma; > > > + goto unmap_and_free_file_vma; > > > + > > > + /* Drivers cannot alter the address of the VMA. */ > > > + WARN_ON_ONCE(addr != vma->vm_start); > > > > > > /* > > > - * Expansion is handled above, merging is handled below. > > > - * Drivers should not alter the address of the VMA. > > > + * Drivers should not permit writability when previously it was > > > + * disallowed. > > > */ > > > - if (WARN_ON((addr != vma->vm_start))) { > > > - error = -EINVAL; > > > - goto close_and_free_vma; > > > - } > > > + VM_WARN_ON_ONCE(vm_flags != vma->vm_flags && > > > + !(vm_flags & VM_MAYWRITE) && > > > + (vma->vm_flags & VM_MAYWRITE)); > > > + > > > mas_reset(&mas); > > > > > > /* > > > @@ -2792,7 +2794,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, > > > vma = merge; > > > /* Update vm_flags to pick up the change. */ > > > vm_flags = vma->vm_flags; > > As far as I can tell we should add: > > + mas_destroy(&mas); > > > > - goto unmap_writable; > > > + goto file_expanded; > > > > I think we might need a mas_destroy() somewhere around here otherwise we > > leak the prealloc? In later versions the merge operation takes our vma > > iterator so it handles that if merge succeeds, but here we have to cleanup > > our mas ourselves? > > > > Sigh, yup. This code path is SO HORRIBLE. I think simply a > mas_destroy(&mas) here would suffice (see above). > > I'm not sure how anything works with stable, I mean do we need to respin a > v2 just for one line? How else am I supposed to take a working patch that has actually been tested? I can't hand-edit this... thanks, greg k-h