From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69695C77B7D for ; Mon, 15 May 2023 06:41:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7E02900003; Mon, 15 May 2023 02:41:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D0560900002; Mon, 15 May 2023 02:41:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7EFB900003; Mon, 15 May 2023 02:41:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9D199900002 for ; Mon, 15 May 2023 02:41:07 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 667B5812BB for ; Mon, 15 May 2023 06:41:07 +0000 (UTC) X-FDA: 80791542174.17.8F0C631 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id 95C67140009 for ; Mon, 15 May 2023 06:41:05 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Ju4QNzqn; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684132865; a=rsa-sha256; cv=none; b=vyIP9b6I5/FXKIjD4MImOLAN62tpQBHW3kIynLXG+lEueW2uA7V6TPlEvHtKdmV3C+mjFH 9SkMikQMYdbpmE90rMk818hOMbNpcUhwHS0jJFO/v2WYJMsUg6JMNg0BA+PYOE+R2Eqphe cfLu2lchCDn8Hn6JOj7FIVA1HoS+dgs= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Ju4QNzqn; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684132865; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eTVrGGlYczEM9hMORy+Z3HIkZyU0TjDqTsefNE/9U+k=; b=eeRHNhfcRNcwsBui2TWdarT5Xpxn9EWOkV9dNqnsnZlXAxZE5HYKFv56z6b3kiMFqJGf6C g5hYvwgp6Kgxsl7aiBPiosYCDBDL3El4y2WVFP5E7PHGNZhKDXbXHZdCHs9Dgd5hfmD9q5 LeSQRpxbpt0nIgbN9RFN0YUd0/lGzqc= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9F2D6612E1; Mon, 15 May 2023 06:41:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C259BC433EF; Mon, 15 May 2023 06:41:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684132864; bh=90nfYyKZYy7ckcCwJAbM9ZsaLtxOr/tA8/EOpxCc+Zo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Ju4QNzqn7rtY5ee/fl+UyxVA/jHCEnpVdiAJraM97ffhN2YlyPfA1yB7Ug8aykSZt NkUxo4gvz5QP6+nI08Qz1rbQ32jtC9zl5ZM8TfWo35Gq2/WYY7HBhLJ4VKNt9V5gxq W7RAne6Txu9244BZhk7DiWDF+PcUy748mTRrp5XtTVsneDS0hAp/FJcz4g94JaK0op R2k/dqP2DHMEZHH/qLASp+hk8Rehjo+o8oj7WcSyFRvJAtn6Ko4yTXcBToy/noCy98 eeQX+4CS3Xfu6eNxOMAi3DfDXx1vw8Ye7g7cdL6YHyEV1Kajv15EGQ3oDo7gPSQea1 mbjOWEtUjGTEQ== Date: Mon, 15 May 2023 09:40:50 +0300 From: Mike Rapoport To: Lorenzo Stoakes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Peter Xu , Mark Rutland , "Liam R . Howlett" , Alexander Viro , Christian Brauner Subject: Re: [PATCH for 6.4-rcX] mm: userfaultfd: avoid passing an invalid range to vma_merge() Message-ID: References: <20230514172731.134188-1-lstoakes@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230514172731.134188-1-lstoakes@gmail.com> X-Rspam-User: X-Stat-Signature: oj44jk9ehszd54fgmfwxyyona3tgcfy9 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 95C67140009 X-HE-Tag: 1684132865-180432 X-HE-Meta: U2FsdGVkX1/UW8dX3gxf310EK8VosuaefIxPbF0zHxNqKAXsrNdFnV4d5AI5AeGYAFSldlVJi4eHWvfsd5DizWWgopXDyXkzPQTSAuTZsTk9RvD9MxX+OxITPa1JPwK6kYdi757A3V9IcSE3ZIhQ3+5I6jsmytLpOyPx2/XyNRCblVsgrjKXqBYrqj3Z6DO7vbo90MwMNIn5lSfDEaIZe3hRance2WtglmUeSVytvqHYRrlJdfeNC84uhxvk3U+vNVItpv0KxCTDT22kIc+C96bIMk8TFXLgURZMKsSlKIu5lpTlO5Hh8XUpqSAdrOEwv0XqgAiJ8WNbfmsqknHxCi07kbND7HAZzq47xPlQj0ONt0FBzCrBapfKZ4/BcKQsfUL93iVCMHuU+bbjTVVGGii+Zf62uOmWQgHENAv5Nj0g9Z9UeMGazcqDK/VHHIHxb5IwOKLyQu+WkGA1DT2CK4gvvnx4EUKy0lnm6rAB6Pc0AucQpUthmyuDg6l73rIiqPr8o+du8O1LlORf++kpnxC5y+r19ecyIRgfK1DUIuslq0JcsVQSuBCS8HlKPmWXsAS1qRkKIx+rBkP7bewnt0q5Pjd/Nd2qf5Ku0Z8rwQrOg51H9Tj56YMJUJPVI13uhw3Ikqf1+R3lbewaIPdPmghoSSX9JR0cVjDuONkoeCe1yQkzBmwV4/4VBhLUqeZU71m4xSQAyVrMDOY9qBIxazVJkV2KSK9wnk3nBuHQOB4WluGFo8M3d5i1UXqYME4xrBMRlKoE7PiQMfA0MOoH1ihYuFM5bWQfKeu9H4Hz16ni4+rTKFkkS9QpLGO5WVkEus89MxXV5f2h8r6CtVCrMKXskuKGdxCA0BQKYTq1EUYIFZ0HpZd9nEJRifL7KuI58ZcCb8Tgn+8T2yYqQbGay67vBAMME+v/cLdLgjWhHt+GkZ2GrMKtNFaJJrgKcNQmubMOFNwCqmbxpEPW1dr BQMJN3iy 4FbHRGZI0KywGyka47JANl+ONjVYAvI8Z5xF2H/vprQgyYyuuQhouolGjQ7nnHF5Sq0YqI61yJOiLyMzCg3Cehlkls8mUHFrRO4Bue3o+WIV0Hce6JywoTb6l1AGOnoPRzq9e83aWd08kfHa6SLhZqgtCFhOydy/L6yeD+PQf287i6AhnMaEnH/xMLDuFSHisBchJJy6WPOWmBH96s4611DT9zAs4fDr6Ibb90/XSi3HJegrLnDu3yMtgdayXKRPozo736b+bMWmVox/1aU7SlBSkMaQ3WGOrzwsTVCcuK1kGU4BjYlzWfh3n/9vXEnxrYxu8/BCKv+8bK28jY+xtMHoU9Xe7F7lkmy5s6+FMq6rMTQYPW1kZ3IwN1IxcjWd9OElPU9g+zatbhhk+lofHjiNE3hqn77PPc4oklmWliYp3SWPcDzGeQG+GUFE1/2zpfeSODW8372lHDVJe6BMEMU2BSjMOlZinM1iW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, May 14, 2023 at 06:27:31PM +0100, Lorenzo Stoakes wrote: > The userfaultfd_[un]register() functions will knowingly pass an invalid > address range to vma_merge(), then rely on it failing to merge to indicate > that the VMA should be split into a valid one. > > This is not something that should be relied upon, as vma_merge() implicitly > assumes in cases 5-8 that curr->vm_start == addr. This is now enforced > since commit b0729ae0ae67 ("mm/mmap/vma_merge: explicitly assign res, vma, > extend invariants") with an explicit VM_WARN_ON() check. > > Since commit 29417d292bd0 ("mm/mmap/vma_merge: always check invariants") > this check is performed unconditionally, which caused this assert to arise > in tests performed by Mark [1]. > > This patch fixes the issue by performing the split operations before > attempting to merge VMAs in both instances. The problematic operation is > splitting the start of the VMA since we were clamping to the end of the VMA > in any case, however it is useful to group both of the split operations > together to avoid egregious goto's and to abstract the code between the > functions. > > As well as fixing the repro described in [1] this also continues to pass > uffd unit tests. > > [1]:https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com > > Reported-by: Mark Rutland > Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/ > Signed-off-by: Lorenzo Stoakes > --- > fs/userfaultfd.c | 111 +++++++++++++++++++++++++++-------------------- > 1 file changed, 63 insertions(+), 48 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 0fd96d6e39ce..4453e7040157 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1319,6 +1319,35 @@ static __always_inline int validate_range(struct mm_struct *mm, > return 0; > } > > +static int split_range(struct vma_iterator *vmi, > + struct vm_area_struct *vma, > + unsigned long start, > + unsigned long end, > + bool *can_merge) Maybe clamp_range()? I'd also prefer to fill lines with parameters, rather than have each on a separate line. > +{ > + int ret; > + bool merge = true; > + > + /* The range must always be clamped to the start of a VMA. */ > + if (vma->vm_start < start) { > + ret = split_vma(vmi, vma, start, 1); > + if (ret) > + return ret; > + > + merge = false; > + } > + > + /* It must also be clamped to the end of a VMA. */ > + if (vma->vm_end > end) { > + ret = split_vma(vmi, vma, end, 0); > + if (ret) > + return ret; > + } > + > + *can_merge = merge; > + return 0; > +} > + > static int userfaultfd_register(struct userfaultfd_ctx *ctx, > unsigned long arg) > { > @@ -1330,7 +1359,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > unsigned long vm_flags, new_flags; > bool found; > bool basic_ioctls; > - unsigned long start, end, vma_end; > + unsigned long start, end; > struct vma_iterator vmi; > > user_uffdio_register = (struct uffdio_register __user *) arg; > @@ -1462,6 +1491,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > > ret = 0; > for_each_vma_range(vmi, vma, end) { > + bool can_merge; > + > cond_resched(); > > BUG_ON(!vma_can_userfault(vma, vm_flags)); > @@ -1477,32 +1508,22 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > (vma->vm_flags & vm_flags) == vm_flags) > goto skip; > > - if (vma->vm_start > start) > - start = vma->vm_start; I don't think this can be removed. Consider a request to register uffd for a range that spans two disjoint VMAs. Then on the second iteration start will be equal to vm_end of the first VMA, so it should be clamped to vm_start of the second VMA. > - vma_end = min(end, vma->vm_end); > + ret = split_range(&vmi, vma, start, end, &can_merge); > + if (ret) > + break; > > new_flags = (vma->vm_flags & ~__VM_UFFD_FLAGS) | vm_flags; > - prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags, > - vma->anon_vma, vma->vm_file, vma->vm_pgoff, > - vma_policy(vma), > - ((struct vm_userfaultfd_ctx){ ctx }), > - anon_vma_name(vma)); > - if (prev) { > + if (can_merge) { > + prev = vma_merge(&vmi, mm, prev, vma->vm_start, vma->vm_end, new_flags, > + vma->anon_vma, vma->vm_file, vma->vm_pgoff, > + vma_policy(vma), > + ((struct vm_userfaultfd_ctx){ ctx }), > + anon_vma_name(vma)); > + > /* vma_merge() invalidated the mas */ > - vma = prev; > - goto next; > - } > - if (vma->vm_start < start) { > - ret = split_vma(&vmi, vma, start, 1); > - if (ret) > - break; > - } > - if (vma->vm_end > end) { > - ret = split_vma(&vmi, vma, end, 0); > - if (ret) > - break; > + if (prev) > + vma = prev; > } > - next: > /* > * In the vma_merge() successful mprotect-like case 8: > * the next vma was merged into the current one and > @@ -1560,7 +1581,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > struct uffdio_range uffdio_unregister; > unsigned long new_flags; > bool found; > - unsigned long start, end, vma_end; > + unsigned long start, end; > const void __user *buf = (void __user *)arg; > struct vma_iterator vmi; > > @@ -1627,6 +1648,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > prev = vma_prev(&vmi); > ret = 0; > for_each_vma_range(vmi, vma, end) { > + bool can_merge; > + > cond_resched(); > > BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); > @@ -1640,9 +1663,9 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > > WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); > > - if (vma->vm_start > start) > - start = vma->vm_start; Ditto > - vma_end = min(end, vma->vm_end); > + ret = split_range(&vmi, vma, start, end, &can_merge); > + if (ret) > + break; > > if (userfaultfd_missing(vma)) { > /* > @@ -1652,35 +1675,27 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > * UFFDIO_WAKE explicitly. > */ > struct userfaultfd_wake_range range; > - range.start = start; > - range.len = vma_end - start; > + range.start = vma->vm_start; > + range.len = vma->vm_end - vma->vm_start; > wake_userfault(vma->vm_userfaultfd_ctx.ctx, &range); > } > > /* Reset ptes for the whole vma range if wr-protected */ > if (userfaultfd_wp(vma)) > - uffd_wp_range(vma, start, vma_end - start, false); > + uffd_wp_range(vma, vma->vm_start, > + vma->vm_end - vma->vm_start, false); > > new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS; > - prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags, > - vma->anon_vma, vma->vm_file, vma->vm_pgoff, > - vma_policy(vma), > - NULL_VM_UFFD_CTX, anon_vma_name(vma)); > - if (prev) { > - vma = prev; > - goto next; > - } > - if (vma->vm_start < start) { > - ret = split_vma(&vmi, vma, start, 1); > - if (ret) > - break; > - } > - if (vma->vm_end > end) { > - ret = split_vma(&vmi, vma, end, 0); > - if (ret) > - break; > + if (can_merge) { > + prev = vma_merge(&vmi, mm, prev, vma->vm_start, > + vma->vm_end, new_flags, vma->anon_vma, > + vma->vm_file, vma->vm_pgoff, > + vma_policy(vma), > + NULL_VM_UFFD_CTX, anon_vma_name(vma)); > + /* vma_merge() invalidated the mas */ > + if (prev) > + vma = prev; > } > - next: > /* > * In the vma_merge() successful mprotect-like case 8: > * the next vma was merged into the current one and > -- > 2.40.1 -- Sincerely yours, Mike.