From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89A3AC0218A for ; Mon, 27 Jan 2025 16:45:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 053BD280182; Mon, 27 Jan 2025 11:45:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 002ED280181; Mon, 27 Jan 2025 11:45:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0D44280182; Mon, 27 Jan 2025 11:45:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BD333280181 for ; Mon, 27 Jan 2025 11:45:32 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5994216038E for ; Mon, 27 Jan 2025 16:45:32 +0000 (UTC) X-FDA: 83053807704.18.CC6A4F3 Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by imf19.hostedemail.com (Postfix) with ESMTP id 9FB421A0012 for ; Mon, 27 Jan 2025 16:45:29 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=ispras.ru header.s=default header.b=CydyE8bw; dmarc=pass (policy=none) header.from=ispras.ru; spf=pass (imf19.hostedemail.com: domain of pchelkin@ispras.ru designates 83.149.199.84 as permitted sender) smtp.mailfrom=pchelkin@ispras.ru ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737996330; a=rsa-sha256; cv=none; b=ZVXhH/xIBuodqIS787s6TxGIcS4BobOstqMmeVQFmt8hDEEXbtironDIVBIdFpbjnvtpMW 2+OiYeufmfiNR6d5T+AG1W8wU4N6SedtOOuiwSOZwc6+n6Dg0tLjfaipc9cXFB5zksc+g3 pCi7tP6SoeLi3q/P/4OG4AFxwBa3w6M= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=ispras.ru header.s=default header.b=CydyE8bw; dmarc=pass (policy=none) header.from=ispras.ru; spf=pass (imf19.hostedemail.com: domain of pchelkin@ispras.ru designates 83.149.199.84 as permitted sender) smtp.mailfrom=pchelkin@ispras.ru ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737996330; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=V5uPrbUib6y9YkGzQTS+AdU6oWDsCH+vCSt/Zyfe5IE=; b=i6wscs3F0LMAFNwZWYsJyeieEP1FluTDl3FH/T0y7WmicwkHd3faCZyriu8sqDs6HAv5FN 8zpkckYWHkkmkM3KGuuKQpv8or6exyWQbjuYKtHXzzo0Ugvwoo3F545f8iDXpdWscxsWyA 0ssvsO7zhPC87JBlgoXx5kMhD0KhZVo= Received: from localhost (unknown [10.10.165.8]) by mail.ispras.ru (Postfix) with ESMTPSA id E097D40737BC; Mon, 27 Jan 2025 16:45:26 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru E097D40737BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ispras.ru; s=default; t=1737996327; bh=V5uPrbUib6y9YkGzQTS+AdU6oWDsCH+vCSt/Zyfe5IE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=CydyE8bwcMmQCt/KehY/qw0rFMP9dQ8YFUxFiA/Mhaglrc6b5GEIgrtdeKjV1JHBo GjooR0Mw/EtBDU+9UI8xFpV1sV02r4KSXcMh32t0yzv9wFeGxV4gwkxmASATxU6ZrN DU8XtcE3NG1M9GUPgwzlblwYuuROYUKtxC1piesQ= Date: Mon, 27 Jan 2025 19:45:26 +0300 From: Fedor Pchelkin To: Lorenzo Stoakes Cc: Daniil Dulov , lvc-project@linuxtesting.org, Jann Horn , linux-kernel@vger.kernel.org, Mina Almasry , Mike Kravetz , "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, "Liam R. Howlett" , Andrew Morton , Vlastimil Babka , stable@vger.kernel.com Subject: Re: [PATCH] mm/vma: Fix hugetlb accounting error in copy_vma() Message-ID: <4rmkmv5bgryxawl4qnizozlhwnfkhlebut4n2dcf6cdpuvqacb@c73fcytj6dfi> References: <20250127143201.45453-1-d.dulov@aladdin.ru> <83645f1b-cede-455c-abc0-6f105199eee9@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <83645f1b-cede-455c-abc0-6f105199eee9@lucifer.local> X-Rspam-User: X-Rspamd-Queue-Id: 9FB421A0012 X-Rspamd-Server: rspam10 X-Stat-Signature: xutxthnaodf63gx91hgoumjrxtk9643x X-HE-Tag: 1737996329-396718 X-HE-Meta: U2FsdGVkX19CG3c1we7e2qNtscMSGc0mfnmL9EMh8v+npxgIspqy9iYbIA3pET2UyeOO+DseYaJoIBjK6IjU/jxrq0lkct+erFnIJLjuxsPb+fGI3ISfbEnQEvxqa20G2nYWhujDgapXoTv39hZ0Ay105N+G93wF9dOQ+YDjQKG7YgitP3H0dtHA2GnBXJ9PCqYircE0lrj6c1FIUVKS8jIXZvZauWmE2rXNeHR0JEWkaRuc5xVg8/t1l8xqHmlp9rxX1vG2dlHi7q0EAqaHbkFporAI7eWrSt5lM1APPuv6OG1FW0KrpuU4Segskfe9JbuLEa45kgC56cZJZMrR7/pqVhAEp4j6uUVx9kB3jUoay5EyDnPdKlMX9bPjUqRzusIHuKaM3B1jHnTfEqMmwt7CtiCmENoo2VxJu9CRmUIv7KTWQzUALP5C0rWsSS2NXwLzqufb/hH3+W9i2x8pkvp2Vv3dK9nHodNqFShSk8CwX5VpkBKqoH70fmCYV6zKgsF++s2YDEQTmKrgJ4esDHtxPrwJ+NrBskb6a5POUA5erUd1oRLOzSGsDlu0qqhNkCW4q4AGaWP0rw4NU6XNuSUq2120tuJvlA4gu23dDLTZUYcDpPR+FsirXSOuv90YwCrO2eyV4GJ/c8eaCaYzQnF+yYmAXKYY1Fqk5iZlqDCGoDNsaQiNWOb7fbXpoQtIfeUCyRKdvcBjjHQ4EQb5rA51b5kINWTp5r0OzpwrYv5FF5RzMusm/Tc85U7qFRSKj5hiLz7eXlXwMEgdde9EVxtqARdnEG+o23F92zVqnEqH1hST77KiR7PW7gBNwwCx7Qp1QFrLP+hs+dPV8nV9kOihEU3hsLQo5MXTl4H7Up5V2+IosK9IuOdQYz/W752jQ4S7TuBtz9q6sAlhccWZqlHKjJH/6WJg51bCg9uv75QyPwarE3uVW6euB1nY4qscDjYyBJzzx4ZZm1L1z5f GG4PcrNi g7n8YuISnsmxGSx8k5Bw/apGP5kZ6hJawPXGeDMSDeygZ76AG4cOhnEp5AFm2Wyru0lRavuFcrOJiRf/pTAwoo+uwuaz2Z0IerRxcrKtUkAya5cGunZFuvRUhOdUgLlOCWVp3UET3Bzx6rAqzMSlE6vB0+q3Wz5rsYz/tcvrpuVRHRvFPKX9T72C4vFwcUz0dvqFXKxdWOAFV4KOvYHvrt+A7sK3Vd7cRo8fble031rHcOfUU935C/IndRUKa93AXmLvducUtcoNMPwaJc8X1+aqsgiYQqKaRIx0ctfa5hGbH64xGwB5qciK8UyKjXQC+73IFySOMlX1QT2uDN+3bbpLphshFRKmlGD8FZdBp+MJZwGTkvNKh3iZp66jYrnfDA4sS4v0KcE8UfiaIy5nq/fqeBLxprlZeRwBA3w7iGvOKHwj3GwnWQNKE2EfNfFdExkJAtsnBt7Vgc6Kx+iNb3v5eu2BTuOp+cfbpzlBrCk5DixRXnP9L1rflb4FG9BCOgmHEnPmvqYNtWqddD0Q/te7P2rEp07zEBUjqxMc3XKQEzHdw0l26DgDncfv2OQhSWASxdlk41TrDW0z9WDfAvQYxgGS9HDCbkLzPSmHbmcuWE3D5Ocj3O2lK3VuaY/Jk7go9wHBde7U9J8juQEJY/5V16RIJoUROfjQs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: +Cc Mina Almasry and Mike Kravetz On Mon, 27. Jan 14:46, Lorenzo Stoakes wrote: > Thanks for the report. > > On Mon, Jan 27, 2025 at 05:32:01PM +0300, Daniil Dulov wrote: > > In copy_vma() allocation of maple tree nodes may fail. Since page accounting > > takes place at the close() operation for hugetlb, it is called at the error > > path against the new_vma to account pages of the vma that was not successfully > > copied and that shares the page_counter with the original vma. Then, when the > > process is being terminated, vm_ops->close() is called once again against the > > original vma, which results in a page_counter underflow. > > This seems like a bug in hugetlb. > > I really hate the solution here, it's hacky and assumes only these fields are > meaningful for 'close twice' scenarios. > > We now use vma_close(), which assigns vma->vm_ops to vma_dummy_vm_ops, meaning > no further close() invocations can occur. Does the "close twice" scenario exactly mean ->close() is called twice for the same object of struct vm_area_struct? For the observed case I think that's not true. ->close() is called for two different objects of type vm_area_struct - the first time for the new_vma on error path of copy_vma(), the second time for the original vma. It turns out then these objects share the same reservation map holding page_counters at this point of time. > > If hugetlb is _still_ choosing to internally invoke this, it seems like it > should have some if (vma->vm_ops == hugetlb_vm_ops) { ... } check first? That > way it'll account for the closing twice issue. > > Can you easily repro in order to check a solution like that fixes your problem? > I don't see why it shouldn't > Seems that wouldn't fix the problem (again, two different vma objects). There's presumably no obvious place in hugetlb internals where this may be fixed elegantly, at the quick glance. But yep, it does look like a bug for hugetlb to care about.. Perhaps somehow defer the reservation map copying? > > > > page_counter underflow: -1024 nr_pages=1024 > > WARNING: CPU: 1 PID: 1086 at mm/page_counter.c:55 page_counter_cancel+0xd6/0x130 mm/page_counter.c:55 > > Modules linked in: > > CPU: 1 PID: 1086 Comm: syz-executor200 Not tainted 6.1.108-syzkaller-00078-g9ce77c16947b #0 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 > > Call Trace: > > > > page_counter_uncharge+0x2e/0x70 mm/page_counter.c:158 > > hugetlb_cgroup_uncharge_counter+0xd2/0x420 mm/hugetlb_cgroup.c:430 > > hugetlb_vm_op_close+0x435/0x700 mm/hugetlb.c:4886 > > remove_vma+0x84/0x130 mm/mmap.c:140 > > exit_mmap+0x32f/0x7a0 mm/mmap.c:3249 > > __mmput+0x11e/0x430 kernel/fork.c:1199 > > mmput+0x61/0x70 kernel/fork.c:1221 > > exit_mm kernel/exit.c:565 [inline] > > do_exit+0xa4a/0x2790 kernel/exit.c:858 > > do_group_exit+0xd0/0x2a0 kernel/exit.c:1021 > > __do_sys_exit_group kernel/exit.c:1032 [inline] > > __se_sys_exit_group kernel/exit.c:1030 [inline] > > __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:1030 > > do_syscall_x64 arch/x86/entry/common.c:51 [inline] > > do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81 > > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > > > > > > Since there is no sense in vm accounting for a bad copy of vma, set vm_start > > to be equal vm_end and vm_pgoff to be equal 0. Previously, a similar issue > > has been fixed in __split_vma() in the same way [1]. > > > > [1]: https://lore.kernel.org/all/20220719201523.3561958-1-Liam.Howlett@oracle.com/T/ > > Understood that we do this elsewhere, I think equally we should not do this > there either! :) > > > > > Found by Linux Verification Center (linuxtesting.org) with Syzkaller. > > > > Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree") > > Cc: stable@vger.kernel.com > > Signed-off-by: Daniil Dulov > > --- > > mm/vma.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/mm/vma.c b/mm/vma.c > > index bb2119e5a0d0..dbc68b7cd0ec 100644 > > --- a/mm/vma.c > > +++ b/mm/vma.c > > @@ -1772,6 +1772,9 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, > > return new_vma; > > > > out_vma_link: > > + /* Avoid vm accounting in close() operation */ > > + new_vma->vm_start = new_vma->vm_end; > > + new_vma->vm_pgoff = 0; > > vma_close(new_vma); > > > > if (new_vma->vm_file) > > -- > > 2.34.1 > >