From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D826AE7719A for ; Fri, 10 Jan 2025 02:08:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 110316B009A; Thu, 9 Jan 2025 21:08:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09A186B009C; Thu, 9 Jan 2025 21:08:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7D406B009E; Thu, 9 Jan 2025 21:08:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C20E66B009A for ; Thu, 9 Jan 2025 21:08:42 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6828D441F5 for ; Fri, 10 Jan 2025 02:08:42 +0000 (UTC) X-FDA: 82989908484.09.2BD01AE Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf24.hostedemail.com (Postfix) with ESMTP id 74539180009 for ; Fri, 10 Jan 2025 02:08:39 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="s/jFGWjZ"; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf24.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736474920; a=rsa-sha256; cv=none; b=ZqBNk2xBUnzzRpP9f+/yakHM6Jnya85jqZL5Je+97nSw14KI3+tXUAQN7/0sFn1qLbCW4t E6qzAK9n4pfLYMBTEWLdNh2lW0Ezmic1FZl0Av8OhKfwGtijpaLLfcgwoZQKnYhlu1Fl+O /0eAefaaHIbBJVBbbEoQS/hLPGlBXnY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="s/jFGWjZ"; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf24.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736474920; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lFBhwXlNXr8OLfF9qBCXxfXAn9mGgaPo/pnaD76UXrI=; b=Uhgx5Al9GYifYrF/lG5TZZgxBW3gAPOoVc1o6AAwoejOw+zwlIVV/V7JqRkKZOT+uquMx3 aFdY4yR7Wdd/H7iQUgYhTt1j0b36QWQa6fPGgZH/8XoDu6qni1BBHx1WORjVlaCRkPG+wL IdpKhGM8cesAB4JFpTidTbaAuIJyLf8= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1736474915; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=lFBhwXlNXr8OLfF9qBCXxfXAn9mGgaPo/pnaD76UXrI=; b=s/jFGWjZ8BTGVf/g60xZ0g88DCVkPa35Peeb7Y27xR1KL9yxjTd2CCTQhGsrO9CGkgqc3VpNjYZlSasakqw+uhJBfJmuH257iWaHVaEk/+NP1W8YzwXEu94cAFUah+mgI4Hr8EDU71AjefgmqyAATZ+Kg+KgkuF+zWdFgFhd8MU= Received: from 30.74.144.114(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WNJ4MMF_1736474913 cluster:ay36) by smtp.aliyun-inc.com; Fri, 10 Jan 2025 10:08:33 +0800 Message-ID: <9f223f07-e1e5-4401-89bf-33f065b2ca0f@linux.alibaba.com> Date: Fri, 10 Jan 2025 10:08:31 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma To: Yang Shi , Liu Shixin , Andrew Morton , Chengming Zhou , Matthew Wilcox , Kefeng Wang , Nanyong Sun , Muchun Song , Qi Zheng , Johannes Weiner Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20250109070059.369257-1-liushixin2@huawei.com> <037d4442-4d2d-4aeb-8091-5efffc374d36@os.amperecomputing.com> From: Baolin Wang In-Reply-To: <037d4442-4d2d-4aeb-8091-5efffc374d36@os.amperecomputing.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 74539180009 X-Stat-Signature: 46qa5g47id9qxt4brznginngutsnsb4b X-HE-Tag: 1736474919-389090 X-HE-Meta: U2FsdGVkX18jbk4JwBGo5zhU/AHdU7hJByBL2zxLxVBlf3X+LnvyRY3Bok02f8KQC57QI0omlIV73SAefhrdeg7F/GVZnAvt4oGzIF4vSTUTRhVpgaxKdd7M6iRecBNrB22bgm+pjl8vwj4eNuR/gXPcyZo0BDG7Pu8IhKWaj718gGg2RT+188iCYxDUcA/5plrDGtxCD7qRODBcRoDSOiBKj66DyuPxHo/gmlFAxIIt9hOH7fIhtxz4fcp5QGYPStjPQ9NDBC7VC9EJfpHCxUvD4HqmxmcGR+bzgPPGJ4Idk4qQ7YQhDN9563lpg6BAdToxyLZ1s4r8nqlwpskWTfV1mAzUadkBFSJlEWM27a5r+4cio3gNt29zLrjMkyat9x+eUGgiBTp8wItzttvYFqnCGUBNe5DQbmEEi8x/6LBtOuoMS41GMTWNxGFG8zktY+y6zni0tymmzwF/uOYH0gAEZYNFGH5twIHajjxkLKMRGZu2xFentVhQDdPFBh0ry4Hr/a/LcnJr9ESN9f0BVTR/SnLYk7HfiyM6zd6SrXxNEPaEaD022hMMexzpP/N8X+SXi4OJsCstIvyfC2Iv0F5fRD48+Wo9llKMoaK3QWxDO79LX7+lsuyjnQLNBbxRQ28ZsxSak+Jvj2ZqBSSt4nQvx6XlGBfIXa0yX3byB66VJUvgYFl8HvvgmzdFxbmnotnoAC68xGGjF3FpGe2L19Mfj/Z1Cx9ZWTuyyBXu1iKlxcBpVuCJ6x3BfEaR+1KZmb0kPbjzQz2kGYijPkOvdAZYD6bSmfuA9X52fDL9u3TTyH1ziUc7MyoGshnAE1WF7rhXN6WKuFMw/B3KG18vkznkWeJTdGfYYGpnuYo8Ae/ZpWaJhoMXky7QgSvx+rlxEad8Xi07hCOWNVj5NTfAx3V5rrPhuJZO22J1kjFK6KbKsGEpTqtF6Ok2Yw+VrtbRP2qB/JJ8bVWRAFNk79v dIooIuP2 /FCbstw3XXLBPmGAUsqCezeBolMWc0ssi/uUJPB7CfNHvoSS2htZl8XcZm7lq/plZu28h8L0S5O6quWlGTrk1u5ELDdr+Uaz7Unt2RBEd+J11mI1F7k4zoAfquVIwzVobIWnhPU48gfHZtmk52Rc/WFSNHqOJf1P23d9Qf892+e5lJeFXOkng6YAWvcGxkAv5rtHTRIRq3hV6PdfKL+kLqoWBaZ1bOPfrB47he1B7mF+aRAqti/XImdPjJM/he/64LLjOA2EwbWZjnrngj8iOSsguyoUGaynBgJO37UjYMELqF4LHIz0RwsWIPTqlTJx/Ajg53nWxwGti4KiCn2Aw4tS/2gaKuitWV8qd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/1/10 01:00, Yang Shi wrote: > > > > On 1/8/25 11:00 PM, Liu Shixin wrote: >> syzkaller reported such a BUG_ON(): >> >>   ------------[ cut here ]------------ >>   kernel BUG at mm/khugepaged.c:1835! >>   Internal error: Oops - BUG: 00000000f2000800 [#1] SMP >> ... >>   CPU: 6 UID: 0 PID: 8009 Comm: syz.15.106 Kdump: loaded Tainted: >> G        W          6.13.0-rc6 #22 >>   Tainted: [W]=WARN >>   Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 >>   pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>   pc : collapse_file+0xa44/0x1400 >>   lr : collapse_file+0x88/0x1400 >>   sp : ffff80008afe3a60 >> ... >>   Call trace: >>    collapse_file+0xa44/0x1400 (P) >>    hpage_collapse_scan_file+0x278/0x400 >>    madvise_collapse+0x1bc/0x678 >>    madvise_vma_behavior+0x32c/0x448 >>    madvise_walk_vmas.constprop.0+0xbc/0x140 >>    do_madvise.part.0+0xdc/0x2c8 >>    __arm64_sys_madvise+0x68/0x88 >>    invoke_syscall+0x50/0x120 >>    el0_svc_common.constprop.0+0xc8/0xf0 >>    do_el0_svc+0x24/0x38 >>    el0_svc+0x34/0x128 >>    el0t_64_sync_handler+0xc8/0xd0 >>    el0t_64_sync+0x190/0x198 >> >> This indicates that the pgoff is unaligned. After analysis, I confirm >> the vma is mapped to /dev/zero. Such a vma certainly has vm_file, but >> it is set to anonymous by mmap_zero(). So even if it's mmapped by >> 2m-unaligned, it can pass the check in thp_vma_allowable_order() as it >> is an anonymous-mmap, but then be collapsed as a file-mmap. >> >> It seems the problem has existed for a long time, but actually, since >> we have khugepaged_max_ptes_none check before, we will skip collapse it >> as it is /dev/zero and so has no present page. But commit d8ea7cc8547c >> limit the check for only khugepaged, so the BUG_ON() can be triggered >> by madvise_collapse(). >> >> Add vma_is_anonymous() check to make such vma be processed by >> hpage_collapse_scan_pmd(). >> >> Fixes: d8ea7cc8547c ("mm/khugepaged: add flag to predicate >> khugepaged-only behavior") >> Signed-off-by: Liu Shixin >> --- >>   mm/khugepaged.c | 6 ++++-- >>   1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 653dbb1ff05c..eb9d240e42e8 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -2422,7 +2422,8 @@ static unsigned int >> khugepaged_scan_mm_slot(unsigned int pages, int *result, >>               VM_BUG_ON(khugepaged_scan.address < hstart || >>                     khugepaged_scan.address + HPAGE_PMD_SIZE > >>                     hend); >> -            if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) { >> +            if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file && >> +                !vma_is_anonymous(vma)) { > > Thanks for catching this. It sounds a little bit weird to have vm_file > for an anonymous VMA. I'm not sure why we should keep such special case. > It seems shared mapping is treated as shmem file mapping. So can we set > vm_file to NULL when mmap'ing /dev/zero for private mapping? Something > like: > > diff --git a/drivers/char/mem.c b/drivers/char/mem.c > index 169eed162a7f..fc332efc5c11 100644 > --- a/drivers/char/mem.c > +++ b/drivers/char/mem.c > @@ -527,6 +527,7 @@ static int mmap_zero(struct file *file, struct > vm_area_struct *vma) >         if (vma->vm_flags & VM_SHARED) >                 return shmem_zero_setup(vma); >         vma_set_anonymous(vma); > +       vma->vm_file = NULL; >         return 0; >  } Yes, this makes more sense to me.