From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FB90D17129 for ; Mon, 21 Oct 2024 19:49:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02ABB6B0085; Mon, 21 Oct 2024 15:49:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F1CA26B0089; Mon, 21 Oct 2024 15:49:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBD816B0092; Mon, 21 Oct 2024 15:49:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B78AB6B0085 for ; Mon, 21 Oct 2024 15:49:45 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 75FEA1A1BD0 for ; Mon, 21 Oct 2024 19:49:17 +0000 (UTC) X-FDA: 82698648858.08.A1D864B Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by imf05.hostedemail.com (Postfix) with ESMTP id C5BD710001D for ; Mon, 21 Oct 2024 19:49:14 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pnq9I6gv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of hughd@google.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729540073; a=rsa-sha256; cv=none; b=JxwTY2LCJF26ZvgTs4XIeaMLDl26tMkVZqT7vXX8kk9seLsEaPyvv1j7A1Re1Wd+WdxnGQ UZYI0/uNhDNhGpGjGRHxQcHZjJgCZwnG1/ngz/BjOnP0Nsx4i6Y9ANvDXbZMdAHaZqfJ0B dk7GZI7glTjUctZONHYMjrJQ3ZIhcQc= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pnq9I6gv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of hughd@google.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729540073; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TIAEhXnz+TycYsOXq4fwUScpNqUcxMY2FROsK0V63/0=; b=e3dFIqBjm1HZ3/5xxfHHK5YnMuh0utn5tVnK8ONRrXAPEc6ZeytCW8pRJ3xCE8hWLljayu T1E3CRmTyFzjbf8c7QUbaoZUMYSsHAL7JgeJh/6U02Dpi9kQHMy3EpFUgy91G12xVU+xjz EVLIWyUetQXVn0L9X73cd+b3gfXCnVw= Received: by mail-pg1-f170.google.com with SMTP id 41be03b00d2f7-7ea16c7759cso2591300a12.1 for ; Mon, 21 Oct 2024 12:49:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729540182; x=1730144982; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=TIAEhXnz+TycYsOXq4fwUScpNqUcxMY2FROsK0V63/0=; b=pnq9I6gvJhQpp+mOzoa6CvKbCYHllPH2N8XhiDnxIsOAPQljd9t3IZgBUd2oWGrDzK 6QBCSGRlkulyAWD6zdeGtGkc+33hTHTWbZxgT7cxWMbg58Gr7TKPS9hQ5kNdYFNYlVcj PcYSqa24kCEofrcIBvGO5Elxir8ydHdlnqaveQxMj7RxfDdeOxQaQI1OsFzHzLfSzjey hCYrEIFO9W07PU/pG3mqoDjjI2QWMLY9X+p823C+pzTs+7aK6K3soUfGecVB1RVLDRKT 5ZX4hrYUNTDHA2njEddUAzGX7j0e9Kacjx+bWqnOTiRniQPVqdXdRhKx3LLnDNpGTa7j N/Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729540182; x=1730144982; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TIAEhXnz+TycYsOXq4fwUScpNqUcxMY2FROsK0V63/0=; b=cfGgbOENJtdNIyIH4C4f4S5TnECVyiZjDaECRJHphfgcp4Xnz+fySYXpveJ1e9q+ZK kpzgL0/4HbJb3Y2gojv1W1JydlaI8goHyYatb3Q/YpgJ/1IQsYAQfCNIrZudQ6b69pFo kiTcT9h0e7WGTs9Lajk9E7gLbF2FJzucqVzN/2tb2zBlElZRHA0g39PNq89qyR56pcOH SRc8eJ6XdVMsKHg2AKWcVmsXV9Cq7r70V2OSkiQU9Csi/FR5LL7d6FMB3DPWtTHAaFbE RqyDex2ceIVf/AAjRDLfNdjYQ20/1BEeNO3QHWDrWn+WoLLPl4TRLsltBShI4QpWJp4q 6V4g== X-Forwarded-Encrypted: i=1; AJvYcCUSuBDfIBPAXHnF8qS3CtsYI9kcZAM5MDnzgsAIjtA8Aqj9nVAGfEGpqQiKUDz94FQGjkyq007NNA==@kvack.org X-Gm-Message-State: AOJu0Yytkqm6HfBFbpko5doANO6RLsBaC7g1F/fmRoxqGTl/PetPxPV4 U55MDt6k8UwS8HBo/0aRWJaeyR+YmuVBlxleIk6VpZ/AmzsBvR23IqApijKY6Q== X-Google-Smtp-Source: AGHT+IE7LBFDuXD464oDX7R+3pKFv6gB/pDONRd/qrGrJhWZX+JT1kChQtcaO3T6c0R4R1FchyglYw== X-Received: by 2002:a05:6a21:a24b:b0:1d9:275b:4f06 with SMTP id adf61e73a8af0-1d96debc9a1mr86139637.19.1729540181812; Mon, 21 Oct 2024 12:49:41 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71ec1407f63sm3277862b3a.204.2024.10.21.12.49.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Oct 2024 12:49:40 -0700 (PDT) Date: Mon, 21 Oct 2024 12:49:28 -0700 (PDT) From: Hugh Dickins To: Roman Gushchin cc: Andrew Morton , linux-mm@kvack.org, Vlastimil Babka , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Hugh Dickins , Matthew Wilcox , Shakeel Butt Subject: Re: [PATCH v2] mm: page_alloc: move mlocked flag clearance into free_pages_prepare() In-Reply-To: <20241021173455.2691973-1-roman.gushchin@linux.dev> Message-ID: References: <20241021173455.2691973-1-roman.gushchin@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: C5BD710001D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 1b6myqhjmuagoxnzi3cbrbnnp9xbqxpn X-HE-Tag: 1729540154-704132 X-HE-Meta: U2FsdGVkX1/DbxtLCiC37iCwNjDBHEpZuL6WaHGH3vfp2tSwsAK/KlCUlKjfTv6QUqQtIS2ObYpNNziKjTtvtbJUqpI1ePAb4Tx6GIk12CK6WLlbyxCO7WYlslvtOjgslHdBTZ5BYuiTzBkDyHTK3UnL+81aIgC5URxdLIt2QM3yvShsrrq8HsOuWuOzJTwhekVBSJK+6WIlOFbHUb8AV4ngdJUTuxj7gU7Ga8j/m/q4TFBlnbBPKAIyvGgAXZ4mgxlXkxITNasr9uzJhIHrr12Iw4t5dUYlnt5jetRCHIM7tntzrL0S2giiJUa2WlTs2W951sbWh3YbE69eucvmJBLpC/2ybA31QSCvP+n1wEv+IR/Ut5xLsLFoiNIrC7fbcMpldaLiMr+jYKVCGMdh74HpISNPyk+rgVwYQT5rvuhRxCc6Mjn8BpgDMwcoCntSyPoEbazdvYT31NZ6gGJ+cS7y4qM0m/59uEmH3xmaHzgcVaB9LmG5r7tA6s+TE5rudxSUZqTv5pXp1tgl2IoKYiPmpbDX6w7vb/W/PmEvo29gi0zyQZO3qVbopKe5IXRs2k+DIs/NXJhu721DKsVBRIcX+HzCM8vfHa8g7HqrN9orQQRdvX/TITxorZyOANW9nqCw772+BSDxrVlihjEOLchkcWpx7xn0+NkqecMU36TWUwUeW+AQwToEJFW7rFlt0WbWyTsRLehmgwB/znGOWcjedM5Mor3ZzFO0/ea15N+dReVCrzY+4Sy0rQ57+Cun/eTUZUPOJTt4VCZqoTdR3vRNynLfKXv3rdOjLuZj9V4nZS9zfRUERpRBo7rSxmgfWGt5ZuJN4+JVirtmGvb6UYGsBQubTHj3VywoqO/pkDE5BBwY9SXJ9X4MQ7JTQULaD3tuWqsY/B7/OuBScIebUFtq/KljuoMEUIDpuWp46ipaxUcJgu01Jd/OiZlFW/eD5b2ZEgF0+XtS5A/UBrt U4OdqQbA tVBx3uKzdvbln6NB9WtX4QSE9u0uDlWDGLEbdgY+1XmC32y6AOmZGf4rMk6XEPGVwKZ7TJq0aTgjKp+hTGeE2AyjEAq3IBwKRQxjEzejPuJ/DA0ks/es7gIEOZM1sM6rYUqbwX6BfKKDdK/tiWK8nZ4B7YJ7kHfN6QH6tNTfu3UVrjRRw28qfbl0632JZwqHECG4vWSuolWG/ZOvAT2koRFQE0zWfJv5jPwrwptVgnOS1fg+NuOTnTRDmMuOvX4euzqZo/SZd6N0IwaUrHrewcPgmOqeVZYuIRGos2I3nxHUHcjH5aiomGmqOdf+KARD5aEijJ/JAl9f37tDHKuHGl7590iQ3eHp87/hc7eQIwiTUpm1jBiwXXJCO+/s+kA9fd8hjiIQrEuT3hM9zAcyaIeKDNcQFO+2PHYIxhtsGPL3KwUehyfbekxgbA4Nf/z6UqQZC3ZoKqFVIlnjHofpIEF9+rXrbgSECPnfkg+zCsZ8bVhoG4bnLtG7gRaHRiEVX6d9+hSkoQPfjwfg4qSHhB3enJWeWuQEJbpM0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 21 Oct 2024, Roman Gushchin wrote: > Syzbot reported a bad page state problem caused by a page > being freed using free_page() still having a mlocked flag at > free_pages_prepare() stage: > > BUG: Bad page state in process syz.0.15 pfn:1137bb > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff8881137bb870 pfn:0x1137bb > flags: 0x400000000080000(mlocked|node=0|zone=1) > raw: 0400000000080000 0000000000000000 dead000000000122 0000000000000000 > raw: ffff8881137bb870 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > page_owner tracks the page as allocated > page last allocated via order 0, migratetype Unmovable, gfp_mask > 0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), pid 3005, tgid > 3004 (syz.0.15), ts 61546 608067, free_ts 61390082085 > set_page_owner include/linux/page_owner.h:32 [inline] > post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537 > prep_new_page mm/page_alloc.c:1545 [inline] > get_page_from_freelist+0x3008/0x31f0 mm/page_alloc.c:3457 > __alloc_pages_noprof+0x292/0x7b0 mm/page_alloc.c:4733 > alloc_pages_mpol_noprof+0x3e8/0x630 mm/mempolicy.c:2265 > kvm_coalesced_mmio_init+0x1f/0xf0 virt/kvm/coalesced_mmio.c:99 > kvm_create_vm virt/kvm/kvm_main.c:1235 [inline] > kvm_dev_ioctl_create_vm virt/kvm/kvm_main.c:5500 [inline] > kvm_dev_ioctl+0x13bb/0x2320 virt/kvm/kvm_main.c:5542 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:907 [inline] > __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x69/0x110 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > page last free pid 951 tgid 951 stack trace: > reset_page_owner include/linux/page_owner.h:25 [inline] > free_pages_prepare mm/page_alloc.c:1108 [inline] > free_unref_page+0xcb1/0xf00 mm/page_alloc.c:2638 > vfree+0x181/0x2e0 mm/vmalloc.c:3361 > delayed_vfree_work+0x56/0x80 mm/vmalloc.c:3282 > process_one_work kernel/workqueue.c:3229 [inline] > process_scheduled_works+0xa5c/0x17a0 kernel/workqueue.c:3310 > worker_thread+0xa2b/0xf70 kernel/workqueue.c:3391 > kthread+0x2df/0x370 kernel/kthread.c:389 > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 > > A reproducer is available here: > https://syzkaller.appspot.com/x/repro.c?x=1437939f980000 > > The problem was originally introduced by > commit b109b87050df ("mm/munlock: replace clear_page_mlock() by final > clearance"): it was handling focused on handling pagecache > and anonymous memory and wasn't suitable for lower level > get_page()/free_page() API's used for example by KVM, as with > this reproducer. > > Fix it by moving the mlocked flag clearance down to > free_page_prepare(). > > The bug itself if fairly old and harmless (aside from generating these > warnings). > > Closes: https://syzkaller.appspot.com/x/report.txt?x=169a47d0580000 > Fixes: b109b87050df ("mm/munlock: replace clear_page_mlock() by final clearance") > Signed-off-by: Roman Gushchin > Cc: > Cc: Hugh Dickins Acked-by: Hugh Dickins Thanks Roman - I'd been preparing a similar patch, so agree that this is the right fix. I don't think there's any need to change your text, but let me remind us that any "Bad page" report stops that page from being allocated again (because it's in an undefined, potentially dangerous state): so does amount to a small memory leak even if otherwise harmless. > Cc: Matthew Wilcox > Cc: Vlastimil Babka > --- > mm/page_alloc.c | 15 +++++++++++++++ > mm/swap.c | 14 -------------- > 2 files changed, 15 insertions(+), 14 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index bc55d39eb372..7535d78862ab 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1044,6 +1044,7 @@ __always_inline bool free_pages_prepare(struct page *page, > bool skip_kasan_poison = should_skip_kasan_poison(page); > bool init = want_init_on_free(); > bool compound = PageCompound(page); > + struct folio *folio = page_folio(page); > > VM_BUG_ON_PAGE(PageTail(page), page); > > @@ -1053,6 +1054,20 @@ __always_inline bool free_pages_prepare(struct page *page, > if (memcg_kmem_online() && PageMemcgKmem(page)) > __memcg_kmem_uncharge_page(page, order); > > + /* > + * In rare cases, when truncation or holepunching raced with > + * munlock after VM_LOCKED was cleared, Mlocked may still be > + * found set here. This does not indicate a problem, unless > + * "unevictable_pgs_cleared" appears worryingly large. > + */ > + if (unlikely(folio_test_mlocked(folio))) { > + long nr_pages = folio_nr_pages(folio); > + > + __folio_clear_mlocked(folio); > + zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); > + count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages); > + } > + > if (unlikely(PageHWPoison(page)) && !order) { > /* Do not let hwpoison pages hit pcplists/buddy */ > reset_page_owner(page, order); > diff --git a/mm/swap.c b/mm/swap.c > index 835bdf324b76..7cd0f4719423 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -78,20 +78,6 @@ static void __page_cache_release(struct folio *folio, struct lruvec **lruvecp, > lruvec_del_folio(*lruvecp, folio); > __folio_clear_lru_flags(folio); > } > - > - /* > - * In rare cases, when truncation or holepunching raced with > - * munlock after VM_LOCKED was cleared, Mlocked may still be > - * found set here. This does not indicate a problem, unless > - * "unevictable_pgs_cleared" appears worryingly large. > - */ > - if (unlikely(folio_test_mlocked(folio))) { > - long nr_pages = folio_nr_pages(folio); > - > - __folio_clear_mlocked(folio); > - zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); > - count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages); > - } > } > > /* > -- > 2.47.0.105.g07ac214952-goog > >