From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33E02C83F3E for ; Tue, 5 Sep 2023 13:27:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7E6B8D0005; Tue, 5 Sep 2023 09:26:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B2DB68D0001; Tue, 5 Sep 2023 09:26:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1CB48D0005; Tue, 5 Sep 2023 09:26:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 92FA88D0001 for ; Tue, 5 Sep 2023 09:26:59 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4967314047D for ; Tue, 5 Sep 2023 13:26:59 +0000 (UTC) X-FDA: 81202619358.02.E20F3DE Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 1E252C001D for ; Tue, 5 Sep 2023 13:26:56 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693920417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ypJDUzv4Xc4ZMBQVTP56nMRmn0Gm+yMaSLmTNAovPxc=; b=D7wIrhY2lW1GvQEQmpbAIjw1OfRw/vm/E9/ILAbVl7Xwg0kQIJ5dAdFfe+ilSdSt5Y+sQu Xv61jQsDwTYsyAyXMGeV39n8BQZmnuhI1m/vRW5XzuUpIPOHY5MERzWv69slL9vBVXa/kO 0/7BslGJBFboBrIipbM+IFSsHmkac2w= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693920417; a=rsa-sha256; cv=none; b=unl8Smiw6HsCG1zd8Od2yjFCrn6Qq+R1ZnJXuyJfHyx3tZTYT01FkeXbiOSUAiTp/4kLnX k0esDFexWfwSTaClEy2CKxOUx5UEh85p/Rwqs7WtPVHtX5sM/0AccZq3HPmUXoMxcCj2VK UcT2b1CrEwcmFdc9iPAAJfyG7dNur2Q= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EE18611FB; Tue, 5 Sep 2023 06:27:33 -0700 (PDT) Received: from [10.1.26.182] (XHFQ2J9959.cambridge.arm.com [10.1.26.182]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8866D3F67D; Tue, 5 Sep 2023 06:26:55 -0700 (PDT) Message-ID: <618edcc2-c73e-4902-95ff-947f2d63838e@arm.com> Date: Tue, 5 Sep 2023 14:26:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 00/14] Rearrange batched folio freeing Content-Language: en-GB To: Matthew Wilcox Cc: linux-mm@kvack.org References: <20230825135918.4164671-1-willy@infradead.org> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 1E252C001D X-Rspam-User: X-Stat-Signature: zquji8z1zd15g84k7m367knp3pwf7pcu X-Rspamd-Server: rspam01 X-HE-Tag: 1693920416-625992 X-HE-Meta: U2FsdGVkX1/2QcVbKXUziJAeec+C79KeUym/xUlq5l7nR3q6zaOQzXrjaeU13ymm90VJe+Ey0nW3RrfSe3l7a32kP7hfCb0CK+5WE8B2JF9OeayZH6amIhj8CAuQ0Ty8UGEKfCxjAvAuCbhWxO8e3BdvDzSyfjYHALnb3SVnKXiR+BQ+Pi/TNZPdIAc7UU29IkcPpbS/+m03rkhXwRMlmxSFFwjJnRln2Rc46sTG4IJRav+7gOKOM17MhJumAu42HtqkxESGpRiyqqlSv4P2CQaSviGEu2/zdE1N2tXi1ZXNCqTPR7Z8/LfDuaDFouAC8TnebbrHOuFVFBchOutaNnmxoCoEsoUnvFeTpFLvRPXWhdemM/i4FKNmiSaO128CYZXlW3odjv7jWmFCMQNR9G4xtVoZb+97l6AmKWWOuuqMaNpWJGAcN/I4pU83LlddPmTq9tOLaNI9HHH5dWV2+/5FLGwJCGqJMVYAV4pw7M81PuQaIssP5u2UCi2JK5OaKEyVt7O0XUPftsaCEA28A1LCOnvyBgtlg+WLyRR3TjcnbCiL9/C3EuvJK9b12OirLe4ErLDqNoA6U7XwVhnogU48PKdcFyxaD7KfUOvLLD67wv8SA9CZxUkHpragm0b9V4lh6D/AWJLZ7O1zByFEwlA+C7oJbkCR+qz7DOhNqg0EAZ20vvOCAn6d7kEf/59wYtoOukOyjXkY520WZON60tsuam/Kd96cjVDeGSCVEqFr6oVhu9mqlRF/kkiDysS2SZMvoPDQnEO8abpSs3McWdTnuQSBq4A0BILotABH96lEbArFWQrvJf8CjtGnZ+nOPwMLbKNAnIDccGQxpcTmUwjgxhTCMUMeOzgB6V7Y6O9LX265nqK2fxaelwtuPSZL6NG+TM0eg8P0atVM25gk6P+lgV3zDwtek5BR8hBESE4jK0hytK5BOLZP6KAX6NKvuQvyZa6BEhJmg297njr Lr+7Agz7 d4YxENcTb8Z+9HxsjvOL1BV7NAHajCy7xnaxwWkhM61w22tAZrHAzgZg1TfNW7kySLeE8DbJsBQJuHENbnGxCrCTCXVUQpmVzeSbvLsIEglCJnljwgXTrma6F5K5GhVrD83neWl6oy3n4Uss= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 05/09/2023 14:15, Matthew Wilcox wrote: > On Mon, Sep 04, 2023 at 02:25:41PM +0100, Ryan Roberts wrote: >> I've been doing some benchmarking of this series, as promised, but have hit an oops. It doesn't appear to be easily reproducible, and I'm struggling to figure out the root cause, so thought I would share in case you have any ideas? > > I didn't hit that with my testing. Admittedly I was using xfs rather > than ext4, but ... I've only seen it once. I have a bit of a hybrid setup - my rootfs is xfs (and using large folios), but the linux tree (which is being built during the benchmark) is on an ext4 partition. Large anon folios is enabled in this config, so there will be plenty of large folios in the system. I'm not sure if the fact that this fired from the ext4 path is too relevant - the page with the dodgy index is already on the PCP list so may or may not be large. > >> UBSAN: array-index-out-of-bounds in mm/page_alloc.c:668:46 >> index 10283 is out of range for type 'list_head [6]' >> pstate: 004000c9 (nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> pc : free_pcppages_bulk+0x330/0x7f8 >> lr : free_pcppages_bulk+0x7a8/0x7f8 >> sp : ffff8000aeef3680 >> x29: ffff8000aeef3680 x28: 000000000000282b x27: 00000000000000fc >> x26: 000000008015a39a x25: ffff08181ef9e840 x24: ffff0818836caf80 >> x23: 0000000000000001 x22: 0000000000000000 x21: ffff08181ef9e850 >> x20: fffffc200368e680 x19: fffffc200368e6c0 x18: 0000000000000000 >> x17: 3d3d3d3d3d3d3d3d x16: 3d3d3d3d3d3d3d3d x15: 3d3d3d3d3d3d3d3d >> x14: 3d3d3d3d3d3d3d3d x13: 3d3d3d3d3d3d3d3d x12: 3d3d3d3d3d3d3d3d >> x11: 3d3d3d3d3d3d3d3d x10: 3d3d3d3d3d3d3d3d x9 : fffffc200368e688 >> x8 : fffffc200368e680 x7 : 205d343737333639 x6 : ffff08181dee0000 >> x5 : ffff0818836caf80 x4 : 0000000000000000 x3 : 0000000000000001 >> x2 : ffff0818836f3330 x1 : ffff0818836f3230 x0 : 006808190c066707 >> Call trace: >> free_pcppages_bulk+0x330/0x7f8 >> free_unref_page_commit+0x15c/0x250 >> free_unref_folios+0x37c/0x4a8 >> release_unref_folios+0xac/0xf8 >> folios_put+0xe0/0x1f0 >> __folio_batch_release+0x34/0x88 >> truncate_inode_pages_range+0x160/0x540 >> truncate_inode_pages_final+0x58/0x90 >> ext4_evict_inode+0x164/0x900 >> evict+0xac/0x160 >> iput+0x170/0x228 >> do_unlinkat+0x1d0/0x290 >> __arm64_sys_unlinkat+0x48/0x98 >> >> UBSAN is complaining about migratetype being out of range here: >> >> /* Used for pages not on another list */ >> static inline void add_to_free_list(struct page *page, struct zone *zone, >> unsigned int order, int migratetype) >> { >> struct free_area *area = &zone->free_area[order]; >> >> list_add(&page->buddy_list, &area->free_list[migratetype]); >> area->nr_free++; >> } >> >> And I think that is called from __free_one_page(), which is called >> from free_pcppages_bulk() at the top of the stack trace. migratetype >> originates from get_pcppage_migratetype(page), which is page->index. But >> I can't see where this might be getting corrupted, or how yours or my >> changes could affect this. > > Agreed with your analysis. > > My best guess is that page->index still contains the file index from > when this page was in the page cache instead of being overwritten with > the migratetype. Yeah that was my guess too. But I couldn't see how that was possible. So started thinking it could be some separate corruption somehow... > This is ext4, so large folios aren't in use. > > I'll look more later, but I don't immediately see the problem. >