From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00AB8C54791 for ; Sun, 10 Mar 2024 19:57:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 58BB16B0071; Sun, 10 Mar 2024 15:57:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 53C456B0074; Sun, 10 Mar 2024 15:57:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42B156B0075; Sun, 10 Mar 2024 15:57:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3053D6B0071 for ; Sun, 10 Mar 2024 15:57:14 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CA1EA1A015E for ; Sun, 10 Mar 2024 19:57:13 +0000 (UTC) X-FDA: 81882188346.26.A1EB794 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf21.hostedemail.com (Postfix) with ESMTP id B726B1C0010 for ; Sun, 10 Mar 2024 19:57:10 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="uIFf/3jD"; dmarc=none; spf=none (imf21.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710100631; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pIIEkP+ZS+1lA3UQ6fmUP/Y2yIiOuRJOTGsAyG5Eo0w=; b=FAtFuF24J2kIYwHSr3iLYPXQaFcjWeqlfu7wx1iZPKjKQwUQbHdhubW4WgSJysRTkXtAQs 5em4SG3PFf9ykyUJg8znOtwe8IReXp7EiLHwPf8MdrV1CCdGb8SPDaFfXSTPGacdd4sjnL o3+eqzED5eFHNYIb7XWycnxbmauRj5c= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="uIFf/3jD"; dmarc=none; spf=none (imf21.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710100631; a=rsa-sha256; cv=none; b=rKoFl47lxNOb6a+2UHIoE0iE+uALwtDqLfzvKhQnh4HXjLeegwsjomtKDycrsdY/aS5+92 Zu1iyYkS4TBYkkktL/ihVgFlQBLacH4kocGWTiOSv3qlXkdxBfP+vp7RMMQFL5iY3tvXRU h7SQKCCMZ7i0JbncXMZcDydf8TPtQlM= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=pIIEkP+ZS+1lA3UQ6fmUP/Y2yIiOuRJOTGsAyG5Eo0w=; b=uIFf/3jD0+LjTf2EVnKw/vob5q wvJxLI6e30+fO4dohcif1zJJl5W2QtCFCgJMiqqyUaEhZ4i2gEG5lAZs3Q51DXZzdTdkgV1CuECw7 3EwChhV7+IF4Olnqz/KawRfJ532sH9tJnTuiIIjMACGJAuQr7tzcxBNY1wfxKHy/oQJrvcHAc+6Kz bSNl8Anch73CryysxLF487VKZkTRdEDfrILmzUveBS85hBmSlIkM4UsH8Me4pT12BBizFJh6nEvVM EdVQU3nrXHYGzidfJ/PhLuEkb+zRJePNUVZgjjokYgofjAlQztc/E8SWZ4y6kI5+VFOo/bLjrf/Mw D/99RgZQ==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rjPIC-0000000Gc8l-2GeC; Sun, 10 Mar 2024 19:57:08 +0000 Date: Sun, 10 Mar 2024 19:57:08 +0000 From: Matthew Wilcox To: Ryan Roberts Cc: Andrew Morton , linux-mm@kvack.org Subject: Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Message-ID: References: <20240227174254.710559-1-willy@infradead.org> <20240227174254.710559-11-willy@infradead.org> <367a14f7-340e-4b29-90ae-bc3fcefdd5f4@arm.com> <8cd67a3d-81a7-4127-9d17-a1d465c3f9e8@arm.com> <02e820c2-8a1d-42cc-954b-f9e041c4417a@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <02e820c2-8a1d-42cc-954b-f9e041c4417a@arm.com> X-Rspamd-Queue-Id: B726B1C0010 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 66qkentzfwcj6rn9qdn51e5utpi34mro X-HE-Tag: 1710100630-646534 X-HE-Meta: U2FsdGVkX1+pLzfEq+kBGXkjnvQwbRUf1+l5Mo3t75bnwsZniwl0SrdujtMj2nVMm3OFieaQ0eKhaPjINEiyWW3pz0z/ORDlEbEf9W5DAT4TfIB/bIRELbhVRzajSSw09gjbC0eHMPlaleV7akI8aKQWqo+u0pKbhLCKh86rHsanZl5Ie2A4Ryd429AZH5r7TUALOEv79zlr3KJa6QPudADrAdNbrp/3Io3hoysWGEmbYRv9YTvzLklHVxZN650tLxUCCTEj/g2sd7stBOkGotEPRM55ucNuXTdx7bwX9LckDhk1p8UY5GHRSLmCkzcQWxqi0N9yVVXd91fDlNL4JE64tKBn3hCeag/1xhZ4AYCWFlsgXPtIA7POKTwoiNGA7C3k/hD45AIqmPpn+GD3zq5iMiD6LwLi8bWmWp96IqdzKPPJori6Insfc+yblKxI/Ca8benLLXwW+oATci7Jdr9OsOrE2IV0264nHpxnBGr/fy3UhqN9VZfzCY6z9hu7ZMnY6PNnkWy0vUVrn8Ij3Q8mucluKycYg+gybwkIxQXdd1lj0yNgpNOsQEqTstUUtpPTO6XidQuDv/L+nP66qwppeBa2PQIFAKM9ZAJuZr0mGsV6pZDsKKv0t+LEGIU2DlwNJqvUzhyKMlSeiuKXv5PTBJUrg7D85XIcu01VJaefTxpRGiR25AqmSxLhEn3+pskO0mjfSV4Y79+WJcVpofofAR15QF04ZovrIFYgAwBQ7eIx7fksthXjW9ksRRnEoUabVJsx1cx+nGMuKS7wwT80/M1y2Oncz+95GOobvdvSeJUSMcBVHjAAT5eJ1MGB+kXvZmJtvd/cWjkUlv5ZUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Mar 10, 2024 at 04:31:25PM +0000, Ryan Roberts wrote: > That's exactly how I discovered the original problem, and was hoping > that with your fix, this would unblock me. Given I can only repro this > when my changes are on top, I guess my code is most likely buggy, > but perhaps you can take a quick look at the oops and tell me what > you think? Well, now my code isn't implicated, I have no interest in helping you. Just kidding ;-) > [ 96.372503] BUG: Bad page state in process usemem pfn:be502 > [ 96.373336] page: refcount:0 mapcount:0 mapping:000000005abfa8d5 index:0x0 pfn:0xbe502 > [ 96.374341] aops:0x0 ino:fffffc0001f940c8 > [ 96.374893] flags: 0x7fff8000000000(node=0|zone=0|lastcpupid=0xffff) > [ 96.375653] page_type: 0xffffffff() > [ 96.376071] raw: 007fff8000000000 0000000000000000 fffffc0001f94090 ffff0000c99ee860 > [ 96.377055] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > [ 96.378650] page dumped because: non-NULL mapping OK, so page->mapping is ffff0000c99ee860 which does look plausible. At least it's not a deferred_list (although it is a pfn suitable for having a deferred_list ... for any allocation up to order-9) > [ 96.390688] dump_stack_lvl+0x78/0xc8 > [ 96.391163] dump_stack+0x18/0x28 > [ 96.391545] bad_page+0x88/0x128 > [ 96.391893] get_page_from_freelist+0xa94/0x1bc0 > [ 96.392407] __alloc_pages+0x194/0x10b0 > [ 113.131515] ------------[ cut here ]------------ > [ 113.132190] UBSAN: array-index-out-of-bounds in mm/vmscan.c:1654:14 > [ 113.132892] index 7 is out of range for type 'long unsigned int [5]' > [ 113.133617] CPU: 9 PID: 528 Comm: kswapd0 Tainted: G B 6.8.0-rc5-ryarob01-swap-out-v4 #2 > [ 113.134705] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 > [ 113.135500] Call trace: > [ 113.135776] dump_backtrace+0x9c/0x128 > [ 113.136218] show_stack+0x20/0x38 > [ 113.136574] dump_stack_lvl+0x78/0xc8 > [ 113.136964] dump_stack+0x18/0x28 > [ 113.137322] __ubsan_handle_out_of_bounds+0xa0/0xd8 > [ 113.137885] isolate_lru_folios+0x57c/0x658 I wish it weren't UBSAN reporting this, then we could get the folio dumped. I suppose we could put in an explicit check for folio_zonenum() being > 5. Does it usually happed in isolate_lru_folio()? > nr_skipped is a stack array of 5 elements. So I guess folio_zonemem(folio) is returning 7. That comes from the flags. I guess this is most likely just a side effect of the corrupted folio due to someone writing to it while its on the free list? Or it's a pointer to something that's not a folio? Are we taking the wrong lock somewhere again?