From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C1E1C5475B for ; Mon, 11 Mar 2024 12:36:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FFE06B0098; Mon, 11 Mar 2024 08:36:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B0336B0099; Mon, 11 Mar 2024 08:36:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89F576B009A; Mon, 11 Mar 2024 08:36:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 79DD06B0098 for ; Mon, 11 Mar 2024 08:36:06 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 50DC4404A5 for ; Mon, 11 Mar 2024 12:36:06 +0000 (UTC) X-FDA: 81884705532.17.6CBDC88 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP id 5D14B10000B for ; Mon, 11 Mar 2024 12:36:04 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710160564; a=rsa-sha256; cv=none; b=CwA7Eu28TaBl2kxGt+XVnQVoEwZhfCkFzDS3JIl7e8jCCvgnWSdybmTwZ9z+lcnRcgbGUR PgC2+FzgwHBo7CMn47CX9sl232syHy5cPx3XeUdp1+O7a2/RH15eQBqRQJXCKrKqBYIkbz vqegpfXs/OOB1ROREHhHLxOe6IuKGvY= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710160564; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pcr6gixz+WiKaKIRNQET9GhzPcAp0VMC3x58NgQ9Rn4=; b=Np8e6u/2d5Kpj+x059y6uAXtZ7VLP23grrkVwMnuf1+7lj4aB3K4KuzT1W++TWjaILJhy3 6lt3e2f2qtp1tJo5xo8WwzWX/reRTOwlj4mq6oZ7GxmiYEfZhsWaabRQV1iRlYscadxcUF d2ICcKgzrqMU/zMUalxaTh8LZZ4VZX0= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2E019FEC; Mon, 11 Mar 2024 05:36:40 -0700 (PDT) Received: from [10.57.68.246] (unknown [10.57.68.246]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AC6D03F64C; Mon, 11 Mar 2024 05:36:02 -0700 (PDT) Message-ID: <79dad067-1d26-4867-8eb1-941277b9a77b@arm.com> Date: Mon, 11 Mar 2024 12:36:00 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Content-Language: en-GB To: Matthew Wilcox Cc: Andrew Morton , linux-mm@kvack.org References: <20240227174254.710559-11-willy@infradead.org> <367a14f7-340e-4b29-90ae-bc3fcefdd5f4@arm.com> <8cd67a3d-81a7-4127-9d17-a1d465c3f9e8@arm.com> <02e820c2-8a1d-42cc-954b-f9e041c4417a@arm.com> <9dfd3b3b-4733-4c7c-b09c-5e6388531e49@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5D14B10000B X-Stat-Signature: gjzfahsa6bji5gkyz8utan319t8p81dq X-HE-Tag: 1710160564-388613 X-HE-Meta: U2FsdGVkX192WY3knZ5rOamNKsmg9zoIpRwiU+Spl4AlYQ9egYT3EP0LmYBUIMqUyCuvAOfqPgteGUhT1Fw7LwMqQm6ln0shZ4dNDjO/rRDbaBq8fc9IJ+c/9THXOqWqaNEc1D4jA/mYiJ8L+H3SPSZTdJOcNyew2o2bJGAY2+SbKI3eb8y8L6mH5SqFBEoywNjOJb2Hp+gcK7YROaMuglIwwBbEjADa+b6Jo0kFNGPHVPxKFTQhOogFMEq1wK6zJZVXj6gc8PZ4S160YzsdHjNJ8y/UgseLfpYyYCDp4Wo93i9iuj9Guh+dDcQ1vqOqWILut4JksIOG3iFZ4vohXt7ViIzHmcXkkBy34weaknJLwfIh5ZnOeQiFawQ7G3Y/RMGSOilHvAXt/v40b+PobAHlzaT6TC6R8aKGDAbLSU1S1pH+HHr0ByYXq7A/MkRgxnyuXArBZkGKJL8C2BNrn30SSE1DhCEPGK0JyGq6ac19pe2iKx58oRp5VFe1PbO72pZjz7NUl5005nXfvyo3aRW5ho5FJgXnY1xkF/VrcGhcFgxZncI2x0zQJkXuQhTldysvh5WyuiEdGwILJPe2zrBdzBfq6KL6Lmkbc4klP1hgWPwAzC0/pXvda/CyIT3d7Rb8jkH1w2qi4MYrxth0OcyTgM9irg5Egzzfhr5RxnKQyQEztkCB+9liU0rAMUVhLdh6ZnjwwxLM/mH+ahPW8b7Hr39hqFAEsCCjMWk/sGa0nZAo2cApWR4uT3R7s8lbynGsMaqkwjDnoe4LIE6xGH5F8Qvch1sAvquNzbifrj6v5HqOdeiAQbJ59RzlB110FbjhqDQumYl21hzFmsMsIhV0XLJ71dTv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11/03/2024 12:26, Matthew Wilcox wrote: > On Mon, Mar 11, 2024 at 09:01:16AM +0000, Ryan Roberts wrote: >> [ 153.499149] Call trace: >> [ 153.499470] uncharge_folio+0x1d0/0x2c8 >> [ 153.500045] __mem_cgroup_uncharge_folios+0x5c/0xb0 >> [ 153.500795] move_folios_to_lru+0x5bc/0x5e0 >> [ 153.501275] shrink_lruvec+0x5f8/0xb30 > >> And that code is from your commit 29f3843026cf ("mm: free folios directly in move_folios_to_lru()") which is another patch in the same series. This suffers from the same problem; uncharge before removing folio from deferred list, so using wrong lock - there are 2 sites in this function that does this. > > Two sites, but basically the same thing; one is for "the batch is full" > and the other is "we finished the list". > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index a0e53999a865..f60c5b3977dc 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1842,6 +1842,9 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec, > if (unlikely(folio_put_testzero(folio))) { > __folio_clear_lru_flags(folio); > > + if (folio_test_large(folio) && > + folio_test_large_rmappable(folio)) > + folio_undo_large_rmappable(folio); > if (folio_batch_add(&free_folios, folio) == 0) { > spin_unlock_irq(&lruvec->lru_lock); > mem_cgroup_uncharge_folios(&free_folios); > >> A quick grep over the entire series has a lot of hits for "uncharge". I >> wonder if we need a full audit of that series for other places that >> could potentially be doing the same thing? > > I think this assertion will catch all occurrences of the same thing, > as long as people who are testing are testing in a memcg. My setup > doesn't use a memcg, so I never saw any of this ;-( > > If you confirm this fixes it, I'll send two patches; a respin of the patch > I sent on Sunday that calls undo_large_rmappable in this one extra place, > and then a patch to add the assertions. Good timing on your response - I've just finished testing! Although my patch included both the site you have above and another that I fixed up speculatively in shrink_folio_list() based on reviewing all mem_cgroup_uncharge_folios() call sites. I haven't been able to reproduce any issue with this patch (and the extra asserts) in place. I've run ~50 iterations over ~2 hours. Previous record was about 30 iterations before catching an oops. Given how difficult it now is to repro, I can't be sure this has definitely fixed all possible places, but its looking positive. diff --git a/mm/vmscan.c b/mm/vmscan.c index a0e53999a865..cf7d4cf47f1a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1436,6 +1436,9 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, */ nr_reclaimed += nr_pages; + if (folio_test_large(folio) && folio_test_large_rmappable(folio)) + folio_undo_large_rmappable(folio); + if (folio_batch_add(&free_folios, folio) == 0) { mem_cgroup_uncharge_folios(&free_folios); try_to_unmap_flush(); @@ -1842,6 +1845,9 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec, if (unlikely(folio_put_testzero(folio))) { __folio_clear_lru_flags(folio); + if (folio_test_large(folio) && folio_test_large_rmappable(folio)) + folio_undo_large_rmappable(folio); + if (folio_batch_add(&free_folios, folio) == 0) { spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_folios(&free_folios); There is also a call to mem_cgroup_uncharge() in delete_from_lru_cache(), which I couldn't convince myself was safe. Perhaps you could do a quick audit of the call sites? But taking a step back, I wonder whether the charge() and uncharge() functions should really be checking to see if the folio is on a deferred split list and if so, then move the folio to the corect list?