From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF0D9C54E60 for ; Sun, 17 Mar 2024 04:07:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B41176B0082; Sun, 17 Mar 2024 00:07:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF0EF6B0083; Sun, 17 Mar 2024 00:07:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96B136B0085; Sun, 17 Mar 2024 00:07:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 804BD6B0082 for ; Sun, 17 Mar 2024 00:07:56 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C52DF16039F for ; Sun, 17 Mar 2024 04:07:55 +0000 (UTC) X-FDA: 81905197710.09.F9C0E1B Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by imf27.hostedemail.com (Postfix) with ESMTP id EBAB240013 for ; Sun, 17 Mar 2024 04:07:53 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ONua5u+P; spf=pass (imf27.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.174 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710648474; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZrsAagVxnJJgGq4beAJ0RLGtxYosTgQyFVa7IwKk2Dc=; b=O+2PRzsEsr3ca3JMpSEVaZbOVORoCMRHcu6a0ypzyVINGODALt17jvoVL/72ukes4HGPI/ a1Wk+xyV9xQG93i2MfWCJa8pK6JPfAg8YNskEVDTxJg3wV45f8qMAxlb0Irmf7egZZGAjj e9xNhwAJpYSqq5AoySLa0sgh1w2LfFY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710648474; a=rsa-sha256; cv=none; b=u/MIbjrYhpr32qVRShEGm5epDj0bwW9u1ai5mMDs/rPjY/hvlsqiTBAONgRvqjApfLqGuv aynChi62Aq4NGVntTcreMoYeokzVCrl9htSdWAMxqsF3lqNcHTiOWmAcvsSKt+ts74AbOD BYeOIRKeSRi64n5t0RKkwcdKnJzmfz4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ONua5u+P; spf=pass (imf27.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.174 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2d46dd5f222so36136031fa.1 for ; Sat, 16 Mar 2024 21:07:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710648472; x=1711253272; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZrsAagVxnJJgGq4beAJ0RLGtxYosTgQyFVa7IwKk2Dc=; b=ONua5u+P1uubLx/pjyhm8UrpMDplYU8T9JpVgviMjb+T74WJaSYXRM5AmV+4peene3 yG5i68ojVc6yNZCv5ebS/++2AIZ63W6ufwyiR/6nJy5LMD1wveVt7mcjFB18Uouk2IgA XogH3mRTT1gxR0/gXCxlS05hsaO94ygjFPtL8Tb/lGBrIR7+m9Ccu+GvNdEvo2nxRf7M L53b2i3cL+CjE1slwYnyU9F8JDnvcZqz2F6JVah4R48/WptcsOp49y8xhX/rtFaBKt/3 ulM6+vAoHJYNQBSTIWSrrRm1r2k0RFqj/L7msc158HXu+gh/Gj9/x2UEEG6M4JusaXx8 zhwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710648472; x=1711253272; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZrsAagVxnJJgGq4beAJ0RLGtxYosTgQyFVa7IwKk2Dc=; b=OR3tHgwFWL3Es6CyEUR7utv6TtnQrAgxYpotg06C410S8/e2pco4qCPdhjwj4EE8QY b6rqGSRb5fgOUyOGx4gNncG66yliQtTWZsek3F84dLwxHM/dW9IPTcgA6CcJ90EeqEIx XdjkCKB3wqftxuHEXmUlkt0YUH2CIxB5JtLMK/RuSd3XmEruKtbZJbs0N/VZtfJ54jRB VbJcZkY99IfofjsNxN4Cr1giOHvQ514un/SEbqzH9jgZS8bDvzKFg0e146rh/ZalWgDp QXU57GrMwVXqGHdZvcCvn0dtYgUhr6EfBQtJadu7PK3DgDjTka0j2Gho/pIy98Uyt1Co u3QA== X-Forwarded-Encrypted: i=1; AJvYcCXNzN97Qp8qQSeoBkZhYFvVo0iV4bxdLqFmylpAHWAwxCi4Ay6qH+eR4U/Kfo8KT6TVDeZDbrSQJmfYEsuMnE6BRLE= X-Gm-Message-State: AOJu0YzpNrqeeI7A61NJ9GRiTedHm8kbEEMwqJNEhFmUpLTHBtOpLe2g hDsD29VcG/o4/VaiNOoYPyzkTdJU+OJ23dq5qQn/1ZIfscckKN1W7B0MPZZz9QN/LOyTnlA5HBy A3IaBh2jc09eVM+rG2nxGI4mBUaY= X-Google-Smtp-Source: AGHT+IElwjwg10UlhuYyRIKD1o2Wk343chTN2Bb2aAnSQcUtC/rN2FirgwzpB1kAz7Poir3ON1YvjgfwcfYb5sJvtKo= X-Received: by 2002:a2e:8611:0:b0:2d4:5d44:fe1d with SMTP id a17-20020a2e8611000000b002d45d44fe1dmr5285431lji.52.1710648471835; Sat, 16 Mar 2024 21:07:51 -0700 (PDT) MIME-Version: 1.0 References: <20240314083921.1146937-1-zhaoyang.huang@unisoc.com> In-Reply-To: From: Zhaoyang Huang Date: Sun, 17 Mar 2024 12:07:40 +0800 Message-ID: Subject: Re: [PATCH] mm: fix a race scenario in folio_isolate_lru To: Matthew Wilcox Cc: "zhaoyang.huang" , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, steve.kang@unisoc.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ojpp753wawkmfc1rrabd5iaeafdt7oxi X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: EBAB240013 X-Rspam-User: X-HE-Tag: 1710648473-885645 X-HE-Meta: U2FsdGVkX1+gWYpOCg+LDjwsKvB1EZiqMtWlWpQeQWXGibhqoeWCZeLqU5UICRMQQ2iyNlbj1tfuT7fybOQ4rMXE+yNCmiGp6EnLHk4+7hXvJGY+Z8Ukk1hhXI8qC+WfRLRfSH+/TlDPNkGGwH75eJTIf2ooDoZuakY7uoIBmZMjfhHHil6HQ8+CIwL4H+qrOew/WCdS/zuCkLlL4FFoIkTYODVtXvS4aeidC0k4idP8PdB56K1TUHKxWKkHW9v1IeZCxi1/a3ZGlCEFhcF6v7i/x6OnJ2TX8UJX3K+KfvFgXLy4wSqhB4z/A47x7AzMcEMrNo1lUkwuTMCqp59mAd87cVoeFN5NBNdJ87HDUVIvJBoqYhMWqiPuYqSnyLqc2RM5i/ZMFU2Bd8lPMED7z/aPxanKAAi3/+aTbWr8WHtaX9WdIZ7S/SV75IhN1eQ+Wp2RpnvE39ypb+CFg95Xj4jwqxAysD1nleOFK633xI+ApR5fB1tXuEAvPj7AHs6tJSaWPM0bN7vVWAWUrIMP1NaNCgCsup6Uyy9j2JA7TY5DI90xsX5eHsXcpc/jrbq7Gr/kxSE7103WQVrxe1DvMwpjefCme0CDzYIbLd/LNgbi/qLTS3uZetveqBz4TUG4xDA2H0SMLPa1m9INuKT5Vpr98jdGK3MPbigpvEa2VSFNhQybTFrJmZsp49CNzcxzXp/uBhH2LIh2gL2VNRhoiaJBSLkI0+J/Ashj71v3XiTdEe7MC1cyz/kRQx++ey+2Up+Gnvsu6ZUmGVzFcQFJD4Fvcrd/i6EEC/7RpaMinb1uLY4WNfa4aQcLO+IiryQEu8RkyMayANUqYl2VnHO7glunZkqlntGdVxQdmFHW0PgKcuAujRQmGkt35XfGGxTtoedLiePCCFUGjKilS9fOCEg+NVqifKt9+xr9MJEe88tsh/tDVK06IN4dejaH/0I3uGqU8LhalfWCreJj9a/ XuQxds8x WhyMblu9TGiLWfGygFtg2NE3b21/0TcHwjGGBBtEvtXlbZpqLQTGXfppdjLOKyomvMldzHI2+OdYN+t8VNVvQnjuAoREJEI8yCFVzdaygS8GsKOwlvv7TPm71SeGkJ9ha8tVdTBlbtsBFzaKkk+f/YodC/DBNDkMkX1Yhn0vo2FFi9uU05r7yT65XaziI61Hw0o+iLvVwPlVGGQQYmblsqZGPTL7QUhT5cq+rcwSEN9ivwx/SVe+nVQlbfNYIuaRKTK1IszHogBsYwF0ri3K68c6T3EDG7SGWsG9vTHxypw6ySRmb48LZyji/4oGpaIqy2e92MGQAplPRhmYwq4Zzb+oVKbXIDKkyKJxFEyY+jMh66iXuGxCEwHKu6msDS9806obn0B1iHWIzmRTlzCsNHa+QduEWM97VCixyOZUWYrxW98FNxN2zMt2Ez9BUiEIa0gfo0eDe+8epbA2hxovL7QDI63IPYaDqvIAQYHKfyzoSsRH46KRVaqXAw4pO7PjUqI05mXuYzuXlVYI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.002051, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 16, 2024 at 10:59=E2=80=AFPM Matthew Wilcox wrote: > > On Sat, Mar 16, 2024 at 04:53:09PM +0800, Zhaoyang Huang wrote: > > On Fri, Mar 15, 2024 at 8:46=E2=80=AFPM Matthew Wilcox wrote: > > > > > > On Thu, Mar 14, 2024 at 04:39:21PM +0800, zhaoyang.huang wrote: > > > > From: Zhaoyang Huang > > > > > > > > Panic[1] reported which is caused by lruvec->list break. Fix the ra= ce > > > > between folio_isolate_lru and release_pages. > > > > > > > > race condition: > > > > release_pages could meet a non-refered folio which escaped from bei= ng > > > > deleted from LRU but add to another list_head > > > > > > I don't think the bug is in folio_isolate_lru() but rather in its > > > caller. > > > > > > * Context: > > > * > > > * (1) Must be called with an elevated refcount on the folio. This is= a > > > * fundamental difference from isolate_lru_folios() (which is cal= led > > > * without a stable reference). > > > > > > So when release_pages() runs, it must not see a refcount decremented = to > > > zero, because the caller of folio_isolate_lru() is supposed to hold o= ne. > > > > > > Your stack trace is for the thread which is calling release_pages(), = not > > > the one calling folio_isolate_lru(), so I can't help you debug furthe= r. > > Thanks for the comments. According to my understanding, > > folio_put_testzero does the decrement before test which makes it > > possible to have release_pages see refcnt equal zero and proceed > > further(folio_get in folio_isolate_lru has not run yet). > > No, that's not possible. > > In the scenario below, at entry to folio_isolate_lru(), the folio has > refcount 2. It has one refcount from thread 0 (because it must own one > before calling folio_isolate_lru()) and it has one refcount from thread 1 > (because it's about to call release_pages()). If release_pages() were > not running, the folio would have refcount 3 when folio_isolate_lru() > returned. Could it be this scenario, where folio comes from pte(thread 0), local fbatch(thread 1) and page cache(thread 2) concurrently and proceed intermixed without lock's protection? Actually, IMO, thread 1 also could see the folio with refcnt=3D=3D1 since it doesn't care if the page is on the page cache or not. madivise_cold_and_pageout does no explicit folio_get thing since the folio comes from pte which implies it has one refcnt from pagecache #thread 0(madivise_cold_and_pageout) #1 (lru_add_drain->fbatch_release_pages) #2(read_pages->filemap_remove_folios) refcnt =3D=3D 1(represent page cache) refcnt=3D=3D2(another one represent LRU) folio comes from page cache folio_isolate_lru release_pages filemap_free_folio refcnt=3D=3D1(decrease the one of page cache) folio_put_testzero =3D=3D true list_add(folio->lru, pages_to_free) //current folio will break LRU's integrity since it has not been deleted In case of gmail's wrap, split above chart to two parts #thread 0(madivise_cold_and_pageout) #1 (lru_add_drain->fbatch_release_pages) refcnt =3D=3D 1(represent page cache) refcnt=3D=3D2(another one represent LRU) folio_isolate_lru release_pag= es folio_put_testzero =3D=3D true list_add(folio->lru, pages_to_free) //current folio will break LRU's integrity since it has not been deleted #1 (lru_add_drain->fbatch_release_pages) #2(read_pages->filemap_remove_folios) refcnt=3D=3D2(another one represent LRU) folio comes from page cache release_pages filemap_free_folio refcnt=3D=3D1(decrease the one of page cache) folio_put_testzero =3D=3D true list_add(folio->lru, pages_to_free) //current folio will break LRU's integrity since it has not been deleted > > > #0 folio_isolate_lru #1 release_pages > > BUG_ON(!folio_refcnt) > > if (folio_put_testzero()) > > folio_get(folio) > > if (folio_test_clear_lru())