From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10BE4C54E64 for ; Thu, 28 Mar 2024 04:03:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8592C6B008C; Thu, 28 Mar 2024 00:03:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8092B6B0092; Thu, 28 Mar 2024 00:03:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F8066B0095; Thu, 28 Mar 2024 00:03:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 52D986B008C for ; Thu, 28 Mar 2024 00:03:18 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8DCE340F53 for ; Thu, 28 Mar 2024 04:03:17 +0000 (UTC) X-FDA: 81945102834.21.E80B035 Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) by imf19.hostedemail.com (Postfix) with ESMTP id A1D6F1A0007 for ; Thu, 28 Mar 2024 04:03:15 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UilTyPHx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.167.44 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711598595; a=rsa-sha256; cv=none; b=02+hPLtP8lHwtqpshry/Tdb+R16PotIQbySE1Fo50+94FfBQTw5H9zTdUvllCLly8bDeP4 9VqGtBTygS1Pl/8B5Gx5dJllwCGEpk+EXwZ0wOQnzY9DmGD5WaEincdCbpe/t/p/opO3EU W+HX3uulLVOFfgkGNhOTbsVYOZawGms= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UilTyPHx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.167.44 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711598595; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y26LU5IaRA0ySvMRHMq2FysiOdELeoSdL42Vw4YgFQU=; b=f3tzIAlLEcbAXp1bt/f4ulugc6W89eOatypNZ/NhhhFYmir1qKgJ2Sxe9T5rqCkbJn6beG OFcmAgz53uHPtFxwivgbwsZa1m9bf1eZCGUYSVYaZHYifpQW3wTyxXD3wiuFqWyuAMsHdB FImtYRDo1mySR/vZyVRAwF5+xG5aPi8= Received: by mail-lf1-f44.google.com with SMTP id 2adb3069b0e04-513e10a4083so556173e87.1 for ; Wed, 27 Mar 2024 21:03:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711598594; x=1712203394; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=y26LU5IaRA0ySvMRHMq2FysiOdELeoSdL42Vw4YgFQU=; b=UilTyPHxuPjEYJZNYivAvn8s6MsUfVDEyjAzQFimZZPKA3yZXMebm6sp8NNsqT2DwL 7E0vwlqWPmzN2aUbWnz+1zt1OBVU2FHv7FsQQLi0HoQZuYA5o/qGligDa08dCRchjc5M Bfd0EP3nFxKJ1EsLdof6WgUQnc905sQgd8nIiw0k29MmvxUzG1kyNWQk/JK7BXJri8p0 stsBLt5qXfdEsCoPIIoVzDy3g7yXlq6jGAgVfDoHYupxh4hgiqtFlCCp+pResqkkw/HM 7cuYTDM2ocl7t9qmHd/F4mhjKXynEjlfXLgpFaVFOXedP6Qzj3vfkuNvsedE14GrDOyj oTiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711598594; x=1712203394; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y26LU5IaRA0ySvMRHMq2FysiOdELeoSdL42Vw4YgFQU=; b=of5qInCZDRzssvl9jKeGMvag7GGK1Hz1CnWnHZBPcVFQIeXEq2noOaxLeKxAmQZ7Jh yz/yikXwfUgcIhBCebo7nhUKpkYQvI3NKvJafhzY0d212OQCia0TTJMYzFb/+3bJKk2U ME7C5mbfNzOXQldcAs3fXgCot/e0cDD6WlOzIGL3r4XGM2r8EGoQrvID4qhI+6G7T3JP ptO0NBdQEEA4EtLaCI4kb+czxJqeFM1qDZxiVX1rpFrPb6UKqn3PHDH1g8xeQAwM5upN dt3DlcD09XKNm1pJriuQ6HoB0WEU4AEWehygPKbFQmUgxu6o2qdzITRhCkekQvNkk9uT lJyQ== X-Forwarded-Encrypted: i=1; AJvYcCXuw0apjfAtyFF92RckLOeNPfChoaiMlAhnsprnkPXOScqPC/K/L4wZ1cN+kV0bFUpYsgQ5eCiyFcFbj8cU8sJYopo= X-Gm-Message-State: AOJu0YwqwdMF3v7dGYggImUSfexhl8LLA4+XdMZmZDee3tNfcXeugn0o QtQ0IRf9EGi/HToU5F8zWKmcwBakRGBBjvhGo9b1C2OXcEgUnTeiSPCxiApQkPUIWSsChOOmbUJ lJA8RMMhXhN6pFKRWNjM7aqhr9zo= X-Google-Smtp-Source: AGHT+IHWYxXbkvHuAYVca52UgrVPta+8NGXkypikoZTNY3AvWAgzhKMpCVfrkv3KLZQu53L40XNZDPiZmlO+yqZb6cY= X-Received: by 2002:a05:6512:3f0:b0:515:96ab:4183 with SMTP id n16-20020a05651203f000b0051596ab4183mr742364lfq.37.1711598593481; Wed, 27 Mar 2024 21:03:13 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Zhaoyang Huang Date: Thu, 28 Mar 2024 12:03:02 +0800 Message-ID: Subject: Re: summarize all information again at bottom//reply: reply: [PATCH] mm: fix a race scenario in folio_isolate_lru To: Matthew Wilcox Cc: =?UTF-8?B?6buE5pyd6ZizIChaaGFveWFuZyBIdWFuZyk=?= , Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , =?UTF-8?B?5bq357qq5ruoIChTdGV2ZSBLYW5nKQ==?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A1D6F1A0007 X-Stat-Signature: 4u9rrgn5kt5uhqnmh8oz9rgcnoowcqfb X-HE-Tag: 1711598595-744356 X-HE-Meta: U2FsdGVkX1/ssLZcH8mSKtBkQ8r3zkvpXFpzDmTFIMGu+0jf/1KqjvI56Oees/cGQPdb7ko+EFfpk4IQxwBMQpcbBVfI65/XgPFMwuunZo7DPdwmSfUgsKtwusw8/nPQKBu4Nplx/zOp/QIPXPNvGin9wU8R1wBSw+k5HNGd7KNk18M1MgzwqdXdD1eiQkbbshYDM/Ma+7EN19W9MXv9xox2zpwqTk6qpblqk6ZPJToLXKlTfW9HMLz344zEf/4b0FMycAABk7jeAXCLTM8rMD5br/0gJzi/RToi7skqFZdjeDsnkg6AqYVBSl3w7F+/9UEIcs/1s3YEvqOkS2aQi3C8/1P8ulQsamq2XzMyP5ThzaD0AyJfiODJJiJ5giUvaqyiyoKFBN1TSS53Yt0Cw4RuJBxOsClHhJg1gl/TP88E9nLjBSM+beQ7GZehuaqpiNXaBEQiBYJG+daEmsWfzt0aYyLezYxOfjqTG6Uss2azHu5Iaoeq2voUdTyVeKgruwOE29Gf2UABnRf0hAhPTqOb2FwRc9qmbfz9vMrl+DyJioeK2ecisIj3fk48i9moj+gcv3z3BxXtQXmYR4cfOATK5dMqjknHALlI7QpeJBjgT60c90PrOMa8TPNuSTPPvxw55axoM+pH9A0g/WzUD60jLfY5bxtCGPMVNu/J6NlgavZAWccOMIXXDV2YUC/U2sMJoKd4SlcQPf0ayVmwKxvvc0IlMJoYQSUENS5ryI5Rl3s0mjhlrI9pWxRBJhl6iBxYYAn6TiWqGVtQYNSVwFh1KmOLG2DKCl61m4aewjz06hFPFrgsm5aTvJj0VS0oh/n+3tymnC9Fz8pEXU09kHLErqdnRiMwIFDukxqLtyoYTT70LuYvJARSjIOt5h8FpFbO5VpouhdpSnsSZf9GFdxNuDv7H2DQvFCGhxfOrlDo8VJGV9zOwKHYiH9FiD/GSO80g2KlLvX4xhTzYIv FN/4sY/I Z9GrFeSVxLY5t/f9ueGMprDNiR5gAjpTJVhvDAN/0n0B9+oY0hkbyV8F0o5J7v/uEwRzbEHANrgV5qcs+q0ZB7GmhEkv2BHfnF0knK5dy3HbFuRJvGVYOzMzgZn0GdO5Z9LjNG5Bke4fybWGjpKCtyVgaMEaXZ8oIB+HCGaFGUkQU/RLHV3h+AssU7ynVj2/GIFczNAF6v4aIzP8p9xMn3cmFOFWGEP1XKsXg9HJt+AfMurZo17nM5bd7kKuY7LJcJqsH2qB/2FLnMgAfRjYKp39eq0P1B+zq0iWplHSNSlHRHPVlQOY++85xC0TOFerFKcN0g3LSVuOXIk4LA+L+TYOMIro4GABFRNgYt9e8AJHq7l7zCOnHLObgZ53vzS48lnlLVldOV0SZ8fdBK+4H2U27SV/m+XNCDeNoTWQDL8l5wvjqLbigGOKgHX6Zce0UUFWoknHrLb9C2SG3Ko807zDNHcUPSAXuCSXK5wQiL7Zw1Z3DH9/41Bh39w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 28, 2024 at 11:18=E2=80=AFAM Matthew Wilcox wrote: > > On Thu, Mar 28, 2024 at 09:27:31AM +0800, Zhaoyang Huang wrote: > > ok, I missed the refcnt from alloc_pages. However, I still think it is > > a bug to call readahead_folio in read_pages as the refcnt obtained by > > alloc_pages should be its final guard which is paired to the one which > > checked in shrink_folio_list->__remove_mapping->folio_ref_freeze(2)(thi= s > > 2 represent alloc_pages & page cache). If we removed this one without > > __remove_mapping() requires that the caller holds the folio locked. > Since the readahead code unlocks the folio, __remove_mapping() cannot > be run because the caller of __remove_mapping() will wait for the folio > lock. repost the whole timing sequence to make it more clear and fix incorrect description of previous feedback Follow the refcount through. In page_cache_ra_unbounded(): folio =3D filemap_alloc_folio(gfp_mask, 0); (folio has refcount 1) ret =3D filemap_add_folio(mapping, folio, index + i, gfp_ma= sk); (folio has refcount 2, PG_lru) Then we call read_pages() First we call ->readahead() which for some reason stops early. Then we call readahead_folio() which calls folio_put() (folio has refcount 1) Then we call folio_get() (folio has refcount 2) Then we call filemap_remove_folio() (folio has refcount 1) Then we call folio_unlock() Then we call folio_put() Amending steps for previous timing sequence below where [1] races with [2] that has nothing to do with __remove_mapping(). IMO, no file_folio should be freed by folio_put as the refcnt obtained by alloc_pages keep it always imbalanced until shrink_folio_list->__remove_mapping, where the folio_ref_freeze(2) implies the refcnt of alloc_pages and isolation should be the last two. release_pages is a special scenario that the refcnt of alloc_pages is freed implicitly in delete_from_page_cache_batch->filemap_free_folio. folio_put() { if(folio_put_test_zero()) *** we should NOT be here as the refcnt of alloc_pages should NOT be droppe= d *** if (folio_test_lru()) *** preempted here with refcnt =3D=3D 0 and pass PG_lru check *** [1] lruvec_del_folio() Then thread_isolate call folio_isolate_lru() folio_isolate_lru() { folio_test_clear_lru() folio_get() [2] lruvec_del_folio() } ---------------------------------------------------------------------------= ----------------- shrink_folio_list() { __remove_mapping() { refcount =3D 1 + folio_nr_pages; *** the refcount =3D 1 + 1 implies there should be only the refcnt of alloc_pages and previous isolation for a no-busy folio as all PTE has gone*** if (!folio_ref_freeze(refcount)) goto keeplock; } }