From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C206FC3DA49 for ; Tue, 30 Jul 2024 21:19:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A2FC6B007B; Tue, 30 Jul 2024 17:19:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 553326B0085; Tue, 30 Jul 2024 17:19:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F3876B0089; Tue, 30 Jul 2024 17:19:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 218916B007B for ; Tue, 30 Jul 2024 17:19:06 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BFB69120528 for ; Tue, 30 Jul 2024 21:19:05 +0000 (UTC) X-FDA: 82397684250.24.80F4BC2 Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) by imf07.hostedemail.com (Postfix) with ESMTP id 9983C40020 for ; Tue, 30 Jul 2024 21:19:03 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmx.com header.s=s31663417 header.b=OBnTT4zx; spf=pass (imf07.hostedemail.com: domain of quwenruo.btrfs@gmx.com designates 212.227.17.21 as permitted sender) smtp.mailfrom=quwenruo.btrfs@gmx.com; dmarc=pass (policy=quarantine) header.from=gmx.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722374289; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1tZkrCQj/8N6EM1a5ULIFKWfa+O4mm26vhEWtFZrC14=; b=STUivF8arOc1atmk2hSYf4t1klvlWbFFELLI+W5ia5Z+5jKVbMgnm0vZL+jOQXAbDqcyIJ pFu0aOKMDKpO34W20+DxflmCxH0861CvsTgr5Ahba6nh0KH2JUdOXbJlgZ9ctVm99izh9E R80Bt51N3w76AYDO8dbYZHM6erP/H0U= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722374289; a=rsa-sha256; cv=none; b=U6qcy22XNdq/LYPx69PKO16DMNX9VZQ+AsoSep84dR0UahlDjpnWW4NughKeE6/S0tdWA1 F22WclnhSp3pGG9mOFLyJMHeNmU0GqKXIFVHaAH9UDh0yfzZLikE47vhPWiIKx3vo/DFAd bLMO4hVR0uXqtR1jk2uIqqhC9iLkpb8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmx.com header.s=s31663417 header.b=OBnTT4zx; spf=pass (imf07.hostedemail.com: domain of quwenruo.btrfs@gmx.com designates 212.227.17.21 as permitted sender) smtp.mailfrom=quwenruo.btrfs@gmx.com; dmarc=pass (policy=quarantine) header.from=gmx.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.com; s=s31663417; t=1722374341; x=1722979141; i=quwenruo.btrfs@gmx.com; bh=1tZkrCQj/8N6EM1a5ULIFKWfa+O4mm26vhEWtFZrC14=; h=X-UI-Sender-Class:Message-ID:Date:MIME-Version:Subject:To:Cc: References:From:In-Reply-To:Content-Type: Content-Transfer-Encoding:cc:content-transfer-encoding: content-type:date:from:message-id:mime-version:reply-to:subject: to; b=OBnTT4zx+tbJPaampN9JZ+eTEZ50DonVwPLPhnTeWjGG3oeGvqSwtILAtbzeaFOV io2x4ugdedi5BT6syuXDptCA4Mzwh0P2T4AJBoBuZi05e2Y+i00T82b6oozhHOJnF 17wX5OsTHBsofT8RgVhh/0oVjPp6AzSpnAEpDoCOpopobDfMXxvZ4SeQTIr2k6ML8 vh/6tx3OeMPz6lUU0H/QT9IW5O4puBk7G/nbCbiO755pGlxSe3/JtNOZ4Z1mRY06m yPpcCGbUyTt//auvKibbSp8uFbO31DgiDoZ0FI71u3yUthHCeo3tiO0ZIApV0iK2x dwnCNy+IdrHvm698iw== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from [172.16.0.191] ([159.196.52.54]) by mail.gmx.net (mrgmx105 [212.227.17.174]) with ESMTPSA (Nemesis) id 1MulmF-1sHeZq0kbL-00zft2; Tue, 30 Jul 2024 23:19:01 +0200 Message-ID: <2f6a2670-cf09-4750-9578-9198eea8dff6@gmx.com> Date: Wed, 31 Jul 2024 06:48:57 +0930 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Forcing vmscan to drop more (related) pages? To: Matthew Wilcox Cc: Linux Memory Management List , linux-fsdevel@vger.kernel.org, "linux-btrfs@vger.kernel.org" References: <7e68a0b2-0bee-4562-a29f-4dd7d8713cd9@gmx.com> Content-Language: en-US From: Qu Wenruo Autocrypt: addr=quwenruo.btrfs@gmx.com; keydata= xsBNBFnVga8BCACyhFP3ExcTIuB73jDIBA/vSoYcTyysFQzPvez64TUSCv1SgXEByR7fju3o 8RfaWuHCnkkea5luuTZMqfgTXrun2dqNVYDNOV6RIVrc4YuG20yhC1epnV55fJCThqij0MRL 1NxPKXIlEdHvN0Kov3CtWA+R1iNN0RCeVun7rmOrrjBK573aWC5sgP7YsBOLK79H3tmUtz6b 9Imuj0ZyEsa76Xg9PX9Hn2myKj1hfWGS+5og9Va4hrwQC8ipjXik6NKR5GDV+hOZkktU81G5 gkQtGB9jOAYRs86QG/b7PtIlbd3+pppT0gaS+wvwMs8cuNG+Pu6KO1oC4jgdseFLu7NpABEB AAHNIlF1IFdlbnJ1byA8cXV3ZW5ydW8uYnRyZnNAZ214LmNvbT7CwJQEEwEIAD4CGwMFCwkI BwIGFQgJCgsCBBYCAwECHgECF4AWIQQt33LlpaVbqJ2qQuHCPZHzoSX+qAUCY00iVQUJDToH pgAKCRDCPZHzoSX+qNKACACkjDLzCvcFuDlgqCiS4ajHAo6twGra3uGgY2klo3S4JespWifr BLPPak74oOShqNZ8yWzB1Bkz1u93Ifx3c3H0r2vLWrImoP5eQdymVqMWmDAq+sV1Koyt8gXQ XPD2jQCrfR9nUuV1F3Z4Lgo+6I5LjuXBVEayFdz/VYK63+YLEAlSowCF72Lkz06TmaI0XMyj jgRNGM2MRgfxbprCcsgUypaDfmhY2nrhIzPUICURfp9t/65+/PLlV4nYs+DtSwPyNjkPX72+ LdyIdY+BqS8cZbPG5spCyJIlZonADojLDYQq4QnufARU51zyVjzTXMg5gAttDZwTH+8LbNI4 mm2YzsBNBFnVga8BCACqU+th4Esy/c8BnvliFAjAfpzhI1wH76FD1MJPmAhA3DnX5JDORcga CbPEwhLj1xlwTgpeT+QfDmGJ5B5BlrrQFZVE1fChEjiJvyiSAO4yQPkrPVYTI7Xj34FnscPj /IrRUUka68MlHxPtFnAHr25VIuOS41lmYKYNwPNLRz9Ik6DmeTG3WJO2BQRNvXA0pXrJH1fN GSsRb+pKEKHKtL1803x71zQxCwLh+zLP1iXHVM5j8gX9zqupigQR/Cel2XPS44zWcDW8r7B0 q1eW4Jrv0x19p4P923voqn+joIAostyNTUjCeSrUdKth9jcdlam9X2DziA/DHDFfS5eq4fEv ABEBAAHCwHwEGAEIACYCGwwWIQQt33LlpaVbqJ2qQuHCPZHzoSX+qAUCY00ibgUJDToHvwAK CRDCPZHzoSX+qK6vB/9yyZlsS+ijtsvwYDjGA2WhVhN07Xa5SBBvGCAycyGGzSMkOJcOtUUf tD+ADyrLbLuVSfRN1ke738UojphwkSFj4t9scG5A+U8GgOZtrlYOsY2+cG3R5vjoXUgXMP37 INfWh0KbJodf0G48xouesn08cbfUdlphSMXujCA8y5TcNyRuNv2q5Nizl8sKhUZzh4BascoK DChBuznBsucCTAGrwPgG4/ul6HnWE8DipMKvkV9ob1xJS2W4WJRPp6QdVrBWJ9cCdtpR6GbL iQi22uZXoSPv/0oUrGU+U5X4IvdnvT+8viPzszL5wXswJZfqfy8tmHM85yjObVdIG6AlnrrD In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:Eo9WSU/yOw0JuSavAvDPE4x5fB6Fg02TIaXegnCI5TQP2v+O/A0 KioBmAyK3dtYwpXYA+lyYHSlBmktWPYrZGz7eRavfaMRZmMNeUgQUtfQ2Q4K1l4xGzz8gkV MER3VFY/uWYLfS76QrVq+1umHMqewxbIbngg7KbyOMTj2GBhv97LRp6QqzQq/drJp/KkO1w jrAXWGAqkNcvH/XkVsdRA== UI-OutboundReport: notjunk:1;M01:P0:VxWnu4AUxO0=;ZGTR7d65lueURPRG/By7dmFN6ty elQPMrx9KR4QiuzDN8GYpN8nQOIXfzeZ2gn1J7Lzid+MFXunCPl0tHEUiAgLZDZ62AEauOBKp eHBcLwY3IZegzrmXxocv7fP+kZakdI5Jiek+jUvdqJgIqrLIIWmX4rRyVtnDE7DdOKbhBAmij nlUyCOU4tacZftTn/QEDUo+5ShxRKJ0HORMLAn97bZIwpSyi3gIRBMosi37shMVkzqxxE9Umn qodOgsDQTHC9jXTYvek0gNHPrMcqCprjB7LadOEeVovKYptR+amoCRBrcMgfWP3NDXcMP7iUu 2dTbMNVbreOtLjcs7h8RuaKqJQygL/wg2xHH9GamZtbp+ybfnTftWZOjCiWfy5XFsAHsBJYcU McXpXibhOGz9OCoHJwF7YX+boeB1mcwTgn873d0LgpqTg/w+2hidqClwf40lPcWXojfDZraKT 8yCGx2oQ4nBshNO7tpv1GSfmyBmcfnK75HwP8oBStT0XnyDFhshoRhdLpRuM1CWH1nkLOQX4M 7WH9F4iEl+jLTE9O0cMxweUM3MxGmJ6m420PxYLfESQdqNn8wwb9MaTYwRl0SHB71chGGYIOu a8ZTukQmaDhIouKM2B1Gdn8Z4KMiJE+C6JiR546qwHfRzHIGN80RYcy3rBeXUYM0tctOecVu5 C+7ZjEYCAWjznluIrBb5eJkLrJ4vi1RmtLdMl/wE9TMlaFbpHzTXgsAztZIW+UwgcgA0XINN+ 70ztZEBXg0QqDdXPsmMXrqS8dpHfmy9qE0Js27YAxzXzxWOllc6CXSG4EBDIPWMyr+PB19hrk abiF8aHsKac4dUaFR37ypo7g== X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9983C40020 X-Stat-Signature: 997g1pm5xaox7u8t9ac776hh3qzxgqd7 X-HE-Tag: 1722374343-542029 X-HE-Meta: U2FsdGVkX1+OfC1oYXD+M0fVlqqDvrPXsW7Rb22QnMRCwRfhIqfBalxLNAi/PP2wVJzALnSFSD0wfz5ypSQc6vJRfw9zExsX82R6MoGs74yLynMu49zp5FYG2WK110DwWoUD+nyCjGIahBS3GEESPvhifl8tSjH/uAVSThYSVhrbHXF/U+SXdRvxC02QJBD15I/Iygk+t4f5NbksPBG4VizgMKQOtlfOLFCUVgE/FdLZU3xibOKm+rpQJtF1sgOghybb0VlFf6b8zfXEPIW0T4OOnhlmPPFSb5azO2Py/68+9WnYnBmNo6SG6qCv9qebQtYAuHCu4qDsxyslTx1h2/J6e8d9dM5T59GQ8gFolEE2IiClIzfS+4E04hRYxYd7VvDJGnvSgtSio6mZW9w7VcdDaZh3I+oP966sRWYp6frluYFxyPnjqTlDo00rRnlEZorADRFl4YhhaLRVn0LCZCLZtGEGroL6ZNBFYIJDyIfcOsrKjbRN5vaWe97ip0yLiaH+lyEsg8Q/FNtRYYbmvS1LSpywwkDNn+Y08+3/Qf5htwnx7LpSkfDxBP6m4tzi3W4uOtmoqu+25lBmMGZyYgYMr1C2etudHPjQyPv27KsepVc/cjcXxKKa+8ZijWgQfTz8QTEwDMLK6pPoEPSYfvae/04DPoo5RnzOBJhsx1KO5tRMD5LPJlCelqOM14+NZ22qysoQQ3gxr+v1FvOkFKy4gZpkbyYsv9dj3zu8MQDkJmKSJx7JdvtpAnYKnXFPE/3L9OjtQCGPXlVILDTNYzYK3ivMZtd1rUdsCudEsBW7Q/Li1hYmz5X6NbBvjN1oQTLbWGn0j96KBPkakYU6Vb/SLfcq3U4MxZEOu9jy+JbqhDzPYN96rR40mpCKvebzrt+bxaGyK4fiB3NRMFNZOLPYKcSsISsFcCBklJ8IwyIFAe0V476kH3PZeIL6FhrJ9Q+OQWYHbezrs5s4k7a ThJ09n7z IWnk7AbEiRJa8SRO+e6luUepiWrRvl1FV89e9p8HO/2NKdHfGxmWznCCtoS7OY9jh5/HXWb4FBEVEWDRTPtleMG7o2/GsdMqN+nx8d/PR9EHmZ4C+Tpq5M/ZIQxVSJOx3WXRcCeYTCzil8iNXr/mLavwkzt+sWAi9woYdgWNk/WFmjc4/bJZYe5W5gTdWaG9q7UEjIlQ4Gvrf2Rstcx1WI92vqMQFqx7zKXYnJ6dpDSepGCYr5KDbkgqtH793KSGGFs3mYpsawALTDzTLqXU0VNlu+6AJzMDqWi5zhE4kiBWyVat3NA/4/GPyICaXn1yuDR161MBoAPE80L+ioKLojlgapUseXcS3TLiB/CHWiVOO01PytK8P4lJ3UOPLDCC3ruVJLmDZrla8Wr+vjPnq94CkbA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: =E5=9C=A8 2024/7/31 01:24, Matthew Wilcox =E5=86=99=E9=81=93: > On Tue, Jul 30, 2024 at 03:35:31PM +0930, Qu Wenruo wrote: >> Hi, >> >> With recent btrfs attempt to utilize larger folios (for its metadata), = I >> am hitting a case like this: >> >> - Btrfs allocated an order 2 folio for metadata X >> >> - Btrfs tries to add the order 2 folio at filepos X >> Then filemap_add_folio() returns -EEXIST for filepos X. >> >> - Btrfs tries to grab the existing metadata >> Then filemap_lock_folio() returns -ENOENT for filepos X. >> >> The above case can have two causes: >> >> a) The folio at filepos X is released between add and lock >> This is pretty rare, but still possible >> >> b) Some folios exist at range [X+4K, X+16K) >> In my observation, this is way more common than case a). >> >> Case b) can be caused by the following situation: >> >> - There is an extent buffer at filepos X >> And it is consisted of 4 order 0 folios. >> >> - vmscan wants to free folio at filepos X >> It calls into the btrfs callback, btree_release_folio(). >> And btrfs did all the checks, release the metadata. >> >> Now all the 4 folios at file pos [X, X+16K) have their private >> flags cleared. >> >> - vmscan freed folio at filepos X >> However the remaining 3 folios X+4K, X+8K, X+12K are still attached >> to the filemap, and in theory we should free all 4 folios in one go. >> >> And later cause the conflicts with the larger folio we want to inser= t. >> >> I'm wondering if there is anyway to make sure we can release all >> involved folios in one go? >> I guess it will need a new callback, and return a list of folios to be >> released? > > I feel like we're missing a few pieces of this puzzle: > > - Why did btrfs decide to create four order-0 folios in the first > place? Maybe the larger folio allocation failed (we go with __GFP_NORETRY | __GFP_NOWARN for larger folio allocation), thus it falls back to order 0 directly. > - Why isn't there an EEXIST fallback from order-2 to order-1 to order-= 0 > folios? Mostly related to the cross folio handling. We have existing code to handle multiple order 0 folios, but that's all. For one single order 2 folio, it's also pretty easy to handle as it covers the full metadata range. If we go support other orders, we need to handle mixed orders instead, which doesn't bring much benefit. So here we only support order 0, or order 2 (for 16K nodesize). And that's why we're not using __filemap_get_folio() with FGP_CREATE to allocate the filemap folios. Maybe it's better to use a bitmap for allowed orders for FGP_CREATE instea= d? As for certain future use cases (e.g. fs supporting blocksize larger than page size), we will require a minimal folio size anyway and falling below that is not acceptable. > > But there's no need for a new API. You can remove folios from the page > cache whenever you like. See delete_from_page_cache_batch() as an > example. So you mean to manually truncate the other pages, inside the release_folio() callback? That sounds feasible, and let me experiment with that solution. Thanks, Qu