From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58DA8E77188 for ; Wed, 15 Jan 2025 01:50:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A88CD6B007B; Tue, 14 Jan 2025 20:50:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A389D6B0082; Tue, 14 Jan 2025 20:50:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 901606B0083; Tue, 14 Jan 2025 20:50:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 703206B007B for ; Tue, 14 Jan 2025 20:50:15 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1EB66160DC0 for ; Wed, 15 Jan 2025 01:50:15 +0000 (UTC) X-FDA: 83008005990.06.4D370F2 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf24.hostedemail.com (Postfix) with ESMTP id 7B22B180008 for ; Wed, 15 Jan 2025 01:50:13 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=s0lMl6Gl; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf24.hostedemail.com: domain of djwong@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=djwong@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736905813; a=rsa-sha256; cv=none; b=g4gIw37Fzhppcp6gsWztQ9k9wUYBfYt6RIFiwJZmXipbdUpFPqxBP5EMkK/+00z628+woh 1DBLSpqtE2YyoJyR5woKcmkrwNkHAYmTSbTxX+yd7eENTXsSgivgYxy8pgmAZY7jjfAMbH DbEjaQds/oeBHw67GD/+fZepHxfaATo= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=s0lMl6Gl; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf24.hostedemail.com: domain of djwong@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=djwong@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736905813; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=csddVGDULYs7b7PyLR2VquX5S73AadPyncNCOargeBg=; b=VGt/bFzbh9jR2m/14LfFmLCp+SnGHhEoKj1inCvI8o3UWRDVIOHMn4dok7xVpTuOL6lSmF Bdo6TzzsmuEH6HLqke3R0tEbna1YjZA3RFno/1gYwqc9yIF1oimp0s2ceh9EW6/dIl2u69 WeXzTXm4haaCLVb6/+htxJGai+Qx5QQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 668B8A40B8D; Wed, 15 Jan 2025 01:48:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A3B85C4CEDD; Wed, 15 Jan 2025 01:50:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1736905811; bh=79MFVIqUuOtzGNPHpqDIhxG/oRviPGoQBum7jlcxVfc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=s0lMl6Gl0PHtDN2EyzDHRqE0h7nzcSrUUwF5QBr1+z3OVyFMvY2XPJ8PW2S7JzNDO gbjbl0DQqAzAiVtxToBQhC2QOSnp1bnuTQhGRfc8qKlyAF5ML4eGgTIabxFrOetlGr e7t3qRN/m6LuLRGPoDoXO+iTiDF0R0PF/lm+jLAHknwYiz/HXA9ChK6iWYfJLTzOmr Rx7WCYbyEMRCxCTOyaT8XG/lWq63JT+IYR0R93qpnFX3ZTFOZnR0Pl/SPBe538htbd rNPKrI1Is8fSWunCUeXZEZSWcHiBp0+qH1gjJ4rMcQ5P0ImqsFaGsemAOD3eTxEo61 P39kAoKQn1ruA== Date: Tue, 14 Jan 2025 17:50:10 -0800 From: "Darrick J. Wong" To: Joanne Koong Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, "Matthew Wilcox (Oracle)" Subject: Re: [LSF/MM/BPF TOPIC] Improving large folio writeback performance Message-ID: <20250115015010.GD3561231@frogsfrogsfrogs> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7B22B180008 X-Stat-Signature: 1e8y97tyupraajb8xc5utfccfeiewtzz X-HE-Tag: 1736905813-308173 X-HE-Meta: U2FsdGVkX1/l0d86H7Gt847PtZI+oREHVrOeplkApn9/6FurWCr6r0HZDBTWUxal0rYOE/AiN7QnpiusGBtSVu7z6tb9Pl1Zd3OXu2bafKCCqaFDs4WlH7S8/9TlPg3ex+2crefZM5xPZyudzqFznyywQWTqtYxKK/yfZ6awt7sQsODCxJzfQuikSL+S2YLnb4Aignzilx+P3Qm59Xbzm9tOwSPOKKQP6ffXLOmwbU6Uy87g+/zlL0A2Qozvs0F01QHmLHIUJoj3BDyrvR6fOL7CDvmWZB4Lqd/0HsfCdJmxqf+vrhJTJ+eEmpAX9H6dSLuc5qc4aocbrJP0vmyJtGeBHH8wIznOgbUXpbq4L99bi4QSwBZ5bHA8QU/MDUiXZwVA7Lan/XYiyzGX5uFEFlId/oCw+eV9LK84vpNtY2Dgzq+pzj5ifxRHM3t7/hh5bDsrxGVQ6RveJnQ/KIGP/97WMZLFGxtRmrUaJZA9qZA4GuoOB7pQj8Y3ifCLqV0ZLPsBQjhtbV35przGBJPXHpD0glFbTJiMTcrjM5BA0oXBAYej8xJEFm48QaZlHPujRbegTjS4OKGD0vMaC1LdhGkwEPHYdqwkM2zerIQxyUI/xBaRVF1/hsKY+wZzHnVpaABhWxYo2DzWm+6XiL1OBICefrxWld33iUFG5YOs+MiuGiDDHu7DXJJzOpai2Wb39PvqYkvQiKY2qdYir+da71fypuF8jntpxrNtQKvAoCiiRpBl4K5foDLZ2ViztHYmRTcNW0CTH1t8G2/KYfHK/Qzi+9cBj10P0Kv4J819dkKapQSf3TgaaWoStboHdxsWbB8S8D1LOBc80uSEZ7BOGZ5aYAfrjaoaReBQ2YJKiPjx47uLQRsPYDC7XMdlwFZdmL551qCZ0cbI++IX1JoBi9A4aWjLcUR+khfGAy9izFOSusu4AZh5w9WJDphDvGph6gjRDl+Snj0edNdryTw i5g1Vszt Kxeh6M/+rxB98b4wDLoscUK6P56KY25jP4F5SJNqhRnamuYChrU0L1oJUY/9RhcioaEB5msQ3dBaFEuwk1r1baZgCqjg2Y4Uu3DHZcVidp4AJ46+ONbfQ22M6NXBp8iN3Ud1fJFX0MehP38iEzpfLqjWLF1L5j1VRPHucMy7yuAsXqR9Ih0unlLtvLIGMccbkwWzfseId57zHnN0+QUHFfv9jAQJmcsIsppJILFsPmftRcRZZ8wcmhcPFmkH+CJCuWnANzpGPmuQ5ZWAX0tbeg0/9ecoxCX3RtaofoVfHkr0eCUyiJv8Z3vcliT/6XF+J9YtSvjIZhICEdcE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 14, 2025 at 04:50:53PM -0800, Joanne Koong wrote: > Hi all, > > I would like to propose a discussion topic about improving large folio > writeback performance. As more filesystems adopt large folios, it > becomes increasingly important that writeback is made to be as > performant as possible. There are two areas I'd like to discuss: > > > == Granularity of dirty pages writeback == > Currently, the granularity of writeback is at the folio level. If one > byte in a folio is dirty, the entire folio will be written back. This > becomes unscalable for larger folios and significantly degrades > performance, especially for workloads that employ random writes. > > One idea is to track dirty pages at a smaller granularity using a > 64-bit bitmap stored inside the folio struct where each bit tracks a > smaller chunk of pages (eg for 2 MB folios, each bit would track 32k > pages), and only write back dirty chunks rather than the entire folio. > > > == Balancing dirty pages == > It was observed that the dirty page balancing logic used in > balance_dirty_pages() fails to scale for large folios [1]. For > example, fuse saw around a 125% drop in throughput for writes when > using large folios vs small folios on 1MB block sizes, which was > attributed to scheduled io waits in the dirty page balancing logic. In > generic_perform_write(), dirty pages are balanced after every write to > the page cache by the filesystem. With large folios, each write > dirties a larger number of pages which can grossly exceed the > ratelimit, whereas with small folios each write is one page and so > pages are balanced more incrementally and adheres more closely to the > ratelimit. In order to accomodate large folios, likely the logic in > balancing dirty pages needs to be reworked. Hmrmm.... it's a pity that folio_account_dirtied charges the process for all the pages in the folio even if it only wrote one byte, and then the ratelimit thresholds haven't caught up to filesystems batching calls to balance_dirty_pages. But I'm no expert on how that ratelimiting stuff works so that's all I have to say about that. :/ --D > > Thanks, > Joanne > > [1] https://lore.kernel.org/linux-fsdevel/Z1N505RCcH1dXlLZ@casper.infradead.org/T/#m9e3dd273aa202f9f4e12eb9c96602b5fec2d383d >