From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEAB3E77188 for ; Wed, 15 Jan 2025 00:51:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 158D56B0083; Tue, 14 Jan 2025 19:51:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 109006B0085; Tue, 14 Jan 2025 19:51:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F126E6B0088; Tue, 14 Jan 2025 19:51:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D505C6B0083 for ; Tue, 14 Jan 2025 19:51:07 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 60F77160E19 for ; Wed, 15 Jan 2025 00:51:07 +0000 (UTC) X-FDA: 83007856974.15.8D05201 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by imf12.hostedemail.com (Postfix) with ESMTP id 9354B4000A for ; Wed, 15 Jan 2025 00:51:05 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NAkS2BcS; spf=pass (imf12.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736902265; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=UZequrAvRvKd4UTlNzoo0boLDHb5K7SNUlbAXXqfxO4=; b=VgKhRkcCbQ9IILQICpiJe3Lcfay4k9vanfw4M1AoYbOi+0U/gef5ulOsrfEjFWeRaNVTDe 6vPwT57QE2ULxzvbAoYVy+i5Cpj2912JYb413S/ObR2Cvqr8262hAUSh6cSKepk8SdFaUM FZu7IEPUp6z+l+GWT5IMvHfjmQMqJME= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736902265; a=rsa-sha256; cv=none; b=8rYYP7k8vVMEe8mNrhG5nvXhXISFQBqHiUAMPZppAYJHimgVCfSHoZrhuRupuWncFxRQkq uVt+ySCbDOMvSxeKifdhocIrGRbdpBKOTe2LZoEOFQwi2/94HSKJ8leh8KjzpsEwy6e7JS V+sux3lAl0I13zQOUCGrT8Y2P0OKCUQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NAkS2BcS; spf=pass (imf12.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-4678afeb133so3557351cf.0 for ; Tue, 14 Jan 2025 16:51:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736902264; x=1737507064; darn=kvack.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=UZequrAvRvKd4UTlNzoo0boLDHb5K7SNUlbAXXqfxO4=; b=NAkS2BcS/oyvks0/Mx4T79pNhcniIOMHPNvQye+N+ElTS0g4XIzKit2ytSJZhV5yu1 1NP7pUZn8m2yBxUviF60NKHa1hs8CmUNoFFNjAKgu55FQMNDQZjQUcm6QcuTR7bIgOC0 r7jq9mf0wqxAhK4RQUkF3bVSfUlJUjw/RU2YlE4b41qLZFbT0XkbOuJOwHOPPNDIXCl4 RJZ+2+LfHk3cCQW4j2sOkHtvuEZtnTQYGO+5I3S4zd4gcBRVijSHseh4FRmmMaUNuhXR Rnw1s/u2rMJDvwY1Ad/MWOBKQTIzCP7kmr0+rXBs0TWGIVLeJHcm/rkBUpecZDr5zCeY VKuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736902264; x=1737507064; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=UZequrAvRvKd4UTlNzoo0boLDHb5K7SNUlbAXXqfxO4=; b=E7CBCDczMf0WTtwy7AeAdVLJgt66l8LfTG8atv0AXsPv76OTU7Tn/JLTKvbiONEXpR 9JmvXHNcIHNrCK6wgDPVSVH4uaEDKVcZcauE+i5v3TpcQgzCNlH5nzN0doZdDjZ4Lb2l 2Cq8d0bmqYvmibJ+I3c1K5ihK9q30g3hzuOPKOlP3jB6lBmAx3uIMlfXzrWinvqbFDVe UKPRMPLH8McmjHRFV6paqKjBtkoSwnkwiS92DKxbt6t6kKI+AWnJnVX2eWLdmXN8Zl8m I/Zxw/52NLDRkY1eugjjlMw0vegmtrpHOFSzzDbFX6bVx99tJ/TpwQ+Y8iyKa7D/pS6k b+sw== X-Forwarded-Encrypted: i=1; AJvYcCWpMNorkbNb4BnVkLtrqcpeUOedCfHxAffEsIrp7Jqp7C3g7OVrw3pVJz3xlpJzo4LEy3lGnytIBA==@kvack.org X-Gm-Message-State: AOJu0Yx5dVtey/HyQigBbxoVNLQ23lg3u+bT/EE1nwlqMb4gHh+x4W44 Q2zDg06/Md4s3R7skdVRr6VUgQRPZTpx2OdBWlDNEQ8a5ZgR+X9WpXib4xGs53RjEUe+WNWzhUZ 1dC0pLAdia8HD6LhAGLEO2VkwPYQ/CNlq X-Gm-Gg: ASbGncsd8CG3DSEGGcWreToZfgCqM0ZYp2eWIYJOFg/y2cQhrb706iSUAa/nmxQi0eo EO9UCdssDaV3SvqmmqwqOXwx8+KXqUqFgxNDonzo= X-Google-Smtp-Source: AGHT+IG0ULU+4kvAJL+SO0+TmFf6xUXVq5KlUeuTmvQS82g8Yiivvwj1NtRlshMXZmhGdlkLuTKnnzqYcQntN7UeRXQ= X-Received: by 2002:ac8:7f16:0:b0:465:2fba:71b5 with SMTP id d75a77b69052e-46df56d386emr17630041cf.13.1736902264576; Tue, 14 Jan 2025 16:51:04 -0800 (PST) MIME-Version: 1.0 From: Joanne Koong Date: Tue, 14 Jan 2025 16:50:53 -0800 X-Gm-Features: AbW1kva7ivpAyvaPfCT1YGtWlyt-QPg3gb8b4M71_XGeZ22TdVC_qesb44X4K2I Message-ID: Subject: [LSF/MM/BPF TOPIC] Improving large folio writeback performance To: lsf-pc@lists.linux-foundation.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, "Matthew Wilcox (Oracle)" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 9354B4000A X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 7etu6pt7ue87semzas9pmi5raewt8pfu X-HE-Tag: 1736902265-973922 X-HE-Meta: U2FsdGVkX1/mny7x7k/qgtSnfnbmJ453c1jboODICNVeLptWnWjAryZfuMovGG1xWmYOiTmyb47pfysgt464ByTA25YjaF8JFnQD/s/qbvOBhJmVg4l0MxiZbZ4loofC18pE9x3nEncYuIza2Ng6N72OoU+B5jK/Xf57qPbpAk+k4L17gifiFODciaKg3nf3CXwKdsTUfxDWERBeAfXNOmx48tf61PaFgjAbgr5QC2JnxTVlJJI+bvQ6uqqjjJd2gZJtQtRO1cMCvj9ilUAugF+7BjrG9PhHGNBam8OorEzjOoK1whV1KVentrzyAgEVJNPf7PMsxv4MalWFAENEJTkj3r3+lql1oUdJU5wAL053rSEnCjezUbxS8hxKd6upBjrFg3xr0E2ChJ14H9L3Ot+33wdqrALGnbYaeomzk8/4YQhYOHa8lsELqqKwW3dglInrM5VsCZyPA1gvXmV4xpFByp0zRDNbxPgpncRaazjGcAf8/qkmaeaj3KzTzI3muDCk0aZYoF0U+qGe4WSppAw5J8yurq5OyDiSSxMItFYBEYKpgXG+UKA5E8xmp0eKdEHLCOS2gyajO6igJI8SIHsxIEJKEb5ZXgOu0db8JiO39dFdynUq8boVeLHJxhq3SMvRq/8BvuQLBjmxkZPwWUdrYSlFL61Imun71NuwndprF97Ukr23R824PxpaWNmmp/znvN2AC1jkmUxuFyXepqVIOhVS5FmOsQRt8J6dPovqjP4v7wi8YcmB103W28R/6bldcJwb2EibqNkfiZm4M0aDaZhtubl5OcU0ydX4wf23dXeL88k+wYQ0P/pw3mdi4VO0X3NemCsOqThYY7Bd9q8JXtWLnvvpHDvwCO9rXaSlBNyQhOrULRRZWq3lpAfbPVIxWYkUS6Yn6yJGPFbFikBUe4ciQ0etdQ4TgWwLXpY2JKq4pv3wev24luLRhioyPIP8vxAtZBCIp80KkUv tRkk0Z/1 FjwMBydRIbhKsG4rFUj3jRvTB5HoGUxC6gAedazxs1e3MWAuEcI566pTQh6/nP3bnO9lKZp5ZEh8oOhg1HvUPasEmQJ1aP3aVxA9RLuv5x5yZacTtF37/NRWGgQKZRVg/F8AFDKC1DTon6DwFmHweh/K2fRCWW9ZI28G1vc4PSBXhZwbUaccRjsUJtnDX1o5XyZRiAS++Xzr0G2EDlPa30Vo9hf72vuDbnIhGAW02xNmAWUfNF1hFjCW6EMpS7zYNnjkCQb6RT/tq6mkLJZueKY8vNDuT5So09x32wGwxt0fjPb390rDKbuvS+LWC3BJhaMRTUNrtBs251S74aGTqyQab9ypGipaB91rKux3sbFanru8X1Q4eC5HfGQTNeYRWNAaEjvnHyxpxsU5E9Qiq2BZRoA== X-Bogosity: Unsure, tests=bogofilter, spamicity=0.476784, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi all, I would like to propose a discussion topic about improving large folio writeback performance. As more filesystems adopt large folios, it becomes increasingly important that writeback is made to be as performant as possible. There are two areas I'd like to discuss: == Granularity of dirty pages writeback == Currently, the granularity of writeback is at the folio level. If one byte in a folio is dirty, the entire folio will be written back. This becomes unscalable for larger folios and significantly degrades performance, especially for workloads that employ random writes. One idea is to track dirty pages at a smaller granularity using a 64-bit bitmap stored inside the folio struct where each bit tracks a smaller chunk of pages (eg for 2 MB folios, each bit would track 32k pages), and only write back dirty chunks rather than the entire folio. == Balancing dirty pages == It was observed that the dirty page balancing logic used in balance_dirty_pages() fails to scale for large folios [1]. For example, fuse saw around a 125% drop in throughput for writes when using large folios vs small folios on 1MB block sizes, which was attributed to scheduled io waits in the dirty page balancing logic. In generic_perform_write(), dirty pages are balanced after every write to the page cache by the filesystem. With large folios, each write dirties a larger number of pages which can grossly exceed the ratelimit, whereas with small folios each write is one page and so pages are balanced more incrementally and adheres more closely to the ratelimit. In order to accomodate large folios, likely the logic in balancing dirty pages needs to be reworked. Thanks, Joanne [1] https://lore.kernel.org/linux-fsdevel/Z1N505RCcH1dXlLZ@casper.infradead.org/T/#m9e3dd273aa202f9f4e12eb9c96602b5fec2d383d