From: Zhiguo Zhou <zhiguo.zhou@intel.com>
To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Cc: willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, muchun.song@linux.dev, osalvador@suse.de,
linux-kernel@vger.kernel.org, tianyou.li@intel.com,
tim.c.chen@linux.intel.com, gang.deng@intel.com,
Zhiguo Zhou <zhiguo.zhou@intel.com>
Subject: [PATCH 0/2] mm/readahead: batch folio insertion to improve performance
Date: Mon, 19 Jan 2026 14:50:23 +0800 [thread overview]
Message-ID: <20260119065027.918085-1-zhiguo.zhou@intel.com> (raw)
This patch series improves readahead performance by batching folio
insertions into the page cache's xarray, reducing the cacheline transfers,
and optimizing the execution efficiency in the critical section.
PROBLEM
=======
When the `readahead` syscall is invoked, `page_cache_ra_unbounded`
currently inserts folios into the page cache individually. Each insertion
requires acquiring and releasing the `xa_lock`, which can lead to:
1. Significant lock contention when running on multi-core systems
2. Cross-core cacheline transfers for the lock and associated data
3. Increased execution time due to frequent lock operations
These overheads become particularly noticeable in high-throughput storage
workloads where readahead is frequently used.
SOLUTION
========
This series introduces batched folio insertion for contiguous ranges in
the page cache. The key changes are:
Patch 1/2: Refactor __filemap_add_folio to separate critical section
- Extract the core xarray insertion logic into
__filemap_add_folio_xa_locked()
- Allow callers to control locking granularity via a 'xa_locked' parameter
- Maintain existing functionality while preparing for batch insertion
Patch 2/2: Batch folio insertion in page_cache_ra_unbounded
- Introduce filemap_add_folio_range() for batch insertion of folios
- Pre-allocate folios before entering the critical section
- Insert multiple folios while holding the xa_lock only once
- Update page_cache_ra_unbounded to use the new batching interface
- Insert folios individually when memory is under pressure
PERFORMANCE RESULTS
===================
Testing was performed using RocksDB's `db_bench` (readseq workload) on a
32-vCPU Intel Ice Lake server with 256GB memory:
1. Throughput improved by 1.51x (ops/sec)
2. Latency:
- P50: 63.9% reduction (6.15 usec → 2.22 usec)
- P75: 42.1% reduction (13.38 usec → 7.75 usec)
- P99: 31.4% reduction (507.95 usec → 348.54 usec)
3. IPC of page_cache_ra_unbounded (excluding lock overhead) improved by
2.18x
TESTING DETAILS
===============
- Kernel: v6.19-rc5 (0f61b1, tip of mm.git:mm-stable on Jan 14, 2026)
- Hardware: Intel Ice Lake server, 32 vCPUs, 256GB RAM
- Workload: RocksDB db_bench readseq
- Command: ./db_bench --benchmarks=readseq,stats --use_existing_db=1
--num_multi_db=32 --threads=32 --num=1600000 --value_size=8192
--cache_size=16GB
IMPLEMENTATION NOTES
====================
- The existing single-folio insertion API remains unchanged for
compatibility
- Hugetlb folio handling is preserved through the refactoring
- Error injection (BPF) support is maintained for __filemap_add_folio
Zhiguo Zhou (2):
mm/filemap: refactor __filemap_add_folio to separate critical section
mm/readahead: batch folio insertion to improve performance
include/linux/pagemap.h | 4 +-
mm/filemap.c | 238 ++++++++++++++++++++++++++++------------
mm/hugetlb.c | 3 +-
mm/readahead.c | 196 ++++++++++++++++++++++++++-------
4 files changed, 325 insertions(+), 116 deletions(-)
--
2.43.0
next reply other threads:[~2026-01-19 6:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-19 6:50 Zhiguo Zhou [this message]
2026-01-19 6:50 ` [PATCH 1/2] mm/filemap: refactor __filemap_add_folio to separate critical section Zhiguo Zhou
2026-01-19 8:34 ` kernel test robot
2026-01-19 9:16 ` kernel test robot
2026-01-19 6:50 ` [PATCH 2/2] mm/readahead: batch folio insertion to improve performance Zhiguo Zhou
2026-01-19 10:02 ` [PATCH v2 0/2] " Zhiguo Zhou
2026-01-19 10:02 ` [PATCH v2 1/2] mm/filemap: refactor __filemap_add_folio to separate critical section Zhiguo Zhou
2026-01-19 10:02 ` [PATCH v2 2/2] mm/readahead: batch folio insertion to improve performance Zhiguo Zhou
2026-01-19 10:38 ` [PATCH v2 0/2] mm/readahead: Changes since v1 Zhiguo Zhou
2026-01-19 14:15 ` [PATCH v2 0/2] mm/readahead: batch folio insertion to improve performance Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260119065027.918085-1-zhiguo.zhou@intel.com \
--to=zhiguo.zhou@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=gang.deng@intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=tianyou.li@intel.com \
--cc=tim.c.chen@linux.intel.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox