linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@redhat.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	 "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
	 Suleiman Souhlal <suleiman@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 Sergey Senozhatsky <senozhatsky@chromium.org>
Subject: mm: swapin read-ahead and zram
Date: Fri, 15 Aug 2025 14:51:49 +0900	[thread overview]
Message-ID: <7ftwasufn2w3bgesfbp66vlchhpiuctxkhdxp24y5nzzgz2oip@pi4kdyqkl5ss> (raw)

Hello,

We are seeing an unexpected behavior under standard memory pressure
test with zram being configured as a swap device (I tested on several
LTS kernels: 6.12, 6.6, 5.4).  Namely, we observe multiple, repetitive
reads of (compressed) zram entries, sometimes in a very short time span:

...
[ 1523.345784] zram: decompress entry idx:1615265 zsmalloc handle:ffffa28c8be3ee70 obj_size:986 num_reads:188
[ 1523.365401] zram: decompress entry idx:1615265 zsmalloc handle:ffffa28c8be3ee70 obj_size:986 num_reads:189
[ 1523.385934] zram: decompress entry idx:1307291 zsmalloc handle:ffffa28c70100b50 obj_size:788 num_reads:227
[ 1523.405098] zram: decompress entry idx:150916 zsmalloc handle:ffffa28c70114fc0 obj_size:436 num_reads:230
[ 1523.475162] zram: decompress entry idx:266372 zsmalloc handle:ffffa28c4566e5e0 obj_size:437 num_reads:192
[ 1523.476785] zram: decompress entry idx:1615262 zsmalloc handle:ffffa28c8be3efe0 obj_size:518 num_reads:99
[ 1523.476899] zram: decompress entry idx:1294524 zsmalloc handle:ffffa28c475825d0 obj_size:436 num_reads:97
[ 1523.477323] zram: decompress entry idx:266373 zsmalloc handle:ffffa28c4566e828 obj_size:434 num_reads:111
[ 1523.478081] zram: decompress entry idx:1638538 zsmalloc handle:ffffa28c70100c40 obj_size:930 num_reads:40
[ 1523.478631] zram: decompress entry idx:1307301 zsmalloc handle:ffffa28c70100348 obj_size:0 num_reads:87
[ 1523.507349] zram: decompress entry idx:1307293 zsmalloc handle:ffffa28c701007c8 obj_size:989 num_reads:98
[ 1523.540930] zram: decompress entry idx:1294528 zsmalloc handle:ffffa28c47582e60 obj_size:441 num_reads:386
[ 1523.540930] zram: decompress entry idx:266372 zsmalloc handle:ffffa28c4566e5e0 obj_size:437 num_reads:193
[ 1523.540958] zram: decompress entry idx:1294534 zsmalloc handle:ffffa28c47582b30 obj_size:520 num_reads:176
[ 1523.540998] zram: decompress entry idx:1615262 zsmalloc handle:ffffa28c8be3efe0 obj_size:518 num_reads:100
[ 1523.541063] zram: decompress entry idx:1615259 zsmalloc handle:ffffa28c8be3e970 obj_size:428 num_reads:171
[ 1523.541101] zram: decompress entry idx:1294524 zsmalloc handle:ffffa28c475825d0 obj_size:436 num_reads:98
[ 1523.541212] zram: decompress entry idx:150916 zsmalloc handle:ffffa28c70114fc0 obj_size:436 num_reads:231
[ 1523.541379] zram: decompress entry idx:1638538 zsmalloc handle:ffffa28c70100c40 obj_size:930 num_reads:41
[ 1523.541412] zram: decompress entry idx:1294521 zsmalloc handle:ffffa28c47582548 obj_size:936 num_reads:70
[ 1523.541771] zram: decompress entry idx:1592754 zsmalloc handle:ffffa28c43a94738 obj_size:0 num_reads:72
[ 1523.541840] zram: decompress entry idx:1615265 zsmalloc handle:ffffa28c8be3ee70 obj_size:986 num_reads:190
[ 1523.547630] zram: decompress entry idx:1307298 zsmalloc handle:ffffa28c70100940 obj_size:797 num_reads:112
[ 1523.547771] zram: decompress entry idx:1307291 zsmalloc handle:ffffa28c70100b50 obj_size:788 num_reads:228
[ 1523.550138] zram: decompress entry idx:1307296 zsmalloc handle:ffffa28c70100f20 obj_size:682 num_reads:61
[ 1523.555016] zram: decompress entry idx:266385 zsmalloc handle:ffffa28c4566e7c0 obj_size:679 num_reads:103
[ 1523.566361] zram: decompress entry idx:1294524 zsmalloc handle:ffffa28c475825d0 obj_size:436 num_reads:99
[ 1523.566428] zram: decompress entry idx:1294528 zsmalloc handle:ffffa28c47582e60 obj_size:441 num_reads:387
...

For instance, notice how entry 1615265 is read, decompressed, then
presumably evicted from the memory, and read/decompressed again
soon after, almost immediately.  Also notice how that entry 1615265
has already went through this cycle 189 times.  It's not entirely
clear why this happens.

As far as I can tell, it seems that these extra zram reads are coming from
the swapin read-ahead:
 handle_mm_fault
  do_swap_page
   swapin_readahead
    swap_read_folio
     submit_bio_wait
      submit_bio_noacct_nocheck
       __submit_bio
        zram_submit_bio
         zram_read_page
          zram_read_from_zspool

There are several issues with this.

First, on systems with zram powered swap devices, these extra reads result
in extra decompressions, which translates into excessive CPU (S/W compression)
and battery usage.  Along with the fact that each decompression first requires
a zsmalloc map() call, which may result in memcpy() (if compressed object
spans two physical pages).

Second, the read-ahead pages are likely to increase memory pressure, as
each read-ahead object decompresses into a PAGE_SIZE object, while we
also hold the compressed object in zsmalloc pool (until slot-free
notification).

Setting `sysctl -w vm.page-cluster=0` doesn't seem to help, because
page-cluster 0 limits the number of pages read-ahead to 1, so we still
read-ahead.

Can swapin read-ahead be entirely disabled for zram swap devices?


             reply	other threads:[~2025-08-15  5:51 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-15  5:51 Sergey Senozhatsky [this message]
2025-08-15  6:20 ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ftwasufn2w3bgesfbp66vlchhpiuctxkhdxp24y5nzzgz2oip@pi4kdyqkl5ss \
    --to=senozhatsky@chromium.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=suleiman@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox