linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Minchan Kim <minchan@kernel.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	Kairui Song <ryncsn@gmail.com>
Subject: Re: [PATCHv4 14/17] zsmalloc: make zspage lock preemptible
Date: Wed, 12 Feb 2025 14:00:26 +0900	[thread overview]
Message-ID: <droaoze6w4atf7guiv6t4imhcmkpteyvoaigdnw5p3vdg75ebx@m56xi2y527i4> (raw)
In-Reply-To: <Z6Z2l9uovxAiED6q@google.com>

On (25/02/07 21:09), Yosry Ahmed wrote:
> Can we do some perf testing to make sure this custom locking is not
> regressing performance (selfishly I'd like some zswap testing too)?

So for zsmalloc I (usually) write some simple testing code which is
triggered via sysfs (device attr) and that is completely reproducible,
so that I compares apples to apples.  In this particular case I just
have a loop that creates objects (we don't need to compress or decompress
anything, zsmalloc doesn't really care)

-	echo 1 > /sys/ ... / test_prepare

	for (sz = 32; sz < PAGE_SIZE; sz += 64) {
		for (i = 0; i < 4096; i++) {
			ent->handle = zs_malloc(zram->mem_pool, sz)
			list_add(ent)
		}
	}


And now I just `perf stat` writes:

-	perf stat echo 1 > /sys/ ... / test_exec_old

	list_for_each_entry
		zs_map_object(ent->handle, ZS_MM_RO);
		zs_unmap_object(ent->handle)

	list_for_each_entry
		dst = zs_map_object(ent->handle, ZS_MM_WO);
		memcpy(dst, tmpbuf, ent->sz)
		zs_unmap_object(ent->handle)



-	perf stat echo 1 > /sys/ ... / test_exec_new

	list_for_each_entry
		dst = zs_obj_read_begin(ent->handle, loc);
		zs_obj_read_end(ent->handle, dst);

	list_for_each_entry
		zs_obj_write(ent->handle, tmpbuf, ent->sz);


-	echo 1 > /sys/ ... / test_finish

	free all handles and ent-s


The nice part is that we don't depend on any of the upper layers, we
don't even need to compress/decompress anything; we allocate objects
of required sizes and memcpy static data there (zsmalloc doesn't have
any opinion on that) and that's pretty much it.


OLD API
=======

10 runs

       369,205,778      instructions                     #    0.80  insn per cycle            
        40,467,926      branches                         #  113.732 M/sec                     

       369,002,122      instructions                     #    0.62  insn per cycle            
        40,426,145      branches                         #  189.361 M/sec                     

       369,051,170      instructions                     #    0.45  insn per cycle            
        40,434,677      branches                         #  157.574 M/sec                     

       369,014,522      instructions                     #    0.63  insn per cycle            
        40,427,754      branches                         #  201.464 M/sec                     

       369,019,179      instructions                     #    0.64  insn per cycle            
        40,429,327      branches                         #  198.321 M/sec                     

       368,973,095      instructions                     #    0.64  insn per cycle            
        40,419,245      branches                         #  234.210 M/sec                     

       368,950,705      instructions                     #    0.64  insn per cycle            
        40,414,305      branches                         #  231.460 M/sec                     

       369,041,288      instructions                     #    0.46  insn per cycle            
        40,432,599      branches                         #  155.576 M/sec                     

       368,964,080      instructions                     #    0.67  insn per cycle            
        40,417,025      branches                         #  245.665 M/sec                     

       369,036,706      instructions                     #    0.63  insn per cycle            
        40,430,860      branches                         #  204.105 M/sec                     


NEW API
=======

10 runs

       265,799,293      instructions                     #    0.51  insn per cycle            
        29,834,567      branches                         #  170.281 M/sec                     

       265,765,970      instructions                     #    0.55  insn per cycle            
        29,829,019      branches                         #  161.602 M/sec                     

       265,764,702      instructions                     #    0.51  insn per cycle            
        29,828,015      branches                         #  189.677 M/sec                     

       265,836,506      instructions                     #    0.38  insn per cycle            
        29,840,650      branches                         #  124.237 M/sec                     

       265,836,061      instructions                     #    0.36  insn per cycle            
        29,842,285      branches                         #  137.670 M/sec                     

       265,887,080      instructions                     #    0.37  insn per cycle            
        29,852,881      branches                         #  126.060 M/sec                     

       265,769,869      instructions                     #    0.57  insn per cycle            
        29,829,873      branches                         #  210.157 M/sec                     

       265,803,732      instructions                     #    0.58  insn per cycle            
        29,835,391      branches                         #  186.940 M/sec                     

       265,766,624      instructions                     #    0.58  insn per cycle            
        29,827,537      branches                         #  212.609 M/sec                     

       265,843,597      instructions                     #    0.57  insn per cycle            
        29,843,650      branches                         #  171.877 M/sec                     


x old-api-insn
+ new-api-insn
+-------------------------------------------------------------------------------------+
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|+                                                                                   x|
|A                                                                                   A|
+-------------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10  3.689507e+08 3.6920578e+08 3.6901918e+08 3.6902586e+08     71765.519
+  10  2.657647e+08 2.6588708e+08 2.6580373e+08 2.6580734e+08     42187.024
Difference at 95.0% confidence
	-1.03219e+08 +/- 55308.7
	-27.9705% +/- 0.0149878%
	(Student's t, pooled s = 58864.4)


> Perhaps Kairui can help with that since he was already testing this
> series.

Yeah, would be great.


  reply	other threads:[~2025-02-12  5:00 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-31  9:05 [PATCHv4 00/17] zsmalloc/zram: there be preemption Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 01/17] zram: switch to non-atomic entry locking Sergey Senozhatsky
2025-01-31 11:41   ` Hillf Danton
2025-02-03  3:21     ` Sergey Senozhatsky
2025-02-03  3:52       ` Sergey Senozhatsky
2025-02-03 12:39       ` Sergey Senozhatsky
2025-01-31 22:55   ` Andrew Morton
2025-02-03  3:26     ` Sergey Senozhatsky
2025-02-03  7:11       ` Sergey Senozhatsky
2025-02-03  7:33         ` Sergey Senozhatsky
2025-02-04  0:19       ` Andrew Morton
2025-02-04  4:22         ` Sergey Senozhatsky
2025-02-06  7:01     ` Sergey Senozhatsky
2025-02-06  7:38       ` Sebastian Andrzej Siewior
2025-02-06  7:47         ` Sergey Senozhatsky
2025-02-06  8:13           ` Sebastian Andrzej Siewior
2025-02-06  8:17             ` Sergey Senozhatsky
2025-02-06  8:26               ` Sebastian Andrzej Siewior
2025-02-06  8:29                 ` Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 02/17] zram: do not use per-CPU compression streams Sergey Senozhatsky
2025-02-01  9:21   ` Kairui Song
2025-02-03  3:49     ` Sergey Senozhatsky
2025-02-03 21:00       ` Yosry Ahmed
2025-02-06 12:26         ` Sergey Senozhatsky
2025-02-06  6:55       ` Kairui Song
2025-02-06  7:22         ` Sergey Senozhatsky
2025-02-06  8:22           ` Sergey Senozhatsky
2025-02-06 16:16           ` Yosry Ahmed
2025-02-07  2:56             ` Sergey Senozhatsky
2025-02-07  6:12               ` Sergey Senozhatsky
2025-02-07 21:07                 ` Yosry Ahmed
2025-02-08 16:20                   ` Sergey Senozhatsky
2025-02-08 16:41                     ` Sergey Senozhatsky
2025-02-09  6:22                     ` Sergey Senozhatsky
2025-02-09  7:42                       ` Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 03/17] zram: remove crypto include Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 04/17] zram: remove max_comp_streams device attr Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 05/17] zram: remove two-staged handle allocation Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 06/17] zram: permit reclaim in zstd custom allocator Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 07/17] zram: permit reclaim in recompression handle allocation Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 08/17] zram: remove writestall zram_stats member Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 09/17] zram: limit max recompress prio to num_active_comps Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 10/17] zram: filter out recomp targets based on priority Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 11/17] zram: unlock slot during recompression Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 12/17] zsmalloc: factor out pool locking helpers Sergey Senozhatsky
2025-01-31 15:46   ` Yosry Ahmed
2025-02-03  4:57     ` Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 13/17] zsmalloc: factor out size-class " Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 14/17] zsmalloc: make zspage lock preemptible Sergey Senozhatsky
2025-01-31 15:51   ` Yosry Ahmed
2025-02-03  3:13     ` Sergey Senozhatsky
2025-02-03  4:56       ` Sergey Senozhatsky
2025-02-03 21:11       ` Yosry Ahmed
2025-02-04  6:59         ` Sergey Senozhatsky
2025-02-04 17:19           ` Yosry Ahmed
2025-02-05  2:43             ` Sergey Senozhatsky
2025-02-05 19:06               ` Yosry Ahmed
2025-02-06  3:05                 ` Sergey Senozhatsky
2025-02-06  3:28                   ` Sergey Senozhatsky
2025-02-06 16:19                   ` Yosry Ahmed
2025-02-07  2:48                     ` Sergey Senozhatsky
2025-02-07 21:09                       ` Yosry Ahmed
2025-02-12  5:00                         ` Sergey Senozhatsky [this message]
2025-02-12 15:35                           ` Yosry Ahmed
2025-02-13  2:18                             ` Sergey Senozhatsky
2025-02-13  2:57                               ` Yosry Ahmed
2025-02-13  7:21                                 ` Sergey Senozhatsky
2025-02-13  8:22                                   ` Sergey Senozhatsky
2025-02-13 15:25                                     ` Yosry Ahmed
2025-02-14  3:33                                       ` Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 15/17] zsmalloc: introduce new object mapping API Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 16/17] zram: switch to new zsmalloc " Sergey Senozhatsky
2025-01-31  9:06 ` [PATCHv4 17/17] zram: add might_sleep to zcomp API Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=droaoze6w4atf7guiv6t4imhcmkpteyvoaigdnw5p3vdg75ebx@m56xi2y527i4 \
    --to=senozhatsky@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=ryncsn@gmail.com \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox