From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CDEFC0219E for ; Wed, 12 Feb 2025 05:00:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BA806B0082; Wed, 12 Feb 2025 00:00:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 56AEF6B0083; Wed, 12 Feb 2025 00:00:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40B476B0085; Wed, 12 Feb 2025 00:00:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1DAF66B0082 for ; Wed, 12 Feb 2025 00:00:36 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AE4FE4B389 for ; Wed, 12 Feb 2025 05:00:35 +0000 (UTC) X-FDA: 83110092030.10.70BAD94 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf02.hostedemail.com (Postfix) with ESMTP id B876B80014 for ; Wed, 12 Feb 2025 05:00:33 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=FNga55pl; spf=pass (imf02.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.173 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739336433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Xx6medSIFAlWqB4xUGRMh4uFzhVHBmlr9g0JApEp0uo=; b=G7o+V32KIbY7agz/ktSKF+TtYg7qT2Rlrtf0ZMDRZLgDosIYtSmp6VP4NkL2Q/3EDrYUrG O0E0YM1+fV6sR5WaOoR0ipxlYuu//GjcFz6xYPjd9ISe6DdXDgxiw8O0RtnKacoRxF8Fjm Fp0tYXl8oc8Dlg2yIBb5wNd6WHwS+l8= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=FNga55pl; spf=pass (imf02.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.173 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739336433; a=rsa-sha256; cv=none; b=HFpdSclnFORdVmfqN/tA3wBsmG4Pw1CHRbjmID/3x2Y5yyAHpwbu4gyhqOaU2bJjUA30Dp amIJFatq9Acnt+Hw43jNvhvZ111lj3Hq6i/6H+LSrcHyzrM6KVan8jLzfbW2u89oMK8PxF CmaHGVejetmyyToW+y68Je7u+agLlYg= Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-21f49bd087cso86769375ad.0 for ; Tue, 11 Feb 2025 21:00:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1739336432; x=1739941232; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Xx6medSIFAlWqB4xUGRMh4uFzhVHBmlr9g0JApEp0uo=; b=FNga55pl9pN/P92zbw0jmI4uylDfWgUREFEXdn6F5BYGymhLRtyHf57pOanTEOtKs8 tSCMKZos4jtgGGa3Tp73vTVtATfJ5+70BzU1qss77iKmJMpPXdt4T3k5FphfjKr/coZh /dJX2h+psLGydfsEcS3uLBZVN3RhDaP6JbY2E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739336432; x=1739941232; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Xx6medSIFAlWqB4xUGRMh4uFzhVHBmlr9g0JApEp0uo=; b=MAS2qsPdxY8prmyNKmqYZ9j56Nl/sjrzgK85m6hjZmOIzpQZruVLqEBbBotuztAALH 7OHXd6kqcbmxFhzafSq5egFh53R3o+12XUmzJpWbF6kOKVJeTmi0sTx0E5a5Lm9+5kRn ZfIFf3CU+wJnFvHjXb4ZQ2gTi5RUPhuMvMNNNpyaxatK5ZoQ/VijlCRy4XgfONGDxCqg hGU2AIjRbWinjtwBrm67OeyIaepdf4Mx18Q9uNisInP7h5zJgd77wOm6GAPRL8MGdr6D AhyPDNBKLkR7pjdXJZV8VxIFOoLgfzlNDOuPRscPECYCFqeQA31cVZ8jKJ5ljZtrsV8p uNWw== X-Forwarded-Encrypted: i=1; AJvYcCVpHXv77FwMLRYHeRh8EVTHOiFsqhqRUD+TCPjhV90KaqVuKPzJa5n7YPMGpPOdRJwiatPZfqOxTQ==@kvack.org X-Gm-Message-State: AOJu0Yz/m1jjVpd+aIoJIGmaGb63UiamdLeyWwtJP+HB5KATC+eDUWX4 3HSyu3VNYq3ERSb5GuGTT0+Ek0MNz8DyAJTxFduKts50d2sWikLc+nEgUxCInQ== X-Gm-Gg: ASbGncsxNaqwcLlxTt4wlmjWH+1gd1enA48LGldIpouCPNgZgrzl595QxW3T0dopBaP M/kz+W7uXn3HqfQ3napfNKIYBHCv3kJq/cBFAJHAJ65BZNhIyZf288JF30QjCqV/05kqgXtrVr+ uOzyOizkYmYO6McF9AaIQ8Ty/D38on8n/u0IerFRftLAo1yk8BKgyjNGqt93C8NAwcnsgg+HwwK GmYPb0P9GgzHfmBazeAxgd2UQ+jb5VKx8JncIQah2E44B5/c7/19sKc/Jj7ffEwoY+3wTrCRzVZ a0YdQ6mroXXqWTk12Cs= X-Google-Smtp-Source: AGHT+IFtmnHkdgwo8UNs3E38g+A1KbQ6WymZKSOuDHiv+zO0BBh4XxO5wIPWlcYITZL/prJHN123pw== X-Received: by 2002:a05:6a21:a46:b0:1e1:ae4a:1d42 with SMTP id adf61e73a8af0-1ee5c83e0a4mr4271360637.31.1739336432558; Tue, 11 Feb 2025 21:00:32 -0800 (PST) Received: from google.com ([2401:fa00:8f:203:69f5:6852:451e:8142]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73092eb8726sm4361440b3a.175.2025.02.11.21.00.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Feb 2025 21:00:31 -0800 (PST) Date: Wed, 12 Feb 2025 14:00:26 +0900 From: Sergey Senozhatsky To: Yosry Ahmed Cc: Sergey Senozhatsky , Andrew Morton , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kairui Song Subject: Re: [PATCHv4 14/17] zsmalloc: make zspage lock preemptible Message-ID: References: <6vtpamir4bvn3snlj36tfmnmpcbd6ks6m3sdn7ewmoles7jhau@nbezqbnoukzv> <6uhsj4bckhursiblkxe54azfgyqal6tq2de3lpkxw6omkised6@uylodcjruuei> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B876B80014 X-Stat-Signature: negmbnpcwab3zh8x7i1sk1ibti7by4gp X-HE-Tag: 1739336433-512338 X-HE-Meta: U2FsdGVkX18ZIJY1+z4z0j0RIfgT6knedrgS2BdsgobzGrQnBUtYcoEK9dd3bkEWXUm0V9y7rIkcYNPTLt8SPhq8MIFXkcNtR9L7WWCDagXf/+MJgDmL2tqkVdx/rEz7BlKd00gi/vksKq45y/vmpXSjSEQxR4Vj2l/iNsyKOwt4H3MkGEhj0rlIzVFs09ctoJODVfXww6eID8InrmDhaUiFvI49Ew4E+ubA8WHvTAQctIz96h9M0s2MYqocYP0PjhHzN2uEIBvQqXT2/6GaOdlv3djQqFX+LH4g+3XZgG5ioe63uDS917PF+QitZt24OhgqYTW9nDA5Vg6hu3sY+97neqK+Jm1K9jy74TXIAg/etB8FxeDf6aiRqsq6WaCBh2F/wAhCQ/1M9emJ5IrAxVqlnIctW3MfCCGzST6+9Y+8RxsyAqUMzNJHLorJkYJJvSW6n5lXymJmniJ8f6zpH2xqtRH5BQioLci3INimuTQXzr9emvPobvz2LEi2uuzAHiZ1sGcnXnEFXgFui/pcFHeelz9Gjpd25g0X8h/nReY1W9igqLOu7udP9VP5UuPRvazLdVQZoQGHeCdUuQD7iIm4mfylWbfI5YlZFkLqKAiaUdWe2AY+Ajinz8YsuPn1wcrqpdZ9vmyF/zy1nR4pUo27Lh4kBeztnjaLzv591ObCe+CFLYzBPyEJVkftpZXM4wk3zr/qib71aFif4/wvYFGofkkCwqxVblzKKlnlNiGZX9Tv4KKaEdtJfIklbJ8ATTR4kjHYwVw0kL7LGxATYYSLyl198Im93Ro4lUBLNb9ZZKIlWC2GIOmmPzb5fgYrflMWpYhH8etuND9/i53xgIwR9qOq6KySCNTpiSR+zol5uATSQ+txZHE6odJw0LarzLI3+k6pmm+LKGzFHu/8AIZIU9NC6nNv6AesWMdsfQcsAVId0OJqA1qfVP7ZpBLzHb+xchxPRiiQtTPuPwT kDO69Xxx 2ow9Whuek9vKQzMmjoO+O86CMjM89iWFH8ZOunZ2VCDoKMW6JDOOxUsJL9HuGNCO1FK6E9+g+NT8ymTfrkbBfgNs6ZfLbQ9KuBxxu9as0GwlLRcWgMfd/0+2pWqK5oCdy7FGmzrHW4uin+9d97TUXH5xhBF5OPTo3pY1HOmredCPEHgl0dQ9u0hnXb0P+QaMh0A1TfiWSppPvSqdT0aKiFWKMjCDTzztXCU/8nkLRP/hbrDn9eOZahE+duXtSwA9SOyey9hsizsM+sL34ikvufIfzj/tKcaJE2VO5zM2oxgyXy64ZVJ3CV5fluWBCpRgWqdIBLjyfXYjFOaQ4lZEXouAz4Gz7BLtF1cZih38jm20oefI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.005997, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On (25/02/07 21:09), Yosry Ahmed wrote: > Can we do some perf testing to make sure this custom locking is not > regressing performance (selfishly I'd like some zswap testing too)? So for zsmalloc I (usually) write some simple testing code which is triggered via sysfs (device attr) and that is completely reproducible, so that I compares apples to apples. In this particular case I just have a loop that creates objects (we don't need to compress or decompress anything, zsmalloc doesn't really care) - echo 1 > /sys/ ... / test_prepare for (sz = 32; sz < PAGE_SIZE; sz += 64) { for (i = 0; i < 4096; i++) { ent->handle = zs_malloc(zram->mem_pool, sz) list_add(ent) } } And now I just `perf stat` writes: - perf stat echo 1 > /sys/ ... / test_exec_old list_for_each_entry zs_map_object(ent->handle, ZS_MM_RO); zs_unmap_object(ent->handle) list_for_each_entry dst = zs_map_object(ent->handle, ZS_MM_WO); memcpy(dst, tmpbuf, ent->sz) zs_unmap_object(ent->handle) - perf stat echo 1 > /sys/ ... / test_exec_new list_for_each_entry dst = zs_obj_read_begin(ent->handle, loc); zs_obj_read_end(ent->handle, dst); list_for_each_entry zs_obj_write(ent->handle, tmpbuf, ent->sz); - echo 1 > /sys/ ... / test_finish free all handles and ent-s The nice part is that we don't depend on any of the upper layers, we don't even need to compress/decompress anything; we allocate objects of required sizes and memcpy static data there (zsmalloc doesn't have any opinion on that) and that's pretty much it. OLD API ======= 10 runs 369,205,778 instructions # 0.80 insn per cycle 40,467,926 branches # 113.732 M/sec 369,002,122 instructions # 0.62 insn per cycle 40,426,145 branches # 189.361 M/sec 369,051,170 instructions # 0.45 insn per cycle 40,434,677 branches # 157.574 M/sec 369,014,522 instructions # 0.63 insn per cycle 40,427,754 branches # 201.464 M/sec 369,019,179 instructions # 0.64 insn per cycle 40,429,327 branches # 198.321 M/sec 368,973,095 instructions # 0.64 insn per cycle 40,419,245 branches # 234.210 M/sec 368,950,705 instructions # 0.64 insn per cycle 40,414,305 branches # 231.460 M/sec 369,041,288 instructions # 0.46 insn per cycle 40,432,599 branches # 155.576 M/sec 368,964,080 instructions # 0.67 insn per cycle 40,417,025 branches # 245.665 M/sec 369,036,706 instructions # 0.63 insn per cycle 40,430,860 branches # 204.105 M/sec NEW API ======= 10 runs 265,799,293 instructions # 0.51 insn per cycle 29,834,567 branches # 170.281 M/sec 265,765,970 instructions # 0.55 insn per cycle 29,829,019 branches # 161.602 M/sec 265,764,702 instructions # 0.51 insn per cycle 29,828,015 branches # 189.677 M/sec 265,836,506 instructions # 0.38 insn per cycle 29,840,650 branches # 124.237 M/sec 265,836,061 instructions # 0.36 insn per cycle 29,842,285 branches # 137.670 M/sec 265,887,080 instructions # 0.37 insn per cycle 29,852,881 branches # 126.060 M/sec 265,769,869 instructions # 0.57 insn per cycle 29,829,873 branches # 210.157 M/sec 265,803,732 instructions # 0.58 insn per cycle 29,835,391 branches # 186.940 M/sec 265,766,624 instructions # 0.58 insn per cycle 29,827,537 branches # 212.609 M/sec 265,843,597 instructions # 0.57 insn per cycle 29,843,650 branches # 171.877 M/sec x old-api-insn + new-api-insn +-------------------------------------------------------------------------------------+ |+ x| |+ x| |+ x| |+ x| |+ x| |+ x| |+ x| |+ x| |+ x| |+ x| |A A| +-------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 3.689507e+08 3.6920578e+08 3.6901918e+08 3.6902586e+08 71765.519 + 10 2.657647e+08 2.6588708e+08 2.6580373e+08 2.6580734e+08 42187.024 Difference at 95.0% confidence -1.03219e+08 +/- 55308.7 -27.9705% +/- 0.0149878% (Student's t, pooled s = 58864.4) > Perhaps Kairui can help with that since he was already testing this > series. Yeah, would be great.