From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50ACCECAAD3 for ; Mon, 19 Sep 2022 21:37:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C69F06B0071; Mon, 19 Sep 2022 17:37:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C19FE80008; Mon, 19 Sep 2022 17:37:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0A0780007; Mon, 19 Sep 2022 17:37:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A22C36B0071 for ; Mon, 19 Sep 2022 17:37:04 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6F06E140C5E for ; Mon, 19 Sep 2022 21:37:04 +0000 (UTC) X-FDA: 79930145568.29.69B265D Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf16.hostedemail.com (Postfix) with ESMTP id 13A79180006 for ; Mon, 19 Sep 2022 21:37:03 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id x4-20020a17090a294400b002007b5f5fabso206627pjf.7 for ; Mon, 19 Sep 2022 14:37:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date; bh=b8T/nfISwHjU5GhrDDw7h8TuPVenEykyrkI2CqK0PNA=; b=TA8fMr6nzbnQJbU1cKRxwhWFVkClJkNpo48/2PDtCj44AVsvQSr24Wu1dlgHh07ek1 QlD08L/ToZ77/qmrWD7u2q7ZTr0TuF9KIExWGwuX+ri/K4njShYi8Sg7xEu00+M7PrFx kQtWTx4jpB6+c6ofFS5fn+nIYv00E3hWw9IzvhDPQgpSYyUpvi/46nnRUay+8G0zG8N1 xiPSu3pbdS73FOi88PfhI0CtNDiqGYa5EZJkYIhs6gRIIX4pZ6MjyUYBsjZhMT/x0onR IEdm2VFCGborFFmGTkBWAosAwx2JVxUvHubkvWlUBEdn9wNRvDa3HrBCUNTClNXG+N+S wKAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date; bh=b8T/nfISwHjU5GhrDDw7h8TuPVenEykyrkI2CqK0PNA=; b=o7WzTBDLBx4YrFpZGClKhvG0j2gPj7mWcQLFL5AQf/25Ld7PnAp+8txwW53Cwku8pG bs8K7yfUFw5XaCaS0Ecx/LsQdHiCatLgpOmZNgbPgLIfRfF2A4TZ/mAaHxRo8mP6vsT/ 8e2mArg6qhaWuI/vJmyavRMSBbap5IdmxBLNFaRktGDNr2omNhV+RQwqaFUAwWjHlrJP +a6L4y05Gbk50//g/HFYYwbGh/YbBivNv1LZnzlJECXj71IahRIxpOtSU8HLRJIJVl9n k1oGPevR5qIVhgPWVu6jjV00p4UC9uqhJ+7aR7WHb0teBtuBZPY52ICA4k401YNNPfXF EP7g== X-Gm-Message-State: ACrzQf1Hrif8jGHcQ3de7KJZqtslrmnkzolzUMslru4Vm/8ShKlEtCn/ s5XPkbylScrKskT/V+lYS20p29HRI84XXA== X-Google-Smtp-Source: AMsMyM4ALR/Mn6NEMrGg+aT9UJ+ROXv9Cuve5n5D3gimP7C3BdTIuA473rtvqYxdhMKtwbsjT80Ljz5HJZeJEA== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:aa7:9247:0:b0:544:6566:8ba0 with SMTP id 7-20020aa79247000000b0054465668ba0mr20806650pfp.11.1663623422992; Mon, 19 Sep 2022 14:37:02 -0700 (PDT) Date: Mon, 19 Sep 2022 21:37:01 +0000 In-Reply-To: <20220919180634.45958-3-ryncsn@gmail.com> Mime-Version: 1.0 References: <20220919180634.45958-1-ryncsn@gmail.com> <20220919180634.45958-3-ryncsn@gmail.com> Message-ID: <20220919213701.fwx4tfgpit6lcpn2@google.com> Subject: Re: [PATCH v2 2/2] mm: memcontrol: make cgroup_memory_noswap a static key From: Shakeel Butt To: Kairui Song Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , linux-kernel@vger.kernel.org, Michal Hocko Content-Type: text/plain; charset="us-ascii" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663623424; a=rsa-sha256; cv=none; b=nbXK9KrxNZnn/24DwwCMhPC1+RBtLh9M4SkqA1QSRjmeazUAUIVQKCKOdYlDqZ1kpXXqyg cdIFD7QJOoMZuLQH87A0oNwtf4Asw/atYFDkg/Io2PwrmQYu8VXQUi1SFprG0NNl3nNGXZ CJJb9bObk7IIpuiiacM4fznX4jcZq0c= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=TA8fMr6n; spf=pass (imf16.hostedemail.com: domain of 3_uAoYwgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3_uAoYwgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663623424; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b8T/nfISwHjU5GhrDDw7h8TuPVenEykyrkI2CqK0PNA=; b=zA0Hi/stCwLnADnv5DImEJpFu2OoWR2KTAHDNKaPMUwXw2YJpzrBRyRe+X3hs1P8+p7rKk nDI62IFlJL+I2WYpuoxHwaY/e+ypsESNfgPE8JIPZFfi3zjUQHLHlrBa6PfjsE9FoUgl+b R4oBWDuKJ9srsyTUqZPz58moWWtQIVk= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: wdgqotz9shedkqput6mc7fe75wxb3ei7 X-Rspamd-Queue-Id: 13A79180006 Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=TA8fMr6n; spf=pass (imf16.hostedemail.com: domain of 3_uAoYwgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3_uAoYwgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1663623423-933678 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 20, 2022 at 02:06:34AM +0800, Kairui Song wrote: > From: Kairui Song > > cgroup_memory_noswap is used in many hot path, so make it a static key > to lower the kernel overhead. > > Using 8G of ZRAM as SWAP, benchmark using `perf stat -d -d -d --repeat 100` > with the following code snip in a non-root cgroup: > > #include > #include > #include > #include > #define MB 1024UL * 1024UL > int main(int argc, char **argv){ > void *p = mmap(NULL, 8000 * MB, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > memset(p, 0xff, 8000 * MB); > madvise(p, 8000 * MB, MADV_PAGEOUT); > memset(p, 0xff, 8000 * MB); > return 0; > } > > Before: > 7,021.43 msec task-clock # 0.967 CPUs utilized ( +- 0.03% ) > 4,010 context-switches # 573.853 /sec ( +- 0.01% ) > 0 cpu-migrations # 0.000 /sec > 2,052,057 page-faults # 293.661 K/sec ( +- 0.00% ) > 12,616,546,027 cycles # 1.805 GHz ( +- 0.06% ) (39.92%) > 156,823,666 stalled-cycles-frontend # 1.25% frontend cycles idle ( +- 0.10% ) (40.25%) > 310,130,812 stalled-cycles-backend # 2.47% backend cycles idle ( +- 4.39% ) (40.73%) > 18,692,516,591 instructions # 1.49 insn per cycle > # 0.01 stalled cycles per insn ( +- 0.04% ) (40.75%) > 4,907,447,976 branches # 702.283 M/sec ( +- 0.05% ) (40.30%) > 13,002,578 branch-misses # 0.26% of all branches ( +- 0.08% ) (40.48%) > 7,069,786,296 L1-dcache-loads # 1.012 G/sec ( +- 0.03% ) (40.32%) > 649,385,847 L1-dcache-load-misses # 9.13% of all L1-dcache accesses ( +- 0.07% ) (40.10%) > 1,485,448,688 L1-icache-loads # 212.576 M/sec ( +- 0.15% ) (39.49%) > 31,628,457 L1-icache-load-misses # 2.13% of all L1-icache accesses ( +- 0.40% ) (39.57%) > 6,667,311 dTLB-loads # 954.129 K/sec ( +- 0.21% ) (39.50%) > 5,668,555 dTLB-load-misses # 86.40% of all dTLB cache accesses ( +- 0.12% ) (39.03%) > 765 iTLB-loads # 109.476 /sec ( +- 21.81% ) (39.44%) > 4,370,351 iTLB-load-misses # 214320.09% of all iTLB cache accesses ( +- 1.44% ) (39.86%) > 149,207,254 L1-dcache-prefetches # 21.352 M/sec ( +- 0.13% ) (40.27%) > > 7.25869 +- 0.00203 seconds time elapsed ( +- 0.03% ) > > After: > 6,576.16 msec task-clock # 0.953 CPUs utilized ( +- 0.10% ) > 4,020 context-switches # 605.595 /sec ( +- 0.01% ) > 0 cpu-migrations # 0.000 /sec > 2,052,056 page-faults # 309.133 K/sec ( +- 0.00% ) > 11,967,619,180 cycles # 1.803 GHz ( +- 0.36% ) (38.76%) > 161,259,240 stalled-cycles-frontend # 1.38% frontend cycles idle ( +- 0.27% ) (36.58%) > 253,605,302 stalled-cycles-backend # 2.16% backend cycles idle ( +- 4.45% ) (34.78%) > 19,328,171,892 instructions # 1.65 insn per cycle > # 0.01 stalled cycles per insn ( +- 0.10% ) (31.46%) > 5,213,967,902 branches # 785.461 M/sec ( +- 0.18% ) (30.68%) > 12,385,170 branch-misses # 0.24% of all branches ( +- 0.26% ) (34.13%) > 7,271,687,822 L1-dcache-loads # 1.095 G/sec ( +- 0.12% ) (35.29%) > 649,873,045 L1-dcache-load-misses # 8.93% of all L1-dcache accesses ( +- 0.11% ) (41.41%) > 1,950,037,608 L1-icache-loads # 293.764 M/sec ( +- 0.33% ) (43.11%) > 31,365,566 L1-icache-load-misses # 1.62% of all L1-icache accesses ( +- 0.39% ) (45.89%) > 6,767,809 dTLB-loads # 1.020 M/sec ( +- 0.47% ) (48.42%) > 6,339,590 dTLB-load-misses # 95.43% of all dTLB cache accesses ( +- 0.50% ) (46.60%) > 736 iTLB-loads # 110.875 /sec ( +- 1.79% ) (48.60%) > 4,314,836 iTLB-load-misses # 518653.73% of all iTLB cache accesses ( +- 0.63% ) (42.91%) > 144,950,156 L1-dcache-prefetches # 21.836 M/sec ( +- 0.37% ) (41.39%) > > 6.89935 +- 0.00703 seconds time elapsed ( +- 0.10% ) > > The performance is clearly better. There is no significant hotspot > improvement according to perf report, as there are quite a few > callers of memcg_swap_enabled and do_memsw_account (which calls > memcg_swap_enabled). Many pieces of minor optimizations resulted > in lower overhead for the branch predictor, and bettter performance. > > Acked-by: Michal Hocko > Signed-off-by: Kairui Song Acked-by: Shakeel Butt