From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92685CA0ED3 for ; Mon, 2 Sep 2024 12:01:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 250288D00C4; Mon, 2 Sep 2024 08:01:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 200258D0065; Mon, 2 Sep 2024 08:01:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C7EE8D00C4; Mon, 2 Sep 2024 08:01:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E3D698D0065 for ; Mon, 2 Sep 2024 08:01:30 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 6EEC7A1C87 for ; Mon, 2 Sep 2024 12:01:30 +0000 (UTC) X-FDA: 82519658340.22.B478836 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by imf20.hostedemail.com (Postfix) with ESMTP id 3D4C81C0033 for ; Mon, 2 Sep 2024 12:01:27 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mzDdpKTD; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of adrianhuang0701@gmail.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=adrianhuang0701@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725278384; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+nuRHHUZafLxE38ll9TYEkwvvW3M7ChZMkf4+qF48X4=; b=V0Px+U1qEZb7dek5QMknCCX1xfTuFpH+DoPJSLoa4Sr8u9RDpZv5/FarxvwwczPdpgYsKJ Gnlb8atRgT72Eyz1RwgMtUKj48N8osG/B82NQCDANezEGaA5h9SFNe3vigBjVV79B3kWE6 XtePLatL+cdVfAEjL03qvTMxpWM+ThQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725278384; a=rsa-sha256; cv=none; b=XCVkAHsaKQ5WTZwLssR3WJHcr58VP3Sgdwulb5w3DoLYsv9fY5iNWew9UBI7z4TOk0gHeT S18F1Tu4RuI/I744k6a95Y7hKcLs78WwoLqRrisq6amu9BNFN/gQnXeu1mThfdlU51ppdn RxHt/Ds1XvZw/HQOdqO7jt4sVtjo0zU= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mzDdpKTD; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of adrianhuang0701@gmail.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=adrianhuang0701@gmail.com Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-7141285db14so3572500b3a.1 for ; Mon, 02 Sep 2024 05:01:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725278487; x=1725883287; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+nuRHHUZafLxE38ll9TYEkwvvW3M7ChZMkf4+qF48X4=; b=mzDdpKTDI4GeFpwh6Z04W5ME71mOTZh/aBXNqn6AYC1oCFMUrM546LeG2DezajNZNK sSpew1Y8Zrse85FLvLsD2ZBXr0mJjJcIK4eMt1AzTilTmSV10N2y35NIJRCaOtumz47c 5XMis6syEci11na1kI67e8n2bi6Ns9sq6NkOGh50yRoeSCwOa0UYlW44doVIa1xifnI8 Q0qg8rA9PbNpKLlSaYrNVfd3FZizXOhPc/CX93n61kgUTL9zQxkSZCuUNu9MpF8llz0K Xj8pzhwxdNOWDOKMoOk6UqXWwtL/Fn6O/nivFMyQj/NutHOg31B27LkO9FKWGYCB7Vz0 mEOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725278487; x=1725883287; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+nuRHHUZafLxE38ll9TYEkwvvW3M7ChZMkf4+qF48X4=; b=RnGFnTnQXo+e8D3D4ZZAoIRFgqAq6ZIMW42fStYQOhhrL+rW1UCfSlM+5ddogByG+M dFwvbD3R1x49onxhOe0yquSlZc7B3+wBSGsSN7wWRu7QYLSEXKSmfFQw3ipZr/gqdmZX K28S4naGddJHA3kXWiZ2bb/L8Zx7c2QMxp+l/ap7YzlNZAbXdG1VDfKtciRuc9S6UeDU q1/GkNR6KwpcpT58g3a2zmYNClyokedifSLZesBhi3Q4r/lGYlHRBcrr6T2MMyWH0yRi x6svibDpdXaIcG+vZoeiNCHX6YyQluzw8gIZL8Uf5L71LEmDksozOfKjqvWuKwubDo0L Ihig== X-Forwarded-Encrypted: i=1; AJvYcCVfouH8r5jAO6KUBM7Bsh5mc4NMD+mTDleuG7PPtqDvXQylDMMb3uMpBW+TZ3Em55nGAWfp4RHxbA==@kvack.org X-Gm-Message-State: AOJu0YxnrxJtBGCmop0PxymYhhJmC5WtxQTDVNsuMfj9hN4IX7aE6Kr8 rzJtm5Iw1jy6KJQ+GJ1SM1VEweMSIgvwQh/V6BvuQl+2ilsti+iB X-Google-Smtp-Source: AGHT+IFIJaVoXRr0zOyLAI4g5pO2EXMnFNT7+UZOmEa9GhJfede9guyxDTgzjbp1STjLZCXX/4WCkw== X-Received: by 2002:a05:6a00:178d:b0:706:31d9:9c99 with SMTP id d2e1a72fcca58-7173b399f1dmr9763962b3a.0.1725278485035; Mon, 02 Sep 2024 05:01:25 -0700 (PDT) Received: from AHUANG12-3ZHH9X.lenovo.com (220-143-210-230.dynamic-ip.hinet.net. [220.143.210.230]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-715e55b9d11sm6909196b3a.90.2024.09.02.05.01.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Sep 2024 05:01:24 -0700 (PDT) From: Adrian Huang X-Google-Original-From: Adrian Huang To: urezki@gmail.com Cc: adrianhuang0701@gmail.com, ahuang12@lenovo.com, akpm@linux-foundation.org, hch@infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 1/1] mm: vmalloc: Optimize vmap_lazy_nr arithmetic when purging each vmap_area Date: Mon, 2 Sep 2024 20:00:46 +0800 Message-Id: <20240902120046.26478-1-ahuang12@lenovo.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3D4C81C0033 X-Stat-Signature: qgyyqcu48yutr4fi6c9h95keypew45jk X-Rspam-User: X-HE-Tag: 1725278487-9323 X-HE-Meta: U2FsdGVkX1/Fl1LP70XlUUKEeaz6iOcDTqeFsQEaH9X7zhJiwqhYB1hRT3lIDlxJDlN+255JR3NpS8/rEKJJcV9sCOJGtYgBBX7IEWmWkOJ9j7eX5VK64x7mkwun7zD0Ruotv5o60OglE/5j0QtEBgPSZFDb97Ha/nhViLsOIBYw+G6C1RL8Rfff/7+JO5zW2xwmBY1PSP4nfWxQ49NEHV13UJZzeTkiR5m3BrP2KfB7jipbmd3lRKg4vs21LLFjjA1jrFd+Uz54lGmPpkGxXJorJV3kUlReOaR8pEoHX73rpcMqfYNgTy8Lc7ZAs0Y0uz7WWgVz2SV9nzBsxVq27c/aa1GrnFnSPXylPSE+lA93kqZpci+kg/0uNle70ZNJbOcbonl049GTheOv3fttjOnZi2ZjGl0QIqQP8r7+8eJdKT8jaXrkvBuYF/owsYrQlZ32ZPiKMjw5X+JiWjItIlonZVMLsXe35H1RUUGLIYYNZyyrgobv+o0/O4jTT8TlpBMxBiSje0We3M87fwshVt1U7rwk1GE5QGe5ZS56tq/PUibUg/pua0+tDATZzaei+T2bDNJFn02SXejMGQV1HvQgJewn5XxPjhBAQR3m00VQhjmJTghxdUppKJyoUj12AlWvZE7zgRYOPXIxesEkwX9IQyE18q8Orjf13ietnwS1UoOiSxvoRt5gOYoA8Ud97Ylk5hevfmoqAqHHfRFLC+CkYDB+//CxrO7Jsnds94LB3WH2DYXBROvSMgnIMUZHrmFxhcTKZa0exKI8CeWNhqIKlDTnmfoMvIdGgZeqcGb1IQ0CMQJ/Pc/CinTAO//MKdIojVUq9agNiNtLNqkRsyCKYUkBBB8yPPPRGnCYuO2W29uXe3DJcU0eNpUk9eGdqzYlFN/S5lw44VGzP5IJqpY8u/LPMRp/kqD+bWyXyLTfa/o6uKR1Bd8StQdNKNA0qWC8VIWKaROQj2cb9lJ 8qU+Ls9/ MLqQDNACNpKVVdIs84TbwR61bJDAC6VNIXb6iiKtgzrvyP+ERw42FOAQwa+/oJA/Aiw4s76i4veW6g5eN0bZyaabrS4UVyk3urM1FZMmMoJ/SijbhRoPye5/7PlO4m0CA/kjVpphVVB7KhAr1VzL4U593g6QuDGwZgAKiJ+3Qgu2Hedash9hox6X98fertAy9kItEFLl1/Y1sGPPP6cJOcLbgpAgfloEfrmRMVpkmNpMJPnOjruzkE02T/29ODVahrBBHykMjZben0HTsoAFeY3+Wv6ScZcTaA4arJZb0PzSYV2VohLU3CG8AgfFFv5jWSreOXOymZ7u9HbuK8jA95jkqonGbqQjAFOn1wXzfAs+fxDH32oC9egbzc8lsRFCAMxakS0ZkbJuvjk8YlD1lL49gzpvYkaz9l4MyLLRs8AuEDS0b4dOBrldMIpvsUftBtGvF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 30, 2024 at 3:00 AM Uladzislau Rezki wrote: > atomic_long_add_return() might also introduce a high contention. We can > optimize by splitting into more light atomics. Can you check it on your > 448-cores system? Interestingly, the following result shows the latency of free_vmap_area_noflush() is just 26 usecs (The worst case is 16ms-32ms). /home/git-repo/bcc/tools/funclatency.py -u free_vmap_area_noflush & pid1=$! && sleep 8 && modprobe test_vmalloc nr_threads=$(nproc) run_test_mask=0x7; kill -SIGINT $pid1 usecs : count distribution 0 -> 1 : 18166 | | 2 -> 3 : 41929818 |** | 4 -> 7 : 181203439 |*********** | 8 -> 15 : 464242836 |***************************** | 16 -> 31 : 620077545 |****************************************| 32 -> 63 : 442133041 |**************************** | 64 -> 127 : 111432597 |******* | 128 -> 255 : 3441649 | | 256 -> 511 : 302655 | | 512 -> 1023 : 738 | | 1024 -> 2047 : 73 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 196 | | avg = 26 usecs, total: 49415657269 usecs, count: 1864782753 free_vmap_area_noflush() just executes the lock prefix one time, so the wrost case might be just about a hundred clock cycles. The problem of purge_vmap_node() is that some cores are busy on purging each vmap_area of the *long* purge_list and executing atomic_long_sub() for each vmap_area, while other cores free vmalloc allocations and execute atomic_long_add_return() in free_vmap_area_noflush(). The following crash log shows the 22 cores are busy on purging vmap_area structs [1]: crash> bt -a | grep "purge_vmap_node+291" | wc -l 22 So, the latency of purge_vmap_node() dramatically increases becase it excutes the lock prefix over 600,0000 times. The issue can be easier to reproduce if more cores execute purge_vmap_node() simultaneously. [1] https://gist.github.com/AdrianHuang/c0030dd7755e673ed00cb197b76ce0a7 Tested the following patch with the light atomics. However, nothing improved (But, the worst case is improved): usecs : count distribution 0 -> 1 : 7146 | | 2 -> 3 : 31734187 |** | 4 -> 7 : 161408609 |*********** | 8 -> 15 : 461411377 |********************************* | 16 -> 31 : 557005293 |****************************************| 32 -> 63 : 435518485 |******************************* | 64 -> 127 : 175033097 |************ | 128 -> 255 : 42265379 |*** | 256 -> 511 : 399112 | | 512 -> 1023 : 734 | | 1024 -> 2047 : 72 | | avg = 32 usecs, total: 59952713176 usecs, count: 1864783491 [free_vmap_area_noflush() assembly instructions wo/ the test patch] ffffffff813e6e90 : ... ffffffff813e6ed4: 4c 8b 65 08 mov 0x8(%rbp),%r12 ffffffff813e6ed8: 4c 2b 65 00 sub 0x0(%rbp),%r12 ffffffff813e6edc: 49 c1 ec 0c shr $0xc,%r12 ffffffff813e6ee0: 4c 89 e2 mov %r12,%rdx ffffffff813e6ee3: f0 48 0f c1 15 f4 2a lock xadd %rdx,0x2bb2af4(%rip) # ffffffff83f999e0 ffffffff813e6eea: bb 02 ffffffff813e6eec: 8b 0d 0e b1 a5 01 mov 0x1a5b10e(%rip),%ecx # ffffffff82e42000 ffffffff813e6ef2: 49 01 d4 add %rdx,%r12 ffffffff813e6ef5: 39 c8 cmp %ecx,%eax ... [free_vmap_area_noflush() assembly instructions w/ the test patch: no lock prefix] ffffffff813e6e90 : ... ffffffff813e6edb: 48 8b 5d 08 mov 0x8(%rbp),%rbx ffffffff813e6edf: 48 2b 5d 00 sub 0x0(%rbp),%rbx ffffffff813e6ee3: 8b 0d d7 b0 a5 01 mov 0x1a5b0d7(%rip),%ecx # ffffffff82e41fc0 ffffffff813e6ee9: 48 c1 eb 0c shr $0xc,%rbx ffffffff813e6eed: 4c 8b 25 b4 f1 92 01 mov 0x192f1b4(%rip),%r12 # ffffffff82d160a8 ffffffff813e6ef4: 48 01 d3 add %rdx,%rbx ffffffff813e6ef7: 48 89 1d e2 2a bb 02 mov %rbx,0x2bb2ae2(%rip) # ffffffff83f999e0 ffffffff813e6efe: 39 c8 cmp %ecx,%eax ... diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 3f9b6bd707d2..3927c541440b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2357,8 +2357,9 @@ static void free_vmap_area_noflush(struct vmap_area *va) if (WARN_ON_ONCE(!list_empty(&va->list))) return; - nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >> - PAGE_SHIFT, &vmap_lazy_nr); + nr_lazy = atomic_long_read(&vmap_lazy_nr); + nr_lazy += (va->va_end - va->va_start) >> PAGE_SHIFT; + atomic_long_set(&vmap_lazy_nr, nr_lazy); /* * If it was request by a certain node we would like to