From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91991C3DA49 for ; Tue, 30 Jul 2024 16:27:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D09376B0082; Tue, 30 Jul 2024 12:27:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CB8BB6B0083; Tue, 30 Jul 2024 12:27:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B80B86B0085; Tue, 30 Jul 2024 12:27:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 936F46B0082 for ; Tue, 30 Jul 2024 12:27:43 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B6691A03B3 for ; Tue, 30 Jul 2024 16:27:42 +0000 (UTC) X-FDA: 82396949964.08.7A14C32 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by imf07.hostedemail.com (Postfix) with ESMTP id DD3084001C for ; Tue, 30 Jul 2024 16:27:40 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lWsasjDI; spf=pass (imf07.hostedemail.com: domain of adrianhuang0701@gmail.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=adrianhuang0701@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722356856; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DQMm3j/WDX5wm2f/0bMEKkx0ZnqFo9kz+zHN+GCN6hs=; b=yOtdHu96RIvSlBuNq3lxKpNbHzCmIFDc/zYID1yXRQaD72M/CCbm4tQdrQXPrCs58UpRk+ A5OuQumKr3P0Q62q7p+ehjFlPwglLwD7fTx4NF7ffWcX0HluMlx4fnxvrgtN9a2B/VSu+X K4Gau9CDWNfPFDsIxey2tDN7FpoavyA= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lWsasjDI; spf=pass (imf07.hostedemail.com: domain of adrianhuang0701@gmail.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=adrianhuang0701@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722356856; a=rsa-sha256; cv=none; b=Ohd/s5dv8TVChThYNguN6Cgw7tKsX8whVNYRKTB5nXEDAi+6hMTsgdb48SG1SgwPge38J0 ZBCXu73bqMBhvJ/zuG5wi0rgUJ4Fc2VNcGns/Jy/0LdKPSfJUR4jcU1wQnkbnh9LMrEW1h s/iJSzrguJ1w+E+SXSoRcmZVzqk4oAE= Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-5afd7a9660eso5375578a12.0 for ; Tue, 30 Jul 2024 09:27:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722356859; x=1722961659; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=DQMm3j/WDX5wm2f/0bMEKkx0ZnqFo9kz+zHN+GCN6hs=; b=lWsasjDIfA9OHI5VIJ/RQJnAE1RyVsFXXycvC04CI2Ad3FV0N3pCEn7jPYYr8ZKpES 8kB0zkfWVtirHsVaYJLj3i9pWqZeCHZvS0PbInC3UoSRWJ4y3EYinKJcuw3FycaDsQMw QZgJd/S/ad9UXZvAO8qLPNVxtnC1XU/9HFKfilFsvZ29xRHnlFKEbkMOoEx7Cf2DE3rP x12awoKdMYqp+2vmgVh+05iShzDxEDfzrbnREaSS7q0clTSiO6kZMsfMmPqverKfhuPf xew44oaSktdgZMwpupEM/7Y/IMn9R6TcnS/QpyvAfqXmXMQ3iweFR24Opigk/zIVuESP DdLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722356859; x=1722961659; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DQMm3j/WDX5wm2f/0bMEKkx0ZnqFo9kz+zHN+GCN6hs=; b=lDHLmtP46D+pz/Kn61odIBSavQaKK6To0SCHrACPkAxF+FdESY6Ijzu1wt3tm4ijl6 aYGoxjfWs3GiZ/qPqYPd3XSHXixR/0kL6HzwCr09s/VHCXQzE1/Egp1zjZ8SropiypTC ddiSJhavQXjnGbC2acK797lj38IzQXutm7QHdRZNVguZs2FHd8/gXytWBaTy4j7eoccL 5PRyTt4oUqv61QRa/zC+1xYZAsnT9QW8pd0D2ajGAsQGTzs2GBiOsfpxq+fgUzF8ZvTU qR18E1Zh3IuF1C1dHOkh7K68JB74eVKrBouNolxcu4l4GgkOuUr1/X6EqIrLxzXNtknI kjxw== X-Forwarded-Encrypted: i=1; AJvYcCU1fxV7TxDyn9Y/zDKMy9W7zb7JXMIBIgLpjoc6JF8keVGc3VQgHW7WAxZprL7PAsE+a0GU+vanDVExKKrIg8o8OFo= X-Gm-Message-State: AOJu0YyuSlTz+ilBswVBYHqbO+vcyyArgwG3+vina9Ql2W//y3WApo8C 3rZP4BY4XvrCMacPW3LjlFdAb+SN/uKU66lnz0m4tsB0VNuSOwbE5Kqe+iLOGkn5hw0qrNhKD/4 rgKrkr8yxdlHnOH75/oCUUmDxGe3LQaifuuY= X-Google-Smtp-Source: AGHT+IEawiuRpfNoqf12SzW44BnpY3hQf+/7DfHQBAHQN8MUUs+CehBB8Rl4GCcTifyw+pnQ2rwaBYTO97pGym0rkKg= X-Received: by 2002:a50:c30b:0:b0:58b:73f4:2ed with SMTP id 4fb4d7f45d1cf-5b0224cf4d6mr6499683a12.35.1722356858839; Tue, 30 Jul 2024 09:27:38 -0700 (PDT) MIME-Version: 1.0 References: <20240730093630.5603-1-ahuang12@lenovo.com> In-Reply-To: From: Huang Adrian Date: Wed, 31 Jul 2024 00:27:27 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/vmalloc: Combine all TLB flush operations of KASAN shadow virtual address into one operation To: Uladzislau Rezki Cc: ahuang12@lenovo.com, akpm@linux-foundation.org, andreyknvl@gmail.com, bhe@redhat.com, dvyukov@google.com, glider@google.com, hch@infradead.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ryabinin.a.a@gmail.com, sunjw10@lenovo.com, vincenzo.frascino@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: ee15eed7hn1fnu6ieotmj3etjjd3fibh X-Rspamd-Queue-Id: DD3084001C X-Rspamd-Server: rspam11 X-HE-Tag: 1722356860-6366 X-HE-Meta: U2FsdGVkX19PKjdQ/kUwCfUtqDRJvtii/kBdzXs681fQvmf4bU522BqDfaeyjHBAoUeEPjz/fsrQ0FPYM/2ZxELsMBREn9QDtIW+uQBnR/J0nz8gBSNExc67ZgNcqZP3A9zw6zxRitqeeywmnPa66mqXB990wTcgPhxNA+TW7oPfGKgy0y7aRYAZKL0q4FLwFSYZhlJblqmwEQI+DU6Q7m6pTM+LQ30iOOYfL4yHUwf1TJ2hJI6cAN/+CYRAAmEdd5yfU6FKin2rK3zk2nHOA4Ww9Cz2Bref2IW1y2R4jP3OwySY65J5NH1gxqL1Kp2mE04F2VeLekMOhT8Eg2onDyXA3F49PaCOZhTahnwz7/kv8elQ24u0B8Ucs1x4/ySqQooXA0fJnJY+ngNW0jHJTMsZBz+4w1KeG7gAEf1boKjDtjY3uXKZjKPkTQOmW2l+5eZV8Xouktq7qF/VQDOi0NQMGtHCwtUrRYtUXyRihkaLp1KFJVpGi5oDmyVAgPuGlfNS5J4+2ewIXIiWeRsYzHCRJZ0aaOFu+SO8xHnVL8j3jnhqCizYFIq5prfM6FLZR/nLHOMWmZ82BeHj1IeNAKOgXO/1SjgW9dXWjZZ31pyUsi1Xs9YCa+gCxyHRFDDPtKF6VO+4CBZA11R2Pj41LKp+ZSZXAqRPhmLvJK3adFfTwB75z+oK33lBVCODKHkjL07fxgZS4L0F8QHBa6YqehF3jgJqoRD7Bwqk+qm0RJU6MnE4zOddTno0ep3Bxw8IA6vHL7xbcXDW0k6MIVQ+8WFAdw2a6X8PhJVn/f3nHDg/n2cmL3CfLu+lRKT/tP0VjZCp2DhJCp26h7uJ9/kzIf9Oimukom0i3BhZGzHqHeUxCHYvjyqYbAnNXYcqi7CKhkm24TqMb/7zJc2eJTb5bDfjwICu4i17WLTkBwExxsuBWfiiaxYM/9ZW8aXaXzoPRxTmiSurQXn4hguf1LF 3hI6sXY7 X7WMhcuiONGP8XsB7XfVijGSPrUj5+r/diKdlc4gAq7wKrTUQqKShfzIN+8XWF7VZzWxbts6BoWJZb5s0A7KZCVQbBAn4A+8oFzTpB/lxN6SN+4LZwWR79Uxj0fefex9G0bRazKmiO++0WrTIuia4O0m6q2vkhQR/V/oGDeUva1JNMi08s4Kez2YZvK3APZSIuz6pNX/xDej4WX3CTAXWeVESCd5nrIjKt5bHA1Tqfl/cKLjktKR9fX54N9eBeZZRRH8lDBl/Cc9IoIaIAvRAi9LwMIo7K3rYYc3/Di27IQ8sibU6nEG4Jq6YQ3DFIgfWjDaPeJlgdlsb86lTEFD4cFWfh8ZNAch6e5eg0zztFBjxGUj60lF5mTuueJkxLI5Vz+fhthDfcgxCH6kORvb23CW+FonVidJpvlRoth/EeGRq74gRnw66tvMlvg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 30, 2024 at 7:38=E2=80=AFPM Uladzislau Rezki = wrote: > > > On Mon, Jul 29, 2024 at 7:29 PM Uladzislau Rezki wro= te: > > > It would be really good if Adrian could run the "compiling workload" = on > > > his big system and post the statistics here. > > > > > > For example: > > > a) v6.11-rc1 + KASAN. > > > b) v6.11-rc1 + KASAN + patch. > > > > Sure, please see the statistics below. > > > > Test Result (based on 6.11-rc1) > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D > > > > 1. Profile purge_vmap_node() > > > > A. Command: trace-cmd record -p function_graph -l purge_vmap_node ma= ke -j $(nproc) > > > > B. Average execution time of purge_vmap_node(): > > > > no patch (us) patched (us) saved > > ------------- ------------ ----- > > 147885.02 3692.51 97% > > > > C. Total execution time of purge_vmap_node(): > > > > no patch (us) patched (us) saved > > ------------- ------------ ----- > > 194173036 5114138 97% > > > > [ftrace log] Without patch: https://gist.github.com/AdrianHuang/a5be= c861f67434e1024bbf43cea85959 > > [ftrace log] With patch: https://gist.github.com/AdrianHuang/a200215= 955ee377288377425dbaa04e3 > > > > 2. Use `time` utility to measure execution time > > > > A. Command: make clean && time make -j $(nproc) > > > > B. The following result is the average kernel execution time of five= -time > > measurements. ('sys' field of `time` output): > > > > no patch (seconds) patched (seconds) saved > > ------------------ ---------------- ----- > > 36932.904 31403.478 15% > > > > [`time` log] Without patch: https://gist.github.com/AdrianHuang/987b= 20fd0bd2bb616b3524aa6ee43112 > > [`time` log] With patch: https://gist.github.com/AdrianHuang/da2ea4e= 6aa0b4dcc207b4e40b202f694 > > > I meant another statistics. As noted here https://lore.kernel.org/linux-m= m/ZogS_04dP5LlRlXN@pc636/T/#m5d57f11d9f69aef5313f4efbe25415b3bae4c818 > i came to conclusion that below place and lock: > > > static void exit_notify(struct task_struct *tsk, int group_dead) > { > bool autoreap; > struct task_struct *p, *n; > LIST_HEAD(dead); > > write_lock_irq(&tasklist_lock); > ... > > > keeps IRQs disabled, so it means that the purge_vmap_node() does the prog= ress > but it can be slow. > > CPU_1: > disables IRQs > trying to grab the tasklist_lock > > CPU_2: > Sends an IPI to CPU_1 > waits until the specified callback is executed on CPU_1 > > Since CPU_1 has disabled IRQs, serving an IPI and completion of callback > takes time until CPU_1 enables IRQs back. > > Could you please post lock statistics for kernel compiling use case? > KASAN + patch is enough, IMO. This just to double check whether a > tasklist_lock is a problem or not. Sorry for the misunderstanding. Two experiments are shown as follows. I saw you think KASAN + patch is enough. But, in case you need another one. ;-) a) v6.11-rc1 + KASAN The result is different from yours, so I ran two tests (make sure the soft lockup warning was triggered). Test #1: waittime-max =3D 5.4ms ... class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg ... tasklist_lock-W: 118762 120090 0.44 5443.22 24807413.37 206.57 429757 569051 2.27 3222.00 69914505.87 122.86 tasklist_lock-R: 108262 108300 0.41 5381.34 23613372.10 218.04 489132 541541 0.20 5543.40 10095470.68 18.64 --------------- tasklist_lock 44594 [<0000000099d3ea35>] exit_notify+0x82/0x900 tasklist_lock 32041 [<0000000058f753d8>] release_task+0x104/0x3f0 tasklist_lock 99240 [<000000008524ff80>] __do_wait+0xd8/0x710 tasklist_lock 43435 [<00000000f6e82dcf>] copy_process+0x2a46/0x50f0 --------------- tasklist_lock 98334 [<0000000099d3ea35>] exit_notify+0x82/0x900 tasklist_lock 82649 [<0000000058f753d8>] release_task+0x104/0x3f0 tasklist_lock 2 [<00000000da5a7972>] mm_update_next_owner+0xc0/0x430 tasklist_lock 26708 [<00000000f6e82dcf>] copy_process+0x2a46/0x50f0 ... Test #2:waittime-max =3D 5.7ms ... class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg ... tasklist_lock-W: 121742 123167 0.43 5713.02 25252257.61 205.02 432111 569762 2.25 3083.08 70711022.74 124.11 tasklist_lock-R: 111479 111523 0.39 5050.50 24557264.88 220.20 491404 542221 0.20 5611.81 10007782.09 18.46 --------------- tasklist_lock 102317 [<000000008524ff80>] __do_wait+0xd8/0x710 tasklist_lock 44606 [<00000000f6e82dcf>] copy_process+0x2a46/0x50f0 tasklist_lock 45584 [<0000000099d3ea35>] exit_notify+0x82/0x900 tasklist_lock 32969 [<0000000058f753d8>] release_task+0x104/0x3f0 --------------- tasklist_lock 100498 [<0000000099d3ea35>] exit_notify+0x82/0x900 tasklist_lock 27401 [<00000000f6e82dcf>] copy_process+0x2a46/0x50f0 tasklist_lock 85473 [<0000000058f753d8>] release_task+0x104/0x3f0 tasklist_lock 650 [<000000004d0b9f6b>] tty_open_proc_set_tty+0x23/0x210 ... b) v6.11-rc1 + KASAN + patch: waittime-max =3D 5.7ms ... class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg ... tasklist_lock-W: 108876 110087 0.33 5688.64 18622460.43 169.16 426740 568715 1.94 2930.76 62560515.48 110.00 tasklist_lock-R: 99864 99909 0.43 5868.69 17849478.20 178.66 487654 541328 0.20 5709.98 9207504.90 17.01 --------------- tasklist_lock 91655 [<00000000a622e532>] __do_wait+0xd8/0x710 tasklist_lock 41100 [<00000000ccf53925>] exit_notify+0x82/0x900 tasklist_lock 8254 [<00000000093ccded>] tty_open_proc_set_tty+0x23/0x210 tasklist_lock 39542 [<00000000a0e6bf4d>] copy_process+0x2a46/0x50f0 --------------- tasklist_lock 90525 [<00000000ccf53925>] exit_notify+0x82/0x900 tasklist_lock 76934 [<00000000cb7ca00c>] release_task+0x104/0x3f0 tasklist_lock 23723 [<00000000a0e6bf4d>] copy_process+0x2a46/0x50f0 tasklist_lock 18223 [<00000000a622e532>] __do_wait+0xd8/0x710 ...