From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0EC8D3C93F for ; Mon, 21 Oct 2024 05:38:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B3A76B007B; Mon, 21 Oct 2024 01:38:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 264056B0082; Mon, 21 Oct 2024 01:38:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 105126B0083; Mon, 21 Oct 2024 01:38:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E6CC06B007B for ; Mon, 21 Oct 2024 01:38:25 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D1B494157C for ; Mon, 21 Oct 2024 05:38:16 +0000 (UTC) X-FDA: 82696503750.30.79E2CA7 Received: from mail-ua1-f52.google.com (mail-ua1-f52.google.com [209.85.222.52]) by imf04.hostedemail.com (Postfix) with ESMTP id 1B6B040007 for ; Mon, 21 Oct 2024 05:38:04 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cs9848Cc; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729488993; a=rsa-sha256; cv=none; b=DFJKbTYaCm3KezpcfBFI4mrahmSy9xtVacs/AX1w4wY24v0Pm6nJzYsXlJkMJXWGn5PZpK JFgvkZ+qniIVzJRfq4mKpQ16tvP8dnaVKR5EnQGGKK0qusJ8Xij1gcdTUI57LWEd0phUv0 7RhdIR03v54309VfsCmnO9P/By5zFCw= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cs9848Cc; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729488993; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SSLKw1tjCvEPAoukgxopioOOPQdZp/X5LDq5LzqQkMw=; b=RB+2D5indND45dSncT60/g/il5dW7hOYS/tmZCvephCiovB+a4d4ZtmWMyEmXOBIYdkfYn Dwnwuw+EBimiYevYaCbOxjrr2g5P/TTjTmZT8gsHNfGdiX34N2ur5pwYH318iEmnDtAMsM N1pNOr5xDzqfUqhFopdTGrbeWVuhpsg= Received: by mail-ua1-f52.google.com with SMTP id a1e0cc1a2514c-851d2a36e6dso1856567241.0 for ; Sun, 20 Oct 2024 22:38:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729489102; x=1730093902; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SSLKw1tjCvEPAoukgxopioOOPQdZp/X5LDq5LzqQkMw=; b=cs9848CchxgYOQqs4GvAdvkObW9EALEv3X0Y1ZR/XlWp4v/+Itih6DRU5Gn17VtoNp J5tnpiiqesybzw1VuD59NnBvHsLZT9GmWps2RCHA4BFuDvQe2yZs5YxDDS+EFSTNSAyp A4GrxtvPUu3kRoOEfhQJJ552K+woYoQRcc5NJc1P+QoJWu+7J6LYVehhnj32Kl8SYkPM uLqqeCvdEKX2Jo5+LVzQZ8u/DV2q7NdQH4cXMQY8z19HxgqKPKB8DFdaI1MGXQsbhAfE A6kd2Re7R4fQmaG5srUYcPMZgTVEjdhNAsx76kGplJMsRMY5XnBNFX1zHshhBYeIz6Wm paPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729489102; x=1730093902; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SSLKw1tjCvEPAoukgxopioOOPQdZp/X5LDq5LzqQkMw=; b=qUottjjNwJ+QHHCI05Nj3KZ1Tp1VszkrSt5N/pUneL+QHe7YhMSqlC7RHHTCnDVLDr YT8m5iyEO5PH3TXKihz6nN7l+gOrfyLKOJL44mnqKljChxfVrk++VozRm5lIxdEZJAho gPs8n9x7dno3XsNXzO/78SR0AZ5eLLjRQU8SWdsj4qio5+EkYq7jvC+0NUbEOKDkcHvh ImqCU/noMyQfuCG2voeFKYIm4D/ELDQPK7tjgLofos46bakWZc0U2vtUxjm1qECHNuh6 /49fay/0hhf8R2zBFTpkKOZoRRCfpr3LI6IBC47Na56OM8GZrz/7Kj0+4aeeunEjKXLN JSjQ== X-Forwarded-Encrypted: i=1; AJvYcCWrhq9u66XKiZLp/cXOyfQ1Im+NP4Dqs4eM5Zhq8VQTvp1XIZ6n+N5z80NPF0K+MosyjNmpCGnHHg==@kvack.org X-Gm-Message-State: AOJu0Yz5XnGCN4T1WnXBv9K5ZZkVsVCIVtzv0j0oyprHrVz+XoHE9d8t /1E9QNDA3a6Njky2lvdh1oIJkW9dk7sb6hyt34tRQESZQwOaf0rTNY2YfqJ/kVIAIfWCsXhtDfM 4Gmo70i+N8o8Mtha7KiWt+fvmgh8= X-Google-Smtp-Source: AGHT+IEzsvJODP0ubWGn/wcd4zwyCAzps3phHTFWJ/A+ZMVyY0VsQRNNeqiHVQ1fTXVL5XEs1HG3Wd/ho1RHHLowk80= X-Received: by 2002:a05:6102:390e:b0:4a4:7148:85d9 with SMTP id ada2fe7eead31-4a5c4796b35mr9446075137.0.1729489102635; Sun, 20 Oct 2024 22:38:22 -0700 (PDT) MIME-Version: 1.0 References: <20241017142504.1170208-1-wangkefeng.wang@huawei.com> <20241017142504.1170208-2-wangkefeng.wang@huawei.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Mon, 21 Oct 2024 18:38:11 +1300 Message-ID: Subject: Re: [PATCH] mm: shmem: convert to use folio_zero_range() To: Kefeng Wang Cc: Matthew Wilcox , Andrew Morton , Hugh Dickins , David Hildenbrand , Baolin Wang , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1B6B040007 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: rtji4jdnbm8rip79difibaf8dksarkuo X-HE-Tag: 1729489084-786488 X-HE-Meta: U2FsdGVkX1+xaLCiMUMJp3sO5Rr5aobQrlp7a3G5NeU0/uv2fBUAiTGPQjtMKld/TQ2fX4Q9WcziVojUbMqrw2YVx/YXEGvg3r/GSpKLO7LTfrcMIdMUsVi2gE8NyJ03dRO9adGy78KE8mTQO1jQ9ONqX9hJVOQB9pnlZHwDjIquPaAZo7A3spweatI2pFJTFYAJgnZEMcRuqMjFp0UjBT5cvOqpuWVqxtiYh3YskN4vSRyf0hdDwB+SBJIcoIAEScx0j/OssM/WlgyAyPD/FYdiDHia2ZH9r+4dhww7uEfUu/nAGL6nj+JEdL19Yh6hh2mcgLrue5BEgAQMQdKjSYzs4NPBzmqGu5YbnaZONJ+HW8hExZZpQ7U6XvYOlAQcgv5Q50OrwYIT59fEiK5YIkoJ2PQ7MNjRLMpz4cbdN6VVI7FZIQVhaE5Nat6n3KtQ1IrHUrH22M2zJXMBqRn6lxxlk4eQUAN8XIt51w+KFShilaOAUE+VdIOasl/lqmMlwTJTIG+JY7PB7TgoZKb6fMw3hFIqQ7uJjfZGXFll8yyBusvRbFrUAo3/5NhuKU+PBaUe9h8fa7cbzHBF0nc5Yqq6X+aT88mBYaNf9ejturr4k0/rv1YTWW1GpzJctH75zvaUVHW21O3WRu1e34sGy0E2CCUvdlqIcM73MNJaiA9nB92JYFWv8PGPuntkOKKaTaBMwNNobqeCYX38bek0BgMtpQS0ryqKHWcle6FVCO02tSQDo6bga4oVR9YZZ7wSHQSBS20PAI6zNfajoqlZZxtiw0fB7wBK2hTGHqx++74jaI0my12dzrn3j/29K+o72Nc29QWnD7srXMfYgjCxc735EiXvmg+5IIbViq8V0kfZQAzEwJg/bFMdCQNGFXg5P2N+mllEJskrCdzWgZwJBJL4L10OKCemp1pdwQb7TJYtMdarSIJu0AcIPmLoSlwVSBu5KJKdlhUkFcPyBbA xXL3WpEZ Ty5MITf/DwHGLOuZ7TJ29rHe6HJzEA8zr/g4t0vsUbSqBs7OSy9dv3oGokQTkN8aBzP1sCvWkA29QP2KsOD9zLhMDaWkxo4oF+5oMvwJH6lkcF5udLz0Ob6bacFXfqUNZ8S2iP2MCGA9MlArpaUb1ahJ/sHs47msEDJhh+i0yFIb78NRDD1UXoOw7aVxlF/n31kvmUl+TEr41ADzoFjMopIxoYqEecVj11kXiuVPHyxuX4cvOGl67w86cfNhCJT5TtvRCOGo7gjxHxJpeuf0wUQ8qT+zj+3LnKeeYZYIinHAvkG07jYFvFBHZtpbEwxVgtrdvPQ18cCCYgKsKZcuznsJHTH/LgW3h7gTr3Zabll14JY3nzAWNJ/dBZIBZsF8rlY2VTMvk8RBrGag= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 21, 2024 at 6:16=E2=80=AFPM Kefeng Wang wrote: > > > > On 2024/10/21 12:15, Barry Song wrote: > > On Fri, Oct 18, 2024 at 8:48=E2=80=AFPM Kefeng Wang wrote: > >> > >> > >> > >> On 2024/10/18 15:32, Kefeng Wang wrote: > >>> > >>> > >>> On 2024/10/18 13:23, Barry Song wrote: > >>>> On Fri, Oct 18, 2024 at 6:20=E2=80=AFPM Kefeng Wang > >>>> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 2024/10/17 23:09, Matthew Wilcox wrote: > >>>>>> On Thu, Oct 17, 2024 at 10:25:04PM +0800, Kefeng Wang wrote: > >>>>>>> Directly use folio_zero_range() to cleanup code. > >>>>>> > >>>>>> Are you sure there's no performance regression introduced by this? > >>>>>> clear_highpage() is often optimised in ways that we can't optimise= for > >>>>>> a plain memset(). On the other hand, if the folio is large, maybe= a > >>>>>> modern CPU will be able to do better than clear-one-page-at-a-time= . > >>>>>> > >>>>> > >>>>> Right, I missing this, clear_page might be better than memset, I ch= ange > >>>>> this one when look at the shmem_writepage(), which already convert = to > >>>>> use folio_zero_range() from clear_highpage(), also I grep > >>>>> folio_zero_range(), there are some other to use folio_zero_range(). > >>>>> > >>>>> fs/bcachefs/fs-io-buffered.c: folio_zero_range(folio, 0, > >>>>> folio_size(folio)); > >>>>> fs/bcachefs/fs-io-buffered.c: folio_zero_range(f, > >>>>> 0, folio_size(f)); > >>>>> fs/bcachefs/fs-io-buffered.c: folio_zero_range(f, > >>>>> 0, folio_size(f)); > >>>>> fs/libfs.c: folio_zero_range(folio, 0, folio_size(folio)); > >>>>> fs/ntfs3/frecord.c: folio_zero_range(folio, 0, > >>>>> folio_size(folio)); > >>>>> mm/page_io.c: folio_zero_range(folio, 0, folio_size(folio)); > >>>>> mm/shmem.c: folio_zero_range(folio, 0, folio_size(folio= )); > >>>>> > >>>>> > >>>>>> IOW, what performance testing have you done with this patch? > >>>>> > >>>>> No performance test before, but I write a testcase, > >>>>> > >>>>> 1) allocate N large folios (folio_alloc(PMD_ORDER)) > >>>>> 2) then calculate the diff(us) when clear all N folios > >>>>> clear_highpage/folio_zero_range/folio_zero_user > >>>>> 3) release N folios > >>>>> > >>>>> the result(run 5 times) shown below on my machine, > >>>>> > >>>>> N=3D1, > >>>>> clear_highpage folio_zero_range folio_zero_user > >>>>> 1 69 74 177 > >>>>> 2 57 62 168 > >>>>> 3 54 58 234 > >>>>> 4 54 58 157 > >>>>> 5 56 62 148 > >>>>> avg 58 62.8 176.8 > >>>>> > >>>>> > >>>>> N=3D100 > >>>>> clear_highpage folio_zero_range folio_zero_user > >>>>> 1 11015 11309 32833 > >>>>> 2 10385 11110 49751 > >>>>> 3 10369 11056 33095 > >>>>> 4 10332 11017 33106 > >>>>> 5 10483 11000 49032 > >>>>> avg 10516.8 11098.4 39563.4 > >>>>> > >>>>> N=3D512 > >>>>> clear_highpage folio_zero_range folio_zero_user > >>>>> 1 55560 60055 156876 > >>>>> 2 55485 60024 157132 > >>>>> 3 55474 60129 156658 > >>>>> 4 55555 59867 157259 > >>>>> 5 55528 59932 157108 > >>>>> avg 55520.4 60001.4 157006.6 > >>>>> > >>>>> > >>>>> > >>>>> folio_zero_user with many cond_resched(), so time fluctuates a lot, > >>>>> clear_highpage is better folio_zero_range as you said. > >>>>> > >>>>> Maybe add a new helper to convert all folio_zero_range(folio, 0, > >>>>> folio_size(folio)) > >>>>> to use clear_highpage + flush_dcache_folio? > >>>> > >>>> If this also improves performance for other existing callers of > >>>> folio_zero_range(), then that's a positive outcome. > >>> > >>> > >>> rm -f /tmp/test && fallocate -l 20G /tmp/test && fallocate -d -l 20G = / > >>> tmp/test && time fallocate -l 20G /tmp/test > >>> > >>> 1=EF=BC=89mount always(2M folio) > >>> with patch without patch > >>> real 0m1.214s 0m1.111s > >>> user 0m0.000s 0m0.000s > >>> sys 0m1.210s 0m1.109s > >>> > >>> With this patch, the performance does have regression, > >>> folio_zero_range() is bad than clear_highpage + flush_dcache_folio > >>> > >>> with patch > >> > >> Oh, this should without patch since it uses clear_highpage, > >> > >>> > >>> 99.95% 0.00% fallocate [kernel.vmlinux] [k] vfs_fallo= cate > >>> vfs_fallocate > >>> - shmem_fallocate > >>> 98.54% __pi_clear_page > >>> - 1.38% shmem_get_folio_gfp > >>> filemap_get_entry > >>> > >> and this one is with patch > >>> without patch > >>> 99.89% 0.00% fallocate [kernel.vmlinux] [k] shmem_fall= ocate > >>> shmem_fallocate > >>> - shmem_get_folio_gfp > >>> 90.12% __memset > >>> - 9.42% zero_user_segments.constprop.0 > >>> 8.16% flush_dcache_page > >>> 1.03% flush_dcache_folio > >>> > >>> > >>> > >>> > >>> 2=EF=BC=89mount never (4K folio) > >>> real 0m3.159s 0m3.176s > >>> user 0m0.000s 0m0.000s > >>> sys 0m3.150s 0m3.169s > >>> > >>> But with this patch, the performance is improved a little, > >>> folio_zero_range() is better than clear_highpage + flush_dcache_folio > >>> > >> > >> For 4K, the result is fluctuating, so maybe no different. > > > > hi Kefeng, > > what's your point? providing a helper like clear_highfolio() or similar= ? > > Yes, from above test, using clear_highpage/flush_dcache_folio is better > than using folio_zero_range() for folio zero(especially for large > folio), so I'd like to add a new helper, maybe name it folio_zero() > since it zero the whole folio. we already have a helper like folio_zero_user()? it is not good enough? > > > > >> > >>> with patch > >>> 97.77% 3.37% fallocate [kernel.vmlinux] [k] shmem_fall= ocate > >>> - 94.40% shmem_fallocate > >>> - 93.70% shmem_get_folio_gfp > >>> 66.60% __memset > >>> - 7.43% filemap_get_entry > >>> 3.49% xas_load > >>> 1.32% zero_user_segments.constprop.0 > >>> > >>> without patch > >>> 97.82% 3.22% fallocate [kernel.vmlinux] [k] shmem_fal= locate > >>> - 94.61% shmem_fallocate > >>> 68.18% __pi_clear_page > >>> - 25.60% shmem_get_folio_gfp > >>> - 7.64% filemap_get_entry > >>> 3.51% xas_load > >>>> > >>>>> > >>>>> > >>>>>> > >>>>>>> if (sgp !=3D SGP_WRITE && !folio_test_uptodate(folio)) { > >>>>>>> - long i, n =3D folio_nr_pages(folio); > >>>>>>> - > >>>>>>> - for (i =3D 0; i < n; i++) > >>>>>>> - clear_highpage(folio_page(folio, i)); > >>>>>>> - flush_dcache_folio(folio); > >>>>>>> + folio_zero_range(folio, 0, folio_size(folio)); > >>>>>>> folio_mark_uptodate(folio); > >>>>>>> } > >>>>>> > >>>>> > >>>>> > >>>> > >>>> Thanks > >>>> Barry > >>> > >>> > >> >