From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 017F7D17131 for ; Mon, 21 Oct 2024 20:32:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8FE026B0085; Mon, 21 Oct 2024 16:32:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AD5C6B008A; Mon, 21 Oct 2024 16:32:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 776676B0099; Mon, 21 Oct 2024 16:32:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 53F116B0085 for ; Mon, 21 Oct 2024 16:32:31 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 02BA0AC55F for ; Mon, 21 Oct 2024 20:31:59 +0000 (UTC) X-FDA: 82698756630.08.9F5D3B5 Received: from mail-vs1-f52.google.com (mail-vs1-f52.google.com [209.85.217.52]) by imf05.hostedemail.com (Postfix) with ESMTP id 4912E100020 for ; Mon, 21 Oct 2024 20:32:00 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=h0y+GTbm; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729542583; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8BZmqOkhN9i7SzK6dcIWHt2cnzDrhFZiliR/3nZnUYE=; b=kjR85UOECz/PufFLZgoR1ptsPvVxfkOT1hrKi4wUn4LTXbbcQmQM6c+XPrOyuvhnjQc7XN 30ArNlRdd3teVYOLrlmeVcwejCGZ9Ix+cnl4iGM91PGHkHqjfSRhGV1NapADrPExYqqbL3 DtlIUHwoaJ8UyuBj7EOTP0Shbusw7P0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729542583; a=rsa-sha256; cv=none; b=XHOzazHbtefh5qzTd3CfvwvMhkv5TKooGuwjzJi689KzdvbAd4mzz1jYqC260kJszHs6hV r6HG56G95NZmMbIFOtSvIg58R1ROrEMUySLDdalWh2tqkTbAEJod/9ZGLHMkkVFC3RhFP8 BNblWYnu7fu4oBVMfGn09vwpxWY6xzM= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=h0y+GTbm; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com Received: by mail-vs1-f52.google.com with SMTP id ada2fe7eead31-4a47d1a62bdso1192197137.0 for ; Mon, 21 Oct 2024 13:32:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729542748; x=1730147548; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8BZmqOkhN9i7SzK6dcIWHt2cnzDrhFZiliR/3nZnUYE=; b=h0y+GTbmhbtrnTHyFzbKVfmvR5UqCsB8GP6wxo/sxq3Bo5Do+Q9x6po1eLlPEOktrJ Msj/SpYr4qCo97AbHSC6N4yMB+aKws9pr3qq71VRw6ACbyX2laJH9jMFNnup4tnBkix3 62tBeMmnphnv9exJvE3Aspk+WpfHhpZXj3GuLAJov1iLaBfmTQIkBYpB9fxArj5O6uN/ +BNbB1+csnxzyXlHSjXb0Pb6I0JHvjqXkuKLEz+eoiaAElQ3DQh7C86pQJ5V3zhoIvuJ XVc0S2BNG+e3wXzuv8BXTVHuUAaOU0U12NjD64S3fDfrmmkn3tU2JV362hT4+ep5SfAt kkdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729542748; x=1730147548; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8BZmqOkhN9i7SzK6dcIWHt2cnzDrhFZiliR/3nZnUYE=; b=PP4czbDhPdQO6+6OXMPrjeVaVxF7ikn1Dy6FZR8ENuAy42dWOEkutlEooSUMP2l/dz 2LC8fHvWBs4ldOu2YjOcN9i/Hz+BuRXizNKJVQMZ8ge7+kgD6p9wR2O1g62b0d+MqZVZ OPK3wE+tNAod/U3RBrojwefQgpa25DwjiSp7DbQex7yMb3XDNOBrT43GgZdd9hxfdN4u 9rMIA6EVHGHLlXkxXuhtwcKOMVCgfzxKco1/YOU74qt9+9qMhlAVgkg2qxUp1veFm2ZD wP12PXuHFmAi2/T5IUhLKmihnH/tEgEqpE2+p3IcWlAXnhcovIX65BkHZSCkddN89P+s e6oA== X-Forwarded-Encrypted: i=1; AJvYcCXMBrS85RLyvMI6qc6QBUHM8ONwbcOUIjjvxRRePCZEogx5r0+4g4UdtR5AI+SMnt+UExMlhEF7wg==@kvack.org X-Gm-Message-State: AOJu0YxxEgbKyjui4XixHTW5aW0PW560pisj9TxV5WiD/dmOj03Se46H jbJANFglVwzERy78fu8w/2oqghB7IY8uqJJk6Q5vqUg/OP2Fo3CCANP2OFAvBUjCh5Beaz7eNZn 52tLbjxb5u7OGYqTgUgjbtmxwAYw= X-Google-Smtp-Source: AGHT+IGGCofVK5YgeZpGSq/EtVG0MfckzWtcVwPZkdY9/AF4EcaOZEASyAVQwBzpJOR7biz9VsUQ76V33+OXEm2E8j0= X-Received: by 2002:a05:6102:291f:b0:4a4:8f96:8b5a with SMTP id ada2fe7eead31-4a5d6add239mr9792448137.10.1729542748172; Mon, 21 Oct 2024 13:32:28 -0700 (PDT) MIME-Version: 1.0 References: <20241017142504.1170208-1-wangkefeng.wang@huawei.com> <20241017142504.1170208-2-wangkefeng.wang@huawei.com> <789aba5c-e2dd-4b4c-bfac-8d534c7a9211@huawei.com> <198b9258-8d5d-4b13-9bc5-21f170b43940@huawei.com> In-Reply-To: <198b9258-8d5d-4b13-9bc5-21f170b43940@huawei.com> From: Barry Song <21cnbao@gmail.com> Date: Tue, 22 Oct 2024 09:32:17 +1300 Message-ID: Subject: Re: [PATCH] mm: shmem: convert to use folio_zero_range() To: Kefeng Wang Cc: Matthew Wilcox , Andrew Morton , Hugh Dickins , David Hildenbrand , Baolin Wang , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 4912E100020 X-Stat-Signature: nz7uf555bqoii1jmzj813hyzcmnwyxw4 X-Rspam-User: X-HE-Tag: 1729542720-944521 X-HE-Meta: U2FsdGVkX19dVgIWw6XSg1u0XuxCxBi921hSfYCUX2Mvd6ZMlGlb/BqjDM9Hm29y1LzxiPMeTP0cGhI2aIoizDLD0hYpISonFTyiFUafgj/gOKgNXiGwVecHGu8aTTRFv0cD+JK0zIV5kyVMKbeDeG9aIlg6BeNima9+XEwgq54+PnnQwhfqrwWJMPwGg92rGJ5GF+e07XjkF8ffjvifX3eb7FnZlnQBKgUlDQ7gxfHkgYxOF/X9Zhqs/2qDWv72loWq0yQtLy/wloJ4dfUemN1wqaR+/NpWQe78X6JI27X2PZXg1R9eWUswEDlnhRyuhDJbgESEQ7G7OtXK1SsIl21OIOyVbZnWLKeEekVByhxmkKGm9PQWKtCthY9NA2IM2Sj+rdRxJ0L5GocB+tQvGLuIWpww3aO9hlki/7lSbkzFRxebcRHC+Bl0wwwZTuQMCoDe+dv61f4WmI1va1SZcFtsXgIOwHENy+a/Q4QHTTBUJMpvSY/t+sLJ458ILnmebpi9YP9kLkn4J9atMKQ0fwTgJOdwL76Lo6BeY2cUbIEcCUvd7VP+l/QKPhdDc+HY7MD3D/t4/V2XxCxWy/BjzZf5zIgTIGZpi5G2Gn4TJJrUczk/Z6K2jX6fw6hj6nYXtJvnDUGrjRD3CeQVR6649vUyaYHU59/87ZIK8kYmBtx19sRpk8s5yCuJeo9bTRD3osbG8TXmZr3pciH1lGtI9e962eEyoY+v+jnYEgXNG9XfeUMpTEAXUrg/bRr3brcNVc3gyRl2oN/dKIngH6AJB2DEUbI+IaYd88VzLemv1d3rSTNI1WIcC1SMqQy6plFLdiTxE7W5IfRj12KqYOnfZq6UPzXEiCM7jnIO/RqrkClb2zdElnVlgJs1+6xWIa3xt2RpxZMNAV5dWS/j/Vuh6bRrzH7yfKeYJG41sr9YDDkTI4AXpVS4WmUixiVU1AmRWKcKu4IRaiBD9YTG13t 3z9SPkve M8+YtSYnha5frBVLvAeTvG4bFnacIv15g+9xfP8Sfv33/H+ArbaQznexHNhU5kN4hZlOi1jAaubHPfUEfux9w+wsHlFnLrQNuxK1B9nyV3ShcM1oYYfrrtbUMjio2n82Q+A+3oPxX3NWl5llNTz90y08j8QoUnP9F+0Ijcu6cgV/1O7ilCyjLQrdJBT12uiUonWIfdVDJV2Q6jolP3xPsbjdppHnLxq+H6+RFkyy1iA5ddQvAbgn7wQFfJCiE/HqNuxjyXYJqtPadFQVxG3mlCMcc4He6BPZ031Msgvs8i6Ul7/h7IIrTeUy/OtxXT+k63NdDWE3c2bFPBHEBVjUq/Khm8ZRkfHqsiraNBxcGL38X9srWSVzc11GiHgnG5Yd4YJvtN2d/LN7vXB+1kre9eNvwj5cX/k+cW6xz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Oct 22, 2024 at 4:33=E2=80=AFAM Kefeng Wang wrote: > > > > On 2024/10/21 17:17, Barry Song wrote: > > On Mon, Oct 21, 2024 at 9:14=E2=80=AFPM Kefeng Wang wrote: > >> > >> > >> > >> On 2024/10/21 15:55, Barry Song wrote: > >>> On Mon, Oct 21, 2024 at 8:47=E2=80=AFPM Barry Song <21cnbao@gmail.com= > wrote: > >>>> > >>>> On Mon, Oct 21, 2024 at 7:09=E2=80=AFPM Kefeng Wang wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 2024/10/21 13:38, Barry Song wrote: > >>>>>> On Mon, Oct 21, 2024 at 6:16=E2=80=AFPM Kefeng Wang wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 2024/10/21 12:15, Barry Song wrote: > >>>>>>>> On Fri, Oct 18, 2024 at 8:48=E2=80=AFPM Kefeng Wang wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 2024/10/18 15:32, Kefeng Wang wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 2024/10/18 13:23, Barry Song wrote: > >>>>>>>>>>> On Fri, Oct 18, 2024 at 6:20=E2=80=AFPM Kefeng Wang > >>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 2024/10/17 23:09, Matthew Wilcox wrote: > >>>>>>>>>>>>> On Thu, Oct 17, 2024 at 10:25:04PM +0800, Kefeng Wang wrote= : > >>>>>>>>>>>>>> Directly use folio_zero_range() to cleanup code. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Are you sure there's no performance regression introduced b= y this? > >>>>>>>>>>>>> clear_highpage() is often optimised in ways that we can't o= ptimise for > >>>>>>>>>>>>> a plain memset(). On the other hand, if the folio is large= , maybe a > >>>>>>>>>>>>> modern CPU will be able to do better than clear-one-page-at= -a-time. > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Right, I missing this, clear_page might be better than memse= t, I change > >>>>>>>>>>>> this one when look at the shmem_writepage(), which already c= onvert to > >>>>>>>>>>>> use folio_zero_range() from clear_highpage(), also I grep > >>>>>>>>>>>> folio_zero_range(), there are some other to use folio_zero_r= ange(). > >>>>>>>>>>>> > >>>>>>>>>>>> fs/bcachefs/fs-io-buffered.c: folio_zero_range(fol= io, 0, > >>>>>>>>>>>> folio_size(folio)); > >>>>>>>>>>>> fs/bcachefs/fs-io-buffered.c: folio_zero_r= ange(f, > >>>>>>>>>>>> 0, folio_size(f)); > >>>>>>>>>>>> fs/bcachefs/fs-io-buffered.c: folio_zero_r= ange(f, > >>>>>>>>>>>> 0, folio_size(f)); > >>>>>>>>>>>> fs/libfs.c: folio_zero_range(folio, 0, folio_size(folio)= ); > >>>>>>>>>>>> fs/ntfs3/frecord.c: folio_zero_range(folio, 0, > >>>>>>>>>>>> folio_size(folio)); > >>>>>>>>>>>> mm/page_io.c: folio_zero_range(folio, 0, folio_size(folio)= ); > >>>>>>>>>>>> mm/shmem.c: folio_zero_range(folio, 0, folio_siz= e(folio)); > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> IOW, what performance testing have you done with this patch= ? > >>>>>>>>>>>> > >>>>>>>>>>>> No performance test before, but I write a testcase, > >>>>>>>>>>>> > >>>>>>>>>>>> 1) allocate N large folios (folio_alloc(PMD_ORDER)) > >>>>>>>>>>>> 2) then calculate the diff(us) when clear all N folios > >>>>>>>>>>>> clear_highpage/folio_zero_range/folio_zero_user > >>>>>>>>>>>> 3) release N folios > >>>>>>>>>>>> > >>>>>>>>>>>> the result(run 5 times) shown below on my machine, > >>>>>>>>>>>> > >>>>>>>>>>>> N=3D1, > >>>>>>>>>>>> clear_highpage folio_zero_range folio_zero_= user > >>>>>>>>>>>> 1 69 74 177 > >>>>>>>>>>>> 2 57 62 168 > >>>>>>>>>>>> 3 54 58 234 > >>>>>>>>>>>> 4 54 58 157 > >>>>>>>>>>>> 5 56 62 148 > >>>>>>>>>>>> avg 58 62.8 176.8 > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> N=3D100 > >>>>>>>>>>>> clear_highpage folio_zero_range folio_zero_= user > >>>>>>>>>>>> 1 11015 11309 32833 > >>>>>>>>>>>> 2 10385 11110 49751 > >>>>>>>>>>>> 3 10369 11056 33095 > >>>>>>>>>>>> 4 10332 11017 33106 > >>>>>>>>>>>> 5 10483 11000 49032 > >>>>>>>>>>>> avg 10516.8 11098.4 39563.4 > >>>>>>>>>>>> > >>>>>>>>>>>> N=3D512 > >>>>>>>>>>>> clear_highpage folio_zero_range folio_zero_u= ser > >>>>>>>>>>>> 1 55560 60055 156876 > >>>>>>>>>>>> 2 55485 60024 157132 > >>>>>>>>>>>> 3 55474 60129 156658 > >>>>>>>>>>>> 4 55555 59867 157259 > >>>>>>>>>>>> 5 55528 59932 157108 > >>>>>>>>>>>> avg 55520.4 60001.4 157006.6 > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> folio_zero_user with many cond_resched(), so time fluctuates= a lot, > >>>>>>>>>>>> clear_highpage is better folio_zero_range as you said. > >>>>>>>>>>>> > >>>>>>>>>>>> Maybe add a new helper to convert all folio_zero_range(folio= , 0, > >>>>>>>>>>>> folio_size(folio)) > >>>>>>>>>>>> to use clear_highpage + flush_dcache_folio? > >>>>>>>>>>> > >>>>>>>>>>> If this also improves performance for other existing callers = of > >>>>>>>>>>> folio_zero_range(), then that's a positive outcome. > >>>>>>>>>> > >>>>> ... > >>>>> > >>>>>>>> hi Kefeng, > >>>>>>>> what's your point? providing a helper like clear_highfolio() or = similar? > >>>>>>> > >>>>>>> Yes, from above test, using clear_highpage/flush_dcache_folio is = better > >>>>>>> than using folio_zero_range() for folio zero(especially for large > >>>>>>> folio), so I'd like to add a new helper, maybe name it folio_zero= () > >>>>>>> since it zero the whole folio. > >>>>>> > >>>>>> we already have a helper like folio_zero_user()? > >>>>>> it is not good enough? > >>>>> > >>>>> Since it is with many cond_resched(), the performance is worst... > >>>> > >>>> Not exactly? It should have zero cost for a preemptible kernel. > >>>> For a non-preemptible kernel, it helps avoid clearing the folio > >>>> from occupying the CPU and starving other processes, right? > >>> > >>> --- a/mm/shmem.c > >>> +++ b/mm/shmem.c > >>> > >>> @@ -2393,10 +2393,7 @@ static int shmem_get_folio_gfp(struct inode > >>> *inode, pgoff_t index, > >>> * it now, lest undo on failure cancel our earlier guarante= e. > >>> */ > >>> > >>> if (sgp !=3D SGP_WRITE && !folio_test_uptodate(folio)) { > >>> - long i, n =3D folio_nr_pages(folio); > >>> - > >>> - for (i =3D 0; i < n; i++) > >>> - clear_highpage(folio_page(folio, i)); > >>> + folio_zero_user(folio, vmf->address); > >>> flush_dcache_folio(folio); > >>> folio_mark_uptodate(folio); > >>> } > >>> > >>> Do we perform better or worse with the following? > >> > >> Here is for SGP_FALLOC, vmf =3D NULL, we could use folio_zero_user(fol= io, > >> 0), I think the performance is worse, will retest once I can access > >> hardware. > > > > Perhaps, since the current code uses clear_hugepage(). Does using > > index << PAGE_SHIFT as the addr_hint offer any benefit? > > > > when use folio_zero_user(), the performance is vary bad with above > fallocate test(mount huge=3Dalways), > > folio_zero_range clear_highpage folio_zero_user > real 0m1.214s 0m1.111s 0m3.159s > user 0m0.000s 0m0.000s 0m0.000s > sys 0m1.210s 0m1.109s 0m3.152s > > I tried with addr_hint =3D 0/index << PAGE_SHIFT, no obvious different. Interesting. Does your kernel have preemption disabled or preemption_debug enabled? If not, it makes me wonder whether folio_zero_user() in alloc_anon_folio() is actually improving performance as expected, compared to the simpler folio_zero() you plan to implement. :-)