From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ADC2C47258 for ; Thu, 25 Jan 2024 17:52:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 854FD6B007E; Thu, 25 Jan 2024 12:52:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 805806B0080; Thu, 25 Jan 2024 12:52:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67E6E6B0082; Thu, 25 Jan 2024 12:52:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 51FE26B007E for ; Thu, 25 Jan 2024 12:52:06 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 17DFC1C181A for ; Thu, 25 Jan 2024 17:52:06 +0000 (UTC) X-FDA: 81718577052.28.6E89F5A Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 3A10220015 for ; Thu, 25 Jan 2024 17:52:03 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mY97yILF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706205124; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=58AAw59QVEkmC52O5LxYTi+CX/9Jh0etyAUz49wdjuc=; b=Jb6EtMukiZuHcDCfKzERMnMmfTZyViR3ytqoa0kXkMLTSWw5rkn4U1KhwX514/HfxJdgPu 2LpFo53pfmDe47gR/dqzDpT3hpD80sRpre/LcBpe9+RjIKLnLeb4PzljrCX7H0LozRKqB1 yUcdG2JR1eL9djH1yMB6Ga2+7IBvlUQ= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mY97yILF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706205124; a=rsa-sha256; cv=none; b=0hkIQGW6iAUdFMoGX1FT32vUh5kHb/HuRPLdjD86yGaNb9mW/vaWl+Td0XiAOwAZHFBQcU szCVF3geFfYU4RYJy/Hl6gUvQaGczJNrYCDnjlZn4vMcQZE6qvmqxwa5J9STWVr5CtNOCj QmjNbeTrxXJIGpeHdxtsXMBYVNB+Be0= Received: by mail-lj1-f172.google.com with SMTP id 38308e7fff4ca-2cf328885b4so13518171fa.3 for ; Thu, 25 Jan 2024 09:52:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706205122; x=1706809922; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=58AAw59QVEkmC52O5LxYTi+CX/9Jh0etyAUz49wdjuc=; b=mY97yILFln2q8SE71Zdq+k6C5ePkAkiA43g1WNAW6k+FynqqQV3rvfx12rfcKhh3gi pE02mD73Gej6pdhlc0aJbh4Uw33ZE7RYJpcjfF/71wNRRfs3YikuhntvAesoKqsexMOl Ian76VVuP9xpHnT8YmnUOnYkAfPMjnxjTb0bIFZwNOR/RDtwMAJit0JNGYEPxMUZ2D9T Dn2sq4tyNp+Qk0Zx6dr2q1oXpQy0QPBYQEwsJd0lhQwpIriwp/uElczqC7ofaQfZm8q1 ox87XOXulqfva6nq7PyIKSfgbXM9wnqBAbpDIRkvrfykbl3zzCCPNbaSTbo0rHh9NWx9 qhiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706205122; x=1706809922; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=58AAw59QVEkmC52O5LxYTi+CX/9Jh0etyAUz49wdjuc=; b=PXGx5VujhBxykhNQLZzZjsfEqyA8fcwymcGRoH8X2PPf0ucbRfO/kbnaMm/PWspt2U Bg70MeXsr2yY09pdqts5csx1ZJJA6GhaAdYETOkv+2nXfeC0q+HcT2kXtmFXGyluRbUr a13GegJZEv7FPcfqJ2y1hJ8HNzlPfB1aN1WU88sBoDiwBDU+2PgJjtvfgH23s17y6eOS 81cQWJ2c59iJKhkvi8UiZMgiQcqsNgHCO4EOk/kwOWa3LAanObs7DWJXKC8zdIBrqezA coEMgGi8J97qaCKwc3z5Hf0CGH6EHqHBgzNiPcgjrg6ervRUupQpqYljmoMM4KPBPQuW B0tg== X-Gm-Message-State: AOJu0YwXpqOfZS4hMn4uZz4P+Z4/YIEaz/ietFngJA3guzy5ZMwomOCn ZLwgZyWCPSu3z+3vEzYU7LPnbgIDHEqSHe6DMUKmREt0pI3heAM+D2Y1bo+0aNGm6LWArMpReOb A3fFtw9HxK3b5DHJPzlb/E+4zi8w= X-Google-Smtp-Source: AGHT+IFP7hSPNsDELqXhfc1DSi7wbXwIL00Bvghzh+3rAURSjCAW/OBnfRxEQ2loTtaioTmHEBnCcZGQf0VdPLQbOyM= X-Received: by 2002:a2e:9092:0:b0:2cf:433c:b3e with SMTP id l18-20020a2e9092000000b002cf433c0b3emr28412ljg.8.1706205121921; Thu, 25 Jan 2024 09:52:01 -0800 (PST) MIME-Version: 1.0 References: <20240123184552.59758-1-ryncsn@gmail.com> <20240123184552.59758-2-ryncsn@gmail.com> In-Reply-To: From: Kairui Song Date: Fri, 26 Jan 2024 01:51:44 +0800 Message-ID: Subject: Re: [PATCH v3 1/3] mm, lru_gen: try to prefetch next page when scanning LRU To: Chris Li Cc: linux-mm@kvack.org, Andrew Morton , Yu Zhao , Wei Xu , Matthew Wilcox , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3A10220015 X-Stat-Signature: k31wnh6x4af35fiofdocmfu78chqbysw X-Rspam-User: X-HE-Tag: 1706205123-740045 X-HE-Meta: U2FsdGVkX19mMASMEORHbmeDtKTU+qwjqHuXIb46DZCqsfJ3GARnXbrOmGBzfWPHQVBjJkEkvrjW6dogNtglE4UaaI+9teC7CVd+6/RthrwDKc1uEH3w/QKHmiVnsblA2/s1OKcyiU2g8W2S2yT4tQ3Qfzy3mgIB7I/zUu21QsE0MCIyxkb+RrBA/83IqZ3nQyECvsBmH7zmLkoBuVMcLuWZ5PGOUqATQfJWY3vTyDJ1T2uHiABnKvMRi0yXw0bFudJ8KM6upDYIqLdm5sMVoN61C6PVI6HHrsgzEC2O28lqUvAdViH41bZ2R6aw2qYJw0tI0VjzfyqySZS/fmd5QmqCDZE0y6ByE+OraZAxFS5wOUaj0mDy+T8+HuauKbEOlnYS6C/f+Q/pHa6f3fMWTKQgP5bnx+4LXGc6GA8OOk4A0ZgZ/hMc5GITYYLJtwgi6S2RExmCKDP2eLo4lCnqFt8+ryjnp6t+JTP/4B9kwvfjv4iELLbFLa6vs5bykf/1v4Rx05i3+95Q3ScnW5BfACjPJ/knkLO98Oq9S15Ue5TNWb/QyDfG0kCiKDJqjI0nE72j9i2PADEBJvuILbJP12y1ScYHo+Q2J9gmYeOBCcEiYKrhNJ84bEu3rM5lfvL5rxXoWdf6a+VCJ6SKmOz148cz7xyxYdFZU0LWMli4ILAAFZ7z1oM0TISAegFQU6X0N2w2CusgE7J0lGP1N4EEME4l3NB/8C11MhW3Wte7vJszCrgTm3+Gq4wRYibjJ6eU5zJUfXrt5zl9jEHG/va7G45OM4WAkabDTW/JVy8UETYTHFpTNa1QUhJXz1R1dMh3WH/VpplX7m+0t9zHvT2jOE5l1SXMWmIQfX4yaU7JfQdb80kLrh2rXN9avtco5ikYpZEo7R9X3lbctLYi6FtmDSBj/U8s4dqJrruSQMqFNmGF46aCCW9Y/GE4wcEoisTQzHynqGRbSRQhKD937rr 8knaShAc F7OyDnVFVil8OM8swu0V/SlYOzIN75U3DtX6pfFsnL4OiZ3VUDjJMvQ9rk4BZ6UOkh4U8y3k9RNBlYedMuJO1LcHnx9zun4y5zmCiMRthDBrCNRBnICZYp1zwlJhGBFzdZq+2yj+pFkc38CTtnYp7yxEZEC6wyLmI3v5tRzeJ9IVN0ZKEfMqcvdD93Qh2QKxZxi55jR/pCe0jpsOxRn5bji4whPiQSZEQSt67nd3GFpXvdyfLnRlf/NNEaGsldmddHcdA1WMmdySPZX7pCKA6ybmpUAz+IEcpTXJtaaZw/oKOvSAqm6CVihWUPlS+bDuNVS89WBeJDzJ3DSKUQzutd9dsUPv8i8yCjWa7N45PBzoNlxswWzdI9UK6evvVgXJoT6udQVTT5sR5WFVe/f+F/5Ch9qREB/O3np70qWdeGlMhu27TMvbaGKPMMP//RuniIsPAOkUtQyUijLpYguaG2H8YSeC9+Cg+NfEwPv4a0CcsapyYFuP5DQ3hv657iUMas/H6WwOuVOvTaxuRZtMbOSm/s2Nrp+BScdZM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 25, 2024 at 3:33=E2=80=AFPM Chris Li wrote: > > On Tue, Jan 23, 2024 at 10:46=E2=80=AFAM Kairui Song w= rote: > > > > From: Kairui Song > > > > Prefetch for inactive/active LRU have been long exiting, apply the same > > optimization for MGLRU. > > > > Test 1: Ramdisk fio ro test in a 4G memcg on a EPYC 7K62: > > fio -name=3Dmglru --numjobs=3D16 --directory=3D/mnt --size=3D960m \ > > --buffered=3D1 --ioengine=3Dio_uring --iodepth=3D128 \ > > --iodepth_batch_submit=3D32 --iodepth_batch_complete=3D32 \ > > --rw=3Drandread --random_distribution=3Dzipf:0.5 --norandommap \ > > --time_based --ramp_time=3D1m --runtime=3D6m --group_reporting > > > > Before this patch: > > bw ( MiB/s): min=3D 7758, max=3D 9239, per=3D100.00%, avg=3D8747.59, s= tdev=3D16.51, samples=3D11488 > > iops : min=3D1986251, max=3D2365323, avg=3D2239380.87, stdev=3D4= 225.93, samples=3D11488 > > > > After this patch (+7.2%): > > bw ( MiB/s): min=3D 8360, max=3D 9771, per=3D100.00%, avg=3D9381.31, s= tdev=3D15.67, samples=3D11488 > > iops : min=3D2140296, max=3D2501385, avg=3D2401613.91, stdev=3D4= 010.41, samples=3D11488 > > > > Test 2: Ramdisk fio hybrid test for 30m in a 4G memcg on a EPYC 7K62 (3= times): > > fio --buffered=3D1 --numjobs=3D8 --size=3D960m --directory=3D/mnt \ > > --time_based --ramp_time=3D1m --runtime=3D30m \ > > --ioengine=3Dio_uring --iodepth=3D128 --iodepth_batch_submit=3D32 \ > > --iodepth_batch_complete=3D32 --norandommap \ > > --name=3Dmglru-ro --rw=3Drandread --random_distribution=3Dzipf:0.7 = \ > > --name=3Dmglru-rw --rw=3Drandrw --random_distribution=3Dzipf:0.7 > > > > Before this patch: > > READ: 6622.0 MiB/s. Stdev: 22.090722 > > WRITE: 1256.3 MiB/s. Stdev: 5.249339 > > > > After this patch (+4.6%, +3.3%): > > READ: 6926.6 MiB/s, Stdev: 37.950260 > > WRITE: 1297.3 MiB/s, Stdev: 7.408704 > > > > Test 3: 30m of MySQL test in 6G memcg (12 times): > > echo 'set GLOBAL innodb_buffer_pool_size=3D16106127360;' | \ > > mysql -u USER -h localhost --password=3DPASS > > > > sysbench /usr/share/sysbench/oltp_read_only.lua \ > > --mysql-user=3DUSER --mysql-password=3DPASS --mysql-db=3DDB \ > > --tables=3D48 --table-size=3D2000000 --threads=3D16 --time=3D1800 r= un > > > > Before this patch > > Avg: 134743.714545 qps. Stdev: 582.242189 > > > > After this patch (+0.2%): > > Avg: 135005.779091 qps. Stdev: 295.299027 > > > > Test 4: Build linux kernel in 2G memcg with make -j48 with SSD swap > > (for memory stress, 18 times): > > > > Before this patch: > > Avg: 1456.768899 s. Stdev: 20.106973 > > > > After this patch (+0.0%): > > Avg: 1455.659254 s. Stdev: 15.274481 > > > > Test 5: Memtier test in a 4G cgroup using brd as swap (18 times): > > memcached -u nobody -m 16384 -s /tmp/memcached.socket \ > > -a 0766 -t 16 -B binary & > > memtier_benchmark -S /tmp/memcached.socket \ > > -P memcache_binary -n allkeys \ > > --key-minimum=3D1 --key-maximum=3D16000000 -d 1024 \ > > --ratio=3D1:0 --key-pattern=3DP:P -c 1 -t 16 --pipeline 8 -x 3 > > > > Before this patch: > > Avg: 50317.984000 Ops/sec. Stdev: 2568.965458 > > > > After this patch (-5.7%): > > Avg: 47691.343500 Ops/sec. Stdev: 3925.772473 > > > > It seems prefetch is helpful in most cases, but the memtier test is > > either hitting a case where prefetch causes higher cache miss or it's > > just too noisy (high stdev). > > > > Signed-off-by: Kairui Song > > --- > > mm/vmscan.c | 30 ++++++++++++++++++++++++++---- > > 1 file changed, 26 insertions(+), 4 deletions(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 4f9c854ce6cc..03631cedb3ab 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -3681,15 +3681,26 @@ static bool inc_min_seq(struct lruvec *lruvec, = int type, bool can_swap) > > /* prevent cold/hot inversion if force_scan is true */ > > for (zone =3D 0; zone < MAX_NR_ZONES; zone++) { > > struct list_head *head =3D &lrugen->folios[old_gen][typ= e][zone]; > > + struct folio *prev =3D NULL; > > > > - while (!list_empty(head)) { > > - struct folio *folio =3D lru_to_folio(head); > > + if (!list_empty(head)) > > + prev =3D lru_to_folio(head); > > + > > + while (prev) { > > + struct folio *folio =3D prev; > > > > VM_WARN_ON_ONCE_FOLIO(folio_test_unevictable(fo= lio), folio); > > VM_WARN_ON_ONCE_FOLIO(folio_test_active(folio),= folio); > > VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) = !=3D type, folio); > > VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) !=3D= zone, folio); > > > > + if (unlikely(list_is_first(&folio->lru, head)))= { > > + prev =3D NULL; > > + } else { > > + prev =3D lru_to_folio(&folio->lru); > > + prefetchw(&prev->flags); > > + } > > This makes the code flow much harder to follow. Also for architecture > that does not support prefetch, this will be a net loss. > > Can you use refetchw_prev_lru_folio() instead? It will make the code > much easier to follow. It also turns into no-op when prefetch is not > supported. > > Chris > Hi Chris, Thanks for the suggestion. Yes, that's doable, I made it this way because in previous series (V1 & V2) I applied the bulk move patch first which needed and introduced the `prev` variable here, so the prefetch logic just used it. For V3 I did a rebase and moved the prefetch commit to be the first one, since it seems to be the most effective one, and just kept the code style to avoid redundant change between patches. I can update in V4 to make this individual patch better with your suggestio= n.