From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31BF9C3DA7F for ; Sun, 4 Aug 2024 12:22:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E4BD6B007B; Sun, 4 Aug 2024 08:22:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 295C16B0082; Sun, 4 Aug 2024 08:22:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 136856B0085; Sun, 4 Aug 2024 08:22:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E34166B007B for ; Sun, 4 Aug 2024 08:22:04 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5F594A18C8 for ; Sun, 4 Aug 2024 12:22:04 +0000 (UTC) X-FDA: 82414474968.21.06330B7 Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) by imf30.hostedemail.com (Postfix) with ESMTP id 7EC2380002 for ; Sun, 4 Aug 2024 12:22:01 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=DO8O2pWc; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.175 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722774077; a=rsa-sha256; cv=none; b=nVyx8vjz2fvj/ErVffexW/BxlMMd3ccScUd7TwOTEOccy/O3MrxAzFw5Ci/F2+dTj8G6Gw bPs8AcECdugEAIyfBtCjgoL9TpPGZqsg5FOH+evTEUkOzK2mnSZL1HMXTm4CBCkA62V458 Vg+iyRG2Gwm3oTuJII3Du5JfwBD/gl0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=DO8O2pWc; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.175 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722774077; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wsi1fL04R0DYy7XwEizcUQkYQtOH3HB8IADsN0w8GsA=; b=hMuz+xnW+twyuxrxIkclgz2jw3P/EGwGkTvnRHenr+aqebqxR1alTU7wP/9skyV17cG6Ua sma4/VH8bsONQ/uvG1SGu1ZLlBGA+dYsIm9awPg+Fc6WNUcOcOtVysGoqPMrqZ+51rC4mu EczL++CB7+CDiOjFgAmma0ClWGL6Xeo= Received: by mail-lj1-f175.google.com with SMTP id 38308e7fff4ca-2f032cb782dso100556461fa.3 for ; Sun, 04 Aug 2024 05:22:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722774120; x=1723378920; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wsi1fL04R0DYy7XwEizcUQkYQtOH3HB8IADsN0w8GsA=; b=DO8O2pWcMzpLEKvWfVlqTISEu1f42wQT7QEF/RqTuyb6zxISdyvpu7gOHLjJMSsR/n 4RNrTXdm1aENrXmqACnrd/EI3c6HR+Ao8GGvHHLRoLiouDmTMyJZ6JxevblFyoDwkSH/ FkxPAIsb3LAJn9syVGbCgkT/6M3RDeThn92mU/UJ/u6aiJH4pRUh6maSfUVombL5zfvp Vq57A5MLeWs0G6Xyqmez7NSxKCh2Tiaw66VgMoH6Nl0bplFYditFAyOzHQ1en3eaiY+O 67sIRrp3EKb387n9kcNvrmH+WiapniRfXuEL54znU7hLf9zfeVbYJOngAEPDGGz0MJxy hh5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722774120; x=1723378920; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wsi1fL04R0DYy7XwEizcUQkYQtOH3HB8IADsN0w8GsA=; b=CnjLN3t3z1R9c3OJ2E6E0L2edd9pFtAUbV79U2zpnos0H7FI5A2BFEiXTbMppfUpkF sZCCu6ZDLILdro7q2ksUoSbUy+8kTTwTnyS7IxJ5cfE41gANx6fPgD8Ln+sHQ/Y3A5r/ AlKCXQ0jKzPJ5wvQKyCINJTFChQ7wOB+YuMUotX3uqtdXMTSZ+jy98rQmyUBtkv3Lf+0 h0V6u5MSasRn+21bT+ZMCf0j2oz44WhgBueLasQFUT3+O4CNN94kjZMgE1pK2Z6gx59z Rj5T1bai15mrBuhduvk1x6HaydhWVlad4eXl6yyS4YlQB6KmyD4MquJJ5nMHP8iwfW1g pfwg== X-Forwarded-Encrypted: i=1; AJvYcCVEKk5DrETCyDuoahzlbIyu+aqeeAM/mJ39sF8sADHLpI/FVdT6pv6FGdds+Qa2kSpxDCMBKnLTaEeqMheqA3Jz0Mk= X-Gm-Message-State: AOJu0YwWTbFg08ITqWXgCW5Y63edneisPbCoV1rXf2gtpA+pN8kNISCQ jRWjWk+UKgCrbPnrXF+eQFOKy7AKz5V/lub3AphJo1khZGmtFZB+6LrVXl5Y4+2BfbMBEOOGq2B LOMgc1+DyeGqbQu945OwJNK1+LKk= X-Google-Smtp-Source: AGHT+IE+8xriYSRBJFDmZHpDtMTJPvlIHseBHdis+XTfdELhfULmeDSHiWi0nfKVonJLhiMm0+Hx3c2i3UK4Yidfh9E= X-Received: by 2002:a2e:914b:0:b0:2ef:17ee:62b0 with SMTP id 38308e7fff4ca-2f15aa8368bmr61733491fa.2.1722774119072; Sun, 04 Aug 2024 05:21:59 -0700 (PDT) MIME-Version: 1.0 References: <1719038884-1903-1-git-send-email-yangge1116@126.com> <0f9f7a2e-23c3-43fe-b5c1-dab3a7b31c2d@126.com> <00a27e2b-0fc2-4980-bc4e-b383f15d3ad9@126.com> In-Reply-To: From: Kairui Song Date: Sun, 4 Aug 2024 20:21:42 +0800 Message-ID: Subject: Re: [PATCH V2] mm/gup: Clear the LRU flag of a page before adding to LRU batch To: Ge Yang , Yu Zhao , Chris Li Cc: Andrew Morton , linux-mm , LKML , stable@vger.kernel.org, Barry Song <21cnbao@gmail.com>, David Hildenbrand , baolin.wang@linux.alibaba.com, liuzixing@hygon.cn, Hugh Dickins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 7EC2380002 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bkpr1g766n7wirwj3cwfuiq3gdtsza8f X-HE-Tag: 1722774121-386055 X-HE-Meta: U2FsdGVkX18ofjNeTt/vPfhbLy6shFZOdeoI+VIi1eytqQu34ZG18+DZKLFGvCemf3UZ8F8+rsqszJQbotvbgaWKcaaxjAo1fzCYHDN/zSvJ2cdUiRn6Qngkh8aiOvphn3QuAtTRio/050gq4kadeipEZI3IsJakR0evAbogO+o6sKXOfkLaFHvbPhOWkd7tM1GucnFcKSbm3rpAWTHmPDu3E4IOea17luZoUETWWgJNNmKuSt3G5AeTuO6YgZhgUaFxmaSyKfLw7qtMyo35nSxdzwqyon5JVZBof25tlu4gCY5L1g669KNkX07mvmT+3kQj50Aek/N8HgpoicxGVleO/ahy2ysUELQ1ybcpWZsEBbzd7UCnqXBq+JLmYskyd+iSdhoGRprg0cAUicGik39mMaucSbto6OvO6G8pTamDivFgDNLVxpYbh/Wy/F1OYpRLyi/Qb4M88aXCgqzXXAiD7ePB1M9ymMhaGscmID9qIPshYwYxhkQfGrTASOPefEWyosaIsWmcZ1uCIFS4LRUjQXdTpRyi28rKPo68ueZZvEcDZG+KA0LnQM/j9sUKoOmfzlEzClTXGXlOnC7wYD0rvzGjeYUJgXiLXM9MhqxEWwVEm3kcJuCAJj5s6xllrIeG4mcLIzBNku75wyLqR/XMd+O8jznwWudPB2902oq9i3Q3/lk7QjWRI6+OUvZFeP18S2wazEM4TMl+h+4Bu1fjJWXLPUhyhkgy7MC34WhVWPujdfG1u72LPi0jwOXsSiE3NV7GtNskQnCQhKZqNelXFDp0OTf0RGt6N2y3aJWh0sgQZxpWf1n5dG31OlfkJ015azAuUNRpMBlvvdy0NSpSmqgYwf+aHFq011PXiriOuMD7uVw5lYT/we+k0n4TsxCpIkrs+yvpELwvwOhJnDhu59K7rH0nQP75aD29Qv+LNWQ04Rkcf1gcCFcKMOnRGJsGIr/iIzJyKFRAO8n OkwMLYwd 1VtNquCHdty23DvSUWjLT5z2RFm0zlDgMze/oQb1urd+z0BjpacpMIAp9hlNvpT2phokrjC68JWjLFO2wQZTC+75lIF6qdzDXjxKWxP1tZdI9ckm7Y6CU9xtgKlXVfKG2c+okassyNd+cgvPMLhicm4YSksenZ/m+/5mywoQisNseuW6O9hrh8NYE7N9Clo1PO4hDmwGqJrgb909fJwS1NeGWVaZ4nyIjbj1cOKLndy5A47OIxMYNloQcGWV6AaawaUKMhHfGl1RGzCDvYWU7N68gx0WatippHYinrtIuTUq3xaPKrvrKabZd6gUUFAoNOF2TE/esccqnTjdrfL/9ulCsJCcOvIuRM/ZcsDmbzENzBlo4oIdViDURyJCIRI34TF9DMJVE44BNRVGNqXcsrdDsr2aVw8o+CV0cQRGuq31gzE5QrShxL+272WaSRsjqhAqTlgUWEmoN7Cs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Aug 4, 2024 at 4:03=E2=80=AFAM Kairui Song wrote= : > > On Sun, Aug 4, 2024 at 1:09=E2=80=AFAM Yu Zhao wrote: > > On Sat, Aug 3, 2024 at 2:31=E2=80=AFAM Ge Yang wro= te: > > > =E5=9C=A8 2024/8/3 4:18, Chris Li =E5=86=99=E9=81=93: > > > > On Thu, Aug 1, 2024 at 6:56=E2=80=AFPM Ge Yang = wrote: > > > >> > > > >> > > > >> > > > >>>> I can't reproduce this problem, using tmpfs to compile linux. > > > >>>> Seems you limit the memory size used to compile linux, which lea= ds to > > > >>>> OOM. May I ask why the memory size is limited to 481280kB? Do I = also > > > >>>> need to limit the memory size to 481280kB to test? > > > >>> > > > >>> Yes, you need to limit the cgroup memory size to force the swap > > > >>> action. I am using memory.max =3D 470M. > > > >>> > > > >>> I believe other values e.g. 800M can trigger it as well. The reas= on to > > > >>> limit the memory to cause the swap action. > > > >>> The goal is to intentionally overwhelm the memory load and let th= e > > > >>> swap system do its job. The 470M is chosen to cause a lot of swap > > > >>> action but not too high to cause OOM kills in normal kernels. > > > >>> In another word, high enough swap pressure but not too high to bu= st > > > >>> into OOM kill. e.g. I verify that, with your patch reverted, the > > > >>> mm-stable kernel can sustain this level of swap pressure (470M) > > > >>> without OOM kill. > > > >>> > > > >>> I borrowed the 470M magic value from Hugh and verified it works w= ith > > > >>> my test system. Huge has a similar swab test up which is more > > > >>> complicated than mine. It is the inspiration of my swap stress te= st > > > >>> setup. > > > >>> > > > >>> FYI, I am using "make -j32" on a machine with 12 cores (24 > > > >>> hyperthreading). My typical swap usage is about 3-5G. I set my > > > >>> swapfile size to about 20G. > > > >>> I am using zram or ssd as the swap backend. Hope that helps you > > > >>> reproduce the problem. > > > >>> > > > >> Hi Chris, > > > >> > > > >> I try to construct the experiment according to your suggestions ab= ove. > > > > > > > > Hi Ge, > > > > > > > > Sorry to hear that you were not able to reproduce it. > > > > > > > >> High swap pressure can be triggered, but OOM can't be reproduced. = The > > > >> specific steps are as follows: > > > >> root@ubuntu-server-2204:/home/yangge# cp workspace/linux/ /dev/shm= / -rf > > > > > > > > I use a slightly different way to setup the tmpfs: > > > > > > > > Here is section of my script: > > > > > > > > if ! [ -d $tmpdir ]; then > > > > sudo mkdir -p $tmpdir > > > > sudo mount -t tmpfs -o size=3D100% nodev $tmpdir > > > > fi > > > > > > > > sudo mkdir -p $cgroup > > > > sudo sh -c "echo $mem > $cgroup/memory.max" || echo setup > > > > memory.max error > > > > sudo sh -c "echo 1 > $cgroup/memory.oom.group" || echo set= up > > > > oom.group error > > > > > > > > Per run: > > > > > > > > # $workdir is under $tmpdir > > > > sudo rm -rf $workdir > > > > mkdir -p $workdir > > > > cd $workdir > > > > echo "Extracting linux tree" > > > > XZ_OPT=3D'-T0 -9 =E2=80=93memory=3D75%' tar xJf $linux_src= || die "xz > > > > extract failed" > > > > > > > > sudo sh -c "echo $BASHPID > $cgroup/cgroup.procs" > > > > echo "Cleaning linux tree, setup defconfig" > > > > cd $workdir/linux > > > > make -j$NR_TASK clean > > > > make defconfig > /dev/null > > > > echo Kernel compile run $i > > > > /usr/bin/time -a -o $log make --silent -j$NR_TASK || die = "make failed" > > > > > > > > > > > Thanks. > > > > > > >> root@ubuntu-server-2204:/home/yangge# sync > > > >> root@ubuntu-server-2204:/home/yangge# echo 3 > /proc/sys/vm/drop_c= aches > > > >> root@ubuntu-server-2204:/home/yangge# cd /sys/fs/cgroup/ > > > >> root@ubuntu-server-2204:/sys/fs/cgroup/# mkdir kernel-build > > > >> root@ubuntu-server-2204:/sys/fs/cgroup/# cd kernel-build > > > >> root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# echo 470M > m= emory.max > > > >> root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# echo $$ > cgr= oup.procs > > > >> root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# cd /dev/shm/l= inux/ > > > >> root@ubuntu-server-2204:/dev/shm/linux# make clean && make -j24 > > > > > > > > I am using make -j 32. > > > > > > > > Your step should work. > > > > > > > > Did you enable MGLRU in your .config file? Mine did. I attached my > > > > config file here. > > > > > > > > > > The above test didn't enable MGLRU. > > > > > > When MGLRU is enabled, I can reproduce OOM very soon. The cause of > > > triggering OOM is being analyzed. > > Hi Ge, > > Just in case, maybe you can try to revert your patch and run the test > again? I'm also seeing OOM with MGLRU with this test, Active/Inactive > LRU is fine. But after reverting your patch, the OOM issue still > exists. > > > I think this is one of the potential side effects -- Huge mentioned > > earlier about isolate_lru_folios(): > > https://lore.kernel.org/linux-mm/503f0df7-91e8-07c1-c4a6-124cad9e65e7@g= oogle.com/ > > > > Try this: > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index cfa839284b92..778bf5b7ef97 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -4320,7 +4320,7 @@ static bool sort_folio(struct lruvec *lruvec, > > struct folio *folio, struct scan_c > > } > > > > /* ineligible */ > > - if (zone > sc->reclaim_idx || skip_cma(folio, sc)) { > > + if (!folio_test_lru(folio) || zone > sc->reclaim_idx || > > skip_cma(folio, sc)) { > > gen =3D folio_inc_gen(lruvec, folio, false); > > list_move_tail(&folio->lru, &lrugen->folios[gen][type][= zone]); > > return true; > > Hi Yu, I tested your patch, on my system, the OOM still exists (96 > core and 256G RAM), test memcg is limited to 512M and 32 thread (). > > And I found the OOM seems irrelevant to either your patch or Ge's > patch. (it may changed the OOM chance slight though) > > After the very quick OOM (it failed to untar the linux source code), > checking lru_gen_full: > memcg 47 /build-kernel-tmpfs > node 0 > 442 1691 29405 0 > 0 0r 0e 0p 57r > 617e 0p > 1 0r 0e 0p 0r > 4e 0p > 2 0r 0e 0p 0r > 0e 0p > 3 0r 0e 0p 0r > 0e 0p > 0 0 0 0 > 0 0 > 443 1683 57748 832 > 0 0 0 0 0 > 0 0 > 1 0 0 0 0 > 0 0 > 2 0 0 0 0 > 0 0 > 3 0 0 0 0 > 0 0 > 0 0 0 0 > 0 0 > 444 1670 30207 133 > 0 0 0 0 0 > 0 0 > 1 0 0 0 0 > 0 0 > 2 0 0 0 0 > 0 0 > 3 0 0 0 0 > 0 0 > 0 0 0 0 > 0 0 > 445 1662 0 0 > 0 0R 34T 0 57R > 238T 0 > 1 0R 0T 0 0R > 0T 0 > 2 0R 0T 0 0R > 0T 0 > 3 0R 0T 0 0R > 81T 0 > 13807L 324O 867Y 2538N > 63F 18A > > If I repeat the test many times, it may succeed by chance, but the > untar process is very slow and generates about 7000 generations. > > But if I change the untar cmdline to: > python -c "import sys; sys.stdout.buffer.write(open('$linux_src', > mode=3D'rb').read())" | tar zx > > Then the problem is gone, it can untar the file successfully and very fas= t. > > This might be a different issue reported by Chris, I'm not sure. After more testing, I think these are two problems (note I changed the memcg limit to 600m later so the compile test can run smoothly). 1. OOM during the untar progress (can be workarounded by the untar cmdline I mentioned above). 2. OOM during the compile progress (this should be the one Chris encountere= d). Both 1 and 2 only exist for MGLRU. 1 can be workarounded using the cmdline I mentioned above. 2 is caused by Ge's patch, and 1 is not. I can confirm Yu's patch fixed 2 on my system, but the 1 seems still a problem, it's not related to this patch, maybe can be discussed elsewhere.