From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B729BC3DA4A for ; Sat, 3 Aug 2024 20:03:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B4A9D6B008A; Sat, 3 Aug 2024 16:03:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF9796B0092; Sat, 3 Aug 2024 16:03:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C1096B0095; Sat, 3 Aug 2024 16:03:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7E5156B008A for ; Sat, 3 Aug 2024 16:03:58 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 23546141758 for ; Sat, 3 Aug 2024 20:03:58 +0000 (UTC) X-FDA: 82412010156.25.CFBC4DB Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 3A92FA0009 for ; Sat, 3 Aug 2024 20:03:54 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=a+FtG5GC; spf=pass (imf15.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722715406; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SbFlmObUtEbhQx6rRjGzT9NcRqSMZAqWwo3sMuWT600=; b=V6NKlS5+xhwk084uJxjbviGldEY3kbwlv2VbOLp/NeeoQaVo3LPvpReKVnb5IDKFJsVQfj zrev1ZBj6pfhk6B8ba0iIYL6agfhzHD1c82I5k451eMRVwhCgVL4nbTyC39Ur+iPwozqy7 qOhYvVZrFLkT1rRzJdBdqp4PKhZIxf0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=a+FtG5GC; spf=pass (imf15.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722715406; a=rsa-sha256; cv=none; b=YgfE4w4LYTkeUDVeWkqLJJNItbT6A06mJAhvJg64Qkypra335JxJHVvc6Hq6lNJQ5e1K8j ORV5AbrvoybYbfuRrcWcY4R1PCF1Qn2Tcx3u8IPF0EIGHYHH9FbfIwsI29jxRkZELEQSvL o6Gvc/tab4rBpPHA50rQCbF3n389w90= Received: by mail-lj1-f172.google.com with SMTP id 38308e7fff4ca-2f029e9c9cfso138991881fa.2 for ; Sat, 03 Aug 2024 13:03:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722715433; x=1723320233; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SbFlmObUtEbhQx6rRjGzT9NcRqSMZAqWwo3sMuWT600=; b=a+FtG5GCmL2/FKUPp+1OyWHYc/fs5w9t4SWs/B+iMXf6apR+dXbCBX0Mgfw9FZ1BBX efo0Co9Xz5KSXs/PetD4ojSrK7Dn7LLjv6NfW80EzHuoywbvf/Ar6svEC96UXV8d8QxX /0h+P0sbx4wYnsZe9YaJ4FNIT+gxcrkVGehdvVNsgCF+4sJNtrdtFyta7YpBCxPVtOFJ b/akMJffbsqIxz7rxNHf9fpBPsNqTJPVVzXHYhNg6HBBUfe7HGfEhYpgKpnbcKc4t0NI 2KqUfGlUezfVL5nP0QuOGuFEip/dvMJWlrH6cP36yjjyJ+kDcpYQntQM8xQ0eXDiUyiH +XFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722715433; x=1723320233; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SbFlmObUtEbhQx6rRjGzT9NcRqSMZAqWwo3sMuWT600=; b=pRfihc+ilAE/nlSwV2tJi6krvuafDpRi1ba73NcLJWH1OQJlM3bMtM+AWFAbep4wc6 NqWG2o0jDB9Og0b1wmG+ayQqY3z6dbzjqhU4qmsj+X7BoOuDH3IsPcBteUv/ukrlAzaf 7QJmlqXZDaShWKJ3y/jwl3Mj0rTh3T3PMAqBhTzJZ5ZmlF5rSTC2bj66/VhMWpNG49XW EBA0NwCPoeetwrXdOPA5D2hr+J27Tak0LLXepPsb4fRrq80HiYd4NCl4T2YwPd5A9NG1 BLSbFEwJAlfEcyxCa4f2lie2jczZ2Fsn0oJPjAaP7D1SU3yWbYV5oY4ltkuy/59+5uLG 2jZQ== X-Forwarded-Encrypted: i=1; AJvYcCUhMXT6D7WxcmfMjwaPisyYCO47QwviE1AhbDqlEGiHHVMjZLRjgAtCB3dAXjv7/HD2/bA0k58SjZETpY+zz6XnDZE= X-Gm-Message-State: AOJu0YyckcTdRyQ3avozRTgvRayGa3xxXUXV3Jh/6jAHhS8iUlIHpfHX uOZFMmATQKbjv9EIBfvisczJ6UKLuc+uvaqczGYDvsiOyfw+kpUPzspUdeBSo2pwDHMPTD4jsaE Jx5fMxXVCvQ4nKr/uvZhXuATDUTQ= X-Google-Smtp-Source: AGHT+IGPt7FFwRlIPDfsnpVBQUgOimqXQMMBsiyEKXCBYKM86ppNn5RJFmyX8JPKJllaSr8r5umuf7fnpWUoEgjltGo= X-Received: by 2002:a2e:9cc9:0:b0:2ef:1c0f:a0f3 with SMTP id 38308e7fff4ca-2f15aa88b76mr58267881fa.6.1722715432838; Sat, 03 Aug 2024 13:03:52 -0700 (PDT) MIME-Version: 1.0 References: <1719038884-1903-1-git-send-email-yangge1116@126.com> <0f9f7a2e-23c3-43fe-b5c1-dab3a7b31c2d@126.com> <00a27e2b-0fc2-4980-bc4e-b383f15d3ad9@126.com> In-Reply-To: From: Kairui Song Date: Sun, 4 Aug 2024 04:03:36 +0800 Message-ID: Subject: Re: [PATCH V2] mm/gup: Clear the LRU flag of a page before adding to LRU batch To: Ge Yang , Chris Li , Yu Zhao Cc: Andrew Morton , linux-mm , LKML , stable@vger.kernel.org, Barry Song <21cnbao@gmail.com>, David Hildenbrand , baolin.wang@linux.alibaba.com, liuzixing@hygon.cn, Hugh Dickins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 3A92FA0009 X-Stat-Signature: 46ekacdykz5bqqyo5imcmyd6i56yfqz5 X-HE-Tag: 1722715434-594161 X-HE-Meta: U2FsdGVkX18xvgQ4X0o4xEfjmBLH+X/SO2qWlak+nfYXLfxYLU8SVaelMdoOJr7bLrAdZFdc6BuX2e7Mbd1uQbc/6Bu/l9ZkT45NrlFm6sh36+Y7OptB+tmj2QLUQXBx1C2EbXnNOWfy974x7JZaHgiT96f3wF8rYY49kUQYWfuq7hqH3o8kcNuPfbesN6/+S1tDreGKIGzKJSonXCxt6em9goC8txYTbmOVKAw0m/+NIaWQheyedCoZW13oRndeBXzoTOEcJia8UZdqW0rg/E3/d0Q8Vb2BM2IhYBjcZ0Jn1vEwn8yUiuow7txO+6u52FSJwvPFQ4TOdgQ7h3L3ve7m8kAIKUtFhItIRPmLmN9vnIARPM+vlqWJtop6YDCUPZKi/GZTU70OHzBJH7fw9bgi6eWxPgrbm6H+ddfMpeWSwFrW/dC0pDIzeh6LjNfSvZoqTaaUGMVEtBiJJhJBlx6KPHgejZIrAQacv/ayjZIXJWEhfcmkQJv5RQpqc9F/0arPrpJHiMPocvqMo3wDHZqUKhoPMkg5e0almtvWHjZaaebtBfMTFzJQlPBiEAVyxO2DXxJqyVE1txmT917wlQmV2Ufc6ygr+SWnAVm1Mw7m0jpE/AGghKQS3kUfmCaHksdXQy1dK8jcVh17cpQuqTGxU+baSSxwGw/vEXGTSHZNbeiIlS0Wz8aJJCCQCpHLVMKfTxRheDWtLXy5yZ8Kjj7t/nsElIG8nWoICMARm6WUI0AVI4G92WW0rEigprkPD/KQ6l7+AdZY/dNWXseWtSU/NZhbROLr8d20krimuHfTzfX+/sOJtR75bV7GVhc1MqiDI+uUiXx0rj4CVvnUYj35NKLgz+ESrYxZ4Vgmdxp34wgBXD/6y9595zILgrk9rJa7tRnvpoRl/UJtQvJd/aRPmI3ED7z7MKR2dAd3WTu03FKXLTLJdmHwwGd3SQwfMUv9TYoWjiPwKq89nMx 9pokU3lw xYskUa6Bbgy9wAcQLLKmABlTVPoYvGsmjAe/z4ylZv+TEv7Sam/vVtuG2XlbmVcOMgmon3V+gq9l13boK+cIOf3GBtiZOE8JcAZVDRWehVcn+4ryAl8GEHsW+pWvx0wYguaW9RYoJQADfPUqcjC0gTEnuIL2C/dOZoZEe3Xh1aNl1TyQ2lPOa8f0kDkBNmAk21k//JSfDCsOSPBJ3LmebWMkDBxKX/JudA4bcvVzaFKxPObfojrzo6SiVopEC9hGiw9mSWva1XgleJMknDJU2Nhp+FwgPzzZX+8nnlcVD4umP78Rgm8+9rW6lva4DDIHkk95wUPKaSWHjrelod2zPRYuEdLd4rxKn8OugKGy36vC3OUZO8gGNndo95nCHrRo5EDc1tAdVrKXUK/YyW7M09DQN1obV5uDBwpoQowqD6ipafN6RBM6VoS1EWA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000011, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Aug 4, 2024 at 1:09=E2=80=AFAM Yu Zhao wrote: > On Sat, Aug 3, 2024 at 2:31=E2=80=AFAM Ge Yang wrote= : > > =E5=9C=A8 2024/8/3 4:18, Chris Li =E5=86=99=E9=81=93: > > > On Thu, Aug 1, 2024 at 6:56=E2=80=AFPM Ge Yang w= rote: > > >> > > >> > > >> > > >>>> I can't reproduce this problem, using tmpfs to compile linux. > > >>>> Seems you limit the memory size used to compile linux, which leads= to > > >>>> OOM. May I ask why the memory size is limited to 481280kB? Do I al= so > > >>>> need to limit the memory size to 481280kB to test? > > >>> > > >>> Yes, you need to limit the cgroup memory size to force the swap > > >>> action. I am using memory.max =3D 470M. > > >>> > > >>> I believe other values e.g. 800M can trigger it as well. The reason= to > > >>> limit the memory to cause the swap action. > > >>> The goal is to intentionally overwhelm the memory load and let the > > >>> swap system do its job. The 470M is chosen to cause a lot of swap > > >>> action but not too high to cause OOM kills in normal kernels. > > >>> In another word, high enough swap pressure but not too high to bust > > >>> into OOM kill. e.g. I verify that, with your patch reverted, the > > >>> mm-stable kernel can sustain this level of swap pressure (470M) > > >>> without OOM kill. > > >>> > > >>> I borrowed the 470M magic value from Hugh and verified it works wit= h > > >>> my test system. Huge has a similar swab test up which is more > > >>> complicated than mine. It is the inspiration of my swap stress test > > >>> setup. > > >>> > > >>> FYI, I am using "make -j32" on a machine with 12 cores (24 > > >>> hyperthreading). My typical swap usage is about 3-5G. I set my > > >>> swapfile size to about 20G. > > >>> I am using zram or ssd as the swap backend. Hope that helps you > > >>> reproduce the problem. > > >>> > > >> Hi Chris, > > >> > > >> I try to construct the experiment according to your suggestions abov= e. > > > > > > Hi Ge, > > > > > > Sorry to hear that you were not able to reproduce it. > > > > > >> High swap pressure can be triggered, but OOM can't be reproduced. Th= e > > >> specific steps are as follows: > > >> root@ubuntu-server-2204:/home/yangge# cp workspace/linux/ /dev/shm/ = -rf > > > > > > I use a slightly different way to setup the tmpfs: > > > > > > Here is section of my script: > > > > > > if ! [ -d $tmpdir ]; then > > > sudo mkdir -p $tmpdir > > > sudo mount -t tmpfs -o size=3D100% nodev $tmpdir > > > fi > > > > > > sudo mkdir -p $cgroup > > > sudo sh -c "echo $mem > $cgroup/memory.max" || echo setup > > > memory.max error > > > sudo sh -c "echo 1 > $cgroup/memory.oom.group" || echo setup > > > oom.group error > > > > > > Per run: > > > > > > # $workdir is under $tmpdir > > > sudo rm -rf $workdir > > > mkdir -p $workdir > > > cd $workdir > > > echo "Extracting linux tree" > > > XZ_OPT=3D'-T0 -9 =E2=80=93memory=3D75%' tar xJf $linux_src |= | die "xz > > > extract failed" > > > > > > sudo sh -c "echo $BASHPID > $cgroup/cgroup.procs" > > > echo "Cleaning linux tree, setup defconfig" > > > cd $workdir/linux > > > make -j$NR_TASK clean > > > make defconfig > /dev/null > > > echo Kernel compile run $i > > > /usr/bin/time -a -o $log make --silent -j$NR_TASK || die "m= ake failed" > > > > > > > > Thanks. > > > > >> root@ubuntu-server-2204:/home/yangge# sync > > >> root@ubuntu-server-2204:/home/yangge# echo 3 > /proc/sys/vm/drop_cac= hes > > >> root@ubuntu-server-2204:/home/yangge# cd /sys/fs/cgroup/ > > >> root@ubuntu-server-2204:/sys/fs/cgroup/# mkdir kernel-build > > >> root@ubuntu-server-2204:/sys/fs/cgroup/# cd kernel-build > > >> root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# echo 470M > mem= ory.max > > >> root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# echo $$ > cgrou= p.procs > > >> root@ubuntu-server-2204:/sys/fs/cgroup/kernel-build# cd /dev/shm/lin= ux/ > > >> root@ubuntu-server-2204:/dev/shm/linux# make clean && make -j24 > > > > > > I am using make -j 32. > > > > > > Your step should work. > > > > > > Did you enable MGLRU in your .config file? Mine did. I attached my > > > config file here. > > > > > > > The above test didn't enable MGLRU. > > > > When MGLRU is enabled, I can reproduce OOM very soon. The cause of > > triggering OOM is being analyzed. Hi Ge, Just in case, maybe you can try to revert your patch and run the test again? I'm also seeing OOM with MGLRU with this test, Active/Inactive LRU is fine. But after reverting your patch, the OOM issue still exists. > I think this is one of the potential side effects -- Huge mentioned > earlier about isolate_lru_folios(): > https://lore.kernel.org/linux-mm/503f0df7-91e8-07c1-c4a6-124cad9e65e7@goo= gle.com/ > > Try this: > diff --git a/mm/vmscan.c b/mm/vmscan.c > index cfa839284b92..778bf5b7ef97 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4320,7 +4320,7 @@ static bool sort_folio(struct lruvec *lruvec, > struct folio *folio, struct scan_c > } > > /* ineligible */ > - if (zone > sc->reclaim_idx || skip_cma(folio, sc)) { > + if (!folio_test_lru(folio) || zone > sc->reclaim_idx || > skip_cma(folio, sc)) { > gen =3D folio_inc_gen(lruvec, folio, false); > list_move_tail(&folio->lru, &lrugen->folios[gen][type][zo= ne]); > return true; Hi Yu, I tested your patch, on my system, the OOM still exists (96 core and 256G RAM), test memcg is limited to 512M and 32 thread (). And I found the OOM seems irrelevant to either your patch or Ge's patch. (it may changed the OOM chance slight though) After the very quick OOM (it failed to untar the linux source code), checking lru_gen_full: memcg 47 /build-kernel-tmpfs node 0 442 1691 29405 0 0 0r 0e 0p 57r 617e 0p 1 0r 0e 0p 0r 4e 0p 2 0r 0e 0p 0r 0e 0p 3 0r 0e 0p 0r 0e 0p 0 0 0 0 0 0 443 1683 57748 832 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 444 1670 30207 133 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 445 1662 0 0 0 0R 34T 0 57R 238T 0 1 0R 0T 0 0R 0T 0 2 0R 0T 0 0R 0T 0 3 0R 0T 0 0R 81T 0 13807L 324O 867Y 2538N 63F 18A If I repeat the test many times, it may succeed by chance, but the untar process is very slow and generates about 7000 generations. But if I change the untar cmdline to: python -c "import sys; sys.stdout.buffer.write(open('$linux_src', mode=3D'rb').read())" | tar zx Then the problem is gone, it can untar the file successfully and very fast. This might be a different issue reported by Chris, I'm not sure.