From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 010E3E77170 for ; Thu, 5 Dec 2024 19:17:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6CB7A6B00FD; Thu, 5 Dec 2024 14:17:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 67B4D6B0134; Thu, 5 Dec 2024 14:17:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 543536B0138; Thu, 5 Dec 2024 14:17:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2AA9F6B00FD for ; Thu, 5 Dec 2024 14:17:48 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C9388140EC5 for ; Thu, 5 Dec 2024 19:17:47 +0000 (UTC) X-FDA: 82861864260.19.8A626F5 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) by imf29.hostedemail.com (Postfix) with ESMTP id 338D8120007 for ; Thu, 5 Dec 2024 19:17:23 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NQPpo2KR; spf=pass (imf29.hostedemail.com: domain of yuzhao@google.com designates 209.85.222.170 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733426255; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yNyLQM7B0k4Lbd3l2gK9H1AXJZOAbp8aqpkmqcy9IVc=; b=mW6+/VJOTL1p8iLHdsoR11WFG77mt0W6LFKZzcHJFzYOaqMX2RWHqFaYaFU7vPO3RGo6zi JvjqcUZJvuz+2Hxi4dPJyP7rJEBXq0V84amaMISCVUtkNBqcCQjc4/T2jIKGn6qqTZj1I0 QYJKnsFovKmRVt6ywrfDP51pzrlCfL4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733426255; a=rsa-sha256; cv=none; b=C0YBfHWDFzsQH3M5zoEJNAWhsfI8xOqEtpjteqz8A5Q55dhKgqx7rDcAtPT4YUqa8eiFXF 7lxZWup5JNsm4hP1DQcqABxkZwbuo5wHX/O7A+201TQZx8xdzXzFADb29IHNXf9EdcH9kc c8iBcdq0FDiYU2XOP9Kt1+Rvb90h1Pk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NQPpo2KR; spf=pass (imf29.hostedemail.com: domain of yuzhao@google.com designates 209.85.222.170 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-7b6648e25e3so98755185a.2 for ; Thu, 05 Dec 2024 11:17:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733426265; x=1734031065; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yNyLQM7B0k4Lbd3l2gK9H1AXJZOAbp8aqpkmqcy9IVc=; b=NQPpo2KR0G04/bTm00ZYdMTpEiPAfeHyUPq1LIismMSniIG0rStrPXVOVcUCMGPnra 7cjGU1nr9hehux8fzU+TdwjITfKK57dw679v+MAJpaiQe+CWQClia1vWEKWoIGZvNvXc +RZvxzNNeyDDcLysYOAIkBThIAxEuhZBUj5X8lZmeOVPDJHbWyr5SsYzS8PydYs3uJQf M/FAPiU2gNG7/qx34Hc1ZYyfs+dpas2C7ZmPEKiax+2+chd8KBpu+C9U20Q6Y6UI/60n LpYQFlKVPTJ4Ffu0EZZGvboRl+Lsib+VVd48ZimrEd8KoT0kgqA/s8CM/u3ReWRUFCDg KrdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733426265; x=1734031065; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yNyLQM7B0k4Lbd3l2gK9H1AXJZOAbp8aqpkmqcy9IVc=; b=n1whMn3HmTtGLpxGgXjvJas7mXY5CET25kByeIJNoUyM/tQzqbipgQU0y5t4cVJvQk wHIUvnH9y3BpwLf00KgON3+08M8TXsl46jKlBGdNRt/eRMAjVKx/MHNXTqgsy4hO1Zbp bBMjHEgHJueaZDvF/B0DV7WASS6fQ6G4CcoVrqRDPVPJKh/AjSGBKSuBohTxyDaIh97H gyfWWCptUfzqV+k1YJl8IaLGU7OSaUFQTducO1YuIF0pcztJvl0GsTd6NzOrzocW8HbE ObqSD1+jIepnaLKXCA6RRjk5MrpNV8qZ4Pfc62mY4vyKKRTapXgkTvBY0qH4s9QsPFN5 jCFw== X-Forwarded-Encrypted: i=1; AJvYcCXgKukmlg4W+WAU2qbbnBEPpyYPLj5JWaGH+0Adju0/TKrFPBEvUQ24RHw5YhfK4cszKJPJ1T74cA==@kvack.org X-Gm-Message-State: AOJu0YzOtV/aoKgutfWkM2s1hSuh5Z+n/AJOV6xehTYTFbZe70S/bz9a Ac7Un7LUNkgQqLJ21ybK6jp8m+wtpeVeF125+Y6not0HihVgeZwWqzIS5SA/pg6WRZJuL4D/GML 3TBc02pW/svlKF5sQYkQk74HMW5XcJfUxcoTd X-Gm-Gg: ASbGncvyJHpi/A8tRrLkbqxKMANmCySBqc+RFsGKFp7tVwkXCVCdeydEFz0uomqKUO7 cu9aXD/1ZHwxa0OnAGpQuxYfu0BaJk1Ur0fg005QwkI1EXwYC22KVrfpCTvUp13+V X-Google-Smtp-Source: AGHT+IF+jwtAYc2yVIPio7jJpqWyTV0s+Bv7ZNzarZYGemc0r8fGbxEyFL101b9eiltkWg0xuHMs4R/zhhHH3XcwEoE= X-Received: by 2002:a05:620a:4889:b0:7b6:6a42:f0ea with SMTP id af79cd13be357-7b6bcaddc2bmr72630685a.19.1733426264897; Thu, 05 Dec 2024 11:17:44 -0800 (PST) MIME-Version: 1.0 References: <20240815025226.8973-1-liuye@kylinos.cn> <20240823020443.7379-1-liuye@kylinos.cn> In-Reply-To: From: Yu Zhao Date: Thu, 5 Dec 2024 12:17:07 -0700 Message-ID: Subject: Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios To: liuye , Hugh Dickins Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 338D8120007 X-Stat-Signature: myqt1ofc5t4cfcquz69szi1upgaypejp X-Rspam-User: X-HE-Tag: 1733426243-854877 X-HE-Meta: U2FsdGVkX1/AX8jIO4zW3FSTOJHPHM8dYW2nwk9yGM/7KhkZRCfkxGt/FEIqpmG4V6dm5AyxMRpEjYTr8q/aEovm6tyv4Ihwl0ViNxqi37QA0q6rT70ZvucNDLkEUsdQeBMjUbc9ui3n0AsLDxOtkpjab10ifCRN61jLfbftixAcjoQsoYKzYxOQlcYvZiCauRHQN8R0ddGUuGqh1s6qRJ0XnitnUtDIwVLPk87v5o0P/udeASBsHneWptVRY4XX6w3wNfjBwz3ZeniWZYCYLz0lRpZZyksYANhICtslvhsG5j9ZyHdXOzdkUtxksXIy1sY4ddVl1kA9cBxUfhqS0JYGhriBBEhZbpguDW9CznyS8A8C3NKwuAdHE85CvDEXPuZvjRbUltrx7Zn4fprfPKGVvj/PhTocsaiO9yrzvXYST5u80tqIo83EAVonZKcwZknLspPN1TYw/7bn2hWNaoKKTqgbxqBYj5SxbvWUl9pwWysSHu9smqWKSs60vTtBqKdjdMczQ3z8qpXN+5I5CxEs4Q6ccfXGRCQcuowecOB0NlGX3xW6YxRUkhgtln4Jl7kEm1S67ygxt7lcM+TZoyVqmfg0AeOchwrw2A5tLR/csYe17CSoKmp59eXS3KG0T9SUKK4F3oDoOU4tIbqjC5UzI+vgvkAZgvrm7HCs5FiuI2akLoVhikZGhHw/D7ZFZUQMkMq4yoR3o1GX7roMFYQ9Mu8lfTJbInZY6LSai+VUE/sZZsZXevoNpZ6ZCbfeYkaXZiCqYwUvMJz2YJy8oyduVfOi09MH80QA79qdV0I1L07xsRlTj7oqwtj9BhB5aP8Y1qZXu3vwOw5JYlP3gf7udslLWkxRkxvMQ6NhDt9zKqUppSS1ExiG95kyoph7PRT30IUCI/kuYRdFewFozBzzMB0Cxtw4rBcLSvqOYCF5XWkEaFh9BLsYJLm1raggOc9VfCDNKwxQC640YzM F42ujS36 U9T5LTaLgN5xAuDwhvQr1k301zjMB16/Y8g8M+rp9YHZPO80mYnPwjLj93+y10sYPSi/g35d4UpKsx8K1rQ/GYSpts5xog8oLX1gYDwobN6+m0B7cY1Uq8SBec0Jyz/2PH7XpXPv8QcqSo3xTiA9UXYOIhPlDJMFRuAhQXG5nZdDBbioPj0Io6rHvqMGuhL3DAviuhtp02BHwAqmVfImMDtxjdXzppvth+NS933YfIjF2nP7Q8gmn7oJ2W9nEuf8Ykbf7WKk+faxIVs0vrEKPVBJCU57/0aOBMAX9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 5, 2024 at 8:19=E2=80=AFAM liuye wrote: > > > Friendly ping. > > Thanks. Hugh has responded on your "v2 RESEND": https://lore.kernel.org/linux-mm/dae8ea77-2bc1-8ee9-b94b-207e2c8e1b8d@googl= e.com/ > On 2024/9/6 =E4=B8=8A=E5=8D=889:16, liuye wrote: > > > > > > On 2024/8/23 =E4=B8=8A=E5=8D=8810:04, liuye wrote: > >> I'm sorry to bother you about that, but it looks like the following em= ail send 7 days ago, > >> did not receive a response from you. Do you mind having a look at this > >> when you have a bit of free time please? > >> > >>>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-n= ode basis") > >>>> > >>>> Merged in 2016. > >>>> > >>>> Under what circumstances does it occur? > >>> > >>> User processe are requesting a large amount of memory and keep page a= ctive. > >>> Then a module continuously requests memory from ZONE_DMA32 area. > >>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm re= ached. > >>> However pages in the LRU(active_anon) list are mostly from > >>> the ZONE_NORMAL area. > >>> > >>>> Can you please describe how to reproduce this? > >>> > >>> Terminal 1: Construct to continuously increase pages active(anon). > >>> mkdir /tmp/memory > >>> mount -t tmpfs -o size=3D1024000M tmpfs /tmp/memory > >>> dd if=3D/dev/zero of=3D/tmp/memory/block bs=3D4M > >>> tail /tmp/memory/block > >>> > >>> Terminal 2: > >>> vmstat -a 1 > >>> active will increase. > >>> procs -----------memory---------- ---swap-- -----io---- -system-- ---= ----cpu------- > >>> r b swpd free inact active si so bi bo in cs us = sy id wa st gu > >>> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 = 1682 0 0 100 0 0 0 > >>> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 = 1468 0 0 100 0 0 0 > >>> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 = 2022 0 0 100 0 0 0 > >>> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 = 1544 0 0 100 0 0 0 > >>> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 = 1501 0 0 100 0 0 0 > >>> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 = 1678 0 0 100 0 0 0 > >>> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 = 2212 0 0 100 0 0 0 > >>> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729= 1585 0 0 100 0 0 0 > >>> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774= 1575 0 0 100 0 0 0 > >>> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738= 1604 0 0 100 0 0 0 > >>> > >>> cat /proc/meminfo | head > >>> Active(anon) increase. > >>> MemTotal: 1579941036 kB > >>> MemFree: 1445618500 kB > >>> MemAvailable: 1453013224 kB > >>> Buffers: 6516 kB > >>> Cached: 128653956 kB > >>> SwapCached: 0 kB > >>> Active: 118110812 kB > >>> Inactive: 11436620 kB > >>> Active(anon): 115345744 kB > >>> Inactive(anon): 945292 kB > >>> > >>> When the Active(anon) is 115345744 kB, insmod module triggers the ZON= E_DMA32 watermark. > >>> > >>> perf show nr_scanned=3D28835844. > >>> 28835844 * 4k =3D 115343376KB approximately equal to 115345744 kB. > >>> > >>> perf record -e vmscan:mm_vmscan_lru_isolate -aR > >>> perf script > >>> isolate_mode=3D0 classzone=3D1 order=3D1 nr_requested=3D32 nr_scanned= =3D2 nr_skipped=3D2 nr_taken=3D0 lru=3Dactive_anon > >>> isolate_mode=3D0 classzone=3D1 order=3D1 nr_requested=3D32 nr_scanned= =3D0 nr_skipped=3D0 nr_taken=3D0 lru=3Dactive_anon > >>> isolate_mode=3D0 classzone=3D1 order=3D0 nr_requested=3D32 nr_scanned= =3D28835844 nr_skipped=3D28835844 nr_taken=3D0 lru=3Dactive_anon > >>> isolate_mode=3D0 classzone=3D1 order=3D1 nr_requested=3D32 nr_scanned= =3D28835844 nr_skipped=3D28835844 nr_taken=3D0 lru=3Dactive_anon > >>> isolate_mode=3D0 classzone=3D1 order=3D0 nr_requested=3D32 nr_scanned= =3D29 nr_skipped=3D29 nr_taken=3D0 lru=3Dactive_anon > >>> isolate_mode=3D0 classzone=3D1 order=3D0 nr_requested=3D32 nr_scanned= =3D0 nr_skipped=3D0 nr_taken=3D0 lru=3Dactive_anon > >>> > >>> If increase Active(anon) to 1000G then insmod module triggers the ZON= E_DMA32 watermark. hard lockup will occur. > >>> > >>> In my device nr_scanned =3D 0000000003e3e937 when hard lockup. Conver= t to memory size 0x0000000003e3e937 * 4KB =3D 261072092 KB. > >>> > >>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53 > >>> ffffc90006fb7c30: 0000000000000020 0000000000000000 > >>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000 > >>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8 > >>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48 > >>> ffffc90006fb7c70: 0000000000000000 0000000000000000 > >>> ffffc90006fb7c80: 0000000000000000 0000000000000000 > >>> ffffc90006fb7c90: 0000000000000000 0000000000000000 > >>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937 > >>> ffffc90006fb7cb0: 0000000000000000 0000000000000000 > >>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000 > >>> > >>>> Why do you think it took eight years to be discovered? > >>> > >>> The problem requires the following conditions to occur: > >>> 1. The device memory should be large enough. > >>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL= area. > >>> 3. The memory in ZONE_DMA32 needs to reach the watermark. > >>> > >>> If the memory is not large enough, or if the usage design of ZONE_DMA= 32 area memory is reasonable, this problem is difficult to detect. > >>> > >>> notes: > >>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, bu= t other suitable scenarios may also trigger the problem. > >>> > >>>> It looks like that will fix, but perhaps something more fundamental > >>>> needs to be done - we're doing a tremendous amount of pretty pointle= ss > >>>> work here. Answers to my above questions will help us resolve this. > >>>> > >>>> Thanks. > >>> > >>> Please refer to the above explanation for details. > >>> > >>> Thanks. > >> > >> Thanks. > >> > > Friendly ping. > > >