From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F261C61D92 for ; Wed, 22 Nov 2023 06:14:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D93A6B0562; Wed, 22 Nov 2023 01:14:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 889DB6B0563; Wed, 22 Nov 2023 01:14:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72C126B0564; Wed, 22 Nov 2023 01:14:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6182E6B0562 for ; Wed, 22 Nov 2023 01:14:17 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 347871A04B2 for ; Wed, 22 Nov 2023 06:14:17 +0000 (UTC) X-FDA: 81484575354.19.96022E3 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf04.hostedemail.com (Postfix) with ESMTP id 270B440016 for ; Wed, 22 Nov 2023 06:14:14 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="hXZq8/Q5"; spf=pass (imf04.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700633655; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SWXugJnpuEV/Coek3cCo5kzra5jzZRN3QDhuWrxNtB8=; b=pVGf91lG9UAs2x2++G92/XvzXCzjEKOaGrixwNfmsZmz3Fgg6QdgjobbvAK1UltL/MdVYJ GOYgL+TDb3qrqCsDq4os/CiuJ2EaAmbzCciuU6S79BDOErorAVfH3Z+J8EKrJTjStG7+Mc 4mmHjTD14xXpqypU9Q1Yc/Ip6iccn4E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700633655; a=rsa-sha256; cv=none; b=CTgpfcz+C4qpszKkQK4stF4VHVa5GbaxttVR8UQlaVA+HxhCBAL12ar8NX7CxXZsEpyRdU 04IrVXKHawUbqEZ7fmBNmq8/3wxyk2aHQPBLS2d4IQHnTp3tSAkqA0f4qvJyOXYHLPg6Hx XZETQJbWnnk2Ib4GtcjWZi8IQi5U/8A= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="hXZq8/Q5"; spf=pass (imf04.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-41cb7720579so115301cf.1 for ; Tue, 21 Nov 2023 22:14:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700633654; x=1701238454; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SWXugJnpuEV/Coek3cCo5kzra5jzZRN3QDhuWrxNtB8=; b=hXZq8/Q5WEpfxYrhUJ+OOTHZJhoxn1K1f5SqgITdBaIlQsUO/w+5IwgHSII0h3m314 /QwpHFXlx4/8E81Or7fP/iQ7DT2YuChEyQM65rjm6wMK/46IRZPO72HGaTPe8Yj34uE3 qMXRBEWLgNQ/bbIOjUBtFJOeD39Gi30Xt8vwar7iVZMMev/Xc0qmksfLZBxME4jTV4jK gO21kUDaxPELQMs02czOKJdgjQ+6Z6Qvq9DOGpmL1t0RvUFLcV8XU00HkyKBjbFz6o4T Rs9n8AwJrX/Y4qKCvEUhhMMEbbUbr1f91eHdNmRVfE+WYpzz/9McYKWperpTUt8IyOEb vaXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700633654; x=1701238454; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SWXugJnpuEV/Coek3cCo5kzra5jzZRN3QDhuWrxNtB8=; b=BN0Jsi9K7a5QThVZ07BHhhHqoc/Kgx0xxLWPP4j5vMXCicVv8cHnX+1e9kz22VURLH T8FilL3RsTrxWMIeTu/5Ei2Ox0T7KJpzZ7C36U1IF/e8CAsOunqZ7L2GY6BWVITXCRgk DlgZ9DnsWu5mDE1MhqlKv4W5Ulq1WoyDDcKuirv6BkgkCJ8i98lnXXjyunt0G7BsNYP9 zD3R+ci2mJvWobth2CoyjGA2MZzgCHHUlO0cwYfEkqMGbip8mBKmb0EZZd5zIJejLlaW Lq7wJhkoea41WoTmQSf+Lcwjv20kivz6heK8LZ4X0PSXBm5tN8H10XO95nnkry3GUEYu S9wA== X-Gm-Message-State: AOJu0YxbQkU4KdfKnZiUDqUXcaZXUoaRYmU7kcTSR9oq7dAsepILk5j9 OgbeHeDSMU1r3IZkD+T4tu38cmY+5EW7itu0F/QuJg== X-Google-Smtp-Source: AGHT+IHWu1FRteJfgee919DZTbZ9YSPV8moPExYzqSQICU/u2w0OONc/rXQoXM5Ceh65yoyT0lQlg3DUvJig5XNkiIc= X-Received: by 2002:ac8:574f:0:b0:41c:da4b:21e4 with SMTP id 15-20020ac8574f000000b0041cda4b21e4mr160085qtx.16.1700633654014; Tue, 21 Nov 2023 22:14:14 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yu Zhao Date: Tue, 21 Nov 2023 23:13:35 -0700 Message-ID: Subject: Re: high kswapd CPU usage with symmetrical swap in/out pattern with multi-gen LRU To: Jaroslav Pulchart Cc: linux-mm@kvack.org, akpm@linux-foundation.org, Igor Raits , Daniel Secik , Charan Teja Kalla , Kalesh Singh Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 270B440016 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: xyky1j9bs74tyq3jcq8ubyhpmdyqxe6g X-HE-Tag: 1700633654-726130 X-HE-Meta: U2FsdGVkX19BsRQihdPJJYEIrdukyj3Xe3kukZYfOK9kgAkD3A+aL2F3UfGVchAf30IjFo0xJcPrLpT8VAbo+n/HUnQb+fC6xdvQgPNUykz0xyqDXs7z5OaYsz53gjKJs2EOJD5ZdsEqy/wdik7TPAA6I8M1aJilkYzgVVMQyqL6j3hHGZE3AZyGL+6B3cx/pylO48CXVTRZlxiZOUHf45hkA1SxJUf1nAhycSTq5zByvt7WPBIDcM5rdfUrHZQ6LclOaGTNcjx1b0SDf0LcJ753mv0NfZB6c2S1eX9b4zpq+ydpaTkkUY/Urjv/ApA9YAhsKFDxvgI9EE3cXsR6AI/zpA74DRxqYqCqyJ4QVOpSTpUzcJLwGJVIV6Y2PuMJxmi8Xwo4QYHOBqhf+DBcY/YFGSw7PCSO1vPz2R2bXWI8dQx10O0h6n/V1U7SasMvaDK5Pr+q6F9Cqt5oN+ipKOmQciaRNk3j4D5xdvivu0qve+wCX+VK8wlhlrxg2EERSoHWGh4E6GiAnYThHgIHUC9BsVahZDNOd+XGOrak9Thl7x+goPLQUGvtYm6Dp5bpfzOeMzXmVjtde/yHTV/fM7SVRBYTxwEi4kGU7CDg99jy0Q/SZ+nR8ori0IMFTdlwMoRvKyLiuu8f+q40i+txD25FWCcKKjEcutXfaF3evsaUFmLDXa8G4zAKV9T+16lHcmDY7g49ezy6B9oJzMS+gU7RPnw4+rptKTfSAEc+Kiy9CN53lPuAAJ+dhyHpZa4dQm5fa9YiaiNhQWkRcTfgkRSgiufKG+0TIP3d5iXWjV2LYfwwgYX+EGa3WVUqsJYnU7ND25R3PyC5RW47SOwrmI4wlQLgSPnqxT1WjQb80eGyZX01chx3+91ht8v1OAxgKmqvO0qLJIYe4koM+XttPa772/z6V4vyVhbsm64T4zV6DqcAPBhdgMa+/v+JwJ+X7OpXtdhyfq/pwpBjn4G vkNEn5DS wNqwdP1HDXMp92bFbhDqUwDyDjseMKinpYPKp3EuVOugYz2Ce285yj5uvUqwAcpdPHr+Sd3BoX3U+qXvS1J63DwaFcctdAPlrgDCu4K4P/+H9S0m4BzbFvIacT41TSWs7yH/BWZPHmA9SmKkpXWOV84Oi08oIEPfWObrZAaV2nOdtUTnzZrhNdj6iyC+SJqsSpuF/OEciDOtpzdePvCjpMJNoIbJLmJTsoRRwBS6q5sxMpvW1q0Qc/GrzNivnSNo7ykntxawjya2iVkDScpelO0EfopkyIwCqHzwA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 20, 2023 at 1:42=E2=80=AFAM Jaroslav Pulchart wrote: > > > On Tue, Nov 14, 2023 at 12:30=E2=80=AFAM Jaroslav Pulchart > > wrote: > > > > > > > > > > > On Mon, Nov 13, 2023 at 1:36=E2=80=AFAM Jaroslav Pulchart > > > > wrote: > > > > > > > > > > > > > > > > > On Thu, Nov 9, 2023 at 3:58=E2=80=AFAM Jaroslav Pulchart > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 8, 2023 at 10:39=E2=80=AFPM Jaroslav Pulchart > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 8, 2023 at 12:04=E2=80=AFPM Jaroslav Pulcha= rt > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jaroslav, > > > > > > > > > > > > > > > > > > > > > > Hi Yu Zhao > > > > > > > > > > > > > > > > > > > > > > thanks for response, see answers inline: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 8, 2023 at 6:35=E2=80=AFAM Jaroslav Pul= chart > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > > > > > > > I would like to report to you an unpleasant behav= ior of multi-gen LRU > > > > > > > > > > > > > with strange swap in/out usage on my Dell 7525 tw= o socket AMD 74F3 > > > > > > > > > > > > > system (16numa domains). > > > > > > > > > > > > > > > > > > > > > > > > Kernel version please? > > > > > > > > > > > > > > > > > > > > > > 6.5.y, but we saw it sooner as it is in investigation= from 23th May > > > > > > > > > > > (6.4.y and maybe even the 6.3.y). > > > > > > > > > > > > > > > > > > > > v6.6 has a few critical fixes for MGLRU, I can backport= them to v6.5 > > > > > > > > > > for you if you run into other problems with v6.6. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I will give it a try using 6.6.y. When it will work we ca= n switch to > > > > > > > > > 6.6.y instead of backporting the stuff to 6.5.y. > > > > > > > > > > > > > > > > > > > > > > Symptoms of my issue are > > > > > > > > > > > > > > > > > > > > > > > > > > /A/ if mult-gen LRU is enabled > > > > > > > > > > > > > 1/ [kswapd3] is consuming 100% CPU > > > > > > > > > > > > > > > > > > > > > > > > Just thinking out loud: kswapd3 means the fourth no= de was under memory pressure. > > > > > > > > > > > > > > > > > > > > > > > > > top - 15:03:11 up 34 days, 1:51, 2 users, = load average: 23.34, > > > > > > > > > > > > > 18.26, 15.01 > > > > > > > > > > > > > Tasks: 1226 total, 2 running, 1224 sleeping= , 0 stopped, 0 zombie > > > > > > > > > > > > > %Cpu(s): 12.5 us, 4.7 sy, 0.0 ni, 82.1 id, = 0.0 wa, 0.4 hi, > > > > > > > > > > > > > 0.4 si, 0.0 st > > > > > > > > > > > > > MiB Mem : 1047265.+total, 28382.7 free, 1021= 308.+used, 767.6 buff/cache > > > > > > > > > > > > > MiB Swap: 8192.0 total, 8187.7 free, = 4.2 used. 25956.7 avail Mem > > > > > > > > > > > > > ... > > > > > > > > > > > > > 765 root 20 0 0 0 = 0 R 98.3 0.0 > > > > > > > > > > > > > 34969:04 kswapd3 > > > > > > > > > > > > > ... > > > > > > > > > > > > > 2/ swap space usage is low about ~4MB from 8GB as= swap in zram (was > > > > > > > > > > > > > observed with swap disk as well and cause IO late= ncy issues due to > > > > > > > > > > > > > some kind of locking) > > > > > > > > > > > > > 3/ swap In/Out is huge and symmetrical ~12MB/s in= and ~12MB/s out > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > /B/ if mult-gen LRU is disabled > > > > > > > > > > > > > 1/ [kswapd3] is consuming 3%-10% CPU > > > > > > > > > > > > > top - 15:02:49 up 34 days, 1:51, 2 users, = load average: 23.05, > > > > > > > > > > > > > 17.77, 14.77 > > > > > > > > > > > > > Tasks: 1226 total, 1 running, 1225 sleeping= , 0 stopped, 0 zombie > > > > > > > > > > > > > %Cpu(s): 14.7 us, 2.8 sy, 0.0 ni, 81.8 id, = 0.0 wa, 0.4 hi, > > > > > > > > > > > > > 0.4 si, 0.0 st > > > > > > > > > > > > > MiB Mem : 1047265.+total, 28378.5 free, 1021= 313.+used, 767.3 buff/cache > > > > > > > > > > > > > MiB Swap: 8192.0 total, 8189.0 free, = 3.0 used. 25952.4 avail Mem > > > > > > > > > > > > > ... > > > > > > > > > > > > > 765 root 20 0 0 0 0= S 3.6 0.0 > > > > > > > > > > > > > 34966:46 [kswapd3] > > > > > > > > > > > > > ... > > > > > > > > > > > > > 2/ swap space usage is low (4MB) > > > > > > > > > > > > > 3/ swap In/Out is huge and symmetrical ~500kB/s i= n and ~500kB/s out > > > > > > > > > > > > > > > > > > > > > > > > > > Both situations are wrong as they are using swap = in/out extensively, > > > > > > > > > > > > > however the multi-gen LRU situation is 10times wo= rse. > > > > > > > > > > > > > > > > > > > > > > > > From the stats below, node 3 had the lowest free me= mory. So I think in > > > > > > > > > > > > both cases, the reclaim activities were as expected= . > > > > > > > > > > > > > > > > > > > > > > I do not see a reason for the memory pressure and rec= laims. This node > > > > > > > > > > > has the lowest free memory of all nodes (~302MB free)= that is true, > > > > > > > > > > > however the swap space usage is just 4MB (still going= in and out). So > > > > > > > > > > > what can be the reason for that behaviour? > > > > > > > > > > > > > > > > > > > > The best analogy is that refuel (reclaim) happens befor= e the tank > > > > > > > > > > becomes empty, and it happens even sooner when there is= a long road > > > > > > > > > > ahead (high order allocations). > > > > > > > > > > > > > > > > > > > > > The workers/application is running in pre-allocated H= ugePages and the > > > > > > > > > > > rest is used for a small set of system services and d= rivers of > > > > > > > > > > > devices. It is static and not growing. The issue pers= ists when I stop > > > > > > > > > > > the system services and free the memory. > > > > > > > > > > > > > > > > > > > > Yes, this helps. > > > > > > > > > > Also could you attach /proc/buddyinfo from the moment > > > > > > > > > > you hit the problem? > > > > > > > > > > > > > > > > > > > > > > > > > > > > I can. The problem is continuous, it is 100% of time cont= inuously > > > > > > > > > doing in/out and consuming 100% of CPU and locking IO. > > > > > > > > > > > > > > > > > > The output of /proc/buddyinfo is: > > > > > > > > > > > > > > > > > > # cat /proc/buddyinfo > > > > > > > > > Node 0, zone DMA 7 2 2 1 1 = 2 1 > > > > > > > > > 1 1 2 1 > > > > > > > > > Node 0, zone DMA32 4567 3395 1357 846 439 = 190 93 > > > > > > > > > 61 43 23 4 > > > > > > > > > Node 0, zone Normal 19 190 140 129 136 = 75 66 > > > > > > > > > 41 9 1 5 > > > > > > > > > Node 1, zone Normal 194 1210 2080 1800 715 = 255 111 > > > > > > > > > 56 42 36 55 > > > > > > > > > Node 2, zone Normal 204 768 3766 3394 1742 = 468 185 > > > > > > > > > 194 238 47 74 > > > > > > > > > Node 3, zone Normal 1622 2137 1058 846 388 = 208 97 > > > > > > > > > 44 14 42 10 > > > > > > > > > > > > > > > > Again, thinking out loud: there is only one zone on node 3,= i.e., the > > > > > > > > normal zone, and this excludes the problem commit > > > > > > > > 669281ee7ef731fb5204df9d948669bf32a5e68d ("Multi-gen LRU: f= ix per-zone > > > > > > > > reclaim") fixed in v6.6. > > > > > > > > > > > > > > I built vanila 6.6.1 and did the first fast test - spin up an= d destroy > > > > > > > VMs only - This test does not always trigger the kswapd3 cont= inuous > > > > > > > swap in/out usage but it uses it and it looks like there is= a > > > > > > > change: > > > > > > > > > > > > > > I can see kswapd non-continous (15s and more) usage with 6.5= .y > > > > > > > # ps ax | grep [k]swapd > > > > > > > 753 ? S 0:00 [kswapd0] > > > > > > > 754 ? S 0:00 [kswapd1] > > > > > > > 755 ? S 0:00 [kswapd2] > > > > > > > 756 ? S 0:15 [kswapd3] <<<<<<<<< > > > > > > > 757 ? S 0:00 [kswapd4] > > > > > > > 758 ? S 0:00 [kswapd5] > > > > > > > 759 ? S 0:00 [kswapd6] > > > > > > > 760 ? S 0:00 [kswapd7] > > > > > > > 761 ? S 0:00 [kswapd8] > > > > > > > 762 ? S 0:00 [kswapd9] > > > > > > > 763 ? S 0:00 [kswapd10] > > > > > > > 764 ? S 0:00 [kswapd11] > > > > > > > 765 ? S 0:00 [kswapd12] > > > > > > > 766 ? S 0:00 [kswapd13] > > > > > > > 767 ? S 0:00 [kswapd14] > > > > > > > 768 ? S 0:00 [kswapd15] > > > > > > > > > > > > > > and none kswapd usage with 6.6.1, that looks to be promising = path > > > > > > > > > > > > > > # ps ax | grep [k]swapd > > > > > > > 808 ? S 0:00 [kswapd0] > > > > > > > 809 ? S 0:00 [kswapd1] > > > > > > > 810 ? S 0:00 [kswapd2] > > > > > > > 811 ? S 0:00 [kswapd3] <<<< nice > > > > > > > 812 ? S 0:00 [kswapd4] > > > > > > > 813 ? S 0:00 [kswapd5] > > > > > > > 814 ? S 0:00 [kswapd6] > > > > > > > 815 ? S 0:00 [kswapd7] > > > > > > > 816 ? S 0:00 [kswapd8] > > > > > > > 817 ? S 0:00 [kswapd9] > > > > > > > 818 ? S 0:00 [kswapd10] > > > > > > > 819 ? S 0:00 [kswapd11] > > > > > > > 820 ? S 0:00 [kswapd12] > > > > > > > 821 ? S 0:00 [kswapd13] > > > > > > > 822 ? S 0:00 [kswapd14] > > > > > > > 823 ? S 0:00 [kswapd15] > > > > > > > > > > > > > > I will install the 6.6.1 on the server which is doing some wo= rk and > > > > > > > observe it later today. > > > > > > > > > > > > Thanks. Fingers crossed. > > > > > > > > > > The 6.6.y was deployed and used from 9th Nov 3PM CEST. So far so = good. > > > > > The node 3 has 163MiB free of memory and I see > > > > > just a few in/out swap usage sometimes (which is expected) and mi= nimal > > > > > kswapd3 process usage for almost 4days. > > > > > > > > Thanks for the update! > > > > > > > > Just to confirm: > > > > 1. MGLRU was enabled, and > > > > > > Yes, MGLRU is enabled > > > > > > > 2. The v6.6 deployed did NOT have the patch I attached earlier. > > > > > > Vanila 6.6, attached patch NOT applied. > > > > > > > Are both correct? > > > > > > > > If so, I'd very appreciate it if you could try the attached patch o= n > > > > top of v6.5 and see if it helps. My suspicion is that the problem i= s > > > > compaction related, i.e., kswapd was woken up by high order > > > > allocations but didn't properly stop. But what causes the behavior > > > > > > Sure, I can try it. Will inform you about progress. > > > > Thanks! > > > > > > difference on v6.5 between MGLRU and the active/inactive LRU still > > > > puzzles me --the problem might be somehow masked rather than fixed = on > > > > v6.6. > > > > > > I'm not sure how I can help with the issue. Any suggestions on what t= o > > > change/try? > > > > Trying the attached patch is good enough for now :) > > So far I'm running the "6.5.y + patch" for 4 days without triggering > the infinite swap in//out usage. > > I'm observing a similar pattern in kswapd usage - "if it uses kswapd, > then it is in majority the kswapd3 - like the vanila 6.5.y which is > not observed with 6.6.y, (The Node's 3 free mem is 159 MB) > # ps ax | grep [k]swapd > 750 ? S 0:00 [kswapd0] > 751 ? S 0:00 [kswapd1] > 752 ? S 0:00 [kswapd2] > 753 ? S 0:02 [kswapd3] <<<< it uses kswapd3, good > is that it is not continuous > 754 ? S 0:00 [kswapd4] > 755 ? S 0:00 [kswapd5] > 756 ? S 0:00 [kswapd6] > 757 ? S 0:00 [kswapd7] > 758 ? S 0:00 [kswapd8] > 759 ? S 0:00 [kswapd9] > 760 ? S 0:00 [kswapd10] > 761 ? S 0:00 [kswapd11] > 762 ? S 0:00 [kswapd12] > 763 ? S 0:00 [kswapd13] > 764 ? S 0:00 [kswapd14] > 765 ? S 0:00 [kswapd15] > > Good stuff is that the system did not end in a continuous loop of swap > in/out usage (at least so far) which is great. See attached > swap_in_out_good_vs_bad.png. I will keep it running for the next 3 > days. Thanks again, Jaroslav! Just a note here: I suspect the problem still exists on v6.6 but somehow is masked, possibly by reduced memory usage from the kernel itself and more free memory for userspace. So to be on the safe side, I'll post the patch and credit you as the reporter and tester.