From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3E48D6B6DB for ; Wed, 30 Oct 2024 21:21:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B10B6B0085; Wed, 30 Oct 2024 17:21:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 460226B00A5; Wed, 30 Oct 2024 17:21:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 327DF6B00A9; Wed, 30 Oct 2024 17:21:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 12FF26B0085 for ; Wed, 30 Oct 2024 17:21:42 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B910080FF5 for ; Wed, 30 Oct 2024 21:21:41 +0000 (UTC) X-FDA: 82731539646.30.88D0AFA Received: from mail-vk1-f181.google.com (mail-vk1-f181.google.com [209.85.221.181]) by imf12.hostedemail.com (Postfix) with ESMTP id CE81940012 for ; Wed, 30 Oct 2024 21:21:27 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cImVU9qn; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730323219; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=04NnIStJmZ8v6ISEsRCwFl6t2OPeokZR8SUSj6bruwE=; b=cjTWIUiuXycLKZGIj9fEY9FoTmr/pq5jOF7UKYmVo9epIA15FaYAVRjhsa+eBS7742z5oX iiH+CGrbwyBYzav60ETrMSC3STRXXfMGz/8cf9ESUHeQ+b6Q2I6dDwE/yx9CDKbYBBwA1v ENZMBZUnKrt1YO5r0emt/mwSIw+2DHY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cImVU9qn; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730323219; a=rsa-sha256; cv=none; b=2AdJy3gZKbDU3f6xK2NH+qoyizgi6mY+SG9pbvncJHjWdOqpLodSgTkfEUH4kEeOQOnHgt i68fW6VElQvF2OuL7tAufrw/8bItJTzbNkZZR3d1IeOvH7ddwL96o3FY5pHo3Q/JZv+KI+ yAoeeOS0lDlGeGCo+gVxndjJUvrspSk= Received: by mail-vk1-f181.google.com with SMTP id 71dfb90a1353d-50d5d4ef231so91500e0c.0 for ; Wed, 30 Oct 2024 14:21:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730323299; x=1730928099; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=04NnIStJmZ8v6ISEsRCwFl6t2OPeokZR8SUSj6bruwE=; b=cImVU9qnkGSq7VE7GKFH5YpwlOI5jZsia0Vbm55cRMXt0nlIJeVJ6zT9pSWOgu/g5c n3wCFskY9RMws6lxdcNzfX3o1t8EIlEFWqp5+FQdlATzBcqbwZuF2/aEW9jjMluv4Fzs Da6nUHV8UOUm153o7DHcEgKJpwARqCTZhPNyZZL5/O7nNwiLSuILyiPBXYXibYwG0wD9 OOSgkz5FVrxNz4rWEAJpe1fm36kBl6ZEKGx001Hv5j4HtmrOHdBl/OMh5dha2h5SvPUW +I/+nocqGbnRW0PUP6izjSCoN9dZmPACqtqqy4mkpii3k/2vu/6dDuMtPw7YhXkDjXzq +zWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730323299; x=1730928099; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=04NnIStJmZ8v6ISEsRCwFl6t2OPeokZR8SUSj6bruwE=; b=h+0hrx85H1Z1zKgvvGBl0jFCesBS34sjYDv+S5VO8wUJT53Nfy9fGWvDZBwT9BWNs+ PealAEx4EOYKpkO47j5/6J9TwovNZ5nKxGVvYNh113x7bfNGIkmyf+M4uJaH+FSlCNiB ZmA7xTJa4ryj1HDoDisvzXF60xucW/oi0TePLudsENt+I8eaNbL0Sdp0yU/A/9P6Eox9 eRi70mnZiA4PrcxGc4QZctDf9bRIO9d6FjIIVjVEhhFMi1OWQzQoi/Fk3Z8ED5ggj2Ry t4hIxA+YP5/IyFCiEICIHQwFJoAmEv88Ev8F0QGGTW3HrGEztxU9jSMQ4KbzlAKSiZs9 vKLg== X-Forwarded-Encrypted: i=1; AJvYcCXOH47kB6FC89p1oYC94LbvEWRfzS1zAPZ66FA+vTvMoD0WaJxPEjOKPXe9cbd8Hjief9YwfjqlSg==@kvack.org X-Gm-Message-State: AOJu0YxDhO7OlnjXF+VLhdkL1UHI6UgR2CYJRUiMSV5G5WhHr7a+duDT BJbJbOgvEd/SEwbfBb/tfBt8TJavK0wa7cUKsR1O+MMKQwFS/VH8XacHECKoXjg/cuLSSEvLIAi +gOYIwmlWe3rzRn5n1c1jDzow+ME= X-Google-Smtp-Source: AGHT+IFxiyrwwaNSACU73R5/Ge2xftYZNBnNjr5FG5CX1TISQoFPGGIsLQr5PcgYW0gS2eOgzgOLL7Rp4yLNbAcKJ40= X-Received: by 2002:a05:6122:3196:b0:50c:4efb:835a with SMTP id 71dfb90a1353d-51014ff7a57mr14380982e0c.1.1730323298876; Wed, 30 Oct 2024 14:21:38 -0700 (PDT) MIME-Version: 1.0 References: <20241027001444.3233-1-21cnbao@gmail.com> <33c5d5ca-7bc4-49dc-b1c7-39f814962ae0@gmail.com> <03b37d84-c167-48f2-9c18-24268b0e73e2@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Thu, 31 Oct 2024 10:21:27 +1300 Message-ID: Subject: Re: [PATCH RFC] mm: mitigate large folios usage and swap thrashing for nearly full memcg To: Yosry Ahmed Cc: Usama Arif , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Kanchana P Sridhar , David Hildenbrand , Baolin Wang , Chris Li , "Huang, Ying" , Kairui Song , Ryan Roberts , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: CE81940012 X-Stat-Signature: dfy7r5ffa9wyxbkqpm1utqfinhhakxmt X-HE-Tag: 1730323287-322949 X-HE-Meta: U2FsdGVkX19VVEY9CnKTa4Stenf/5GSGZdortznhtL4SCFFh1EgCrraa0NQquA2Q3zUwdHTdm0QeiZwzVhSCStBLs61ZX7qyDrnAWryHmAOR7wgFukFmyrT7g3QKJxSOT7nxiz+rXr9ljKfbLc28P5BzTEGWZyNNCc7e49JsP7fANLz2M31TOIrAfmNXzN7qqdMq+3vPQyAsF3YQl+tspFn5a5LGDsutBrLbNtqcq8d0iHtdvXwCXZO38AAwem+4GU5y05ANxtBGCYZDi/w6tB8amzQtpFxLb4UZyp/futVMKJ1B6Mok6xUTNdOVw6qi7XjiPvoFnLiI66HQSoBboDEdK+p2zAU2BuIy24eir3ABKX+28f6BD5sESbcuL5/4Sunm7z33cLoIb4d50yb2OGOzd1GPV05ZJTwjNRU5THxPkGZ+RoMPsZV/FdxlLGvc240iocKpD7D5LZt7OoH3CPb6JO2EcqiG83XYnVSoMxJbzREcWah6pzh6Dn+8yPrnR8csC5smoIxJhmq5s9qbK6o/8oOvgsT+Q39DTYcRUAZMLxn4qQ7S0K50HMODxdOWrZUL/80Sxy1siKoGC6e3acel9NlbIR2g0j4PoKbZM5luVYoNpCBeTOnxmuUsd18aXUlgTxsj8pwwtX7qa/+odcXyp7SOzWpy+Yw8pq+ovAUBF3kuuCKBTvJzdmXVLHLSXxC39RhTYXz1OoPJie08hhGUUX6Rrkip9bC7C0n179aP49qkMINKkshgPDWsZ4XtnY0i4ckB8Lt/Iaq1mlDuYS3IQYZVQEhcIkDX0zJiW9uLr0md/EGvzoyedIh2o8+b8Bao5Vt+TUxl/IufwHrnM8CzQvOlkEjnfP3KQxnwu/w4dYZkCKIvsNowXtOSFUfyYiU805GM3YS6pkoiZ0eleN5ijsJjEJcw9i/gnwM1OLoHCzk3760p/lDCyc/AP/DxGyZAPgXJ/VzeX+WHgnm fh13FUUq zR1G4Ua+7DHzG2UjhvQAGCdQV0MgSi40vG0jFd1sseucaBodgWCYngzHdyE2JV3CzPFpBmX/kCPOCMGHaJ+2aAnjpp5hvvCwC8hBBcyRRTQyQK2PDA5qxmBhKyz2cHRNl+vlMLd/dUWOGrkmLEfA+bcRRLFEPVZqVRoZZWvvN/nJF7cphsIZLHUH7EuTkqb2pSVMELHTM4V6Cc1+ojtfKhth2OvkXswM2f03gR0F2NYReRbaSPiCRCYPfl/nym4eoq7es6xtGlCtOUxWBQbQWZp/uZgbpTmqc5SnVjxbJU3CqZ7j0zFt2XYrG/zUoMPI0LR0Wu2AnvdI0wLg/fvfdOCQ6pbEXVOYBOH6fbW4BklmmpLya1bDUtA8+KYfcW5NEzC30oV67cTMnuUk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 31, 2024 at 10:10=E2=80=AFAM Yosry Ahmed wrote: > > [..] > > >>> A crucial component is still missing=E2=80=94managing the compressi= on and decompression > > >>> of multiple pages as a larger block. This could significantly reduc= e > > >>> system time and > > >>> potentially resolve the kernel build issue within a small memory > > >>> cgroup, even with > > >>> swap thrashing. > > >>> > > >>> I=E2=80=99ll send an update ASAP so you can rebase for zswap. > > >> > > >> Did you mean https://lore.kernel.org/all/20241021232852.4061-1-21cnb= ao@gmail.com/? > > >> Thats wont benefit zswap, right? > > > > > > That's right. I assume we can also make it work with zswap? > > > > Hopefully yes. Thats mainly why I was looking at that series, to try an= d find > > a way to do something similar for zswap. > > I would prefer for these things to be done separately. We still need > to evaluate the compression/decompression of large blocks. I am mainly > concerned about having to decompress a large chunk to fault in one > page. > > The obvious problems are fault latency, and wasted work having to > consistently decompress the large chunk to take one page from it. We > also need to decide if we'd rather split it after decompression and > compress the parts that we didn't swap in separately. > > This can cause problems beyond the fault latency. Imagine the case > where the system is under memory pressure, so we fallback to order-0 > swapin to avoid reclaim. Now we want to decompress a chunk that used > to be 64K. Yes, this could be an issue. We had actually tried to utilize several buffers for those partial swap-in cases, where the decompressed data was held in anticipation of the upcoming swap-in. This approach could address the majority of partial swap-ins for fallback scenarios. > > We need to allocate 64K of contiguous memory for a temporary > allocation to be able to fault a 4K page. Now we either need to: > - Go into reclaim, which we were trying to avoid to begin with. > - Dip into reserves to allocate the 64K as it's a temporary > allocation. This is probably risky because under memory pressure, many > CPUs may be doing this concurrently. This has been addressed by using contiguous memory that is prepared on a per-CPU basis., search the below: "alloc_pages() might fail, so we don't depend on allocation:" https://lore.kernel.org/all/20241021232852.4061-1-21cnbao@gmail.com/ Thanks Barry