From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1BD2D6B6DB for ; Wed, 30 Oct 2024 21:10:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56AFA6B00B3; Wed, 30 Oct 2024 17:10:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 51AEC6B00B4; Wed, 30 Oct 2024 17:10:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E2AD6B00B5; Wed, 30 Oct 2024 17:10:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1F36B6B00B3 for ; Wed, 30 Oct 2024 17:10:57 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B98B51A1151 for ; Wed, 30 Oct 2024 21:10:56 +0000 (UTC) X-FDA: 82731513060.14.D58965A Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by imf26.hostedemail.com (Postfix) with ESMTP id 22646140017 for ; Wed, 30 Oct 2024 21:10:34 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=kJjcZI8y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730322478; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Yv64QlET2EPMrWiN3Hx3EoPUYscpfQCW7OcH/MAi+rs=; b=bpYbpe6eDhlAdwTH3pSaH6d/477MUfKI7vCs6grzj8MyssGHxlDYQQJVDfmFx6mmMmHcJ3 VVXbYT4wrdBr85a4CyhERVb/IuvS3gISr8LgXNrrI2fGs1HMe+qTzrADNAKUslfKqEUKg3 XsZ7CuOqn6JFBSy+a/iGowUmG0VN59Y= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730322478; a=rsa-sha256; cv=none; b=pb23kh6863cbj8wxQcV3kwoNpe5RDHjguziKqS/fiC/9OMR5qdl/AbYq1q+iTvihGkJsg3 AgSdl8uPcGrMVRDvpsVI/Omcv8QKkNVbndisUEQWJbfBb4kk0gDoo2XmayQdDDkGU9ew1b Gt5Gy5EmtXmtleIdvF46SDupHXenIcw= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=kJjcZI8y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-6cbf347dc66so1748066d6.3 for ; Wed, 30 Oct 2024 14:10:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730322654; x=1730927454; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Yv64QlET2EPMrWiN3Hx3EoPUYscpfQCW7OcH/MAi+rs=; b=kJjcZI8y0wPkZzCh7YufMi6yKeEXDQcKsJNQWhBSdJn7rGzVn/shOHjutkkjwBFaqr j6vElWuseBKB5RoBRPV4pSFdedILofaYrQO/rS+u95LXcGpek1BiRK7OoxRSK4bs/FY5 pzJsbg3qsZT/b1vmDf/Jveos+8G3lmY/63nMwzz5iDQutz8/glDo6BPqK4bJCU3dt3JM c54XRseHze3swvUeiCm8SFlKUh5wcS0JxfGfA4ThLELIEYy+GVukQn3aX+MOBZb4/2lq /O6Zv4HBbAoWVf7+4dcLX9QIG0FTvhvoGgKzYhghOaaqAh17qcvj5TFQTQXjU1JTb887 qljw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730322654; x=1730927454; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yv64QlET2EPMrWiN3Hx3EoPUYscpfQCW7OcH/MAi+rs=; b=J9imB7InO28IeYAOToQ7DOuMq1hBAVkbwX3Hq2ei22wnv3TPEHSkiGbSbW8KDGHPQ8 p+cQbVvn1KsJy9Bx4+sncHQFE/ypLQDB+vm1LLojWE7MZO312AEwZGMbD3Kv4Al6A347 I9r13Izo0I3zLUZyTPiejaotI/jKLI3Yx6Bve0jvKBQdEMVYarsd/wx4/dqmcOEJ7hT0 dYPnHS+M9q0PAwhf6tIII8bGJxbMABvzSww1PzEHZv3RRi9UhLyPMo3Pg8u8/7SF4Q76 q4SpwY2/e4KvaltuC5j9ICkBuAk3znq0CUK0MD3ke+tSwxCq/omDi3fsdf6ia27bDLSf fGpA== X-Forwarded-Encrypted: i=1; AJvYcCXFjoWxszm+MIS0Z8vDPJax3lynkIgcwOP/gE87qs6a/jsAs5DfMISQ5F+OtzQUMwKsgmXeSYWXpg==@kvack.org X-Gm-Message-State: AOJu0YwKf7lLYfpk4AyiYZNYZOPfYZutK/ymJ2wGwEmoAIWgeJrtius1 GOJlDGBSxG4zm0ntZXFsd2XJjm9mwfDjxywD8l5Y9ofz76XJNB/5ngJjjj27qXJtJRETOQpUs47 2oJqCfa6BNvezzx0Y6TjM/aumKh+LcNc8zZF+ X-Google-Smtp-Source: AGHT+IFSS2YttB6l2HQFCMEWaOmu1/SVUYpulUyhAF2c1/XqkoK6fuxeN+Gt2NutPLnZPmax2rGmjblrytR5r69cBr4= X-Received: by 2002:ad4:524f:0:b0:6d1:8599:5d85 with SMTP id 6a1803df08f44-6d185995f3fmr202244166d6.48.1730322653802; Wed, 30 Oct 2024 14:10:53 -0700 (PDT) MIME-Version: 1.0 References: <20241027001444.3233-1-21cnbao@gmail.com> <33c5d5ca-7bc4-49dc-b1c7-39f814962ae0@gmail.com> <03b37d84-c167-48f2-9c18-24268b0e73e2@gmail.com> In-Reply-To: <03b37d84-c167-48f2-9c18-24268b0e73e2@gmail.com> From: Yosry Ahmed Date: Wed, 30 Oct 2024 14:10:17 -0700 Message-ID: Subject: Re: [PATCH RFC] mm: mitigate large folios usage and swap thrashing for nearly full memcg To: Usama Arif Cc: Barry Song <21cnbao@gmail.com>, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Kanchana P Sridhar , David Hildenbrand , Baolin Wang , Chris Li , "Huang, Ying" , Kairui Song , Ryan Roberts , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 22646140017 X-Stat-Signature: bgrbhuwg87k6cd5xs6adwe34hjxf8km9 X-Rspam-User: X-HE-Tag: 1730322634-11825 X-HE-Meta: U2FsdGVkX19Gt+9oKHtB1rkNjxhoG47uBRH8qxIcPIxvxd526woMXWvdPGUvorwPicR/lv2RNLAoeAHEHCf0fEbALd15q9GBZo2rLP6E2XALfAE5A8IOelXZZ2CbDvVM1nDvrLgdBedpdEeaNOthbUkglepymhxFd47pKvm2mqxPca7ZtN3GtqPTCYxtJ3jr/zyaLAE+Ff1DMfoAYpgf4BeODbSZfKos/jrDACmtyQ0A+5tKZCSvkBfhhUyApu8JQcyMCc8+ZKP62Ycz4hFDhqdl7AApOi5cevR5xuNAWFJQAiAJmSzicHsRKzWwBrH/vNwchhH6THobpN80y/a+yo8y89iisRQpHrYldrBFdLaNJPFJRBlwhqR4RCAxn0R4Skv3RVu8gns9J1B/EjxLOIttLN+G1/3WAW2YE+c+KwyBOs/qlTPXMoOqeG1tjDYVw2pjC6O8mKlC1KHOloHFQj2QPtdI+vAfdP27HNLRhEigVLAosUld8Vnblq3FgeHJL+Rubn616M3pPSDAt7oY2qn7g8qo1iuY97+dxieDkWhzIXQv8B95tLmO1zCDU2DLN6Z2hiRgG9UQ+4H04fyb70bZ6fG3c81CE8l4amvaPSoBazgiHa6RJnLenpHjWcmeO1X8Elez4tgRQx2wy/GbX5Q5I6mfY6OAggKP6jCZLNdj+lU17RCrYf389bZvFjddwja0H3BuPJufXKxC/qc5yKss+wghcfHMHnOBbWCAEmvPROvuJwuq78j4Vm9agSFLKvXGh+BbWH7av5WnNFRmj9X5fx6hef690BvPEWlCgkIAfOoxXNNKEWfqOd5fH78e95wRtLvLgBULcAU2KSCKzRwkF+2JZCJ6jmRcgBn8IM7wGJHNQ4+W/zTZgwjl3jwON7+kqm8V9HTKfPgMesWXHzn7Zrj7ss89HtGUUlMTmdlpTw0U9wbl2pOvrQhzev85hBFNe7Gql9RnjOQaPby hoiGEpJh EHistiqgfEbrPepBcTN8+p3EZd5MG4FTEzWdDHNkFDHUc4g8Dp0ThNIWMLhlLVQQisscxR7y2n++x2SnlkArJkaax6t57Y61RB0JAvT/sHEhs8Sac/kNlNPupnmL/rgp3g1SfuGVsdi4/wUZFdDJ+n1n8tmwSyksn1W3K/FuPtGPSpjI1B2E76IkVvhYWv6v2cEDNhzttR9F/sSq3G2i0OGejJG5pn9x0fYjvr76euOVIpW760es08ZXUYw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [..] > >>> A crucial component is still missing=E2=80=94managing the compression= and decompression > >>> of multiple pages as a larger block. This could significantly reduce > >>> system time and > >>> potentially resolve the kernel build issue within a small memory > >>> cgroup, even with > >>> swap thrashing. > >>> > >>> I=E2=80=99ll send an update ASAP so you can rebase for zswap. > >> > >> Did you mean https://lore.kernel.org/all/20241021232852.4061-1-21cnbao= @gmail.com/? > >> Thats wont benefit zswap, right? > > > > That's right. I assume we can also make it work with zswap? > > Hopefully yes. Thats mainly why I was looking at that series, to try and = find > a way to do something similar for zswap. I would prefer for these things to be done separately. We still need to evaluate the compression/decompression of large blocks. I am mainly concerned about having to decompress a large chunk to fault in one page. The obvious problems are fault latency, and wasted work having to consistently decompress the large chunk to take one page from it. We also need to decide if we'd rather split it after decompression and compress the parts that we didn't swap in separately. This can cause problems beyond the fault latency. Imagine the case where the system is under memory pressure, so we fallback to order-0 swapin to avoid reclaim. Now we want to decompress a chunk that used to be 64K. We need to allocate 64K of contiguous memory for a temporary allocation to be able to fault a 4K page. Now we either need to: - Go into reclaim, which we were trying to avoid to begin with. - Dip into reserves to allocate the 64K as it's a temporary allocation. This is probably risky because under memory pressure, many CPUs may be doing this concurrently.