From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EB05CF9C69 for ; Tue, 24 Sep 2024 20:51:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEF396B00A3; Tue, 24 Sep 2024 16:51:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A79566B00A4; Tue, 24 Sep 2024 16:51:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9230A6B00A7; Tue, 24 Sep 2024 16:51:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6A7DD6B00A3 for ; Tue, 24 Sep 2024 16:51:27 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 18D504036A for ; Tue, 24 Sep 2024 20:51:27 +0000 (UTC) X-FDA: 82600827414.02.8ED19B0 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf05.hostedemail.com (Postfix) with ESMTP id 54954100013 for ; Tue, 24 Sep 2024 20:51:25 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bFvPHAYP; spf=pass (imf05.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727210966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w8Ouk3+mOzkk2sFKAnAlzIMXgObvchvQn+4b73wEQTE=; b=QMdNg3z1znKKpaE4tb8a0Sp0v0mZJrTz+U4iQKaSsOfs9rKjyhBQsk+JO4cYgv1N77J4B/ EB3Q3apKug764l5CPufu83MYQD284dATUvrG+cXHxvcPxUPLW2qYA7OG4uM6Ep6p+Nf19b ofAzLaRAi2UTki2e1HWwWL9uPEUk+oY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727210966; a=rsa-sha256; cv=none; b=XfHMFaw8UK2YflxYAyAuTjK9Krw9AbapcZfFQosM7w18PIvWiAbHpwGhmlSJYmf+RkPcLE 1SmEYFz5QuJDZUOwyiwsu6IJMuwNZv70flhEhQumT+4tEAHqcuuPjBTMVLZXesilNVESEb /gny141Z0svYFH48HdCpucOw/fmLO5o= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bFvPHAYP; spf=pass (imf05.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6c34fb4f65eso53132926d6.0 for ; Tue, 24 Sep 2024 13:51:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727211084; x=1727815884; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=w8Ouk3+mOzkk2sFKAnAlzIMXgObvchvQn+4b73wEQTE=; b=bFvPHAYP3632MfhLcMuI2nZp5MPQ4pjeust36WgnSdCtCq4MhKMjW6BjiRy5ZkfBP4 CgCNDkwiPwLJVSNckaDZ3fwOqaCoNkLVWFcYtNnY1ZXyIgU2jlHDABN8Z8y/fm8q7ynQ /uh8VEZLQmxtiGC+c3NTFMOtvopGmJyqYJ313aKfbR1hOVagaAw4ZEhDw9GGSjc43eiH YB/o2yWOqP5Lez7IMVCjD4hUADE7KtK9RHWrLliqcDu8eYD7zdUdnyp7DUvW4NHe3CXf BC/eiQB2V6JE/vxzx39pWkTDMyN330Zd0za3pz0jtNRUDB7X1ywzef870K2oVIvTMr4b tsJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727211084; x=1727815884; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w8Ouk3+mOzkk2sFKAnAlzIMXgObvchvQn+4b73wEQTE=; b=iJ3iX6asp5Ll8c9bFCyuZgKAsw6SURcg+Y3vTDkXojQrDaUDOhGEtvbNlqTd9jM1zC UPrK74qn5SjEyvvgrG0vSdTe+ISwaWICECdCOn7o2nkOMiHyYCyKHsNWWilzqsAx8Rjk 44uLIVEJ9opkRe/PCG3/2ekTyKFKUGX6W5KGLF7ibN/W9LpZ9pr5M3SO+ULDKzZp7BO3 Qn0D7D/1LEYT9sO8A0zq2r8sGp253dpHw5ganNlpUpQt9BmQ654xNmzMmqioWH6kpMEL 4jVCmEQx21zsdGcPe3h0fgnymNlvsL/tZk1pctxzdniP+mUfqWz2Qahm6w5c/ikkrsv2 2D8Q== X-Forwarded-Encrypted: i=1; AJvYcCX5zQC0WNMsBOQjBJ2My4GRPjY8ZGnTCgkYDi3VyueElolHfJA3ogpELOeDrcILfC2NSNpt0ZyWOw==@kvack.org X-Gm-Message-State: AOJu0Yx5Kz65AokHDqbF0yHpzdxSVEEbSBRH6JQ+24oTHs7rUQBslWbX uZ6AIw/Dy3epv0QBt9OoUYi0qaD2lgZS2q/RS31XDTWVdB2bFvkDaDIw6S7U/ErZIXuiHN8si1J gLwXNTzmLCcNidvWczPBCdqaPouI= X-Google-Smtp-Source: AGHT+IFQC06uACtY5Q0qyEv5c3jsAf1V1JqntpR5sTV+WlZmbiWoVeqVnTXQJigx0D8eL1UWD58roQDqvyzbVMEpV4M= X-Received: by 2002:a05:6214:4a85:b0:6c5:a934:6b7d with SMTP id 6a1803df08f44-6cb1dd70739mr7083626d6.6.1727211084320; Tue, 24 Sep 2024 13:51:24 -0700 (PDT) MIME-Version: 1.0 References: <20240924011709.7037-1-kanchana.p.sridhar@intel.com> <20240924011709.7037-7-kanchana.p.sridhar@intel.com> In-Reply-To: From: Nhat Pham Date: Tue, 24 Sep 2024 13:51:13 -0700 Message-ID: Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store(). To: Yosry Ahmed Cc: Kanchana P Sridhar , linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, chengming.zhou@linux.dev, usamaarif642@gmail.com, shakeel.butt@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, nanhai.zou@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 54954100013 X-Stat-Signature: 6fcdegn6zrdk1c9rx7guh6h11ahr44m8 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1727211085-625023 X-HE-Meta: U2FsdGVkX18Kt4cslfe6Xg0L0QhCfQa3lKoX/dk+eMjEjCGe1OBIOeCS6sO/VvyWFHMF/S//50cZ++AL7MCNrIf3b7qTXnVnI63qpcprzW3YQCoFyUWra2xZSVkM3IpOkk1NBssLifbFg2OdOSumOY5/OIIgE9oVoCKG4JwTaH88gwh6pIy6j428pfyEXbbzjps6rAPxfKGIa3oI9YxIYpkwfrT49c6QLjAyfsRE9uj09ZM5QkUgMZv9d3dzuTHEhiMdZSpsYuc27z484IOCo6ScA7Vsy8JzyrBN0MKUFcRex20mtLVmyaXFdP9Uw6S5sEXWizJeRoDOZj30XUYBwNOC7bT32qc8xdkGny1g7ecYIswVz2+YsT/vA/JscrMIhEML8W7FBEFcTSzAwoOZndSHKne/B+brObEhMZfvJsOfMFICcCXc5VW+036tOJqBKe5YqNN9jfwqzdyt/1eFhm3Q9p7kYSreTV5Of+g+ilsFejAwdXx+naCJ8OScdVFaJmhKlKiiPO9tHIjhMifp/5gnTG2+QYw9OfxE6IrnPHbWs8ZQlutNdmJBaWe7CYvyddSAqf34FxtA1n5op8WbQZhF/0iU5uiGc4mohU6TkyECzFO3Yiz8NGRS49L+unaoKClfVB6wYXVYQp7clwacSwdStuCuGDINBElNp6YQ9BgctI2tuNgy5jHG4V1HSt7OxXt3zGqjv79SUaSHjUO1QPnqNhw5uRuAqHjy53HXWbea+RnWqJfJJNkGe3JPXjY1zjaUEvxDfK9KWE5/HU76+jow2fyBmiPpaKQJJu0OrQFe2xgyoWekA/hZmJ5ktWHTIZSjQ5OIx7BZqYxF1r4XVuanTr6daqKJAZoGQOWOVfLuj2wwvGQdhHnoDmA6wPW+Mh4gLM4s3UDSorRgJd7Sc/ERYO5KANyQ8WHVh+h4qfpF++tjHdvgVT/u/pcYuyuDy0tdKZg77OglvMWfFg2 gSqTjMMt QhUwAqi+ZJ35/72tviapFSdlKCQRUo8ASidS0l01ZsupYF7/hKkegVcgBxeApHgOHV9zcu9goB6ablWqbIj2LVK0jhJjpz4nzCY4hZfAOUClHYp3wMx8Yh9Ur3oig84YAaxk7/w44V2d8ALUmZYIEZe5MfNH7a/5A2Yzgoo90RHsSVBveK39xHCpViEvr0niJVzXIbbtru8T9tkV4q/8PJhIa64qtuhtjqFmRqiNIlBuB6+nbUOWsPOTowJjvHN+J/4EWydZhZGBaUIHLN0XyS4mQWBbIub+/lxYgFWoUhXEyOUN+K84v4Napu0Wpsy2ceyQtS95aQtv3v8p5Cx8vSkMOADtURUyqx0DMWvXih3DA8KPUd6ETZPNSPqpMx4V25L9JEoa4mYCOtrBi7CiPF68B8MHx+up5yxcZc7uWRDdUUmUh7InO3me3QQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Sep 24, 2024 at 12:39=E2=80=AFPM Yosry Ahmed wrote: > > On Mon, Sep 23, 2024 at 6:17=E2=80=AFPM Kanchana P Sridhar > > + * The cgroup zswap limit check is done once at the beginning o= f an > > + * mTHP store, and not within zswap_store_page() for each page > > + * in the mTHP. We do however check the zswap pool limits at th= e > > + * start of zswap_store_page(). What this means is, the cgroup > > + * could go over the limits by at most (HPAGE_PMD_NR - 1) pages= . > > + * However, the per-store-page zswap pool limits check should > > + * hopefully trigger the cgroup aware and zswap LRU aware globa= l > > + * reclaim implemented in the shrinker. If this assumption hold= s, > > + * the cgroup exceeding the zswap limits could potentially be > > + * resolved before the next zswap_store, and if it is not, the = next > > + * zswap_store would fail the cgroup zswap limit check at the s= tart. > > + */ > > I do not really like this. Allowing going one page above the limit is > one thing, but one THP above the limit seems too much. I also don't Hmm what if you have multiple concurrent zswap stores, from different tasks but the same cgroup? If none of them has charged, they would all get greenlit, and charge towards the cgroup... So technically the zswap limit checking is already best-effort only. But now, instead of one page per violation, it's 512 pages per violation :) Yeah this can be bad. I think this is only safe if you only use zswap.max as a binary knob (0 or max)... > like relying on the repeated limit checking in zswap_store_page(), if > anything I think that should be batched too. > > Is it too unreasonable to maintain the average compression ratio and > use that to estimate limit checking for both memcg and global limits? > Johannes, Nhat, any thoughts on this? I remember asking about this, but past Nhat might have relented :) https://lore.kernel.org/linux-mm/CAKEwX=3DPfAMZ2qJtwKwJsVx3TZWxV5z2ZaU1Epk1= UD=3DDBdMsjFA@mail.gmail.com/ We can do limit checking and charging after compression is done, but that's a lot of code change (might not even be possible)... It will, however, allow us to do charging + checking in one go (rather than doing it 8, 16, or 512 times) Another thing we can do is to register a zswap writeback after the zswap store attempts to clean up excess capacity. Not sure what will happen if zswap writeback is disabled for the cgroup though :) If it's too hard, the average estimate could be a decent compromise, until we figure something smarter.