From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2023BC02190 for ; Tue, 28 Jan 2025 19:04:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 52A58280247; Tue, 28 Jan 2025 14:04:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D83B280220; Tue, 28 Jan 2025 14:04:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37829280247; Tue, 28 Jan 2025 14:04:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 15115280220 for ; Tue, 28 Jan 2025 14:04:13 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 893571A03A4 for ; Tue, 28 Jan 2025 19:04:12 +0000 (UTC) X-FDA: 83057785944.18.375722F Received: from out-174.mta0.migadu.com (out-174.mta0.migadu.com [91.218.175.174]) by imf26.hostedemail.com (Postfix) with ESMTP id A4C6D140018 for ; Tue, 28 Jan 2025 19:04:10 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=LYBancFX; spf=pass (imf26.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.174 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738091050; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AFRuLzECQi3X37aibekIYYwDV2KO+vl70+JQWMTjrzc=; b=zDew1mqCQFra7P2+xkye1nJpHbDG7oc0aotnCWnfTm+g5sewSm5VtIrloM/z4vToFk2nLX d/TW42f1/3wQxena2EzEYX5eHbh7bQOsFDqhrRhNlJAZuz7GdJiEQFYyA5oL1XqlZDlmXk iuMZBcC4b6kTGT80lUbsf48roOGvJDY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=LYBancFX; spf=pass (imf26.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.174 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738091051; a=rsa-sha256; cv=none; b=cZVIUfT+SFsxWyr4FKGd2uMq8TVSnJkiNagypZ2J0svz3CUjht97WvBngN1ouSJIma93xG H8AkhAz1JWVYuO6AknR4kq4m+zyghop7hYzQPTQz4WezmqRDKWxxeFAPUsq7tkBROche95 /rynGjqvGHtgmJoAtKcUonWOb2qIcg0= Date: Tue, 28 Jan 2025 19:03:57 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1738091043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=AFRuLzECQi3X37aibekIYYwDV2KO+vl70+JQWMTjrzc=; b=LYBancFXx7gyMq11rPnjg5evtJMSp8OangkBbA1ifr/6LjKpREua02y7Hjqk5/vTc1fm1O Gpy9gsKeKolCVhO1O/3+904YUT/XwtNwcLA/jlgms9IqcxHpiWHOs5E3BmdI71+K/0WFtV APwrtOBZiIwVdWjAV0QNrGhIMuPi5wY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Kanchana P Sridhar , Johannes Weiner , Nhat Pham , Chengming Zhou , Andrew Morton , linux-mm@kvack.org, stable@vger.kernel.org Subject: Re: [PATCH v2 mm-hotfixes] mm/zswap: fix inconsistent charging when zswap_store_page() fails Message-ID: References: <20250128185507.2176-1-42.hyeyoo@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250128185507.2176-1-42.hyeyoo@gmail.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: A4C6D140018 X-Stat-Signature: 6weeiock3mt5o5caz4g9csaizx5a4y98 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738091050-535507 X-HE-Meta: U2FsdGVkX1/0Nq8gxkkW+N2jXYRL+GuS+7dy9N4s2tK9s5A2x/IF7YTeZ1YaCdNKvFNlNSQnhD0NumOAFPOAvHUbkGPoAnewaERHFbnyOVntv2T51ATRgoY8D5DS6i70+7VZjzVYDyYxHvNXqMHzCszOBB2Iv4vv+8pmxIgE9ur09ntTAuKfYDniDK0FVuxcGfjiFdGvqH03zxlWRizLjK3kYff+tfG1iJYNKDOr/Bk9axPvUGpuO75879C09vsTxWAgXY8zMg+sU0IhK6i5IAgq1I8SCNOgMmzJi/OBIo6BFymujEVArRmKm4v7QKD4p1e4d0cBAhblARhHeS0yhnmaY8kFMcOwYWJEy9Ag7NcLOL28Qk7TA+Sb5V55LfrTo+kN6jpbuf8wpr4hVBJhNgDsUl+jxOm2txUEj+VwJj1PGoOAJBaFzChbihfzmB0S6dfQw5jTSLpaAasJJB5zVez/IOCyiB62uxakDIsXhmR/nXmDHVbqFM3CKOwskmHvQpCRSqTwqC7Ionca5W36IwAYLXWyXkJR7/aXn5knl0vR1v9X8ln/OMQUwEhkymCUUHIOAS9Lf5nke3QFf//xeA4d1TWMfVmW4HpSAlsr4JlADwBLEQJfhzLK/k/N8DeqvcZf4fbwcYLpt6RladZSkkt4nkfmm+NNb+Cqu/dVHHyC4oBUUojkEd0a+DHZB5xNAdLhdI2eOHlwL5j5IXKPh/R80Vck9ZQ+EV7PTsweoAdNkivemqZwt30ev9Y8SHuraegHcbd6xnV+56jKlNU4aP8x5GQ2fS9PMZEet3Yv+u9rcxzSq09xJG8FfrA1/lNdstyL9ZixW4YD2CiMllY4WC+ad7CJ9fRHwtGgxezpU1wmIHDOAV8p2YDAXvQJr0Xmjo88yzGyAhm+Th6m4RuGqbHi0x6BFL04V6ON9T2vyHQthFZSW6nTOwWlV2f6htxa/q8t9J5OEoWmERT+IP+ jmgAJtY/ UO08+Bqb6k/IPAVdSJNtre5M48Hw2Z3Tq6TEQev6loO4e4ARpePYFblO9+Q+Jw36UAdoTO1QBhxhjeCO2jKLcHik2/EnhwgktBA23rfyrFED7MANj/1W4BN+P8ra4dyZ9TLtGHU4cy818eSyk0aEAIoeRio/Cxksy4QtO7MRL+kBwn31jAU3hGJ+vfZ0W2mGwNnGfjDTACohSi/zPop58LiExvJXQMQXK7SfDrZJ3naL02bWKLvWcqd4+6yBAfiotnXtZViKj0AvYyrSP1qO+2aFAjK/6Fn7s+/nPR5uhIGurD2Pz7FFGjT9R14sG5uzRs2/XsegmZTzd7MOqn2Mr/EqCk0s4unySO/lQ/FYu7EI7KaqAZDhAGzdb7HS8m97QQ4+DOptI+6Y/xDPunckp6ihdUQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 29, 2025 at 03:55:07AM +0900, Hyeonggon Yoo wrote: > Commit b7c0ccdfbafd ("mm: zswap: support large folios in zswap_store()") > skips charging any zswapped base pages when it failed to zswap the entire > folio. > > However, when some base pages are zswapped but it failed to zswap > the entire folio, the zswap operation is rolled back. > When freeing zswap entries for those pages, zswap_entry_free() uncharges > the pages that were not previously charged, causing zswap charging to > become inconsistent. > > This inconsistency triggers two warnings with following steps: > # On a machine with 64GiB of RAM and 36GiB of zswap > $ stress-ng --bigheap 2 # wait until the OOM-killer kills stress-ng > $ sudo reboot > > Two warnings are: > in mm/memcontrol.c:163, function obj_cgroup_release(): > WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1)); > > in mm/page_counter.c:60, function page_counter_cancel(): > if (WARN_ONCE(new < 0, "page_counter underflow: %ld nr_pages=%lu\n", > new, nr_pages)) > > While objcg events should only be accounted for when the entire folio is > zswapped, objcg charging should be performed regardlessly. > Fix accordingly. > > After resolving the inconsistency, these warnings disappear. > > Fixes: b7c0ccdfbafd ("mm: zswap: support large folios in zswap_store()") > Cc: stable@vger.kernel.org > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > --- > > v1->v2: > > Fixed objcg events being accounted for on zswap failure. > > Fixed the incorrect description. I misunderstood that the base pages are > going to be stored in zswap, but their zswap entries are freed immediately. > > Added a comment on why it charges pages that are going to be removed > from zswap. > > mm/zswap.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 6504174fbc6a..10b30ac46deb 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1568,20 +1568,26 @@ bool zswap_store(struct folio *folio) > > bytes = zswap_store_page(page, objcg, pool); > if (bytes < 0) > - goto put_pool; > + goto charge_zswap; > compressed_bytes += bytes; > } > > - if (objcg) { > - obj_cgroup_charge_zswap(objcg, compressed_bytes); > + if (objcg) > count_objcg_events(objcg, ZSWPOUT, nr_pages); > - } > > atomic_long_add(nr_pages, &zswap_stored_pages); > count_vm_events(ZSWPOUT, nr_pages); > > ret = true; > > +charge_zswap: > + /* > + * Charge zswapped pages even when it failed to zswap the entire folio, > + * because zswap_entry_free() will uncharge them anyway. > + * Otherwise zswap charging will become inconsistent. > + */ > + if (objcg) > + obj_cgroup_charge_zswap(objcg, compressed_bytes); Thanks for fixing this! Having to charge just to uncharge right after is annoying. Ideally we'd just clear entry->objcg if we fail before charging, but we don't have a direct reference to the entries here and another tree lookup is not ideal either. I guess we may be able to improve this handling once [1] lands, as we can move the charging logic into zswap_store_folio() where we'd have access to the entries. For now, would the control flow be easier if we move the charge ahead of the zswap_store_page() loop instead? There is an existing if (objcg) block there as well. [1]https://lore.kernel.org/linux-mm/20241221063119.29140-12-kanchana.p.sridhar@intel.com/ > put_pool: > zswap_pool_put(pool); > put_objcg: > -- > 2.47.1 > >