From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00553E7E65A for ; Tue, 26 Sep 2023 18:24:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A2386B019C; Tue, 26 Sep 2023 14:24:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 752396B019E; Tue, 26 Sep 2023 14:24:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 619C46B019F; Tue, 26 Sep 2023 14:24:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5190B6B019C for ; Tue, 26 Sep 2023 14:24:41 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1F66EA09C8 for ; Tue, 26 Sep 2023 18:24:41 +0000 (UTC) X-FDA: 81279574362.02.2D2EEF3 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf24.hostedemail.com (Postfix) with ESMTP id 01774180011 for ; Tue, 26 Sep 2023 18:24:38 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=mXzRrcsY; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf24.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.41 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695752679; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rhTI8dcDdjEZYE7dpMYfMSA1DN/2lzmSQcv8fWKSSFk=; b=035ZO+CpMHUXBf/L4nkUW+/lAFezg5sPVvbXDNV+SKlQRxVvOScN4U2N7pcy8zh10yPQiS zVexktKzb8+Setrzxfo50XgQhECe3cG5daJeu/87rz2JINrQTg/vwBWQhvzsuivh8fx+cT LJHIwR+NjH9r9HTr0CM45b4oWV2ZySY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=mXzRrcsY; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf24.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.41 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695752679; a=rsa-sha256; cv=none; b=jQdOc52Y1sfoYR2oBFJXgYnHHlX/8DCkGBaoqyEK8rpPZX0IBOPEPdGlwws7FADEqIzLfl W0rDr5Vtuz/rifjkBWZwH3HTNa1sr/Wy53saNoHOx4+BJdRThn4AG4EIRc7U5TFyIFLEO1 JGqAQpPWuWtM+BB+Wwk53HO21/CeC58= Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-65b0e623189so25209736d6.1 for ; Tue, 26 Sep 2023 11:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1695752678; x=1696357478; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=rhTI8dcDdjEZYE7dpMYfMSA1DN/2lzmSQcv8fWKSSFk=; b=mXzRrcsY6YK3kq7+Zhpc3gCoq1QsQqCQVs+wrZbsbxRDn/SziG3wsJ8obC0szTB6oC fcdq1TsXqQCGk7ibzMXFQjRPl8jncrbxP6MkONjWsWP+bMcpaumaVxoXMJsZesZ42HX8 6xYMwzOQ5ckMwxmvKcjxMvAfKr8VQkH6zqheFtFS6SgIEyTV34xcyjvnVy82uumQkz6i aWhsQr8M/iJ31DIsLDnAXUstJZw8VfpfKZ/OswEm/OqvlFyyf2sABH6UdeDAV9y0MPRW FXU7YS8wUizXzXU9RXm/LZi9UPyx8WCpR8vfNRLc2sreJesZbSZA5B/s8LWODkjlcOJ+ GE9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695752678; x=1696357478; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rhTI8dcDdjEZYE7dpMYfMSA1DN/2lzmSQcv8fWKSSFk=; b=M6w01Ik2v+NyLvh9JQfjfOoCGthl7YRZPTnnPTmSbG81zZGs7EZph/NC6NQP5PT6lD CK3nAVZx2W14s78QoaCQ7iVFzqNa2QWXROEGqoQ3kPxyaXf3h84xuWnjw7DZFwSk6EIy XXV4Vm2Nr0iAW9PFMiEdmG9CJSKr39MpMq0jKOFX/HaKpAVL68bextSwxYjpcrh3p06R XsTBHB/wBACKXb00x6j7ViLvOPAUojYEpOtYR+f1ivXVIAROSyrQVW3zrJcYEWx+BHC7 ivreKmK5zHnjXVYWUe/SB/iKpBcdBAnKWRwyj/jaCc9yedRiQq76ob61/9B5mv0niFXj t1KQ== X-Gm-Message-State: AOJu0YygGjx+XDzocchmETXSFsT1UBApyK/2p6ege2DHlqIqvG4MSrjT JSvfxi+InJqCnb2fPaE4vVpmTQ== X-Google-Smtp-Source: AGHT+IGrllrqWC0EKYtowvbvW5nW/cSOOf9HD2lBbguXAWo9ez72h7EB7JcAv/d+4OtMR5uQ2cxa0A== X-Received: by 2002:a05:6214:2d04:b0:649:8baa:2986 with SMTP id mz4-20020a0562142d0400b006498baa2986mr10287586qvb.2.1695752677986; Tue, 26 Sep 2023 11:24:37 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:ba06]) by smtp.gmail.com with ESMTPSA id r1-20020a0c9e81000000b00656e2464719sm100717qvd.92.2023.09.26.11.24.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 11:24:37 -0700 (PDT) Date: Tue, 26 Sep 2023 14:24:36 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Nhat Pham , akpm@linux-foundation.org, cerasuolodomenico@gmail.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Chris Li Subject: Re: [PATCH v2 1/2] zswap: make shrinking memcg-aware Message-ID: <20230926182436.GB348484@cmpxchg.org> References: <20230919171447.2712746-1-nphamcs@gmail.com> <20230919171447.2712746-2-nphamcs@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Stat-Signature: tur3685wgkk9fyju7mcdz6dwhgsohg68 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 01774180011 X-HE-Tag: 1695752678-408540 X-HE-Meta: U2FsdGVkX19KC6TCutWbf5F9CVVaDQU2OIo8vfMRQc85TWRcLH9nNYo/AlihZCHAHa+WSOrZsLc4/j4kVJOEsjEU80rGyFRmaa0y+/tTcmqxJ79bXnxqqHYi70i9+3CeFcyb+2YSoBROvoM1LdhV/dGT+2mLCWU08Ef6Kns7HYee27o9iMh5IXaHfQZpbGyRON2oA7Ni6VClrTmsC7xxEJLxKLDdrD2HZMwNg29xWyf7ONo7LW2koeAGurjcdz3G6Mkji2Cl/Q92LVb9jeoUqY68K5IcUewBAfc5bQJNywStRScjM6CUFF493IfncILZkH2z7VcZmwNcu/iVuo91RX5psdCsGs56Hr8oJ3QM6x8iAMjUaIg5dX+zqDAIRnan7EunZw5H+1I4FMLEYl4qtPwqnMnItPHT1HM8LFsi6Gqm3seokZ/+rUXJ5qSYFWQycfkxHBfN+iCq6p2cHxoF75B5sOXJyT152fT5OVizgNrJonrdDrG+dHiir9XziaMNGZrzehWNe8Rifkt9Lm5frOztC+GJpyTKn5WD4fJ5+ZXnh90gdd6mEtIZWIZoRD/3xP9P/wce4R86GAyVf8grWhc4arBh1/Nc2fnV6Wi4GKe/0/LMyWlPfvgVmWzeeetPFCQbOSMdy7dg4A+IHLtD8/IcbsMX44V1rMIRv/Ni5KLo5XQ6+XhkAvD9uZ2a1BCgi6ODbbAEMyQqyh1sp9Pylk1IqubWCPzYVtwCvS5EiE3/aE8Epw//INtAdjBTdnLBJ0/Qbu38GLRTjJ8SSG7huEhJDAwrafRgjxiQgLR17IfMpGEHeTI+wfIbKhhLamuLrrWwtNA2/I05cK3c7WfEK33l0ynI8KJJVhp+bbHq4XLXEjVWrBWiGcxscHphcJaM0onqwigQlVkHF7edIr50IC6V2+2vmm/RyZ1zRjy1/kjSfzxPP51/+zG3SFrXe/F0/l31Ya6zKmeg+SlovOB yKJ5heDW zOveHQcSDgR2pKOQpYkYHhf+rstg01xs5+MM5JIP18CpHySGL85XyjBq4QAoNchtSVIGyieRZkztberdmuoJRRIbwBuhRf6wJMxGaqaoSOiKAOrlKlD0wMRWhcO0EQLaoZ+8WT9OK5Br0SzKbpWSZ1KdEM02bVc2BVEJycnI4IMK9a903uwI3jsy1poOSLKxwObvOraRAEz3Uh6MJnZ0hgn2tB6olGtZKng6IhV9eN4k92jRwv1JKDx2kzOSn/jD4DOSCo/U9GQmSoL774DTNqbCz+TF4WaPD6rJYiaEh0qolSgFyLBgyxq0NwZTKBUgYYJvyRJptW+jkEacbjkTLzCCbnTZce4RgFUghBdPAjQoFaQ36vEgySgcVLEDU4kLoGJ47caXzLDsDfx7BeY4vJi0gJRvgWrnYY1CewTMS/MHUPnBaclC5VfV9DEOVtkpqz+Jr4p9kIBHqnnLP1TwILeSuiUvAouWAq65Y+ZRd11w57utTJNC5Urbkawxf2Qs/RQrNlPMJjgUt3G4YX9sOUn9eu/5D1KN3tNGwt1UkCZg12Ik= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 25, 2023 at 01:17:04PM -0700, Yosry Ahmed wrote: > +Chris Li > > On Tue, Sep 19, 2023 at 10:14 AM Nhat Pham wrote: > > > > From: Domenico Cerasuolo > > > > Currently, we only have a single global LRU for zswap. This makes it > > impossible to perform worload-specific shrinking - an memcg cannot > > determine which pages in the pool it owns, and often ends up writing > > pages from other memcgs. This issue has been previously observed in > > practice and mitigated by simply disabling memcg-initiated shrinking: > > > > https://lore.kernel.org/all/20230530232435.3097106-1-nphamcs@gmail.com/T/#u > > > > This patch fully resolves the issue by replacing the global zswap LRU > > with memcg- and NUMA-specific LRUs, and modify the reclaim logic: > > > > a) When a store attempt hits an memcg limit, it now triggers a > > synchronous reclaim attempt that, if successful, allows the new > > hotter page to be accepted by zswap. > > b) If the store attempt instead hits the global zswap limit, it will > > trigger an asynchronous reclaim attempt, in which an memcg is > > selected for reclaim in a round-robin-like fashion. > > Hey Nhat, > > I didn't take a very close look as I am currently swamped, but going > through the patch I have some comments/questions below. > > I am not very familiar with list_lru, but it seems like the existing > API derives the node and memcg from the list item itself. Seems like > we can avoid a lot of changes if we allocate struct zswap_entry from > the same node as the page, and account it to the same memcg. Would > this be too much of a change or too strong of a restriction? It's a > slab allocation and we will free memory on that node/memcg right > after. My 2c, but I kind of hate that assumption made by list_lru. We ran into problems with it with the THP shrinker as well. That one strings up 'struct page', and virt_to_page(page) results in really fun to debug issues. IMO it would be less error prone to have memcg and nid as part of the regular list_lru_add() function signature. And then have an explicit list_lru_add_obj() that does a documented memcg lookup. Because of the overhead, we've been selective about the memory we charge. I'd hesitate to do it just to work around list_lru.