From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EAA6C3DA7F for ; Fri, 2 Aug 2024 18:52:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 557676B009B; Fri, 2 Aug 2024 14:52:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 506006B009C; Fri, 2 Aug 2024 14:52:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CD576B009D; Fri, 2 Aug 2024 14:52:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1EC726B009B for ; Fri, 2 Aug 2024 14:52:12 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D81C61C233A for ; Fri, 2 Aug 2024 18:52:11 +0000 (UTC) X-FDA: 82408200462.14.9B60695 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf29.hostedemail.com (Postfix) with ESMTP id F4045120005 for ; Fri, 2 Aug 2024 18:52:09 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YTn1pVSJ; spf=pass (imf29.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722624724; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t+v3NL+zeA+TfLqsAOg0jF4pN/5zB5SP0NcF9oposHA=; b=66V3o5d8UyGQTnSw2fxvD0i7biYYr417ZgUYrucPR/8Kk6yVskZy9cu05bXoiyoqeE3Q39 bwfEvqlKDK75FOPOtlooD7XAdN5izep7jPqBfzP1IA+IDSD4HRK+hJpyt2yWIXZoavl16c I9krQu41lOAdS7UW1Q3BjvqCjqJkLGI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YTn1pVSJ; spf=pass (imf29.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722624724; a=rsa-sha256; cv=none; b=bTrJaDxvOyuHGlYmSmImN8T+0W3NN86TGF7IP9sAsRmbHBWo55hgr50eu+JHvfO4WLZ0Vt 9Xjiw4d9Fcooten7pPRMhOJaf4IcUJoecWyS4RcsHjLR0yF1df1Agqh6PPYEIHhgt0Ic6F q9X2dy6XHQykZ/M71uMyDUzLkiG1wNc= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1fc66fc35f2so26114485ad.0 for ; Fri, 02 Aug 2024 11:52:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722624728; x=1723229528; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:from:date:message-id:from:to:cc:subject:date:message-id:reply-to; bh=t+v3NL+zeA+TfLqsAOg0jF4pN/5zB5SP0NcF9oposHA=; b=YTn1pVSJO4fi8Oz00tY4Db0jPTeyswpmslOA73O7v11by4cUwen6B7vjvwT8BEJJV0 kHygOOYg0QonTZ2m39In6d/YqDiuEA4vlzmD8PMNWGV5HdJfJuNYQCauPjkUXr36IRxs WCU1LJDLHSUwMtUF1mJiwj2iqnd3XgjQ7jyRMnWHiH8GOcqnDujitEQOcP+VI4szykWp Wbh3yOUYh0A9MahZk2wi1PNSE/sYOsfTey7O7kdKlbttMpWJ0sQj2eNctUcDpy52iI4S KggXSeTA2+ko9Js71t6hvw47AV2ac3+zQUlf1dfh5Yrhr7pKvd0sltAIz//WGcRiaR8+ z0bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722624728; x=1723229528; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:from:date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=t+v3NL+zeA+TfLqsAOg0jF4pN/5zB5SP0NcF9oposHA=; b=b9E48kWEn896kekzXBqU5NLyBHI1imGpbNCfmVKarGte4jqJt9tDFgnRVd2oixC7Im BXiKB5bFD/Y59a8ipNaFEkpsTiho8p8TNhl0pvIRlaBgWA7WT5r0IcmIG0KCAkdmB8Vu L0zAEgjdHl0o/9rR7E2YB1eQb4B2YFGQThcvvtaFiNLzhAQCyyBoh9PehDEPk10YlY2x xaM/nuMmeY6rkyf71usJavHoUqwfV4zDxtyZ5okZchKwxEopGBBsWIYeDU+gDf3fEnri wbGbZ9/XwPQvAktxgY+kk3xzxLWeOxbrYSByacQvgjEiIDkDSAEU+R52v1yLc+/PVwDy XYvA== X-Forwarded-Encrypted: i=1; AJvYcCWAG49QOSfTjv4c1b9I75niZ8CVF3kWRzjjWnRTsAKSLpYrsB9lVDpdUlLaeTvtNcXhDK39CURjynA5ue+z0bEgo34= X-Gm-Message-State: AOJu0Yznz+iygWigkWilsZ/aq35Gvyv0RhuEunMXF3wD9yC5XS4TKC5y lN1oNTGoYjhp4UOL6pf+g7uWxms98jXKQ7Vh+F1S6s0LoB5wLte+ X-Google-Smtp-Source: AGHT+IH0BB6n3fiWsO3K5b0edDmVXwKFkws5lBhHDWc9OULFCdJ7o2ipq4hucS8+J99vDCijG4X8Kw== X-Received: by 2002:a17:903:41c8:b0:1fd:6ca4:f987 with SMTP id d9443c01a7336-1ff57bc504emr63979135ad.15.1722624728254; Fri, 02 Aug 2024 11:52:08 -0700 (PDT) Received: from DESKTOP-DUKSS9G. (c-76-133-131-165.hsd1.ca.comcast.net. [76.133.131.165]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ff59294e92sm20488915ad.240.2024.08.02.11.52.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Aug 2024 11:52:07 -0700 (PDT) Message-ID: <66ad2ad7.170a0220.3a8e93.6a32@mx.google.com> X-Google-Original-Message-ID: Date: Fri, 2 Aug 2024 11:52:04 -0700 From: Vishal Moola To: alexs@kernel.org Cc: Vitaly Wool , Miaohe Lin , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, minchan@kernel.org, willy@infradead.org, senozhatsky@chromium.org, david@redhat.com, 42.hyeyoo@gmail.com, Yosry Ahmed , nphamcs@gmail.com Subject: Re: [PATCH v4 01/22] mm/zsmalloc: add zpdesc memory descriptor for zswap.zpool References: <20240729112534.3416707-1-alexs@kernel.org> <20240729112534.3416707-2-alexs@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240729112534.3416707-2-alexs@kernel.org> X-Rspam-User: X-Stat-Signature: cfjnboi9e3xy5epcx6zzcxwaw1edq5rc X-Rspamd-Queue-Id: F4045120005 X-Rspamd-Server: rspam11 X-HE-Tag: 1722624729-125019 X-HE-Meta: U2FsdGVkX1+SC4bw2pxbTVmLxaVkr+W/5XKGbAKOZ90Q+oqQ6Y1PHD76fJrHCSjtZwcjBcuMrOU451bmOJK0sMuQ2HtkHU1w8po18j9BFUO7a1i/yXlW4QA9zGyyjvYhN/KaLrhsVkPAzo60imk6rRuRLDV1SZ5w0MsAuuJAiRUszKcYIBj2M3X+k488whSci48WcBrHBdTNTYgbWYnor4ucRg0ERHvwh6F9cGl5Xzq7ugGh7oxPi34GcwW3lCcjdROPNKn+oLzOw+v+b63hfKVwBobLbJCxH5ydibfUrjFGRtGlLH3OaFd0pJk4wQEVnGK/N4VO+7Aeu4xjqrhebQ63N80jI9kcK4XjyEXB/txgeQ+neAnMQHc5z96dhcIfidvhvT6vrRFG5F2uDX637Etm1+OrihW+A06eiULhapLGyqCQsLOw/p+X4T0dswOi1ypHvIMsglhyllFzZ5qeLzLngB9IRMM5XsIoinXestFAiHAz74lwUwlgBvzodRsGTsYz7xMiVCI1H4JeaNfTPNba4XZeBCJ9Kq967gxBLFIge1/Cn4TXdT7hXFEvKCNbjZg+of71GoDwYc6oxKEwmpCrObz6tQ38TNYbtpvM4oeL37xeyeCpocN4UEomSLV0h4hYRCRhQSr1lsWZSyOPzOSBRnx+0biXFO7aLJx1ADrSWujflPzbQvYwaby25L8MGHU/I6/94fb2dH9kXpHPRJZlucb8NxCQKU/1Kl4iVCD3wwQZkuJzbxAWcuAW/RuilqVkKgcvtZkh4gpl54NGti3VFDDOIyfXkPBRiplswa+PgQ+EWBSWSc17SoBmyomupX98vm9W0w9IDZLX2YKsTYmzKZQzMkUnPdF4V0UgVPTVufEJqmWJI48JNsycKh+Ba2qFgIS8Wkztw9/dEkqA/dEx/QVQyZeMqP/17BJ4u9P+IkVxQBx4qveRkpAYdsS5bsY+/HRyPXD57DxZDJy 1ccxFlXU uldkXPWuzXBifJzeUNQUfwofc6q7GECZ6lcwWtDn4LrweoRFEamHV6cmON/TeqDUUWQJNcBbkQvadaU4PJsem2qf/Ekct/hMCDxEkHouW++0kG9ydyIhf0KTW/MCclW4cCVHwLyb+AE56mG8xF8oluY9U3v+ff7XmCDolFYMs2Mmj/SptdadFANB+QI9nWDK3JjYyTeenKKrKX7QY2lAOWeIPnb7PPGJz/eTnb0dhLfzTqohc7t8Ok81W2INtk4F4hk22I69l1+xJlZ7XsGLPdk3on+imqNrGztXwyqxW11FFw86VSlKpM5itcH0/cRON5JlKbwR44QDb7ZlQmm7ppy01HWDeg+IJ10uFzk6ONXOi9OyjWRfJYzMuuF1GvEHPmOHs/LvolD/hke4xQFuw3XOZA+aOq748T29iFW2wlzWIuflsaumT9xXB3f9pEH/mB5Xv0sTmnLKmxpeJ8Wo/de23vEfYloWoVXoP10RaDGnkHg/jgCiOAYWcmhN3S0mGMibhvLcHM5+TXk8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 29, 2024 at 07:25:13PM +0800, alexs@kernel.org wrote: > From: Alex Shi I've been busy with other things, so I haven't been able to review this until now. Thanks to both you and Hyeonggon for working on this memdesc :) > The 1st patch introduces new memory decriptor zpdesc and rename > zspage.first_page to zspage.first_zpdesc, no functional change. > > We removed PG_owner_priv_1 since it was moved to zspage after > commit a41ec880aa7b ("zsmalloc: move huge compressed obj from > page to zspage"). > > And keep the memcg_data member, since as Yosry pointed out: > "When the pages are freed, put_page() -> folio_put() -> __folio_put() will call > mem_cgroup_uncharge(). The latter will call folio_memcg() (which reads > folio->memcg_data) to figure out if uncharging needs to be done. > > There are also other similar code paths that will check > folio->memcg_data. It is currently expected to be present for all > folios. So until we have custom code paths per-folio type for > allocation/freeing/etc, we need to keep folio->memcg_data present and > properly initialized." > > Originally-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > Signed-off-by: Alex Shi > --- > mm/zpdesc.h | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++ > mm/zsmalloc.c | 21 ++++++++-------- > 2 files changed, 76 insertions(+), 11 deletions(-) > create mode 100644 mm/zpdesc.h > > diff --git a/mm/zpdesc.h b/mm/zpdesc.h > new file mode 100644 > index 000000000000..2dbef231f616 > --- /dev/null > +++ b/mm/zpdesc.h > @@ -0,0 +1,66 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* zpdesc.h: zswap.zpool memory descriptor > + * > + * Written by Alex Shi > + * Hyeonggon Yoo <42.hyeyoo@gmail.com> > + */ > +#ifndef __MM_ZPDESC_H__ > +#define __MM_ZPDESC_H__ > + > +/* > + * struct zpdesc - Memory descriptor for zpool memory, now is for zsmalloc > + * @flags: Page flags, PG_private: identifies the first component page > + * @lru: Indirectly used by page migration > + * @mops: Used by page migration > + * @next: Next zpdesc in a zspage in zsmalloc zpool > + * @handle: For huge zspage in zsmalloc zpool > + * @zspage: Pointer to zspage in zsmalloc > + * @memcg_data: Memory Control Group data. > + * I think its a good idea to include comments for the padding (namely what aliases with it in struct page) here as well. It doesn't hurt, and will make them easier to remove in the future. > + * This struct overlays struct page for now. Do not modify without a good > + * understanding of the issues. > + */ > +struct zpdesc { > + unsigned long flags; > + struct list_head lru; > + struct movable_operations *mops; > + union { > + /* Next zpdescs in a zspage in zsmalloc zpool */ > + struct zpdesc *next; > + /* For huge zspage in zsmalloc zpool */ > + unsigned long handle; > + }; > + struct zspage *zspage; I like using pointers here, although I think the comments should be more precise about what the purpose of the pointer is. Maybe something like "Points to the zspage this zpdesc is a part of" or something. > + unsigned long _zp_pad_1; > +#ifdef CONFIG_MEMCG > + unsigned long memcg_data; > +#endif > +}; You should definitely fold your additions to the struct from patch 17 into this patch. It makes it easier to review, and better for anyone looking at the commit log in the future. > +#define ZPDESC_MATCH(pg, zp) \ > + static_assert(offsetof(struct page, pg) == offsetof(struct zpdesc, zp)) > + > +ZPDESC_MATCH(flags, flags); > +ZPDESC_MATCH(lru, lru); > +ZPDESC_MATCH(mapping, mops); > +ZPDESC_MATCH(index, next); > +ZPDESC_MATCH(index, handle); > +ZPDESC_MATCH(private, zspage); > +#ifdef CONFIG_MEMCG > +ZPDESC_MATCH(memcg_data, memcg_data); > +#endif > +#undef ZPDESC_MATCH > +static_assert(sizeof(struct zpdesc) <= sizeof(struct page)); > + > +#define zpdesc_page(zp) (_Generic((zp), \ > + const struct zpdesc *: (const struct page *)(zp), \ > + struct zpdesc *: (struct page *)(zp))) > + > +#define zpdesc_folio(zp) (_Generic((zp), \ > + const struct zpdesc *: (const struct folio *)(zp), \ > + struct zpdesc *: (struct folio *)(zp))) > + > +#define page_zpdesc(p) (_Generic((p), \ > + const struct page *: (const struct zpdesc *)(p), \ > + struct page *: (struct zpdesc *)(p))) > + > +#endif I'm don't think we need both page and folio cast functions for zpdescs. Sticking to pages will probably suffice (and be easiest) since all APIs zsmalloc cares about are already defined. We can stick to 1 "middle-man" descriptor for zpdescs since zsmalloc uses those pages as space to track zspages and nothing more. We'll likely end up completely removing it from zsmalloc once we can allocate memdescs on their own: It seems most (if not all) of the "indirect" members of zpdesc are used as indicators to the rest of core-mm telling them not to mess with that memory.