From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BA16C3600C for ; Mon, 31 Mar 2025 23:03:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0AD9B280002; Mon, 31 Mar 2025 19:03:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 03756280001; Mon, 31 Mar 2025 19:03:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF2AA280002; Mon, 31 Mar 2025 19:03:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BBBC6280001 for ; Mon, 31 Mar 2025 19:03:21 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C004CB7A6E for ; Mon, 31 Mar 2025 23:03:22 +0000 (UTC) X-FDA: 83283374244.19.949650A Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) by imf19.hostedemail.com (Postfix) with ESMTP id EEE381A0012 for ; Mon, 31 Mar 2025 23:03:20 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EM9TbOaN; spf=pass (imf19.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.49 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743462201; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NWwPmCk/bFrl3o6dHPfyYz2Jq8FW3LKwskkgzUQA7CQ=; b=4sm/3bGWEvSPHyspimbwq4rFj5P6eIn1Ut1XlzK76oOZKgdfoUaLROczwAywf1bW5XrXOw rjdfksTDWDLAP7nv8mlVBW2DTQSpFHNygTzBUz2UER2pDs674bfzV5OKDNQ2FRNJD+0xbz W7svSzVHY/YgzSHoVDEGjH8E4qW6zJY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743462201; a=rsa-sha256; cv=none; b=iBH9Arqw2dZR5jziSITO/BN8bsea2Mw6HV/PLnhQ57iOLUebLQo2yu4AgnyVMuwl0a5EIk RDv9VQaYfFB5UnIn64ylT0bppJ6iNRv7LJdUEDjofNf3Mekx7UqZ6If4+90Kifk9ijJN7n lZzC7R2x1VflwF1O+1Hd2fSkeYedKFQ= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EM9TbOaN; spf=pass (imf19.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.49 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-6ecf99dd567so57965336d6.0 for ; Mon, 31 Mar 2025 16:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743462200; x=1744067000; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NWwPmCk/bFrl3o6dHPfyYz2Jq8FW3LKwskkgzUQA7CQ=; b=EM9TbOaNCguUjE51v6ewHh1Z1eZXZlYPgPJJP1zpVVYM02eyrVdlwuLaeKgIiwO5ol 93Oyf9dMfe/grS24TIgiEXIL3FFbScX/09w349WWdhWtt0+M7vvvfb02QZ8RYVguyUcO mRENMJPnVa0RniKOy347LMeA6zOw8iEbTZQi2fd9O0Dpxz2qZ5BDhU8K+NwCj+UnKYlD NzR4FDKBvJ/b/DDgK6dsg1Ac53ivcFJfL3Pt/WAz//xXMqswiYoUJdc57NYJiEo35XGe CbIgLfTgkX52Rz38qzC161Wm7htXMIiK6/qPW3gLzByM8DLg3rlo+ps+nfxrp/hcUKxJ VlKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743462200; x=1744067000; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NWwPmCk/bFrl3o6dHPfyYz2Jq8FW3LKwskkgzUQA7CQ=; b=avgz0UAKD2Enw+PLlqIOTlEmE6hX00koUAgHWG4O6rC56rhge19syd+eZ85DjdPQma Yh07bWwPDUtnne12gSyGMjHht7ZNsp6ZJ3Ia91zgdD0qQ80S8ZAD3Tmmm1fGZcX7Hdd/ zOtZEfA6Izk0tTqvN7B4bwiD1EJtfU77Q+Hll92m+NqaW9s0ti/XUCFNe+nb/42CCpSm tK2BNWuVoDy79CMX1W+vGg6SbiVXIoE56pOc6WbXk9DgDRUfj2eWZ72Wvu3d7e+CKoyD jzziR/ozC84dTM929IgQGW8jwff/Ki4jH09MQgSFxjFY6ER0mTl7bQR9Mem67cpw+5ZS dFMw== X-Gm-Message-State: AOJu0Yw08PBVFdpTaNIP4KQ9lJz03cI1bYMk+jWne1DZLUjrxih3a/+1 sHn36FhVY2k88LK3kEWA9iMSV7l1kqxjlV9YMvWtF68y48/AUL9CQUrdQI5xVGBraskeAaRBTPJ rtv+3iQpu2T9O2pIGDAtBKsLBUaY= X-Gm-Gg: ASbGncuqp10JTkZGQceuUT8Tge8N6UFqRy9OEPKEtD7i3yKeEgszA2vWF7T8qVlS/t6 0tcUPr1EPb58xvCsT2EI12UYOBMMxO6tkMIUZ315UNTg5BxmUnGO00lp2/BUWElMt9mSjRmz8L1 JLZvPWIQlbcpARD7lrWr+apuIaql/G8NLVyqfxGJV1/Avw7UXfFuWV X-Google-Smtp-Source: AGHT+IF1XNzfdEdI5a/KuIRD1Tv1xoYFpxj9+QumwGIhqynrvYAvG1rLCLaa6IdtCDkTmp4jEb2Q8uUGWC2C58hGufM= X-Received: by 2002:a05:6214:2507:b0:6e6:61a5:aa4c with SMTP id 6a1803df08f44-6eed629a9admr176232946d6.31.1743462199977; Mon, 31 Mar 2025 16:03:19 -0700 (PDT) MIME-Version: 1.0 References: <20250329110230.2459730-1-nphamcs@gmail.com> <20250329110230.2459730-2-nphamcs@gmail.com> <67eb148e1f818_7baf294b9@dwillia2-mobl3.amr.corp.intel.com.notmuch> In-Reply-To: <67eb148e1f818_7baf294b9@dwillia2-mobl3.amr.corp.intel.com.notmuch> From: Nhat Pham Date: Mon, 31 Mar 2025 16:03:09 -0700 X-Gm-Features: AQ5f1JrJa0OI2zK8nR3gBptJMvPGRmf2NcF6lSF3CCpW8PAHL9JjLfK12YT14xI Message-ID: Subject: Re: [RFC PATCH 1/2] zsmalloc: let callers select NUMA node to store the compressed objects To: Dan Williams Cc: linux-mm@kvack.org, akpm@linux-foundation.org, hannes@cmpxchg.org, yosry.ahmed@linux.dev, chengming.zhou@linux.dev, sj@kernel.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, gourry@gourry.net, willy@infradead.org, ying.huang@linux.alibaba.com, jonathan.cameron@huawei.com, linux-cxl@vger.kernel.org, minchan@kernel.org, senozhatsky@chromium.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: EEE381A0012 X-Stat-Signature: tdxnkex99nkcswhdyea8qc3b91ded98z X-HE-Tag: 1743462200-229650 X-HE-Meta: U2FsdGVkX1/aJOL9n8iZU/JpMCuTnFW9yVwN/7kNTNSqhcuoAvK12t0616fOkvgbg1GSSxsxTtEIuvYGKaHNYhmevH09BmI9iLQ4ytlhOqWxEhlqHyZBLmH6x3uLtjb96cGPPH0K5T9pCBlewSauRjbixmJgaqYnKh0fT9usC31iFm6f+fXjKkmC/Ef3arLUMJO0PPR9Ga/QVS8J14gab3nfTITHgxHaNeg3ZcGx0xswiktB2//fP+RgdoQnHHgEhE+CjOxxCt1zvutEeLtdl1+cii3hWi6ENDtgiPO383xHB/EwQUgYIM52ZAzuQoZxuR6KPuO8cBH3EUb94+e2y8vdeFTPVMZwDVvrMzTOPHm/GJX0bp8fFv0fLrCTF8ZPP7j9T0UMpm//+OUlbJ0/9fOdznj3WfMdNab2fkTlUFNPAiyFClozPbqlQv/v2CtttjJ0IDK+hDizeGmDhYcv6XYFaQemJTLsYpQhNfI0HgG6SEjtLAQHvieEnvpJeb7jJ8gOUiUzGK62OpVwUgPxty4rrgDS+nKJm3o2D1GPsF2IMAurqXyxWVVKrdcNcMPOHtaGDhcKAEj5VNut7LGHwcsaD9B9q8C+E8pm+aFjhpnydBrkWQ/fwruLDoeV4hD2bXLcq+NAVJ1m7ZkleXbkJ2+wp0UkylIuMqZT3rzARJtM+nmg1MINv0Q1vx9MdgUOvN2smhfFptB+3KI/8fF6H0LQ99976sEyY+BLHfMcCIT/vV48OL/isKcpbXmDqx4WZt+R6U3Fli+IMvKRw/JazVmQQ2x1Fc8DnQoo6DslJrbH4sLrtL8L661M0RZVPM1HAsiCOJW99+BD8awtRicuJQsGm8AemG9IGltRWbYgrUDJ3vXr5Pxi70cvAl409FPTCjntKnJR/CqV/XzmGsRPpUVB9ixcBziAD+NeL1jBSfp9mj15z3UU0Mlrgljoo2NqkgqYMNa6odyUH+8o5kH 1wTGwcSA izALLk8QP3z7Et3TjM3gEUA5kZZuMdCiBPU8h24PXK9Lud8VaauH8+1Xolxo5MR6z9huZ53WRraStvRWmuVrv+WsH3wTINmuGGmmM/4SLKB+JvAplZN/I62j3WlCHTMPpuZ0I9ZQJmg8Ytt1tlU2aeV3utvJlRUSlcPIa5bHL+/1bBaF+5B4QM09iEk9XYVHQhCk0bfjMIZg9/XhJe0B4J2btxiQ8Gs0AgXadRxiMd6xJKeIjngW2gIwQxbsJLyIsPFVFwAaxbuQQevog7yajnfVJi2PZtjnMWx8xISpUbvTQ6Fr9ITR6dlfgkyVUI34M4AVuUhLyVX17/x2Ok6ahSHVojqPrjd0MbW4na2SX3uMNHYPr3DwWpisb7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 31, 2025 at 3:18=E2=80=AFPM Dan Williams wrote: > > Nhat Pham wrote: > > Curerntly, zsmalloc does not specify any memory policy when it allocate= s > > memory for the compressed objects. > > > > Let users select the NUMA node for the memory allocation, through the > > zpool-based API. Direct callers (i.e zram) should not observe any > > behavioral change. > > > > Signed-off-by: Nhat Pham > > --- > > include/linux/zpool.h | 4 ++-- > > mm/zpool.c | 8 +++++--- > > mm/zsmalloc.c | 28 +++++++++++++++++++++------- > > mm/zswap.c | 2 +- > > 4 files changed, 29 insertions(+), 13 deletions(-) > > > > diff --git a/include/linux/zpool.h b/include/linux/zpool.h > > index 52f30e526607..0df8722e13d7 100644 > > --- a/include/linux/zpool.h > > +++ b/include/linux/zpool.h > > @@ -22,7 +22,7 @@ const char *zpool_get_type(struct zpool *pool); > > void zpool_destroy_pool(struct zpool *pool); > > > > int zpool_malloc(struct zpool *pool, size_t size, gfp_t gfp, > > - unsigned long *handle); > > + unsigned long *handle, int *nid); > > I agree with Johannes about the policy knob, so I'll just comment on the > implementation. > > Why not just pass a "const int" for @nid, and use "NUMA_NO_NODE" for the > "default" case. alloc_pages_node_noprof() is already prepared for a > NUMA_NO_NODE argument. Gregory and Johannes gave me that suggestion too! However, my understanding was that alloc_pages_node(NUMA_NO_NODE, ...) !=3D alloc_page(...), and I was trying to preserve the latter since it is the "original behavior" (primarily for !same_node_mode, but also for zram, which I tried not to change in this patch). Please correct me if I'm mistaken, but IIUC: 1. alloc_pages_node(NUMA_NO_NODE, ...) would go to the local/closest node: /* * Allocate pages, preferring the node given as nid. When nid =3D=3D NUMA_N= O_NODE, * prefer the current CPU's closest node. Otherwise node must be valid and * online. */ static inline struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, unsigned int order) { if (nid =3D=3D NUMA_NO_NODE) nid =3D numa_mem_id(); 2. whereas, alloc_page(...) (i.e the "original" behavior) would actually adopt the allocation policy of the task entering reclaim, by calling: struct page *alloc_frozen_pages_noprof(gfp_t gfp, unsigned order) { struct mempolicy *pol =3D &default_policy; /* * No reference counting needed for current->mempolicy * nor system default_policy */ if (!in_interrupt() && !(gfp & __GFP_THISNODE)) pol =3D get_task_policy(current); Now, I think the "original" behavior is dumb/broken, and should be changed altogether. We should probably always pass the page's node id. On the zswap side, in the next version I'll remove same_node_mode sysfs knob and always pass the page's node id to zsmalloc and the page allocator. That will clean up the zpool path per your (and Johannes' and Gregory's) suggestion. That still leaves zram though. zram is more complicated than zswap - it has multiple allocation paths, so I don't want to touch it quite yet (and preferably a zram maintainer/developer should do it). :) Or if zram maintainers are happy with NUMA_NO_NODE, then we can completely get rid of the pointer arguments etc.