From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED352C27C53 for ; Fri, 7 Jun 2024 17:25:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 66D866B0098; Fri, 7 Jun 2024 13:25:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 61D8F6B009B; Fri, 7 Jun 2024 13:25:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E50F6B009C; Fri, 7 Jun 2024 13:25:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2F5D36B0098 for ; Fri, 7 Jun 2024 13:25:04 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A3B9CC0E07 for ; Fri, 7 Jun 2024 17:25:03 +0000 (UTC) X-FDA: 82204768086.17.842211D Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf15.hostedemail.com (Postfix) with ESMTP id C390CA0006 for ; Fri, 7 Jun 2024 17:25:01 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sKrnMoMv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717781101; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BZcPHS5nObMkGpNMDkoJlbRs4OmeOKlnWXQEnPNBwVU=; b=4yVJKSdj2+5G5P647vBTb4WfzA5gzxFqKeK0OHaM6HZc7T5sjj3jd/0px55MdEmj3bep2q 5EP5cyeZFYXQz8xLKhRNejwSrkabrea8OqChFScc/MI0TFmMSChRQc06+ZxX81KtbaTUls eI1Z2Bz6cF3ExIOW46y73zcJwBxg4fE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sKrnMoMv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717781101; a=rsa-sha256; cv=none; b=CC6Y5zEXVp2OA+HPKK3w9B8pDv6U9q1RWCPYWig7upfwqi4xvPGNvzNqtwSeS8Yo/pdkT0 2OKckXPXQaf0LJmC+PXhNDvSQaIrNo01YimUKj+wytM8HK6KSbBPny/NyKf8sjhKvNqhTT iCz4th3SQX4ULrezhpXFkDpshY4VmxU= Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-57c614c572aso653189a12.1 for ; Fri, 07 Jun 2024 10:25:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717781100; x=1718385900; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BZcPHS5nObMkGpNMDkoJlbRs4OmeOKlnWXQEnPNBwVU=; b=sKrnMoMvHjxVD4qyOyt8JkVpIhrkw/Iy+vD25B5gHsWBemBTfc4WwPnRs/8R5rX0s3 bBFc1og9ZpeHsXsKehMgc46cn/O7vWd52XiSqOO7YJ9Cv6YrQExVokTaU9ttspMnEtM1 LNmF/89ytu3dohzwXtaw51QDp4GeeVLHRt80VDvGQe75Nv6z7cZPbxkBRAThX3NHFrJg Z3r98bLFDDxxdM8dC7rh0OZ40GNU2UDsPfJHHFgokQ/+pTATcn8Vdllm24tlHCx18O57 5G/xl+2v1RXWn9cWr4O9EVPEoLVxG/FdiUG0xNeK0UD5MIvc+mHX+2v1CCp/0lL1pWXj 8Ing== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717781100; x=1718385900; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BZcPHS5nObMkGpNMDkoJlbRs4OmeOKlnWXQEnPNBwVU=; b=nIUizEKV78uFvZ3SEw7hr14k23zlk3HcphSmGgKVU4Bd5D7+2xQdb0kuGUEzx7O8CR r6MICrTbbmFV2J3qDoyBIuw0GfQQP2Ag6lPS7poy30Rz3nwDJbvnkDp44IcCcW56Lijh 3oAM5B05T4F/cXRgwRTPxhXUHr6UZ0n3AQM4zMVmsr9VsZZpjOCkj/JCxiL9fC4Km4n9 1FNXeucFffQny83bzztft3x15v7TPjsKVvR9DkvWoGeIw2538GacR5Db3zecv/8kOIbm qeg+Mh9t62CgCKerYx+9UPwUcs7S80GYbPiFdnxj+BwCInHVDCtcc60Y8neDc1uU+gBs mhJg== X-Forwarded-Encrypted: i=1; AJvYcCXYtg0CZBNFo7SdzYF2Ye5jNsKp0FNXug69i6sIuu58Pf8FZtnX74m3/QnlcIIcEmD1spvaFa/V9OfGGm+8Y14V8q0= X-Gm-Message-State: AOJu0Yx79BUD5E0Ns9mNPu+VUmOzzTi5Qgzh2aLP1KHHmC4ClYmeiyXa XyBK7qSHg/ukXtf94TuLsOauAyRn4vfFKtJMdU/DKV6zXQsICt+MAWdkWQbhU8z9L8R2thyOtzg dUyyBwD5omGoc6FZBSZjges4cm50nGotX+ZM7 X-Google-Smtp-Source: AGHT+IGvDsfJ5GPj6VC/dEdrnKwbmtX6kibtWaeckQgiOPlyznoRd6nf2wUsVVvhPGsFz5jeluIvbrDrgRFhHjdWym4= X-Received: by 2002:a17:906:fe49:b0:a68:fdfd:8041 with SMTP id a640c23a62f3a-a6cd7a843f0mr302689966b.42.1717781099759; Fri, 07 Jun 2024 10:24:59 -0700 (PDT) MIME-Version: 1.0 References: <20240604175340.218175-1-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Fri, 7 Jun 2024 10:24:23 -0700 Message-ID: Subject: Re: [PATCH] mm: zsmalloc: share slab caches for all zsmalloc zpools To: Minchan Kim Cc: Andrew Morton , Sergey Senozhatsky , Vlastimil Babka , David Rientjes , Christoph Lameter , Erhard Furtner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C390CA0006 X-Stat-Signature: sxh155qhx1za8qhejm5sgnzk8415dqb5 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1717781101-887497 X-HE-Meta: U2FsdGVkX1+QiX3rcNDPezEcUYEup3hIrylKXG/L6gDKbdOFao9kUPrN26tjyQo0vU9dvKk63ipKWb3YW+nyMhBW++lqMYFJvXNu6f/kMe+NfSAUKCU3OeJvl3c3rwH9UgpypriXTiLWFvdqbPlO5c8vFnEp0YHX4duTfbpHUd3x5CDKm22PoVhH9Vko/Gn3IiAWWD+8Dzzie9UPV+zZ4O6BrzOzoWFhr8iH9VL8tYdic/Lrq+y/syXD4KDX2olo3h/xqFxGCvshPKummqG9qiZxAUogG2piDYfIg79QPMcdRV/Ao6rWnhm/gxbm+gg1KY8hNdjEB32xSTgPnHH9CrryVdfNTL/7Z/48E4CE6sgVKySpsMUroFn826I7R8NmSsHn/vHqBfmXrfoRlbvkToLH4UskE672p5JKWQRVxAod+QRIea3+bV4O3fC28Oo3G3r/y7c56mI1qLavSeFl6s/VG4paQgtonwqT7sWd0Ckb2/+GksPBgeKIHay2ISTZxcBchfTWX4xobKEVUC12LlTRWMzdUCpPotV4DaLoU00GwK6G3WI+n7TMQ9fvmTITWDBYfnXrmAwdRnnZWQa2XCL8tNW+WeOr+u96meZD0TmSsB5OgPlZWF5VjNgE1DB2aXh51sLBVjMR+kBlwfpS0Tqts8obtoYB12CQH4f1xVki/Qz5dD6P5f0qwD32+iVy4SNuIUU/P0Zy9YTGnLNp5xizRbgZZbrIgRL4fNeZuBG2Kari9a9xbyhJ5QZnwbkECtYOXYVGRUS8krOiiBmteXK8x+y+ipHVyG3cdyFEiT6nLtzJiGujTBHjKpM0UALxf6hiDyXQqV7w+GzzoFRegcgSY97PO4tvUt5KplSDj1PfL1vmn7pZaa33fUtHcZK/tVlE6ctDwnNSL+Gt9UYTuxvBZcHTJ6IbAdJsuETS4l6JbQOpZFeGqf85Eimf1MegXtUGvyk6JetdW5UB/WS 7QiGMxV7 y/6o0g/9qCca9qHDOm85pags2OtYLExNpKQrRNpwQuRzLHnvQRjUe6a72DRZ2gsjRWy5bhoZ8Y+5z3FgN/TgV/v2D4lGMAcjakDgsCn2CN7MKwCfX1Pm0pFnoiujZ8KDWWEzaUgQs9oc8cD8uWB7nH1N5addUZ+ViYjpk6z7Mk76iWkD0A1CG9TOWXUZlWrCbxEuK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 7, 2024 at 9:14=E2=80=AFAM Minchan Kim wro= te: > > On Thu, Jun 06, 2024 at 04:03:55PM -0700, Yosry Ahmed wrote: > > On Thu, Jun 6, 2024 at 3:36=E2=80=AFPM Minchan Kim = wrote: > > > > > > On Tue, Jun 04, 2024 at 05:53:40PM +0000, Yosry Ahmed wrote: > > > > Zswap creates multiple zpools to improve concurrency. Each zsmalloc > > > > zpool creates its own 'zs_handle' and 'zspage' slab caches. Current= ly we > > > > end up with 32 slab caches of each type. > > > > > > > > Since each slab cache holds some free objects, we end up with a lot= of > > > > free objects distributed among the separate zpool caches. Slab cach= es > > > > are designed to handle concurrent allocations by using percpu > > > > structures, so having a single instance of each cache should be eno= ugh, > > > > and avoids wasting more memory than needed due to fragmentation. > > > > > > > > Additionally, having more slab caches than needed unnecessarily slo= ws > > > > down code paths that iterate slab_caches. > > > > > > > > In the results reported by Eric in [1], the amount of unused slab m= emory > > > > in these caches goes down from 242808 bytes to 29216 bytes (-88%). = This > > > > is calculated by (num_objs - active_objs) * objsize for each 'zs_ha= ndle' > > > > and 'zspage' cache. Although this patch did not help with the alloc= ation > > > > failure reported by Eric with zswap + zsmalloc, I think it is still > > > > worth merging on its own. > > > > > > > > [1]https://lore.kernel.org/lkml/20240604134458.3ae4396a@yea/ > > > > > > I doubt this is the right direction. > > > > > > Zsmalloc is used for various purposes, each with different object > > > lifecycles. For example, swap operations relatively involve short-liv= ed > > > objects, while filesystem use cases might have longer-lived objects. > > > This mix of lifecycles could lead to fragmentation with this approach= . > > > > Even in a swapfile, some objects can be short-lived and some objects > > can be long-lived, and the line between swap and file systems both > > becomes blurry with shmem/tmpfs. I don't think having separate caches > > > Many allocators differentiate object lifecycles to minimize > fragmentation. While this isn't a new concept, you argue it's irrelevant > without a clearcut use case. > > > here is vital, but I am not generally familiar with the file system > > use cases and I don't have data to prove/disprove it. > > The use case I had in mind was build output directories (e.g., Android). > These consume object files in zram until the next build. > > Other potential scenarios involve separate zrams: one for foreground > apps (short-term) and another for cached apps (long-term). Even > zswap and zram could have different object lifecycles, as zswap might > write back more aggressively. > > While you see no clear use cases, I disagree with dismissing this > concept without strong justification. I was just unaware of these use cases, as I mentioned. I didn't really know how zram was used with file systems. Thanks for the examples :) > > > > > > > > > I believe the original problem arose when zsmalloc reduced its lock > > > granularity from the class level to a global level. And then, Zswap w= ent > > > to mitigate the issue with multiple zpools, but it's essentially anot= her > > > bandaid on top of the existing problem, IMO. > > > > IIRC we reduced the granularity when we added writeback support to > > zsmalloc, which was relatively recent. I think we have seen lock > > contention with zsmalloc long before that. We have had a similar patch > > internally to use multiple zpools in zswap for many years now. > > > > +Yu Zhao > > > > Yu has more historical context about this, I am hoping he will shed > > more light about this. > > > > > > > > The correct approach would be to further reduce the zsmalloc lock > > > granularity. > > > > I definitely agree that the correct approach should be to fix the lock > > contention at the source and drop zswap's usage of multiple zpools. > > Nonetheless, I think this patch provides value in the meantime. The > > fragmentation within the slab caches is real with zswap's use case. > > OTOH, sharing a cache between swap and file system use cases leading > > to fragmentation within the same slab cache is a less severe problem > > in my opinion. > > > > That being said, I don't feel strongly. If you really don't like this > > patch I am fine with dropping it. > > How about introducing a flag like "bool slab_merge" in zs_create_pool? > This would allow zswap to unify slabs while others don't. Yeah this should work. But I'll wait until we have more data and we know whether we need to keep using multiple zpools for zswap. I sent this patch because I thought it would be generally useful to share caches (e.g. if we have zram and zswap on the same system), but given that you said it is actually preferable that the caches are separate, it may not be. I'll wait for more data before sending any more patches to address this. Andrew, could you please take this out of mm-unstable? Thanks!