From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6BC6C0015E for ; Thu, 20 Jul 2023 18:31:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 497AC28014F; Thu, 20 Jul 2023 14:31:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4469E28004C; Thu, 20 Jul 2023 14:31:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30F3B28014F; Thu, 20 Jul 2023 14:31:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2126E28004C for ; Thu, 20 Jul 2023 14:31:41 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DE14440275 for ; Thu, 20 Jul 2023 18:31:40 +0000 (UTC) X-FDA: 81032833560.24.9F41C69 Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) by imf16.hostedemail.com (Postfix) with ESMTP id 0A31318001E for ; Thu, 20 Jul 2023 18:31:38 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=m5mX4VS3; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689877899; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JrkFFvhCDJb8jzV4vfT/0jPY3RsaSNNqzn4aYV8R2Cs=; b=c/d2CA8TYQGaqL/Vcx+huusUGDUqhIFDDH27FH7kUz0oR6g2p+ByR1/5GKnRjCfxczV+bl dY1rP9dDAHqlnHUBrSz5Y4oIhcwDVFACiYFVY7snyXkF/bjsCpACmOIjQep2e1PU+0yuT1 7GiwOZATlE8tT0X+Y45ueDrY0RodhnA= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=m5mX4VS3; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689877899; a=rsa-sha256; cv=none; b=i9XAcQu3PlXrfLrMSjbBBqp3tpRGSIyZArcejO4J1OLflwGxMzmChss+U37yAj3izBUCHM 6U3QUqKRVJprgYyZtwSrN8i+Xh5kiOTC9I/4TU7t1u083sFk3JKPaw0+GJ3vTUMPno/Vrq yokqRRBc7yqU4heqKHAgBoaB/k07X1c= Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2b74209fb60so16930111fa.0 for ; Thu, 20 Jul 2023 11:31:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689877897; x=1690482697; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JrkFFvhCDJb8jzV4vfT/0jPY3RsaSNNqzn4aYV8R2Cs=; b=m5mX4VS30OmUMkZ/G4ry1B4ELa9sEbB7lDPhBwdjeD7O2pRls71f6H0h//qa1Cck+B 7rt4dNLeKGXLeFI94ATkHwYPSgllaoUaJh4Xg+dGtHKXyOYE2wBrse7NDqIYpR8MOwqX HiDlViZfqEBtqAURT6ykbrUMhj0IzH11ROq7SEKVrs8yUEojlDXiGiPkYOq+EvYihD8Q 6FAu97w8KIqxwEub7Ne+dIWdGM7cahPln4DkJvA9zOjGgWD5TcQnaFpuhsLsGzr0BehK 7xQLfNw6AM6wiDM+DzOM3M09l+pAvK4m2ZN3YdM+F+pEkokXOEGyZ2I+wxjmZAfp1tbU 2NZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689877897; x=1690482697; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JrkFFvhCDJb8jzV4vfT/0jPY3RsaSNNqzn4aYV8R2Cs=; b=lIjThcKbukef9qiEvADpLJDxB1hv+Gn24u9fVqwxNvGgPldDFauqtl+BpIsoceDObI g5sh4cXLJ6Hh4Od4mReKbNOSGoLGoq9myYgKGwFLyUrEbOWBnau34OdMgVIbUQyqnRD8 m0gHLiTCD/4CBnqMiWNipWoBBN1zDBgTV12nBeM7fh44DKs6BfsHlJwXGSiP+BQ1q8te C0xQAUVRRkB3IaxWNJXyRkjRB0QdtZKfFwFkQoRCKexkg2IZ7Adt4ylcDZ4YOJPTAsE1 w1r4wrYVnHWzBxZN71gDOj5OZuBp9ggr4ixP6oOL69FunZGcTAKIbT8gwZp8tcjomkvt w29w== X-Gm-Message-State: ABy/qLZ9a9jBy+3qAfenyUDkYDrHZYWjGJ3+xohkXqSGP70OwNVH+HmM rhu8h53H+RrPyJM/npAPaTn71oZkbUbXI44CD2/CMw== X-Google-Smtp-Source: APBJJlE1BjjSg2woF/XJDLyNu/ayF8F8B6HCNXDFXwHIE5YnroYP7jQXWe5Vgzm3KTmEcJ6zOtUbG2PIILbLvjZqEXU= X-Received: by 2002:a2e:a28c:0:b0:2b4:45bc:7bd with SMTP id k12-20020a2ea28c000000b002b445bc07bdmr3118296lja.4.1689877896635; Thu, 20 Jul 2023 11:31:36 -0700 (PDT) MIME-Version: 1.0 References: <20230713042037.980211-1-42.hyeyoo@gmail.com> <20230720071826.GE955071@google.com> In-Reply-To: From: Yosry Ahmed Date: Thu, 20 Jul 2023 11:31:00 -0700 Message-ID: Subject: Re: [RFC PATCH v2 00/21] mm/zsmalloc: Split zsdesc from struct page To: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Sergey Senozhatsky , Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Matthew Wilcox , Mike Rapoport Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 0A31318001E X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: ruo6t7zjko36i9ejgc5jw19i1kn1zszq X-HE-Tag: 1689877898-866754 X-HE-Meta: U2FsdGVkX18HqyQ2dyfPdHDy9BgVLV2OvrvcPBzrZVN/VHOyoyG4eZDy7AZysf6NZI4A7fAlFT8tJN+ge+DrPscwLi3pJ3f2zMbO1ST0qCP5EWWAtvNEDz4pnn1L/zW/FOh0jnj4TnIiYC11BqmzrkwmPBmXHWSuz5Qf8WiXbnf2vSN6tAZ/vhpI0PcZGETT5HzDbOawKRyBRjigsUAaNqxX/1udbaip0718Va7ecBVLKMWRQjpelzFQ+hQEigUMLi2jZzn8Ud5zD02IiS9zSWaNjk0bwAHIHAu2PAteg/+Z6oNV7zxKxEq030FEU0bXKIMRZELjvsG6XB6xcSmokp2jpyDVH19h1CjO7xO5EFmZkk53xT7pDlLAvYstyXwisHKd2Jvj0k5ZJ9Y2FJuadD6DLRfiCD53/9RFLqqv8IDdkKhL2ws2Uh0cZca/6qEu4wglLEUhFOkEWtNAG87K5FAcYz8HIbtpuXfxVF8inTJXztLZqEObwVl3SUCZPpT0jCsmnwt3jv7w+D4NeyI4mTgAsmiGbhaEJQDv2T+WXEwcyggrpkqoj59wtumd1blx7YSvsozhJODZYa/h9xcECcq+uctMsTiuBQxZPnCCr6V1ZYdtC2TqDbgDUciL3rJLl0C1ExzvHbCgc4Nt43ruLRcXsxBdFzB1uWWmHsNoEbe+OHyRuED2SxFT8xvIqYyXSUbJm2PjPHCMur460mr5GWeYVlE3PYL0Vpx+tpV4jDBNXyyyvSQEEO68S7hvCiOzJLNoKxnn811M5gzH74sGgg2DWBsA8DO2Bj6sejCtSNv5twMz8tUGqtaj8NQzBTYzgXyCgnk7FTBAt6Pg3JHRHjo45VnjCMWsoPGIVLTC6jhKHTTlUf8PCKGfXZLw/16DG71RpMAcWBf8OaUnaOev8VAfC14Qb3rU6H5bW/aulxGpogXcL2I30E7GNJcC3d6gkQUflRiCmyP8L5BwBPH 07lxsXNu zBT32cpiyJlub4M6op0pEVsxj6AuA+psaCVPFLolfezXn3C3JXNr3EHfaYVb7aqyzp54jEy0tP6Z4rFk6eVcgMEqmghqu+XxGiAMidj2f5uM5rB7qU3hjMduiJjY8uP3xxF+h+q4o2/8zP0K/xEM3LtFJy/K5lLh8bonMGf9EJkQ7Gy0uf67x946HQWrlF2KnlLyet4+QmYsAqBjfimaPTuV4mECLCoMf+0UnbfKUAOaZc0wLPRrUn/NC3w1F3xNvfalyY87qwUPiY2S9sGDRzJEPVCsXEsitrE1cwtfXBFnpFmcdo+6yfHse++3JJ4/cQHwbfnWqT0DOrMLAv9xm32AoM0YZpiwpI/0VAHhdCvdKxwQn3C4c5e+a1ol8rAkGHxkyHZS8P/XEuw+z/Ucv7z5sGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 20, 2023 at 4:34=E2=80=AFAM Hyeonggon Yoo <42.hyeyoo@gmail.com>= wrote: > > On Thu, Jul 20, 2023 at 4:55=E2=80=AFPM Yosry Ahmed wrote: > > > > On Thu, Jul 20, 2023 at 12:18=E2=80=AFAM Sergey Senozhatsky > > wrote: > > > > > > On (23/07/13 13:20), Hyeonggon Yoo wrote: > > > > The purpose of this series is to define own memory descriptor for z= smalloc, > > > > instead of re-using various fields of struct page. This is a part o= f the > > > > effort to reduce the size of struct page to unsigned long and enabl= e > > > > dynamic allocation of memory descriptors. > > > > > > > > While [1] outlines this ultimate objective, the current use of stru= ct page > > > > is highly dependent on its definition, making it challenging to sep= arately > > > > allocate memory descriptors. > > > > > > I glanced through the series and it all looks pretty straight forward= to > > > me. I'll have a closer look. And we definitely need Minchan to ACK it= . > > > > > > > Therefore, this series introduces new descriptor for zsmalloc, call= ed > > > > zsdesc. It overlays struct page for now, but will eventually be all= ocated > > > > independently in the future. > > > > > > So I don't expect zsmalloc memory usage increase. On one hand for eac= h > > > physical page that zspage consists of we will allocate zsdesc (extra = bytes), > > > but at the same time struct page gets slimmer. So we should be even, = or > > > am I wrong? > > > > Well, it depends. Here is my understanding (which may be completely wro= ng): > > > > The end goal would be to have an 8-byte memdesc for each order-0 page, > > and then allocate a specialized struct per-folio according to the use > > case. In this case, we would have a memdesc and a zsdesc for each > > order-0 page. If sizeof(zsdesc) is 64 bytes (on 64-bit), then it's a > > net loss. The savings only start kicking in with higher order folios. > > As of now, zsmalloc only uses order-0 pages as far as I can tell, so > > the usage would increase if I understand correctly. > > I partially agree with you that the point of memdesc stuff is > allocating a use-case specific > descriptor per folio. but I thought the primary gain from memdesc was > from anon and file pages > (where high order pages are more usable), rather than zsmalloc. > > And I believe enabling a memory descriptor per folio would be > impossible (or inefficient) > if zsmalloc and other subsystems are using struct page in the current > way (or please tell me I'm wrong?) > > So I expect the primary gain would be from high-order anon/file folios, > while this series is a prerequisite for them to work sanely. Right, I agree with that, sorry if I wasn't clear. I meant that generally speaking, we see gains from memdesc from higher order folios, so for zsmalloc specifically we probably won't see seeing any savings, and *might* see some extra usage (which I might be wrong about, see below). > > > It seems to me though the sizeof(zsdesc) is actually 56 bytes (on > > 64-bit), so sizeof(zsdesc) + sizeof(memdesc) would be equal to the > > current size of struct page. If that's true, then there is no loss, > > Yeah, zsdesc would be 56 bytes on 64 bit CPUs as memcg_data field is > not used in zsmalloc. > More fields in the current struct page might not be needed in the > future, although it's hard to say at the moment. > but it's not a loss. Is page->memcg_data something that we can drop? Aren't there code paths that will check page->memcg_data even for kernel pages (e.g. __folio_put() -> __folio_put_small() -> mem_cgroup_uncharge() ) ? > > > and there's potential gain if we start using higher order folios in > > zsmalloc in the future. > > AFAICS zsmalloc should work even when the system memory is fragmented, > so we may implement fallback allocation (as currently discussed in > large anon folios thread). Of course, any usage of higher order folios in zsmalloc must have a fallback logic, although it might be simpler for zsmalloc than anon folios. I agree that's off topic here. > > It might work, but IMHO the purpose of this series is to enable memdesc > for large anon/file folios, rather than seeing a large gain in zsmalloc i= tself. > (But even in zsmalloc, it's not a loss) > > > (That is of course unless we want to maintain cache line alignment for > > the zsdescs, then we might end up using 64 bytes anyway). > > we already don't require cache line alignment for struct page. the curren= t > alignment requirement is due to SLUB's cmpxchg128 operation, not cache > line alignment. I thought we want struct page to be cache line aligned (to avoid having to fetch two cache lines for one struct page), but I can easily be wrong. > > I might be wrong in some aspects, so please tell me if I am. > And thank you and Sergey for taking a look at this! Thanks to you for doing the work! > -- > Hyeonggon