linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Minchan Kim <minchan@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>,
	 Mike Rapoport <rppt@kernel.org>
Subject: Re: [RFC PATCH v2 00/21] mm/zsmalloc: Split zsdesc from struct page
Date: Fri, 21 Jul 2023 06:33:36 +0900	[thread overview]
Message-ID: <CAB=+i9R+BnePZWJGm-5xi+2HEiM=_5EBZtFGmaHye+iZvVR2Vw@mail.gmail.com> (raw)
In-Reply-To: <CAJD7tkbO0W4woJtidbQU0F2iOCQcDG024c6EZ1ZOb2sLOG1ovg@mail.gmail.com>

On Fri, Jul 21, 2023 at 3:31 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> On Thu, Jul 20, 2023 at 4:34 AM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> >
> > On Thu, Jul 20, 2023 at 4:55 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> > >
> > > On Thu, Jul 20, 2023 at 12:18 AM Sergey Senozhatsky
> > > <senozhatsky@chromium.org> wrote:
> > > >
> > > > On (23/07/13 13:20), Hyeonggon Yoo wrote:
> > > > > The purpose of this series is to define own memory descriptor for zsmalloc,
> > > > > instead of re-using various fields of struct page. This is a part of the
> > > > > effort to reduce the size of struct page to unsigned long and enable
> > > > > dynamic allocation of memory descriptors.
> > > > >
> > > > > While [1] outlines this ultimate objective, the current use of struct page
> > > > > is highly dependent on its definition, making it challenging to separately
> > > > > allocate memory descriptors.
> > > >
> > > > I glanced through the series and it all looks pretty straight forward to
> > > > me. I'll have a closer look. And we definitely need Minchan to ACK it.
> > > >
> > > > > Therefore, this series introduces new descriptor for zsmalloc, called
> > > > > zsdesc. It overlays struct page for now, but will eventually be allocated
> > > > > independently in the future.
> > > >
> > > > So I don't expect zsmalloc memory usage increase. On one hand for each
> > > > physical page that zspage consists of we will allocate zsdesc (extra bytes),
> > > > but at the same time struct page gets slimmer. So we should be even, or
> > > > am I wrong?
> > >
> > > Well, it depends. Here is my understanding (which may be completely wrong):
> > >
> > > The end goal would be to have an 8-byte memdesc for each order-0 page,
> > > and then allocate a specialized struct per-folio according to the use
> > > case. In this case, we would have a memdesc and a zsdesc for each
> > > order-0 page. If sizeof(zsdesc) is 64 bytes (on 64-bit), then it's a
> > > net loss. The savings only start kicking in with higher order folios.
> > > As of now, zsmalloc only uses order-0 pages as far as I can tell, so
> > > the usage would increase if I understand correctly.
> >
> > I partially agree with you that the point of memdesc stuff is
> > allocating a use-case specific
> > descriptor per folio. but I thought the primary gain from memdesc was
> > from anon and file pages
> > (where high order pages are more usable), rather than zsmalloc.
> >
> > And I believe enabling a memory descriptor per folio would be
> > impossible (or inefficient)
> > if zsmalloc and other subsystems are using struct page in the current
> > way (or please tell me I'm wrong?)
> >
> > So I expect the primary gain would be from high-order anon/file folios,
> > while this series is a prerequisite for them to work sanely.
>
> Right, I agree with that, sorry if I wasn't clear. I meant that
> generally speaking, we see gains from memdesc from higher order
> folios, so for zsmalloc specifically we probably won't see seeing any
> savings, and *might* see some extra usage (which I might be wrong
> about, see below).

Yeah, even if I said, "oh, we don't necessarily need to use extra
memory for zsdesc"
below, a slight increase wouldn't hurt too much in that perspective,
because there
will be savings from other users of memdesc.

> > > It seems to me though the sizeof(zsdesc) is actually 56 bytes (on
> > > 64-bit), so sizeof(zsdesc) + sizeof(memdesc) would be equal to the
> > > current size of struct page. If that's true, then there is no loss,
> >
> > Yeah, zsdesc would be 56 bytes on 64 bit CPUs as memcg_data field is
> > not used in zsmalloc.
> > More fields in the current struct page might not be needed in the
> > future, although it's hard to say at the moment.
> > but it's not a loss.
>
> Is page->memcg_data something that we can drop? Aren't there code
> paths that will check page->memcg_data even for kernel pages (e.g.
> __folio_put() -> __folio_put_small() -> mem_cgroup_uncharge() ) ?

zsmalloc pages are not accounted for via __GFP_ACCOUNT,
and IIUC the current implementation of zswap memcg charging does not
use memcg_data
either - so I think it can be dropped.

I think we don't want to increase memdesc to 16 bytes by adding memcg_data.
It should be in use-case specific descriptors if it can be charged to memcg?

> > > and there's potential gain if we start using higher order folios in
> > > zsmalloc in the future.
> >
> > AFAICS zsmalloc should work even when the system memory is fragmented,
> > so we may implement fallback allocation (as currently discussed in
> > large anon folios thread).
>
> Of course, any usage of higher order folios in zsmalloc must have a
> fallback logic, although it might be simpler for zsmalloc than anon
> folios. I agree that's off topic here.
> > It might work, but IMHO the purpose of this series is to enable memdesc
> > for large anon/file folios, rather than seeing a large gain in zsmalloc itself.
> > (But even in zsmalloc, it's not a loss)
> >
> > > (That is of course unless we want to maintain cache line alignment for
> > > the zsdescs, then we might end up using 64 bytes anyway).
> >
> > we already don't require cache line alignment for struct page. the current
> > alignment requirement is due to SLUB's cmpxchg128 operation, not cache
> > line alignment.
>
> I thought we want  struct page to be cache line aligned (to avoid
> having to fetch two cache lines for one struct page), but I can easily
> be wrong.

Right. I admit that even if it's not required to be cache line
aligned, it is 64 bytes
in commonly used configurations. and changing it could affect some workloads.

But I think for zsdesc it would be better not to align by cache line
size, before
observing degradations due to alignment. By the time zsmalloc is intensively
used, it shouldn't be a huge issue.

> > I might be wrong in some aspects, so please tell me if I am.
> > And thank you and Sergey for taking a look at this!
>
> Thanks to you for doing the work!

No problem! :)


  reply	other threads:[~2023-07-20 21:33 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-13  4:20 Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 01/21] mm/zsmalloc: create new struct zsdesc Hyeonggon Yoo
2023-07-20  7:47   ` Sergey Senozhatsky
2023-07-13  4:20 ` [RFC PATCH v2 02/21] mm/zsmalloc: add utility functions for zsdesc Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 03/21] mm/zsmalloc: replace first_page to first_zsdesc in struct zspage Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 04/21] mm/zsmalloc: add alternatives of frequently used helper functions Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 05/21] mm/zsmalloc: convert {try,}lock_zspage() to use zsdesc Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 06/21] mm/zsmalloc: convert __zs_{map,unmap}_object() " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 07/21] mm/zsmalloc: convert obj_to_location() and its users " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 08/21] mm/zsmalloc: convert obj_malloc() " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 09/21] mm/zsmalloc: convert create_page_chain() and its user " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 10/21] mm/zsmalloc: convert obj_allocated() and related helpers " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 11/21] mm/zsmalloc: convert init_zspage() " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 12/21] mm/zsmalloc: convert obj_to_page() and zs_free() " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 13/21] mm/zsmalloc: convert reset_page() to reset_zsdesc() Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 14/21] mm/zsmalloc: convert zs_page_{isolate,migrate,putback} to use zsdesc Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 15/21] mm/zsmalloc: convert __free_zspage() " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 16/21] mm/zsmalloc: convert location_to_obj() " Hyeonggon Yoo
2023-07-20  7:49   ` Sergey Senozhatsky
2023-07-13  4:20 ` [RFC PATCH v2 17/21] mm/zsmalloc: convert migrate_zspage() " Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 18/21] mm/zsmalloc: convert get_zspage() to take zsdesc Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 19/21] mm/zsmalloc: convert SetZsPageMovable() to use zsdesc Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 20/21] mm/zsmalloc: remove now unused helper functions Hyeonggon Yoo
2023-07-13  4:20 ` [RFC PATCH v2 21/21] mm/zsmalloc: convert {get,set}_first_obj_offset() to use zsdesc Hyeonggon Yoo
2023-07-20  7:18 ` [RFC PATCH v2 00/21] mm/zsmalloc: Split zsdesc from struct page Sergey Senozhatsky
2023-07-20  7:54   ` Yosry Ahmed
2023-07-20 11:34     ` Hyeonggon Yoo
2023-07-20 18:31       ` Yosry Ahmed
2023-07-20 21:33         ` Hyeonggon Yoo [this message]
2023-07-20 21:38           ` Yosry Ahmed
2023-07-20 21:52             ` Hyeonggon Yoo
2023-07-20 21:57               ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAB=+i9R+BnePZWJGm-5xi+2HEiM=_5EBZtFGmaHye+iZvVR2Vw@mail.gmail.com' \
    --to=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=rppt@kernel.org \
    --cc=senozhatsky@chromium.org \
    --cc=willy@infradead.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox