linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	 Chris Li <chrisl@kernel.org>, Linux-MM <linux-mm@kvack.org>
Subject: Re: [RFC] Storing same-filled pages without a zswap_entry
Date: Wed, 20 Mar 2024 14:14:16 -0700	[thread overview]
Message-ID: <CAJD7tkaLYN1DY8L2NE+037u+YxFMuDydpagXHoksA9iEvDL1yg@mail.gmail.com> (raw)
In-Reply-To: <20240320210716.GH294822@cmpxchg.org>

On Wed, Mar 20, 2024 at 2:07 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Wed, Mar 20, 2024 at 01:49:17PM -0700, Yosry Ahmed wrote:
> > Hey folks,
> >
> > I was looking at cleaning up the same-filled handling code in zswap,
> > when it hit me that after the xarray conversion, the only member of
> > struct zwap_entry that is relevant to same-filled pages is now the
> > objcg pointer.
> >
> > The xarray allows a pointer to be tagged by up to two tags (1 and 3),
> > so we can completely avoid allocating a zswap_entry for same-filled
> > pages by storing a tagged objcg pointer directly in the xarray
> > instead.
> >
> > Basically the xarray would then either have a pointer to struct
> > zswap_entry or struct obj_cgroup, where the latter is tagged as
> > SAME_FILLED_ONE or SAME_FILLED_ZERO.
> >
> > There are two benefits of this:
> > - Saving some memory (precisely 64 bytes per same-filled entry).
> > - Further separating handling of same-filled pages from compressed
> > pages, which results in some nice cleanups (especially in
> > zswap_store()). It also makes further improvements easier (e.g.
> > skipping limit checking for same-filled entries).
>
> This sounds interesting.
>
> Where would you store the byte value it's filled with? Or would you
> limit it to zero-filled only?

Oh yeah I should have explicitly stated this, it would be limited to
pages filled with the same bit (all 0s or all 1s), assuming that most
same-filled entries would be zero-filled anyway.

>
> > The disadvantage is obviously the complexity needed to handle two
> > different types of pointers in the xarray, although I think with the
> > correct abstractions this is not a big deal.
> >
> > I have some untested patches that implement this that I plan on
> > testing and sending out at some point, the reason I am sending this
> > RFC now is to gauge interest. I am not sure how common same-filled
> > pages are. Unfortunately, this data is not easy to collect from our
> > fleet (still working on it), so if someone has data from actual
> > workloads that would be helpful.
> >
> > Running the kernel build test only shows a small amount of same-filled
> > pages landing in zswap, but I am thinking maybe actual workloads have
> > more zerod pages lying around.
>
> In our fleet, same-filled pages seem to average pretty consistently at
> 10% of total stored entries.
>
> I'd assume they're mostly zero-filled instead of other patterns, but I
> don't have any data on that.

I think this would be key to know if this is doable. If most
same-filled pages are zero pages, we can limit zswap's same-filled
feature for those and do the optimization. Otherwise we cannot do it,
as we actually store an entire word now as the same-filled "value".

Alternatively, we can have a separate struct that only has objcg and
value for same-filled pages. This means we still use 16 bytes (on
x86_64) per same-filled entry. I think we can still get the code
cleanups by separating handling for same-filled pages though?

>
> The savings from the entry would be a few megabytes per host, probably
> not an argument by itself. But the code simplifications and shaving a
> few cycles of the fast paths do sound promising.


  reply	other threads:[~2024-03-20 21:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 20:49 Yosry Ahmed
2024-03-20 21:07 ` Johannes Weiner
2024-03-20 21:14   ` Yosry Ahmed [this message]
2024-03-20 21:19   ` Johannes Weiner
2024-03-20 21:31     ` Yosry Ahmed
2024-03-21  3:40       ` Chengming Zhou
2024-03-21 18:44         ` Yosry Ahmed
2024-03-21 19:29           ` Johannes Weiner
2024-03-21 19:57             ` Yosry Ahmed
2024-03-22  0:28               ` Chris Li
2024-03-22  0:37                 ` Yosry Ahmed
2024-03-20 21:42   ` Chris Li
2024-03-21  4:26 ` Chris Li
2024-03-21 18:50   ` Yosry Ahmed
2024-03-22  0:18     ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJD7tkaLYN1DY8L2NE+037u+YxFMuDydpagXHoksA9iEvDL1yg@mail.gmail.com \
    --to=yosryahmed@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox