linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: 贺中坤 <hezhongkun.hzk@bytedance.com>, "Yu Zhao" <yuzhao@google.com>,
	minchan@kernel.org, senozhatsky@chromium.org, mhocko@suse.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Fabian Deutsch" <fdeutsch@redhat.com>
Subject: Re: [External] Re: [RFC PATCH 1/3] zram: charge the compressed RAM to the page's memcgroup
Date: Fri, 16 Jun 2023 01:39:53 -0700	[thread overview]
Message-ID: <CAJD7tkY7CMLFS7Kv-DYnPwO9cGVWZmQZzkgOfVJMkH2pO8Kt9Q@mail.gmail.com> (raw)
In-Reply-To: <dede2f5b-2ae5-6fa3-c0d5-3ce7fba11694@redhat.com>

On Fri, Jun 16, 2023 at 1:37 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 16.06.23 10:04, Yosry Ahmed wrote:
> > On Fri, Jun 16, 2023 at 12:57 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 16.06.23 09:37, Yosry Ahmed wrote:
> >>> On Thu, Jun 15, 2023 at 9:41 PM 贺中坤 <hezhongkun.hzk@bytedance.com> wrote:
> >>>>
> >>>>> Thanks Fabian for tagging me.
> >>>>>
> >>>>> I am not familiar with #1, so I will speak to #2. Zhongkun, There are
> >>>>> a few parts that I do not understand -- hopefully you can help me out
> >>>>> here:
> >>>>>
> >>>>> (1) If I understand correctly in this patch we set the active memcg
> >>>>> trying to charge any pages allocated in a zspage to the current memcg,
> >>>>> yet that zspage will contain multiple compressed object slots, not
> >>>>> just the one used by this memcg. Aren't we overcharging the memcg?
> >>>>> Basically the first memcg that happens to allocate the zspage will pay
> >>>>> for all the objects in this zspage, even after it stops using the
> >>>>> zspage completely?
> >>>>
> >>>> It will not overcharge.  As you said below, we are not using
> >>>> __GFP_ACCOUNT and charging the compressed slots to the memcgs.
> >>>>
> >>>>>
> >>>>> (2) Patch 3 seems to be charging the compressed slots to the memcgs,
> >>>>> yet this patch is trying to charge the entire zspage. Aren't we double
> >>>>> charging the zspage? I am guessing this isn't happening because (as
> >>>>> Michal pointed out) we are not using __GFP_ACCOUNT here anyway, so
> >>>>> this patch may be NOP, and the actual charging is coming from patch 3
> >>>>> only.
> >>>>
> >>>> YES, the actual charging is coming from patch 3. This patch just
> >>>> delivers the BIO page's  memcg to the current task which is not the
> >>>> consumer.
> >>>>
> >>>>>
> >>>>> (3) Zswap recently implemented per-memcg charging of compressed
> >>>>> objects in a much simpler way. If your main interest is #2 (which is
> >>>>> what I understand from the commit log), it seems like zswap might be
> >>>>> providing this already? Why can't you use zswap? Is it the fact that
> >>>>> zswap requires a backing swapfile?
> >>>>
> >>>> Thanks for your reply and review. Yes, the zswap requires a backing
> >>>> swapfile. The I/O path is very complex, sometimes it will throttle the
> >>>> whole system if some resources are short , so we hope to use zram.
> >>>
> >>> Is the only problem with zswap for you the requirement of a backing swapfile?
> >>>
> >>> If yes, I am in the early stages of developing a solution to make
> >>> zswap work without a backing swapfile. This was discussed in LSF/MM
> >>> [1]. Would this make zswap usable in for your use case?
> >>
> >> Out of curiosity, are there any other known pros/cons when using
> >> zswap-without-swap instead of zram?
> >>
> >> I know that zram requires sizing (size of the virtual block device) and
> >> consumes metadata, zswap doesn't.
> >
> > We don't use zram in our data centers so I am not an expert about
> > zram, but off the top of my head there are a few more advantages to
> > zswap:
>
> Thanks!
>
> > (1) Better memcg support (which this series is attempting to address
> > in zram, although in a much more complicated way).
>
> Right. I think this patch also misses to update apply the charging in the recompress
> case. (only triggered by user space IIUC)
>
> >
> > (2) We internally have incompressible memory handling on top of zswap,
> > which is something that we would like to upstream when
> > zswap-without-swap is supported. Basically if a page does not compress
> > well enough to save memory we reject it from zswap and make it
> > unevictable (if there is no backing swapfile). The existence of zswap
> > in the MM layer helps with this. Since zram is a block device from the
> > MM perspective, it's more difficult to do something like this.
> > Incompressible pages just sit in zram AFAICT.
>
> I see. With ZRAM_HUGE we still have to store the uncompressed page
> (because, it's a block device and has to hold that data).

Right.

>
> >
> > (3) Writeback support. If you're running out of memory to store
> > compressed pages you can add a swapfile in runtime and zswap will
> > start writing to it freeing up space to compress more pages. This
> > wouldn't be possible in the same way in zram. Zram supports writing to
> > a backing device but in a more manual way (userspace has to write to
> > an interface to tell zram to write some pages).
>
> Right, that zram backing device stuff is really sub-optimal and only useful
> in corner cases (most probably not datacenters).
>
> What one can do with zram is to add a second swap device with lower priority.
> Looking at my Fedora machine:
>
>   $ cat /proc/swaps
> Filename                                Type            Size            Used            Priority
> /dev/dm-2                               partition       16588796        0               -2
> /dev/zram0                              partition       8388604         0               100
>
>
> Guess the difference here is that you won't be writing out the compressed
> data to the disk, but anything the gets swapped out afterwards will
> end up on the disk. I can see how the zswap behavior might be better in that case
> (instead of swapping out some additional pages you relocate the
> already-swapped-out-to-zswap pages to the disk).

Yeah I am hoping we can enable the use of zswap without a backing
swapfile, and I keep seeing use cases that would benefit from that.

>
> --
> Cheers,
>
> David / dhildenb
>


  reply	other threads:[~2023-06-16  8:40 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-15  3:48 Zhongkun He
2023-06-15  4:59 ` Yu Zhao
2023-06-15  8:57   ` Fabian Deutsch
2023-06-15 10:00     ` [External] " 贺中坤
2023-06-15 12:14       ` Fabian Deutsch
2023-06-16  1:39     ` Yosry Ahmed
2023-06-16  4:40       ` [External] " 贺中坤
2023-06-16  7:37         ` Yosry Ahmed
2023-06-16  7:57           ` David Hildenbrand
2023-06-16  8:04             ` Yosry Ahmed
2023-06-16  8:37               ` David Hildenbrand
2023-06-16  8:39                 ` Yosry Ahmed [this message]
2023-06-15  9:32   ` Fabian Deutsch
2023-06-15  9:41   ` [External] " 贺中坤
2023-06-15  9:27 ` David Hildenbrand
2023-06-15 11:15   ` [External] " 贺中坤
2023-06-15 11:19     ` David Hildenbrand
2023-06-15 12:19       ` 贺中坤
2023-06-15 12:56         ` David Hildenbrand
2023-06-15 13:40           ` 贺中坤
2023-06-15 14:46             ` David Hildenbrand
2023-06-16  3:44               ` 贺中坤
2023-06-15  9:35 ` Michal Hocko
2023-06-15 11:58   ` [External] " 贺中坤
2023-06-15 12:16     ` Michal Hocko
2023-06-15 13:09       ` 贺中坤
2023-06-15 13:27         ` Michal Hocko
2023-06-15 14:13           ` 贺中坤
2023-06-15 14:20             ` Michal Hocko
2023-06-16  3:31               ` 贺中坤
2023-06-16  6:40                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJD7tkY7CMLFS7Kv-DYnPwO9cGVWZmQZzkgOfVJMkH2pO8Kt9Q@mail.gmail.com \
    --to=yosryahmed@google.com \
    --cc=aarcange@redhat.com \
    --cc=david@redhat.com \
    --cc=fdeutsch@redhat.com \
    --cc=hezhongkun.hzk@bytedance.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=senozhatsky@chromium.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox