linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Nhat Pham <nphamcs@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, ngupta@vflare.org,
	sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com
Subject: Re: [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks
Date: Thu, 3 Nov 2022 14:08:01 -0400	[thread overview]
Message-ID: <Y2QDgSAc7FcIaNBh@cmpxchg.org> (raw)
In-Reply-To: <Y2Pj+fuON8lTMcmn@google.com>

On Thu, Nov 03, 2022 at 08:53:29AM -0700, Minchan Kim wrote:
> On Thu, Nov 03, 2022 at 11:18:04AM -0400, Johannes Weiner wrote:
> > On Wed, Nov 02, 2022 at 02:36:35PM -0700, Minchan Kim wrote:
> > > On Wed, Nov 02, 2022 at 12:28:56PM +0900, Sergey Senozhatsky wrote:
> > > > On (22/10/26 13:06), Nhat Pham wrote:
> > > > >  struct size_class {
> > > > > -	spinlock_t lock;
> > > > >  	struct list_head fullness_list[NR_ZS_FULLNESS];
> > > > >  	/*
> > > > >  	 * Size of objects stored in this class. Must be multiple
> > > > > @@ -247,8 +245,7 @@ struct zs_pool {
> > > > >  #ifdef CONFIG_COMPACTION
> > > > >  	struct work_struct free_work;
> > > > >  #endif
> > > > > -	/* protect page/zspage migration */
> > > > > -	rwlock_t migrate_lock;
> > > > > +	spinlock_t lock;
> > > > >  };
> > > > 
> > > > I'm not in love with this, to be honest. One big pool lock instead
> > > > of 255 per-class locks doesn't look attractive, as one big pool lock
> > > > is going to be hammered quite a lot when zram is used, e.g. as a regular
> > > > block device with a file system and is under heavy parallel writes/reads.
> > 
> > TBH the class always struck me as an odd scope to split the lock. Lock
> > contention depends on how variable the compression rate is of the
> > hottest incoming data, which is unpredictable from a user POV.
> > 
> > My understanding is that the primary usecase for zram is swapping, and
> > the pool lock is the same granularity as the swap locking.
> 
> People uses the zram to store caching object files in build server.

Oh, interesting. We can try with a kernel build directory on zram.

> > Do you have a particular one in mind? (I'm thinking journaled ones are
> > not of much interest, since their IO tends to be fairly serialized.)
> > 
> > btrfs?
> 
> I am not sure what FSes others are using but at least for me, just
> plain ext4.

Okay, we can test with both.

> > > I am also worry about that LRU stuff should be part of allocator
> > > instead of higher level.
> > 
> > I'm sorry, but that's not a reasonable objection.
> > 
> > These patches implement a core feature of being a zswap backend, using
> > standard LRU and locking techniques established by the other backends.
> > 
> > I don't disagree that it would nicer if zswap had a strong abstraction
> > for backend pages and a generalized LRU. But that is major surgery on
> > a codebase of over 6,500 lines. It's not a reasonable ask to change
> > all that first before implementing a basic feature that's useful now.
> 
> With same logic, folks added the LRU logic into their allocators
> without the effort considering moving the LRU into upper layer.
> 
> And then trend is still going on since I have seen multiple times
> people are trying to add more allocators. So if it's not a reasonable
> ask to consier, we couldn't stop the trend in the end.

So there is actually an ongoing effort to do that. Yosry and I have
spent quite some time on coming up with an LRU design that's
independent from compression policy over email and at Plumbers.

My concern is more about the order of doing things:

1. The missing writeback support is a gaping hole in zsmalloc, which
   affects production systems. A generalized LRU list is a good idea,
   but it's a huge task that from a user pov really is not
   critical. Even from a kernel dev / maintainer POV, there are bigger
   fish to fry in the zswap code base and the backends than this.

2. Refactoring existing functionality is much easier than writing
   generalized code that simultaneously enables new behavior. zsmalloc
   is the most complex of our backends. To make its LRU writeback work
   we had to patch zswap's ->map ordering to accomodate it, e.g. Such
   tricky changes are easier to make and test incrementally.

   The generalized LRU project will hugely benefit from already having
   a proven writeback implementation in zsmalloc, because then all the
   requirements in zswap and zsmalloc will be in black and white.

> > I get that your main interest is zram, and so this feature isn't of
> > interest to you. But zram isn't the only user, nor is it the primary
> 
> I am interest to the feature but my interest is more of general swap
> layer to manage the LRU so that it could support any hierarchy among
> swap devices, not only zswap.

I think we're on the same page about the longer term goals.


  reply	other threads:[~2022-11-03 18:08 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-26 20:06 [PATCH 0/5] Implement writeback for zsmalloc Nhat Pham
2022-10-26 20:06 ` [PATCH 1/5] zswap: fix writeback lock ordering " Nhat Pham
2022-10-26 20:06 ` [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks Nhat Pham
2022-10-28 14:46   ` Johannes Weiner
2022-11-02  3:28   ` Sergey Senozhatsky
2022-11-02 21:36     ` Minchan Kim
2022-11-03 15:18       ` Johannes Weiner
2022-11-03 15:53         ` Minchan Kim
2022-11-03 18:08           ` Johannes Weiner [this message]
2022-11-03 18:10             ` Yosry Ahmed
2022-11-03 20:37               ` Minchan Kim
2022-11-03 20:46                 ` Yosry Ahmed
2022-11-03 21:15                   ` Yu Zhao
2022-11-03 23:19                     ` Yosry Ahmed
2022-11-03 21:43                   ` Minchan Kim
2022-11-03 23:31                     ` Yosry Ahmed
2022-11-03 20:22             ` Minchan Kim
2022-11-04  3:58         ` Sergey Senozhatsky
2022-11-07 21:31           ` Nhat Pham
2022-11-07 22:35             ` Minchan Kim
2022-10-26 20:06 ` [PATCH 3/5] zsmalloc: Add a LRU to zs_pool to keep track of zspages in LRU order Nhat Pham
2022-10-28 14:47   ` Johannes Weiner
2022-10-26 20:06 ` [PATCH 4/5] zsmalloc: Add ops fields to zs_pool to store evict handlers Nhat Pham
2022-10-28 15:18   ` Johannes Weiner
2022-11-02  4:10   ` Sergey Senozhatsky
2022-11-07 21:36     ` Nhat Pham
2022-10-26 20:06 ` [PATCH 5/5] zsmalloc: Implement writeback mechanism for zsmalloc Nhat Pham
2022-10-27 13:53   ` kernel test robot
2022-10-27 18:27     ` [PATCH v2 " Nhat Pham
2022-10-28 15:19       ` Johannes Weiner
2022-11-02  3:42       ` Sergey Senozhatsky
2022-11-03 15:26         ` Johannes Weiner
2022-11-02  3:44       ` Sergey Senozhatsky
2022-11-02  4:13       ` Sergey Senozhatsky
2022-11-03 16:45         ` Johannes Weiner
2022-11-04  4:02           ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2QDgSAc7FcIaNBh@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=ddstreet@ieee.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=nphamcs@gmail.com \
    --cc=senozhatsky@chromium.org \
    --cc=sjenning@redhat.com \
    --cc=vitaly.wool@konsulko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox