From: Yosry Ahmed <yosryahmed@google.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
hannes@cmpxchg.org, chengming.zhou@linux.dev,
usamaarif642@gmail.com, shakeel.butt@linux.dev,
ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com,
akpm@linux-foundation.org, nanhai.zou@intel.com,
wajdi.k.feghali@intel.com, vinodh.gopal@intel.com
Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store().
Date: Tue, 24 Sep 2024 14:38:00 -0700 [thread overview]
Message-ID: <CAJD7tkYCXexrP_2xjXqFDpJALCgi84aA7wGOo=6mfuUSpMO-ng@mail.gmail.com> (raw)
In-Reply-To: <CAKEwX=O4PJmLRLog3NGzy+r6+1XTXs_r9Nxs73CJeFeN0pcr+Q@mail.gmail.com>
On Tue, Sep 24, 2024 at 1:51 PM Nhat Pham <nphamcs@gmail.com> wrote:
>
> On Tue, Sep 24, 2024 at 12:39 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar
> > > + * The cgroup zswap limit check is done once at the beginning of an
> > > + * mTHP store, and not within zswap_store_page() for each page
> > > + * in the mTHP. We do however check the zswap pool limits at the
> > > + * start of zswap_store_page(). What this means is, the cgroup
> > > + * could go over the limits by at most (HPAGE_PMD_NR - 1) pages.
> > > + * However, the per-store-page zswap pool limits check should
> > > + * hopefully trigger the cgroup aware and zswap LRU aware global
> > > + * reclaim implemented in the shrinker. If this assumption holds,
> > > + * the cgroup exceeding the zswap limits could potentially be
> > > + * resolved before the next zswap_store, and if it is not, the next
> > > + * zswap_store would fail the cgroup zswap limit check at the start.
> > > + */
> >
> > I do not really like this. Allowing going one page above the limit is
> > one thing, but one THP above the limit seems too much. I also don't
>
> Hmm what if you have multiple concurrent zswap stores, from different
> tasks but the same cgroup? If none of them has charged, they would all
> get greenlit, and charge towards the cgroup...
>
> So technically the zswap limit checking is already best-effort only.
> But now, instead of one page per violation, it's 512 pages per
> violation :)
Yeah good point about concurrent operations, we can go 512 pages above
limit * number of concurrent swapouts. That can be a lot of memory.
>
> Yeah this can be bad. I think this is only safe if you only use
> zswap.max as a binary knob (0 or max)...
>
> > like relying on the repeated limit checking in zswap_store_page(), if
> > anything I think that should be batched too.
> >
> > Is it too unreasonable to maintain the average compression ratio and
> > use that to estimate limit checking for both memcg and global limits?
> > Johannes, Nhat, any thoughts on this?
>
> I remember asking about this, but past Nhat might have relented :)
>
> https://lore.kernel.org/linux-mm/CAKEwX=PfAMZ2qJtwKwJsVx3TZWxV5z2ZaU1Epk1UD=DBdMsjFA@mail.gmail.com/
>
> We can do limit checking and charging after compression is done, but
> that's a lot of code change (might not even be possible)... It will,
> however, allow us to do charging + checking in one go (rather than
> doing it 8, 16, or 512 times)
>
> Another thing we can do is to register a zswap writeback after the
> zswap store attempts to clean up excess capacity. Not sure what will
> happen if zswap writeback is disabled for the cgroup though :)
>
> If it's too hard, the average estimate could be a decent compromise,
> until we figure something smarter.
We can also do what we discussed before about double charging. The
pages that are being reclaimed are already charged, so technically we
don't need to charge them again. We can uncharge the difference
between compressed and uncompressed sizes after compression and call
it a day. This fixes the limit checking and the double charging in one
go.
I am a little bit nervous though about zswap uncharing the pages from
under reclaim, there are likely further accesses of the page memcg
after zswap. Maybe we can plumb the info back to reclaim or set a flag
on the page to avoid uncharging it when it's freed.
next prev parent reply other threads:[~2024-09-24 21:38 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 1:17 [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-24 16:45 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-24 16:50 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 3/8] mm: zswap: Refactor code to store an entry in zswap xarray Kanchana P Sridhar
2024-09-24 17:16 ` Nhat Pham
2024-09-24 20:40 ` Sridhar, Kanchana P
2024-09-24 19:14 ` Yosry Ahmed
2024-09-24 22:22 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 4/8] mm: zswap: Refactor code to delete stored offsets in case of errors Kanchana P Sridhar
2024-09-24 17:25 ` Nhat Pham
2024-09-24 20:41 ` Sridhar, Kanchana P
2024-09-24 19:20 ` Yosry Ahmed
2024-09-24 22:32 ` Sridhar, Kanchana P
2024-09-25 0:43 ` Yosry Ahmed
2024-09-25 1:18 ` Sridhar, Kanchana P
2024-09-25 14:11 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio Kanchana P Sridhar
2024-09-24 19:28 ` Yosry Ahmed
2024-09-24 22:45 ` Sridhar, Kanchana P
2024-09-25 0:47 ` Yosry Ahmed
2024-09-25 1:49 ` Sridhar, Kanchana P
2024-09-25 13:53 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store() Kanchana P Sridhar
2024-09-24 17:33 ` Nhat Pham
2024-09-24 20:51 ` Sridhar, Kanchana P
2024-09-24 21:08 ` Nhat Pham
2024-09-24 21:34 ` Yosry Ahmed
2024-09-24 22:16 ` Nhat Pham
2024-09-24 22:18 ` Sridhar, Kanchana P
2024-09-24 22:28 ` Yosry Ahmed
2024-09-24 22:17 ` Sridhar, Kanchana P
2024-09-24 19:38 ` Yosry Ahmed
2024-09-24 20:51 ` Nhat Pham
2024-09-24 21:38 ` Yosry Ahmed [this message]
2024-09-24 23:11 ` Nhat Pham
2024-09-25 0:05 ` Sridhar, Kanchana P
2024-09-25 0:52 ` Yosry Ahmed
2024-09-24 23:21 ` Sridhar, Kanchana P
2024-09-24 23:02 ` Sridhar, Kanchana P
2024-09-25 13:40 ` Johannes Weiner
2024-09-25 18:30 ` Yosry Ahmed
2024-09-25 19:10 ` Sridhar, Kanchana P
2024-09-25 19:49 ` Yosry Ahmed
2024-09-25 20:49 ` Johannes Weiner
2024-09-25 19:20 ` Johannes Weiner
2024-09-25 19:39 ` Yosry Ahmed
2024-09-25 20:13 ` Johannes Weiner
2024-09-25 21:06 ` Yosry Ahmed
2024-09-25 22:29 ` Sridhar, Kanchana P
2024-09-26 3:58 ` Sridhar, Kanchana P
2024-09-26 4:52 ` Yosry Ahmed
2024-09-26 16:40 ` Sridhar, Kanchana P
2024-09-26 17:19 ` Yosry Ahmed
2024-09-26 17:29 ` Sridhar, Kanchana P
2024-09-26 17:34 ` Yosry Ahmed
2024-09-26 19:36 ` Sridhar, Kanchana P
2024-09-26 18:43 ` Johannes Weiner
2024-09-26 18:45 ` Yosry Ahmed
2024-09-26 19:40 ` Sridhar, Kanchana P
2024-09-26 19:39 ` Sridhar, Kanchana P
2024-09-25 14:27 ` Johannes Weiner
2024-09-25 18:17 ` Yosry Ahmed
2024-09-25 18:48 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 7/8] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 8/8] mm: Document the newly added mTHP zswpout stats, clarify swpout semantics Kanchana P Sridhar
2024-09-24 17:36 ` Nhat Pham
2024-09-24 20:52 ` Sridhar, Kanchana P
2024-09-24 19:34 ` [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-09-24 22:50 ` Sridhar, Kanchana P
2024-09-25 6:35 ` Huang, Ying
2024-09-25 18:39 ` Sridhar, Kanchana P
2024-09-26 0:44 ` Huang, Ying
2024-09-26 3:48 ` Sridhar, Kanchana P
2024-09-26 6:47 ` Huang, Ying
2024-09-26 21:44 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJD7tkYCXexrP_2xjXqFDpJALCgi84aA7wGOo=6mfuUSpMO-ng@mail.gmail.com' \
--to=yosryahmed@google.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox