From: Yosry Ahmed <yosryahmed@google.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"shakeel.butt@linux.dev" <shakeel.butt@linux.dev>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"Huang, Ying" <ying.huang@intel.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>
Subject: Re: [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio.
Date: Tue, 24 Sep 2024 17:47:21 -0700 [thread overview]
Message-ID: <CAJD7tkbCB9iqRc1Y0287rZjQnJaV41DxN+GzZJFkqxM1FM8R4w@mail.gmail.com> (raw)
In-Reply-To: <SJ0PR11MB567887AF292EC178BEF1CF43C9682@SJ0PR11MB5678.namprd11.prod.outlook.com>
[..]
> >
> > > +{
> > > + swp_entry_t swp = folio->swap;
> > > + int type = swp_type(swp);
> > > + pgoff_t offset = swp_offset(swp) + index;
> > > + struct page *page = folio_page(folio, index);
> > > + struct xarray *tree = swap_zswap_tree(swp);
> > > + struct zswap_entry *entry;
> > > +
> > > + if (objcg)
> > > + obj_cgroup_get(objcg);
> > > +
> > > + if (zswap_check_limits())
> > > + goto reject;
> > > +
> > > + /* allocate entry */
> > > + entry = zswap_entry_cache_alloc(GFP_KERNEL, folio_nid(folio));
> > > + if (!entry) {
> > > + zswap_reject_kmemcache_fail++;
> > > + goto reject;
> > > + }
> > > +
> > > + /* if entry is successfully added, it keeps the reference */
> > > + if (!zswap_pool_get(pool))
> > > + goto freepage;
> >
> > I think we can batch this for all pages in zswap_store(), maybe first
> > add zswap_pool_get_many().
> >
> > I am also wondering if it would be better to batch the limit checking
> > and allocating the entries, to front load any failures before we start
> > compression. Not sure if that's overall better though.
> >
> > To batch allocate entries we will have to also allocate an array to
> > hold them. To batch the limit checking we will have to either allow
> > going further over limit for mTHPs, or check if there is enough
> > clearance to allow for compressing all the pages. Using the
> > uncompressed size will lead to false negatives though, so maybe we can
> > start tracking the average compression ratio for better limit
> > checking.
> >
> > Nhat, Johannes, any thoughts here? I need someone to tell me if I am
> > overthinking this :)
>
> These are all good points. I suppose I was thinking along the same lines
> of what Nhat mentioned in an earlier comment. I was trying the
> incremental zswap_pool_get() and limit checks and shrinker invocations
> in case of (all) error conditions to allow different concurrent stores to make
> progress, without favoring only one process's mTHP store. I was thinking
> this would have minimal impact on the process(es) that see the zswap
> limit being exceeded, and that this would be better than preemptively
> checking for the entire mTHP and failing (this could also complicate things
> where no one makes progress because multiple processes run the batch
> checks and fail, when realistically one/many could have triggered
> the shrinker before erroring out, and at least one could have made
> progress).
On the other hand, if we allow concurrent mTHP swapouts to do limit
checks incrementally, they may all fail at the last page. While if
they all do limit checks beforehand, one of them can proceed.
I think we need to agree on a higher-level strategy for limit
checking, both global and per-memcg. The per-memcg limit should be
stricter though, so we may end up having different policies.
>
> Would appreciate your perspectives on how this should be handled,
> and will implement a solution in v8 accordingly.
>
> Thanks,
> Kanchana
>
> >
> > > +
> > > + entry->pool = pool;
> > > +
> > > + if (!zswap_compress(page, entry))
> > > + goto put_pool;
> > > +
> > > + entry->swpentry = swp_entry(type, offset);
> > > + entry->objcg = objcg;
> > > + entry->referenced = true;
> > > +
> > > + if (!zswap_store_entry(tree, entry))
> > > + goto store_failed;
> > > +
> > > + if (objcg) {
> > > + obj_cgroup_charge_zswap(objcg, entry->length);
> > > + count_objcg_event(objcg, ZSWPOUT);
> > > + }
> > > +
> > > + /*
> > > + * We finish initializing the entry while it's already in xarray.
> > > + * This is safe because:
> > > + *
> > > + * 1. Concurrent stores and invalidations are excluded by folio lock.
> > > + *
> > > + * 2. Writeback is excluded by the entry not being on the LRU yet.
> > > + * The publishing order matters to prevent writeback from seeing
> > > + * an incoherent entry.
> > > + */
> > > + if (entry->length) {
> > > + INIT_LIST_HEAD(&entry->lru);
> > > + zswap_lru_add(&zswap_list_lru, entry);
> > > + }
> > > +
> > > + /* update stats */
> > > + atomic_inc(&zswap_stored_pages);
> > > + count_vm_event(ZSWPOUT);
> >
> > We should probably also batch updating the stats. It actually seems
> > like now we don't handle rolling them back upon failure.
>
> Good point! I assume you are referring only to the "ZSWPOUT" vm event stats
> updates and not the "zswap_stored_pages" (since latter is used in limit checking)?
I actually meant both. Do we rollback changes to zswap_stored_pages
when some stores succeed and some of them fail?
I think it's more correct and efficient to update the atomic once
after all the pages are successfully compressed and stored.
next prev parent reply other threads:[~2024-09-25 0:48 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 1:17 [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-24 16:45 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-24 16:50 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 3/8] mm: zswap: Refactor code to store an entry in zswap xarray Kanchana P Sridhar
2024-09-24 17:16 ` Nhat Pham
2024-09-24 20:40 ` Sridhar, Kanchana P
2024-09-24 19:14 ` Yosry Ahmed
2024-09-24 22:22 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 4/8] mm: zswap: Refactor code to delete stored offsets in case of errors Kanchana P Sridhar
2024-09-24 17:25 ` Nhat Pham
2024-09-24 20:41 ` Sridhar, Kanchana P
2024-09-24 19:20 ` Yosry Ahmed
2024-09-24 22:32 ` Sridhar, Kanchana P
2024-09-25 0:43 ` Yosry Ahmed
2024-09-25 1:18 ` Sridhar, Kanchana P
2024-09-25 14:11 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio Kanchana P Sridhar
2024-09-24 19:28 ` Yosry Ahmed
2024-09-24 22:45 ` Sridhar, Kanchana P
2024-09-25 0:47 ` Yosry Ahmed [this message]
2024-09-25 1:49 ` Sridhar, Kanchana P
2024-09-25 13:53 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store() Kanchana P Sridhar
2024-09-24 17:33 ` Nhat Pham
2024-09-24 20:51 ` Sridhar, Kanchana P
2024-09-24 21:08 ` Nhat Pham
2024-09-24 21:34 ` Yosry Ahmed
2024-09-24 22:16 ` Nhat Pham
2024-09-24 22:18 ` Sridhar, Kanchana P
2024-09-24 22:28 ` Yosry Ahmed
2024-09-24 22:17 ` Sridhar, Kanchana P
2024-09-24 19:38 ` Yosry Ahmed
2024-09-24 20:51 ` Nhat Pham
2024-09-24 21:38 ` Yosry Ahmed
2024-09-24 23:11 ` Nhat Pham
2024-09-25 0:05 ` Sridhar, Kanchana P
2024-09-25 0:52 ` Yosry Ahmed
2024-09-24 23:21 ` Sridhar, Kanchana P
2024-09-24 23:02 ` Sridhar, Kanchana P
2024-09-25 13:40 ` Johannes Weiner
2024-09-25 18:30 ` Yosry Ahmed
2024-09-25 19:10 ` Sridhar, Kanchana P
2024-09-25 19:49 ` Yosry Ahmed
2024-09-25 20:49 ` Johannes Weiner
2024-09-25 19:20 ` Johannes Weiner
2024-09-25 19:39 ` Yosry Ahmed
2024-09-25 20:13 ` Johannes Weiner
2024-09-25 21:06 ` Yosry Ahmed
2024-09-25 22:29 ` Sridhar, Kanchana P
2024-09-26 3:58 ` Sridhar, Kanchana P
2024-09-26 4:52 ` Yosry Ahmed
2024-09-26 16:40 ` Sridhar, Kanchana P
2024-09-26 17:19 ` Yosry Ahmed
2024-09-26 17:29 ` Sridhar, Kanchana P
2024-09-26 17:34 ` Yosry Ahmed
2024-09-26 19:36 ` Sridhar, Kanchana P
2024-09-26 18:43 ` Johannes Weiner
2024-09-26 18:45 ` Yosry Ahmed
2024-09-26 19:40 ` Sridhar, Kanchana P
2024-09-26 19:39 ` Sridhar, Kanchana P
2024-09-25 14:27 ` Johannes Weiner
2024-09-25 18:17 ` Yosry Ahmed
2024-09-25 18:48 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 7/8] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 8/8] mm: Document the newly added mTHP zswpout stats, clarify swpout semantics Kanchana P Sridhar
2024-09-24 17:36 ` Nhat Pham
2024-09-24 20:52 ` Sridhar, Kanchana P
2024-09-24 19:34 ` [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-09-24 22:50 ` Sridhar, Kanchana P
2024-09-25 6:35 ` Huang, Ying
2024-09-25 18:39 ` Sridhar, Kanchana P
2024-09-26 0:44 ` Huang, Ying
2024-09-26 3:48 ` Sridhar, Kanchana P
2024-09-26 6:47 ` Huang, Ying
2024-09-26 21:44 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJD7tkbCB9iqRc1Y0287rZjQnJaV41DxN+GzZJFkqxM1FM8R4w@mail.gmail.com \
--to=yosryahmed@google.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox