From: Yosry Ahmed <yosryahmed@google.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"shakeel.butt@linux.dev" <shakeel.butt@linux.dev>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"Huang, Ying" <ying.huang@intel.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>
Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store().
Date: Thu, 26 Sep 2024 10:34:49 -0700 [thread overview]
Message-ID: <CAJD7tkaU4pdGZ4yJrn2z+dECrsbpByrWSc0XcrE6zA_QjSZBSg@mail.gmail.com> (raw)
In-Reply-To: <SJ0PR11MB56781678BE55278052EB590CC96A2@SJ0PR11MB5678.namprd11.prod.outlook.com>
On Thu, Sep 26, 2024 at 10:29 AM Sridhar, Kanchana P
<kanchana.p.sridhar@intel.com> wrote:
>
> > -----Original Message-----
> > From: Yosry Ahmed <yosryahmed@google.com>
> > Sent: Thursday, September 26, 2024 10:20 AM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>; linux-kernel@vger.kernel.org;
> > linux-mm@kvack.org; nphamcs@gmail.com; chengming.zhou@linux.dev;
> > usamaarif642@gmail.com; shakeel.butt@linux.dev; ryan.roberts@arm.com;
> > Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
> > foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
> > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
> > Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in
> > zswap_store().
> >
> > On Thu, Sep 26, 2024 at 9:40 AM Sridhar, Kanchana P
> > <kanchana.p.sridhar@intel.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Yosry Ahmed <yosryahmed@google.com>
> > > > Sent: Wednesday, September 25, 2024 9:52 PM
> > > > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> > > > Cc: Johannes Weiner <hannes@cmpxchg.org>; linux-
> > kernel@vger.kernel.org;
> > > > linux-mm@kvack.org; nphamcs@gmail.com; chengming.zhou@linux.dev;
> > > > usamaarif642@gmail.com; shakeel.butt@linux.dev;
> > ryan.roberts@arm.com;
> > > > Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
> > > > foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
> > > > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
> > > > Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in
> > > > zswap_store().
> > > >
> > > > [..]
> > > > >
> > > > > One thing I realized while reworking the patches for the batched checks
> > is:
> > > > > within zswap_store_page(), we set the entry->objcg and entry->pool
> > before
> > > > > adding it to the xarray. Given this, wouldn't it be safer to get the objcg
> > > > > and pool reference per sub-page, locally in zswap_store_page(), rather
> > than
> > > > > obtaining batched references at the end if the store is successful? If we
> > > > want
> > > > > zswap_store_page() to be self-contained and correct as far as the entry
> > > > > being created and added to the xarray, it seems like the right thing to
> > do?
> > > > > I am a bit apprehensive about the entry being added to the xarray
> > without
> > > > > a reference obtained on the objcg and pool, because any page-
> > > > faults/writeback
> > > > > that occur on sub-pages added to the xarray before the entire folio has
> > been
> > > > > stored, would run into issues.
> > > >
> > > > We definitely should not obtain references to the pool and objcg after
> > > > initializing the entries with them. We can obtain all references in
> > > > zswap_store() before zswap_store_page(). IOW, the batching in this
> > > > case should be done before the per-page operations, not after.
> > >
> > > Thanks Yosry. IIUC, we should obtain all references to the objcg and to the
> > > zswap_pool at the start of zswap_store.
> > >
> > > In the case of error on any sub-page, we will unwind state for potentially
> > > only the stored pages or the entire folio if it happened to already be in
> > zswap
> > > and is being re-written. We might need some additional book-keeping to
> > > keep track of which sub-pages were found in the xarray and
> > zswap_entry_free()
> > > got called (nr_sb). Assuming I define a new "obj_cgroup_put_many()", I
> > would need
> > > to call this with (folio_nr_pages() - nr_sb).
> > >
> > > As far as zswap_pool_get(), there is some added complexity if we want to
> > > keep the existing implementation that calls "percpu_ref_tryget()", and
> > assuming
> > > this is extended to provide a new "zswap_pool_get_many()" that calls
> > > "percpu_ref_tryget_many()". Is there a reason we use percpu_ref_tryget()
> > instead
> > > of percpu_ref_get()? Reason I ask is, with tryget(), if for some reason the
> > pool->ref
> > > is 0, no further increments will be made. If so, upon unwinding state in
> > > zswap_store(), I would need to special-case to catch this before calling a
> > new
> > > "zswap_pool_put_many()".
> > >
> > > Things could be a little simpler if zswap_pool_get() can use
> > "percpu_ref_get()"
> > > which will always increment the refcount. Since the zswap pool->ref is
> > initialized
> > > to "1", this seems Ok, but I don't know if there will be unintended
> > consequences.
> > >
> > > Can you please advise on what is the simplest/cleanest approach:
> > >
> > > 1) Proceed with the above changes without changing percpu_ref_tryget in
> > > zswap_pool_get. Needs special-casing in zswap_store to detect pool-
> > >ref
> > > being "0" before calling zswap_pool_put[_many].
> >
> > My assumption is that we can reorder the code such that if
> > zswap_pool_get_many() fails we don't call zswap_pool_put_many() to
> > begin with (e.g. jump to a label after zswap_pool_put_many()).
>
> However, the pool refcount could change between the start and end of
> zswap_store.
I am not sure what you mean. If zswap_pool_get_many() fails then we
just do not call zswap_pool_put_many() at all and abort.
>
> >
> > > 2) Modify zswap_pool_get/zswap_pool_get_many to use
> > percpu_ref_get_many
> > > and avoid special-casing to detect pool->ref being "0" before calling
> > > zswap_pool_put[_many].
> >
> > I don't think we can simply switch the tryget to a get, as I believe
> > we can race with the pool being destroyed.
>
> That was my initial thought as well, but I figured this couldn't happen
> since the pool->ref is initialized to "1", and based on the existing
> implementation. In any case, I can understand the intent of the use
> of "tryget"; it is just that it adds to the considerations for reference
> batching.
The initial ref can be dropped in __zswap_param_set() if a new pool is
created (see the call to ercpu_ref_kill(()).
>
> >
> > > 3) Keep the approach in v7 where obj_cgroup_get/put is localized to
> > > zswap_store_page for both success and error conditions, and any
> > unwinding
> > > state in zswap_store will take care of dropping references obtained from
> > > prior successful writes (from this or prior invocations of zswap_store).
> >
> > I am also fine with doing that and doing the reference batching as a follow up.
>
> I think so too! We could try and improve upon (3) with reference batching
> in a follow-up patch.
SGTM.
next prev parent reply other threads:[~2024-09-26 17:35 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 1:17 [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-24 16:45 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-24 16:50 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 3/8] mm: zswap: Refactor code to store an entry in zswap xarray Kanchana P Sridhar
2024-09-24 17:16 ` Nhat Pham
2024-09-24 20:40 ` Sridhar, Kanchana P
2024-09-24 19:14 ` Yosry Ahmed
2024-09-24 22:22 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 4/8] mm: zswap: Refactor code to delete stored offsets in case of errors Kanchana P Sridhar
2024-09-24 17:25 ` Nhat Pham
2024-09-24 20:41 ` Sridhar, Kanchana P
2024-09-24 19:20 ` Yosry Ahmed
2024-09-24 22:32 ` Sridhar, Kanchana P
2024-09-25 0:43 ` Yosry Ahmed
2024-09-25 1:18 ` Sridhar, Kanchana P
2024-09-25 14:11 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio Kanchana P Sridhar
2024-09-24 19:28 ` Yosry Ahmed
2024-09-24 22:45 ` Sridhar, Kanchana P
2024-09-25 0:47 ` Yosry Ahmed
2024-09-25 1:49 ` Sridhar, Kanchana P
2024-09-25 13:53 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store() Kanchana P Sridhar
2024-09-24 17:33 ` Nhat Pham
2024-09-24 20:51 ` Sridhar, Kanchana P
2024-09-24 21:08 ` Nhat Pham
2024-09-24 21:34 ` Yosry Ahmed
2024-09-24 22:16 ` Nhat Pham
2024-09-24 22:18 ` Sridhar, Kanchana P
2024-09-24 22:28 ` Yosry Ahmed
2024-09-24 22:17 ` Sridhar, Kanchana P
2024-09-24 19:38 ` Yosry Ahmed
2024-09-24 20:51 ` Nhat Pham
2024-09-24 21:38 ` Yosry Ahmed
2024-09-24 23:11 ` Nhat Pham
2024-09-25 0:05 ` Sridhar, Kanchana P
2024-09-25 0:52 ` Yosry Ahmed
2024-09-24 23:21 ` Sridhar, Kanchana P
2024-09-24 23:02 ` Sridhar, Kanchana P
2024-09-25 13:40 ` Johannes Weiner
2024-09-25 18:30 ` Yosry Ahmed
2024-09-25 19:10 ` Sridhar, Kanchana P
2024-09-25 19:49 ` Yosry Ahmed
2024-09-25 20:49 ` Johannes Weiner
2024-09-25 19:20 ` Johannes Weiner
2024-09-25 19:39 ` Yosry Ahmed
2024-09-25 20:13 ` Johannes Weiner
2024-09-25 21:06 ` Yosry Ahmed
2024-09-25 22:29 ` Sridhar, Kanchana P
2024-09-26 3:58 ` Sridhar, Kanchana P
2024-09-26 4:52 ` Yosry Ahmed
2024-09-26 16:40 ` Sridhar, Kanchana P
2024-09-26 17:19 ` Yosry Ahmed
2024-09-26 17:29 ` Sridhar, Kanchana P
2024-09-26 17:34 ` Yosry Ahmed [this message]
2024-09-26 19:36 ` Sridhar, Kanchana P
2024-09-26 18:43 ` Johannes Weiner
2024-09-26 18:45 ` Yosry Ahmed
2024-09-26 19:40 ` Sridhar, Kanchana P
2024-09-26 19:39 ` Sridhar, Kanchana P
2024-09-25 14:27 ` Johannes Weiner
2024-09-25 18:17 ` Yosry Ahmed
2024-09-25 18:48 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 7/8] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 8/8] mm: Document the newly added mTHP zswpout stats, clarify swpout semantics Kanchana P Sridhar
2024-09-24 17:36 ` Nhat Pham
2024-09-24 20:52 ` Sridhar, Kanchana P
2024-09-24 19:34 ` [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-09-24 22:50 ` Sridhar, Kanchana P
2024-09-25 6:35 ` Huang, Ying
2024-09-25 18:39 ` Sridhar, Kanchana P
2024-09-26 0:44 ` Huang, Ying
2024-09-26 3:48 ` Sridhar, Kanchana P
2024-09-26 6:47 ` Huang, Ying
2024-09-26 21:44 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJD7tkaU4pdGZ4yJrn2z+dECrsbpByrWSc0XcrE6zA_QjSZBSg@mail.gmail.com \
--to=yosryahmed@google.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox