linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
To: Yosry Ahmed <yosryahmed@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"nphamcs@gmail.com" <nphamcs@gmail.com>,
	"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
	"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
	"shakeel.butt@linux.dev" <shakeel.butt@linux.dev>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	"21cnbao@gmail.com" <21cnbao@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"Zou, Nanhai" <nanhai.zou@intel.com>,
	"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
	"Gopal, Vinodh" <vinodh.gopal@intel.com>,
	"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Subject: RE: [PATCH v8 6/8] mm: zswap: Support large folios in zswap_store().
Date: Mon, 30 Sep 2024 17:55:44 +0000	[thread overview]
Message-ID: <SJ0PR11MB567870784D380DE5EDB29AEBC9762@SJ0PR11MB5678.namprd11.prod.outlook.com> (raw)
In-Reply-To: <SJ0PR11MB56780EABB1E37C98A0EDE4EDC9752@SJ0PR11MB5678.namprd11.prod.outlook.com>

> -----Original Message-----
> From: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Sent: Sunday, September 29, 2024 2:15 PM
> To: Yosry Ahmed <yosryahmed@google.com>; Johannes Weiner
> <hannes@cmpxchg.org>
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> nphamcs@gmail.com; chengming.zhou@linux.dev;
> usamaarif642@gmail.com; shakeel.butt@linux.dev; ryan.roberts@arm.com;
> Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
> foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>;
> Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Subject: RE: [PATCH v8 6/8] mm: zswap: Support large folios in zswap_store().
> 
> > -----Original Message-----
> > From: Yosry Ahmed <yosryahmed@google.com>
> > Sent: Saturday, September 28, 2024 11:11 AM
> > To: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>; linux-
> > kernel@vger.kernel.org; linux-mm@kvack.org; nphamcs@gmail.com;
> > chengming.zhou@linux.dev; usamaarif642@gmail.com;
> > shakeel.butt@linux.dev; ryan.roberts@arm.com; Huang, Ying
> > <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
> foundation.org;
> > Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
> > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
> > Subject: Re: [PATCH v8 6/8] mm: zswap: Support large folios in
> zswap_store().
> >
> > On Sat, Sep 28, 2024 at 7:15 AM Johannes Weiner <hannes@cmpxchg.org>
> > wrote:
> > >
> > > On Fri, Sep 27, 2024 at 08:42:16PM -0700, Yosry Ahmed wrote:
> > > > On Fri, Sep 27, 2024 at 7:16 PM Kanchana P Sridhar
> > > > >  {
> > > > > +       struct page *page = folio_page(folio, index);
> > > > >         swp_entry_t swp = folio->swap;
> > > > > -       pgoff_t offset = swp_offset(swp);
> > > > >         struct xarray *tree = swap_zswap_tree(swp);
> > > > > +       pgoff_t offset = swp_offset(swp) + index;
> > > > >         struct zswap_entry *entry, *old;
> > > > > -       struct obj_cgroup *objcg = NULL;
> > > > > -       struct mem_cgroup *memcg = NULL;
> > > > > -
> > > > > -       VM_WARN_ON_ONCE(!folio_test_locked(folio));
> > > > > -       VM_WARN_ON_ONCE(!folio_test_swapcache(folio));
> > > > > +       int type = swp_type(swp);
> > > >
> > > > Why do we need type? We use it when initializing entry->swpentry to
> > > > reconstruct the swp_entry_t we already have.
> > >
> > > It's not the same entry. folio->swap points to the head entry, this
> > > function has to store swap entries with the offsets of each subpage.
> >
> > Duh, yeah, thanks.
> >
> > >
> > > Given the name of this function, it might be better to actually pass a
> > > page pointer to it; do the folio_page() inside zswap_store().
> > >
> > > Then do
> > >
> > >                 entry->swpentry = page_swap_entry(page);
> > >
> > > below.
> >
> > That is indeed clearer.
> >
> > Although this will be adding yet another caller of page_swap_entry()
> > that already has the folio, yet it calls page_swap_entry() for each
> > page in the folio, which calls page_folio() inside.
> >
> > I wonder if we should add (or replace page_swap_entry()) with a
> > folio_swap_entry(folio, index) helper. This can also be done as a
> > follow up.
> 
> Thanks Johannes and Yosry for these comments. I was thinking about
> this some more. In its current form, zswap_store_page() is called in the
> context
> of the folio by passing in a [folio, index]. This implies a key assumption about
> the existing zswap_store() large folios functionality, i.e., we do the per-page
> store for the page at a "index * PAGE_SIZE" within the folio, and not for any
> arbitrary page. Further, we need the folio for folio_nid(); but this can also be
> computed from the page. Another reason why I thought the existing signature
> might be preferable is because it seems like it enables getting the entry's
> swp_entry_t with fewer computes. Could calling page_swap_entry() add
> more computes; which if it is the case, could potentially add up (say 512
> times)

I went ahead and quantified this with the v8 signature of zswap_store_page()
and the suggested changes for this function to take a page and use
page_swap_entry(). I ran usemem with 2M pmd-mappable folios enabled.
The results indicate that the page_swap_entry() implementation is slightly
better in throughput and latency:

v8:                             run1       run2       run3    average
---------------------------------------------------------------------
Total throughput (KB/s):   6,483,835  6,396,760  6,349,532  6,410,042
Average throughput (KB/s):   216,127    213,225	   211,651    213,889
elapsed time (sec):           107.75     107.06	    109.99     108.87
sys time (sec):             2,476.43   2,453.99	  2,551.52   2,513.98
---------------------------------------------------------------------


page_swap_entry():              run1       run2       run3    average
---------------------------------------------------------------------
Total throughput (KB/s):   6,462,954  6,396,134  6,418,076  6,425,721
Average throughput (KB/s):   215,431    213,204	   213,935    214,683
elapsed time (sec):           108.67     109.46	    107.91     108.29
sys time (sec):             2,473.65   2,493.33	  2,507.82   2,490.74
---------------------------------------------------------------------------

Based on this, I will go ahead and implement the change suggested
by Johannes and submit a v9.

Thanks,
Kanchana

> 
> I would appreciate your thoughts on whether these are valid considerations,
> and can proceed accordingly.
> 
> >
> > >
> > > > >         obj_cgroup_put(objcg);
> > > > > -       if (zswap_pool_reached_full)
> > > > > -               queue_work(shrink_wq, &zswap_shrink_work);
> > > > > -check_old:
> > > > > +       return false;
> > > > > +}
> > > > > +
> > > > > +bool zswap_store(struct folio *folio)
> > > > > +{
> > > > > +       long nr_pages = folio_nr_pages(folio);
> > > > > +       swp_entry_t swp = folio->swap;
> > > > > +       struct xarray *tree = swap_zswap_tree(swp);
> > > > > +       pgoff_t offset = swp_offset(swp);
> > > > > +       struct obj_cgroup *objcg = NULL;
> > > > > +       struct mem_cgroup *memcg = NULL;
> > > > > +       struct zswap_pool *pool;
> > > > > +       size_t compressed_bytes = 0;
> > > >
> > > > Why size_t? entry->length is int.
> > >
> > > In light of Willy's comment, I think size_t is a good idea.
> >
> > Agreed.
> 
> Thanks Yosry, Matthew and Johannes for the resolution on this!
> 
> Thanks,
> Kanchana


  reply	other threads:[~2024-09-30 17:56 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-28  2:16 [PATCH v8 0/8] mm: zswap swap-out of large folios Kanchana P Sridhar
2024-09-28  2:16 ` [PATCH v8 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-28  2:30   ` Yosry Ahmed
2024-09-28  5:39   ` Chengming Zhou
2024-09-28 13:46   ` Johannes Weiner
2024-09-28  2:16 ` [PATCH v8 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-28  5:41   ` Chengming Zhou
2024-09-28 13:46   ` Johannes Weiner
2024-09-28  2:16 ` [PATCH v8 3/8] mm: zswap: Rename zswap_pool_get() to zswap_pool_tryget() Kanchana P Sridhar
2024-09-28  2:29   ` Yosry Ahmed
2024-09-28  5:43   ` Chengming Zhou
2024-09-29 21:01     ` Sridhar, Kanchana P
2024-09-28 13:47   ` Johannes Weiner
2024-09-28 23:26   ` Nhat Pham
2024-09-29 21:04     ` Sridhar, Kanchana P
2024-09-28  2:16 ` [PATCH v8 4/8] mm: Provide a new count_objcg_events() API for batch event updates Kanchana P Sridhar
2024-09-28  3:02   ` Yosry Ahmed
2024-09-28  5:46     ` Chengming Zhou
2024-09-29 21:00     ` Sridhar, Kanchana P
2024-09-28  2:16 ` [PATCH v8 5/8] mm: zswap: Modify zswap_stored_pages to be atomic_long_t Kanchana P Sridhar
2024-09-28  2:57   ` Yosry Ahmed
2024-09-28  4:50     ` Matthew Wilcox
2024-09-28  8:12       ` Yosry Ahmed
2024-09-28  8:13   ` Yosry Ahmed
2024-09-29 21:04     ` Sridhar, Kanchana P
2024-09-28 13:53   ` Johannes Weiner
2024-09-29 21:03     ` Sridhar, Kanchana P
2024-09-28 23:27   ` Nhat Pham
2024-09-28  2:16 ` [PATCH v8 6/8] mm: zswap: Support large folios in zswap_store() Kanchana P Sridhar
2024-09-28  3:42   ` Yosry Ahmed
2024-09-28 14:15     ` Johannes Weiner
2024-09-28 18:11       ` Yosry Ahmed
2024-09-29 21:15         ` Sridhar, Kanchana P
2024-09-30 17:55           ` Sridhar, Kanchana P [this message]
2024-09-29 21:24     ` Sridhar, Kanchana P
2024-09-28  6:05   ` Chengming Zhou
2024-09-28  2:16 ` [PATCH v8 7/8] mm: swap: Count successful large folio zswap stores in hugepage zswpout stats Kanchana P Sridhar
2024-09-28  2:16 ` [PATCH v8 8/8] mm: Document the newly added sysfs large folios " Kanchana P Sridhar
2024-09-29 22:34   ` Nhat Pham
2024-09-30  0:56     ` Sridhar, Kanchana P
2024-09-28  2:25 ` [PATCH v8 0/8] mm: zswap swap-out of large folios Yosry Ahmed
2024-09-28  2:36   ` Sridhar, Kanchana P
2024-09-28  3:00     ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB567870784D380DE5EDB29AEBC9762@SJ0PR11MB5678.namprd11.prod.outlook.com \
    --to=kanchana.p.sridhar@intel.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox