linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	hannes@cmpxchg.org,  nphamcs@gmail.com, chengming.zhou@linux.dev,
	usamaarif642@gmail.com,  shakeel.butt@linux.dev,
	ryan.roberts@arm.com, ying.huang@intel.com,  21cnbao@gmail.com,
	akpm@linux-foundation.org, nanhai.zou@intel.com,
	 wajdi.k.feghali@intel.com, vinodh.gopal@intel.com
Subject: Re: [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio.
Date: Tue, 24 Sep 2024 12:28:57 -0700	[thread overview]
Message-ID: <CAJD7tkacF3hCXiZHy7_+E7xmdojnsUewDeP=BsamcZReHxCTtg@mail.gmail.com> (raw)
In-Reply-To: <20240924011709.7037-6-kanchana.p.sridhar@intel.com>

On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar
<kanchana.p.sridhar@intel.com> wrote:
>
> For zswap_store() to handle mTHP folios, we need to iterate through each
> page in the mTHP, compress it and store it in the zswap pool. This patch
> introduces an auxiliary function zswap_store_page() that provides this
> functionality.
>
> The function signature reflects the design intent, namely, for it
> to be invoked by zswap_store() per-page in an mTHP. Hence, the folio's
> objcg and the zswap_pool to use are input parameters for sake of
> efficiency and consistency.
>
> The functionality in zswap_store_page() is reused and adapted from
> Ryan Roberts' RFC patch [1]:
>
>   "[RFC,v1] mm: zswap: Store large folios without splitting"
>
>   [1] https://lore.kernel.org/linux-mm/20231019110543.3284654-1-ryan.roberts@arm.com/T/#u
>
> Co-developed-by: Ryan Roberts
> Signed-off-by:
> Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
> ---
>  mm/zswap.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 88 insertions(+)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 9bea948d653e..8f2e0ab34c84 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1463,6 +1463,94 @@ static void zswap_delete_stored_offsets(struct xarray *tree,
>         }
>  }
>
> +/*
> + * Stores the page at specified "index" in a folio.
> + *
> + * @folio: The folio to store in zswap.
> + * @index: Index into the page in the folio that this function will store.
> + * @objcg: The folio's objcg.
> + * @pool:  The zswap_pool to store the compressed data for the page.
> + */
> +static bool __maybe_unused zswap_store_page(struct folio *folio, long index,
> +                                           struct obj_cgroup *objcg,
> +                                           struct zswap_pool *pool)

Why are we adding an unused function that duplicates code in
zswap_store(), then using it in the following patch? This makes it
difficult to see that the function does the same thing. This patch
should be refactoring the per-page code out of zswap_store() into
zswap_store_page(), and directly calling zswap_store_page() from
zswap_store().

> +{
> +       swp_entry_t swp = folio->swap;
> +       int type = swp_type(swp);
> +       pgoff_t offset = swp_offset(swp) + index;
> +       struct page *page = folio_page(folio, index);
> +       struct xarray *tree = swap_zswap_tree(swp);
> +       struct zswap_entry *entry;
> +
> +       if (objcg)
> +               obj_cgroup_get(objcg);
> +
> +       if (zswap_check_limits())
> +               goto reject;
> +
> +       /* allocate entry */
> +       entry = zswap_entry_cache_alloc(GFP_KERNEL, folio_nid(folio));
> +       if (!entry) {
> +               zswap_reject_kmemcache_fail++;
> +               goto reject;
> +       }
> +
> +       /* if entry is successfully added, it keeps the reference */
> +       if (!zswap_pool_get(pool))
> +               goto freepage;

I think we can batch this for all pages in zswap_store(), maybe first
add zswap_pool_get_many().

I am also wondering if it would be better to batch the limit checking
and allocating the entries, to front load any failures before we start
compression. Not sure if that's overall better though.

To batch allocate entries we will have to also allocate an array to
hold them. To batch the limit checking we will have to either allow
going further over limit for mTHPs, or check if there is enough
clearance to allow for compressing all the pages. Using the
uncompressed size will lead to false negatives though, so maybe we can
start tracking the average compression ratio for better limit
checking.

Nhat, Johannes, any thoughts here? I need someone to tell me if I am
overthinking this :)

> +
> +       entry->pool = pool;
> +
> +       if (!zswap_compress(page, entry))
> +               goto put_pool;
> +
> +       entry->swpentry = swp_entry(type, offset);
> +       entry->objcg = objcg;
> +       entry->referenced = true;
> +
> +       if (!zswap_store_entry(tree, entry))
> +               goto store_failed;
> +
> +       if (objcg) {
> +               obj_cgroup_charge_zswap(objcg, entry->length);
> +               count_objcg_event(objcg, ZSWPOUT);
> +       }
> +
> +       /*
> +        * We finish initializing the entry while it's already in xarray.
> +        * This is safe because:
> +        *
> +        * 1. Concurrent stores and invalidations are excluded by folio lock.
> +        *
> +        * 2. Writeback is excluded by the entry not being on the LRU yet.
> +        *    The publishing order matters to prevent writeback from seeing
> +        *    an incoherent entry.
> +        */
> +       if (entry->length) {
> +               INIT_LIST_HEAD(&entry->lru);
> +               zswap_lru_add(&zswap_list_lru, entry);
> +       }
> +
> +       /* update stats */
> +       atomic_inc(&zswap_stored_pages);
> +       count_vm_event(ZSWPOUT);

We should probably also batch updating the stats. It actually seems
like now we don't handle rolling them back upon failure.


> +
> +       return true;
> +
> +store_failed:
> +       zpool_free(entry->pool->zpool, entry->handle);
> +put_pool:
> +       zswap_pool_put(pool);
> +freepage:
> +       zswap_entry_cache_free(entry);
> +reject:
> +       obj_cgroup_put(objcg);
> +       if (zswap_pool_reached_full)
> +               queue_work(shrink_wq, &zswap_shrink_work);
> +
> +       return false;
> +}
> +
>  bool zswap_store(struct folio *folio)
>  {
>         long nr_pages = folio_nr_pages(folio);
> --
> 2.27.0
>


  reply	other threads:[~2024-09-24 19:29 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-24  1:17 [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Kanchana P Sridhar
2024-09-24  1:17 ` [PATCH v7 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-24 16:45   ` Nhat Pham
2024-09-24  1:17 ` [PATCH v7 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-24 16:50   ` Nhat Pham
2024-09-24  1:17 ` [PATCH v7 3/8] mm: zswap: Refactor code to store an entry in zswap xarray Kanchana P Sridhar
2024-09-24 17:16   ` Nhat Pham
2024-09-24 20:40     ` Sridhar, Kanchana P
2024-09-24 19:14   ` Yosry Ahmed
2024-09-24 22:22     ` Sridhar, Kanchana P
2024-09-24  1:17 ` [PATCH v7 4/8] mm: zswap: Refactor code to delete stored offsets in case of errors Kanchana P Sridhar
2024-09-24 17:25   ` Nhat Pham
2024-09-24 20:41     ` Sridhar, Kanchana P
2024-09-24 19:20   ` Yosry Ahmed
2024-09-24 22:32     ` Sridhar, Kanchana P
2024-09-25  0:43       ` Yosry Ahmed
2024-09-25  1:18         ` Sridhar, Kanchana P
2024-09-25 14:11         ` Johannes Weiner
2024-09-25 18:45           ` Sridhar, Kanchana P
2024-09-24  1:17 ` [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio Kanchana P Sridhar
2024-09-24 19:28   ` Yosry Ahmed [this message]
2024-09-24 22:45     ` Sridhar, Kanchana P
2024-09-25  0:47       ` Yosry Ahmed
2024-09-25  1:49         ` Sridhar, Kanchana P
2024-09-25 13:53           ` Johannes Weiner
2024-09-25 18:45             ` Sridhar, Kanchana P
2024-09-24  1:17 ` [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store() Kanchana P Sridhar
2024-09-24 17:33   ` Nhat Pham
2024-09-24 20:51     ` Sridhar, Kanchana P
2024-09-24 21:08       ` Nhat Pham
2024-09-24 21:34         ` Yosry Ahmed
2024-09-24 22:16           ` Nhat Pham
2024-09-24 22:18             ` Sridhar, Kanchana P
2024-09-24 22:28             ` Yosry Ahmed
2024-09-24 22:17           ` Sridhar, Kanchana P
2024-09-24 19:38   ` Yosry Ahmed
2024-09-24 20:51     ` Nhat Pham
2024-09-24 21:38       ` Yosry Ahmed
2024-09-24 23:11         ` Nhat Pham
2024-09-25  0:05           ` Sridhar, Kanchana P
2024-09-25  0:52           ` Yosry Ahmed
2024-09-24 23:21       ` Sridhar, Kanchana P
2024-09-24 23:02     ` Sridhar, Kanchana P
2024-09-25 13:40     ` Johannes Weiner
2024-09-25 18:30       ` Yosry Ahmed
2024-09-25 19:10         ` Sridhar, Kanchana P
2024-09-25 19:49           ` Yosry Ahmed
2024-09-25 20:49             ` Johannes Weiner
2024-09-25 19:20         ` Johannes Weiner
2024-09-25 19:39           ` Yosry Ahmed
2024-09-25 20:13             ` Johannes Weiner
2024-09-25 21:06               ` Yosry Ahmed
2024-09-25 22:29                 ` Sridhar, Kanchana P
2024-09-26  3:58                   ` Sridhar, Kanchana P
2024-09-26  4:52                     ` Yosry Ahmed
2024-09-26 16:40                       ` Sridhar, Kanchana P
2024-09-26 17:19                         ` Yosry Ahmed
2024-09-26 17:29                           ` Sridhar, Kanchana P
2024-09-26 17:34                             ` Yosry Ahmed
2024-09-26 19:36                               ` Sridhar, Kanchana P
2024-09-26 18:43                             ` Johannes Weiner
2024-09-26 18:45                               ` Yosry Ahmed
2024-09-26 19:40                                 ` Sridhar, Kanchana P
2024-09-26 19:39                               ` Sridhar, Kanchana P
2024-09-25 14:27   ` Johannes Weiner
2024-09-25 18:17     ` Yosry Ahmed
2024-09-25 18:48     ` Sridhar, Kanchana P
2024-09-24  1:17 ` [PATCH v7 7/8] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-09-24  1:17 ` [PATCH v7 8/8] mm: Document the newly added mTHP zswpout stats, clarify swpout semantics Kanchana P Sridhar
2024-09-24 17:36   ` Nhat Pham
2024-09-24 20:52     ` Sridhar, Kanchana P
2024-09-24 19:34 ` [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-09-24 22:50   ` Sridhar, Kanchana P
2024-09-25  6:35 ` Huang, Ying
2024-09-25 18:39   ` Sridhar, Kanchana P
2024-09-26  0:44     ` Huang, Ying
2024-09-26  3:48       ` Sridhar, Kanchana P
2024-09-26  6:47         ` Huang, Ying
2024-09-26 21:44           ` Sridhar, Kanchana P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJD7tkacF3hCXiZHy7_+E7xmdojnsUewDeP=BsamcZReHxCTtg@mail.gmail.com' \
    --to=yosryahmed@google.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=kanchana.p.sridhar@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox