linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: chengming.zhou@linux.dev
Cc: hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com,
	 akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	 Chengming Zhou <zhouchengming@bytedance.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v3] mm/zswap: invalidate old entry when store fail or !zswap_enabled
Date: Tue, 6 Feb 2024 21:41:53 -0800	[thread overview]
Message-ID: <CAF8kJuOCbuFemoFNUYeNGYzYJ7eGLka6Y6OvSg8h61vXUfYdLw@mail.gmail.com> (raw)
In-Reply-To: <20240207033857.3820921-1-chengming.zhou@linux.dev>

On Tue, Feb 6, 2024 at 7:39 PM <chengming.zhou@linux.dev> wrote:
>
> From: Chengming Zhou <zhouchengming@bytedance.com>
>
> We may encounter duplicate entry in the zswap_store():
>
> 1. swap slot that freed to per-cpu swap cache, doesn't invalidate
>    the zswap entry, then got reused. This has been fixed.
>
> 2. !exclusive load mode, swapin folio will leave its zswap entry
>    on the tree, then swapout again. This has been removed.
>
> 3. one folio can be dirtied again after zswap_store(), so need to
>    zswap_store() again. This should be handled correctly.

Thanks, I have been wondering about the cause of that for a while.

>
> So we must invalidate the old duplicate entry before insert the
> new one, which actually doesn't have to be done at the beginning
> of zswap_store(). And this is a normal situation, we shouldn't
> WARN_ON(1) in this case, so delete it. (The WARN_ON(1) seems want
> to detect swap entry UAF problem? But not very necessary here.)
>
> The good point is that we don't need to lock tree twice in the
> store success path.
>
> Note we still need to invalidate the old duplicate entry in the
> store failure path, otherwise the new data in swapfile could be
> overwrite by the old data in zswap pool when lru writeback.
>
> We have to do this even when !zswap_enabled since zswap can be
> disabled anytime. If the folio store success before, then got
> dirtied again but zswap disabled, we won't invalidate the old
> duplicate entry in the zswap_store(). So later lru writeback
> may overwrite the new data in swapfile.
>
> Fixes: 42c06a0e8ebe ("mm: kill frontswap")
> Cc: <stable@vger.kernel.org>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Yosry Ahmed <yosryahmed@google.com>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
> v3:
>  - Fix a few grammatical problems in comments, per Yosry.
>
> v2:
>  - Change the duplicate entry invalidation loop to if, since we hold
>    the lock, we won't find it once we invalidate it, per Yosry.
>  - Add Fixes tag.
> ---
>  mm/zswap.c | 33 ++++++++++++++++-----------------
>  1 file changed, 16 insertions(+), 17 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index cd67f7f6b302..d9d8947d6761 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1518,18 +1518,8 @@ bool zswap_store(struct folio *folio)
>                 return false;
>
>         if (!zswap_enabled)
> -               return false;
> +               goto check_old;
>
> -       /*
> -        * If this is a duplicate, it must be removed before attempting to store
> -        * it, otherwise, if the store fails the old page won't be removed from
> -        * the tree, and it might be written back overriding the new data.
> -        */
> -       spin_lock(&tree->lock);
> -       entry = zswap_rb_search(&tree->rbroot, offset);
> -       if (entry)
> -               zswap_invalidate_entry(tree, entry);
> -       spin_unlock(&tree->lock);
>         objcg = get_obj_cgroup_from_folio(folio);
>         if (objcg && !obj_cgroup_may_zswap(objcg)) {
>                 memcg = get_mem_cgroup_from_objcg(objcg);
> @@ -1608,14 +1598,12 @@ bool zswap_store(struct folio *folio)
>         /* map */
>         spin_lock(&tree->lock);
>         /*
> -        * A duplicate entry should have been removed at the beginning of this
> -        * function. Since the swap entry should be pinned, if a duplicate is
> -        * found again here it means that something went wrong in the swap
> -        * cache.
> +        * The folio may have been dirtied again, invalidate the
> +        * possibly stale entry before inserting the new entry.
>          */
> -       while (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
> -               WARN_ON(1);
> +       if (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
>                 zswap_invalidate_entry(tree, dupentry);
> +               VM_WARN_ON(zswap_rb_insert(&tree->rbroot, entry, &dupentry));

It seems there is only one path called zswap_rb_insert() and there is
no loop to repeat the insert any more. Can we have the
zswap_rb_insert() install the entry and return the dupentry? We can
still just call zswap_invalidate_entry() on the duplicate. The mapping
of the dupentry has been removed when  zswap_rb_insert() returns. That
will save a repeat lookup on the duplicate case.
After this change, the zswap_rb_insert() will map to the xarray
xa_store() pretty nicely.

Chris

>         }
>         if (entry->length) {
>                 INIT_LIST_HEAD(&entry->lru);
> @@ -1638,6 +1626,17 @@ bool zswap_store(struct folio *folio)
>  reject:
>         if (objcg)
>                 obj_cgroup_put(objcg);
> +check_old:
> +       /*
> +        * If the zswap store fails or zswap is disabled, we must invalidate the
> +        * possibly stale entry which was previously stored at this offset.
> +        * Otherwise, writeback could overwrite the new data in the swapfile.
> +        */
> +       spin_lock(&tree->lock);
> +       entry = zswap_rb_search(&tree->rbroot, offset);
> +       if (entry)
> +               zswap_invalidate_entry(tree, entry);
> +       spin_unlock(&tree->lock);
>         return false;
>
>  shrink:
> --
> 2.40.1
>
>


  reply	other threads:[~2024-02-07  5:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-07  3:38 chengming.zhou
2024-02-07  5:41 ` Chris Li [this message]
2024-02-07  5:45   ` Yosry Ahmed
2024-02-07  5:52     ` Chris Li
2024-02-07 11:36 ` Chengming Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF8kJuOCbuFemoFNUYeNGYzYJ7eGLka6Y6OvSg8h61vXUfYdLw@mail.gmail.com \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=yosryahmed@google.com \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox