linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kairui Song <ryncsn@gmail.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: akpm@linux-foundation.org, alex_y_xu@yahoo.ca, baohua@kernel.org,
	 da.gomez@samsung.com, david@redhat.com, hughd@google.com,
	ioworker0@gmail.com,  linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, ryan.roberts@arm.com,
	 wangkefeng.wang@huawei.com, willy@infradead.org, ziy@nvidia.com
Subject: Re: [PATCH] mm: shmem: fix potential data corruption during shmem swapin
Date: Tue, 25 Feb 2025 01:50:23 +0800	[thread overview]
Message-ID: <CAMgjq7D=TKC68PoMhLsJd24_sH5eyJ=o6PsDe6Ne4tAMOi49gw@mail.gmail.com> (raw)
In-Reply-To: <53e610af72302667475821e5b3c84c382da4efbc.1740386576.git.baolin.wang@linux.alibaba.com>

On Mon, Feb 24, 2025 at 4:47 PM Baolin Wang
<baolin.wang@linux.alibaba.com> wrote:
>
> Alex and Kairui reported some issues (system hang or data corruption) when
> swapping out or swapping in large shmem folios. This is especially easy to
> reproduce when the tmpfs is mount with the 'huge=within_size' parameter.
> Thanks to Kairui's reproducer, the issue can be easily replicated.
>
> The root cause of the problem is that swap readahead may asynchronously
> swap in order 0 folios into the swap cache, while the shmem mapping can
> still store large swap entries. Then an order 0 folio is inserted into
> the shmem mapping without splitting the large swap entry, which overwrites
> the original large swap entry, leading to data corruption.
>
> When getting a folio from the swap cache, we should split the large swap
> entry stored in the shmem mapping if the orders do not match, to fix this
> issue.
>
> Fixes: 809bc86517cc ("mm: shmem: support large folio swap out")
> Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> Reported-by: Kairui Song <ryncsn@gmail.com>

Maybe you can add a Closes:?

> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  mm/shmem.c | 31 +++++++++++++++++++++++++++----
>  1 file changed, 27 insertions(+), 4 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 4ea6109a8043..cebbac97a221 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2253,7 +2253,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>         struct folio *folio = NULL;
>         bool skip_swapcache = false;
>         swp_entry_t swap;
> -       int error, nr_pages;
> +       int error, nr_pages, order, split_order;
>
>         VM_BUG_ON(!*foliop || !xa_is_value(*foliop));
>         swap = radix_to_swp_entry(*foliop);
> @@ -2272,10 +2272,9 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>
>         /* Look it up and read it in.. */
>         folio = swap_cache_get_folio(swap, NULL, 0);
> +       order = xa_get_order(&mapping->i_pages, index);
>         if (!folio) {
> -               int order = xa_get_order(&mapping->i_pages, index);
>                 bool fallback_order0 = false;
> -               int split_order;
>
>                 /* Or update major stats only when swapin succeeds?? */
>                 if (fault_type) {
> @@ -2339,6 +2338,29 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>                         error = -ENOMEM;
>                         goto failed;
>                 }
> +       } else if (order != folio_order(folio)) {
> +               /*
> +                * Swap readahead may swap in order 0 folios into swapcache
> +                * asynchronously, while the shmem mapping can still stores
> +                * large swap entries. In such cases, we should split the
> +                * large swap entry to prevent possible data corruption.
> +                */
> +               split_order = shmem_split_large_entry(inode, index, swap, gfp);
> +               if (split_order < 0) {
> +                       error = split_order;
> +                       goto failed;
> +               }
> +
> +               /*
> +                * If the large swap entry has already been split, it is
> +                * necessary to recalculate the new swap entry based on
> +                * the old order alignment.
> +                */
> +               if (split_order > 0) {
> +                       pgoff_t offset = index - round_down(index, 1 << split_order);
> +
> +                       swap = swp_entry(swp_type(swap), swp_offset(swap) + offset);
> +               }
>         }
>
>  alloced:
> @@ -2346,7 +2368,8 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>         folio_lock(folio);
>         if ((!skip_swapcache && !folio_test_swapcache(folio)) ||
>             folio->swap.val != swap.val ||
> -           !shmem_confirm_swap(mapping, index, swap)) {
> +           !shmem_confirm_swap(mapping, index, swap) ||
> +           xa_get_order(&mapping->i_pages, index) != folio_order(folio)) {
>                 error = -EEXIST;
>                 goto unlock;
>         }
> --
> 2.43.5
>

Thanks for the fix, it works for me.

Tested-by: Kairui Song <kasong@tencent.com>


  reply	other threads:[~2025-02-24 17:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1738717785.im3r5g2vxc.none.ref@localhost>
2025-02-05  1:23 ` Hang when swapping huge=within_size tmpfs from zram Alex Xu (Hello71)
2025-02-05  1:55   ` Baolin Wang
2025-02-05  6:38     ` Baolin Wang
2025-02-05 14:39       ` Lance Yang
2025-02-07  7:23         ` Baolin Wang
2025-02-23 17:53           ` Kairui Song
2025-02-23 18:22             ` Kairui Song
2025-02-24  3:21               ` Baolin Wang
2025-02-24  8:47                 ` [PATCH] mm: shmem: fix potential data corruption during shmem swapin Baolin Wang
2025-02-24 17:50                   ` Kairui Song [this message]
2025-02-25  1:07                     ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMgjq7D=TKC68PoMhLsJd24_sH5eyJ=o6PsDe6Ne4tAMOi49gw@mail.gmail.com' \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex_y_xu@yahoo.ca \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=da.gomez@samsung.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox