Re: cluster_alloc_swap_entry() may change per-cpu cluster of another CPU randomly

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kairui Song <ryncsn@gmail.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm <linux-mm@kvack.org>
Subject: Re: cluster_alloc_swap_entry() may change per-cpu cluster of another CPU randomly
Date: Thu, 24 Oct 2024 11:50:59 +0800	[thread overview]
Message-ID: <CAMgjq7ArBOZjTquDr5BhN+xtqKLbE3dV-7rXB-mzz_UL_m3HNQ@mail.gmail.com> (raw)
In-Reply-To: <871q0641xd.fsf@yhuang6-desk2.ccr.corp.intel.com>

On Thu, Oct 24, 2024 at 11:29 AM Huang, Ying <ying.huang@intel.com> wrote:
>
> Hi, Kairui,
>
> When reading new swap cluster allocation code.  Sorry, I didn't find
> time to review your patch at the first place.  I found that in
> cluster_alloc_swap_entry(), the following code path is possible,
>
>   cluster = this_cpu_ptr(si->percpu_cluster);
>   offset = alloc_swap_scan_cluster();
>     ...
>       spin_unlock(&ci->lock);
>       spin_unlock(&si->lock);
>       /* migrate to another cpu */
>       spin_lock(&ci->lock);
>       spin_lock(&si->lock);
>   cluster->next[order] = offset;
>
> That is, the per cpu cluster of a CPU may be changed on another CPU.  I
> guess that this will not cause some functionality issue.  However, this
> makes code harder to be reasoned.  Is it possible to avoid unlock before
> changing per cpu cluster?  Or, give up updating per cpu cluster if we
> need to unlock.

Hi Ying

Yes, I've noticed this when working on the new si->lock series. It may
cause fragmentation, only if the CPU the current allocation migrates
from requested another swap allocation, in such case it may waste one
per cpu cluster.

No functionality or leak issue, but make things harder to reason
indeed, as you mentioned.

In the new series I used a local lock to prevent migration, so the cpu
is stable during the whole process. However that is not doable unless
si->lock is removed from allocation path or there are lock dependency
issue.

Do we need to fix this separately in the current tree? The
fragmentation is overall better with these patches, so I think that is
causing any issues. Before these patches per CPU cluster can also be
stolen by other CPUs so these were not guaranteed to be accurate.

One easy way we can do is set the cluster->next[order] to 0 before
scanning for a new cluster to use, and check if it is 0 before setting
next[order] again.

That may still cause fragmentation as migration will cause extra
clusters to be scanned and used, which is hard to avoid without
changing too much code (like the things being done in the new si->lock
series).

next prev parent reply	other threads:[~2024-10-24  3:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-24  3:25 Huang, Ying
2024-10-24  3:50 ` Kairui Song [this message]
2024-10-24  3:54   ` Kairui Song
2024-10-25  5:52   ` Huang, Ying
2024-11-03 17:11     ` Kairui Song
2024-11-04  3:02       ` Huang, Ying
2024-11-05 18:04         ` Kairui Song
2024-11-06  5:24           ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMgjq7ArBOZjTquDr5BhN+xtqKLbE3dV-7rXB-mzz_UL_m3HNQ@mail.gmail.com \
    --to=ryncsn@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox