From: David Hildenbrand <david@redhat.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
"Huang, Ying" <ying.huang@intel.com>,
Chris Li <chrisl@kernel.org>, Minchan Kim <minchan@kernel.org>,
Yu Zhao <yuzhao@google.com>, Barry Song <v-songbaohua@oppo.com>,
SeongJae Park <sj@kernel.org>, Hugh Dickins <hughd@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@suse.com>,
Yosry Ahmed <yosryahmed@google.com>,
stable@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/swap: fix race when skipping swapcache
Date: Fri, 16 Feb 2024 17:16:31 +0100 [thread overview]
Message-ID: <6ad329f6-6e1b-411b-a5d7-be1c8fd89d96@redhat.com> (raw)
In-Reply-To: <CAMgjq7CtLrzkO0kBEsqRDyu+GoGbzdgii3_dj7pfo-3-maQU8A@mail.gmail.com>
>> (4) relock the folio. (we do that already, might not want to fail)
>>
>> (4) take the PTE lock. If the PTE did not change, turn it into a present
>> PTE entry. Otherwise, cleanup.
>
> Very interesting idea!
>
> I'm just not sure what actual benefit it brings. The only concern
> about reusing swapcache_prepare so far is repeated page faults that
> may hurt performance or statistics, this issue is basically gone after
> adding a schedule().
I think you know that slapping in random schedule() calls is not a
proper way to wait for an event to happen :) It's pretty much
unpredictable how long the schedule() will take and if there even is
anything to schedule to!
With what I propose, just like with page migration, you really do wait
for the event (page read and unlocked, only the PTE has to be updated)
to happen before you try anything else.
Now, the difference is most likely that the case here happens much less
frequently than page migration. Still, you could have all threads
faulting one the same page and all would do the same dance here.
>
> We can't drop all the operations around swap cache and map anyway. It
> doesn't know if it should skip the swapcache until swapcache lookup
> and swap count look up are all done. So I think it can be done more
> naturally here with a special value, making things simpler, robust,
> and improving performance a bit more.
>
The issue will always be that someone can zap the PTE concurrently,
which would free up the swap cache. With what I propose, that cannot
happen in the sync swapin case we are discussing here.
If someone where to zap the PTE in the meantime, it would only remove
the special non-swap entry, indicating to swapin code that concurrent
zapping happened. The swapin code would handle the swapcache freeing
instead, after finishing reading the content.
So the swapcache would not get freed concurrently anymore if I am not wrong.
At least the theory, I didn't prototype it yet.
> And in another series [1] I'm working on making shmem make use of
> cache bypassed swapin as well, following this approach I'll have to
> implement another shmem map based synchronization too.
>
I'd have to look further into that, if something like that could
similarly apply to shmem. But yes, it's no using PTEs, so a PTE-based
sync mechanism does definitely not apply..
> After all it's only a rare race, I think a simpler solution might be better.
I'm not sure that simpler means better here. Simpler might be better for
a backport, though.
The "use schedule() to wait" looks odd, maybe it's a common primitive
that I simply didn't stumble over yet. (I doubt it but it could be possible)
--
Cheers,
David / dhildenb
prev parent reply other threads:[~2024-02-16 16:17 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-06 18:25 Kairui Song
2024-02-06 18:44 ` SeongJae Park
2024-02-06 23:02 ` Minchan Kim
2024-02-07 3:22 ` Kairui Song
2024-02-06 23:10 ` Chris Li
2024-02-06 23:40 ` Barry Song
2024-02-07 2:03 ` Chris Li
2024-02-07 2:20 ` Kairui Song
2024-02-07 1:52 ` Barry Song
2024-02-07 3:21 ` Kairui Song
2024-02-07 4:01 ` Chris Li
2024-02-07 4:06 ` Kairui Song
2024-02-07 18:31 ` Minchan Kim
2024-02-08 6:04 ` Kairui Song
2024-02-08 6:34 ` Huang, Ying
2024-02-08 19:01 ` Kairui Song
2024-02-08 19:42 ` Chris Li
2024-02-09 5:30 ` Kairui Song
2024-02-12 19:53 ` Kairui Song
2024-02-15 0:44 ` Minchan Kim
2024-02-15 19:07 ` Kairui Song
2024-02-19 5:42 ` Huang, Ying
2024-02-08 7:16 ` Barry Song
2024-02-07 2:08 ` Huang, Ying
2024-02-07 2:28 ` Kairui Song
2024-02-07 3:44 ` Huang, Ying
2024-02-07 3:45 ` Barry Song
2024-02-07 4:16 ` Huang, Ying
2024-02-07 4:24 ` Barry Song
2024-02-15 15:36 ` David Hildenbrand
2024-02-15 18:49 ` Kairui Song
2024-02-15 20:03 ` David Hildenbrand
2024-02-15 20:55 ` Minchan Kim
2024-02-15 22:58 ` Andrew Morton
2024-02-16 10:01 ` Kairui Song
2024-02-16 7:11 ` Kairui Song
2024-02-16 16:16 ` David Hildenbrand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6ad329f6-6e1b-411b-a5d7-be1c8fd89d96@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=ryncsn@gmail.com \
--cc=sj@kernel.org \
--cc=stable@vger.kernel.org \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox