From: "Huang, Ying" <ying.huang@intel.com>
To: Chris Li <chrisl@kernel.org>
Cc: Kairui Song <ryncsn@gmail.com>, Minchan Kim <minchan@kernel.org>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@suse.com>,
Yosry Ahmed <yosryahmed@google.com>,
David Hildenbrand <david@redhat.com>,
linux-kernel@vger.kernel.org, Yu Zhao <yuzhao@google.com>
Subject: Re: Whether is the race for SWP_SYNCHRONOUS_IO possible? (was Re: [PATCH v3 6/7] mm/swap, shmem: use unified swapin helper for shmem)
Date: Thu, 01 Feb 2024 08:52:46 +0800 [thread overview]
Message-ID: <87plxhf1k1.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <CAF8kJuN6G578RWQ5R6eW=FQWg3mphywgnusQYCRMJA4TzhR4jg@mail.gmail.com> (Chris Li's message of "Wed, 31 Jan 2024 15:45:18 -0800")
Chris Li <chrisl@kernel.org> writes:
> On Tue, Jan 30, 2024 at 7:58 PM Kairui Song <ryncsn@gmail.com> wrote:
>>
>> Hi Ying,
>>
>> On Wed, Jan 31, 2024 at 10:53 AM Huang, Ying <ying.huang@intel.com> wrote:
>> >
>> > Hi, Minchan,
>> >
>> > When I review the patchset from Kairui, I checked the code to skip swap
>> > cache in do_swap_page() for swap device with SWP_SYNCHRONOUS_IO. Is the
>> > following race possible? Where a page is swapped out to a swap device
>> > with SWP_SYNCHRONOUS_IO and the swap count is 1. Then 2 threads of the
>> > process runs on CPU0 and CPU1 as below. CPU0 is running do_swap_page().
>>
>> Chris raised a similar issue about the shmem path, and I was worrying
>> about the same issue in previous discussions about do_swap_page:
>> https://lore.kernel.org/linux-mm/CAMgjq7AwFiDb7cAMkWMWb3vkccie1-tocmZfT7m4WRb_UKPghg@mail.gmail.com/
>
> Ha thanks for remembering that.
>
>>
>> """
>> In do_swap_page path, multiple process could swapin the page at the
>> same time (a mapped once page can still be shared by sub threads),
>> they could get different folios. The later pte lock and pte_same check
>> is not enough, because while one process is not holding the pte lock,
>> another process could read-in, swap_free the entry, then swap-out the
>> page again, using same entry, an ABA problem. The race is not likely
>> to happen in reality but in theory possible.
>> """
>>
>> >
>> > CPU0 CPU1
>> > ---- ----
>> > swap_cache_get_folio()
>> > check sync io and swap count
>> > alloc folio
>> > swap_readpage()
>> > folio_lock_or_retry()
>> > swap in the swap entry
>> > write page
>> > swap out to same swap entry
>> > pte_offset_map_lock()
>> > check pte_same()
>> > swap_free() <-- new content lost!
>> > set_pte_at() <-- stale page!
>> > folio_unlock()
>> > pte_unmap_unlock()
>>
>> Thank you very much for highlighting this!
>>
>> My concern previously is the same as yours (swapping out using the
>> same entry is like an ABA issue, where pte_same failed to detect the
>> page table change), later when working on V3, I mistakenly thought
>> that's impossible as entry should be pinned until swap_free on CPU0,
>> and I'm wrong. CPU1 can also just call swap_free, then swap count is
>> dropped to 0 and it can just swap out using the same entry. Now I
>> think my patch 6/7 is also affected by this potential race. Seems
>> nothing can stop it from doing this.
>>
>> Actually I was trying to make a reproducer locally, due to swap slot
>> cache, swap allocation algorithm, and the short race window, this is
>> very unlikely to happen though.
>
> You can put some sleep in some of the CPU0 where expect the other race
> to happen to manual help triggering it. Yes, it sounds hard to trigger
> in real life due to reclaim swap out.
>
>>
>> How about we just increase the swap count temporarily in the direct
>> swap in path (after alloc folio), then drop the count after pte_same
>> (or shmem_add_to_page_cache in shmem path)? That seems enough to
>> prevent the entry reuse issue.
>
> Sounds like a good solution.
Yes. It seems that this can solve the race.
--
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2024-02-01 0:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-29 17:54 [PATCH v3 0/7] swapin refactor for optimization and unified readahead Kairui Song
2024-01-29 17:54 ` [PATCH v3 1/7] mm/swapfile.c: add back some comment Kairui Song
2024-01-29 17:54 ` [PATCH v3 2/7] mm/swap: move no readahead swapin code to a stand-alone helper Kairui Song
2024-01-30 5:38 ` Huang, Ying
2024-01-30 5:55 ` Kairui Song
2024-01-29 17:54 ` [PATCH v3 3/7] mm/swap: always account swapped in page into current memcg Kairui Song
2024-01-30 6:12 ` Huang, Ying
2024-01-30 7:01 ` Kairui Song
2024-01-30 7:03 ` Kairui Song
2024-01-29 17:54 ` [PATCH v3 4/7] mm/swap: introduce swapin_entry for unified readahead policy Kairui Song
2024-01-30 6:29 ` Huang, Ying
2024-01-29 17:54 ` [PATCH v3 5/7] mm/swap: avoid a duplicated swap cache lookup for SWP_SYNCHRONOUS_IO Kairui Song
2024-01-30 6:51 ` Huang, Ying
2024-01-29 17:54 ` [PATCH v3 6/7] mm/swap, shmem: use unified swapin helper for shmem Kairui Song
2024-01-31 2:51 ` Whether is the race for SWP_SYNCHRONOUS_IO possible? (was Re: [PATCH v3 6/7] mm/swap, shmem: use unified swapin helper for shmem) Huang, Ying
2024-01-31 3:58 ` Kairui Song
2024-01-31 23:45 ` Chris Li
2024-02-01 0:52 ` Huang, Ying [this message]
2024-01-31 23:38 ` Chris Li
2024-01-29 17:54 ` [PATCH v3 7/7] mm/swap: refactor swap_cache_get_folio Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87plxhf1k1.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=ryncsn@gmail.com \
--cc=willy@infradead.org \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox