From: Miaohe Lin <linmiaohe@huawei.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Linux-MM <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
"Shakeel Butt" <shakeelb@google.com>,
Alex Shi <alex.shi@linux.alibaba.com>,
"Minchan Kim" <minchan@kernel.org>
Subject: Re: [Question] Is there a race window between swapoff vs synchronous swap_readpage
Date: Tue, 30 Mar 2021 19:21:54 +0800 [thread overview]
Message-ID: <7d2126a2-e67e-cadb-d732-77f8d54a2f0c@huawei.com> (raw)
In-Reply-To: <87h7kt9ufw.fsf@yhuang6-desk1.ccr.corp.intel.com>
On 2021/3/30 11:44, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
>
>> On 2021/3/30 9:57, Huang, Ying wrote:
>>> Hi, Miaohe,
>>>
>>> Miaohe Lin <linmiaohe@huawei.com> writes:
>>>
>>>> Hi all,
>>>> I am investigating the swap code, and I found the below possible race window:
>>>>
>>>> CPU 1 CPU 2
>>>> ----- -----
>>>> do_swap_page
>>>> skip swapcache case (synchronous swap_readpage)
>>>> alloc_page_vma
>>>> swapoff
>>>> release swap_file, bdev, or ...
>>>> swap_readpage
>>>> check sis->flags is ok
>>>> access swap_file, bdev or ...[oops!]
>>>> si->flags = 0
>>>>
>>>> The swapcache case is ok because swapoff will wait on the page_lock of swapcache page.
>>>> Is this will really happen or Am I miss something ?
>>>> Any reply would be really grateful. Thanks! :)
>>>
>>> This appears possible. Even for swapcache case, we can't guarantee the
>>
>> Many thanks for reply!
>>
>>> swap entry gotten from the page table is always valid too. The
>>
>> The page table may change at any time. And we may thus do some useless work.
>> But the pte_same() check could handle these races correctly if these do not
>> result in oops.
>>
>>> underlying swap device can be swapped off at the same time. So we use
>>> get/put_swap_device() for that. Maybe we need similar stuff here.
>>
>> Using get/put_swap_device() to guard against swapoff for swap_readpage() sounds
>> really bad as swap_readpage() may take really long time. Also such race may not be
>> really hurtful because swapoff is usually done when system shutdown only.
>> I can not figure some simple and stable stuff out to fix this. Any suggestions or
>> could anyone help get rid of such race?
>
> Some reference counting on the swap device can prevent swap device from
> swapping-off. To reduce the performance overhead on the hot-path as
> much as possible, it appears we can use the percpu_ref.
>
Sounds a good idea. Many thanks for your suggestion. :)
> Best Regards,
> Huang, Ying
> .
>
prev parent reply other threads:[~2021-03-30 11:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 13:18 Miaohe Lin
2021-03-30 1:57 ` Huang, Ying
2021-03-30 3:15 ` Miaohe Lin
2021-03-30 3:44 ` Huang, Ying
2021-03-30 5:47 ` Yu Zhao
2021-03-30 6:57 ` Huang, Ying
2021-03-30 7:27 ` Yu Zhao
2021-04-12 3:12 ` Miaohe Lin
2021-03-30 11:21 ` Miaohe Lin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d2126a2-e67e-cadb-d732-77f8d54a2f0c@huawei.com \
--to=linmiaohe@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linux.alibaba.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=shakeelb@google.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox