From: "Huang\, Ying" <ying.huang@intel.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, Hugh Dickins <hughd@google.com>,
"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
Minchan Kim <minchan@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Tim Chen <tim.c.chen@linux.intel.com>,
Mel Gorman <mgorman@techsingularity.net>,
Jérôme Glisse <jglisse@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
David Rientjes <rientjes@google.com>,
Rik van Riel <riel@redhat.com>, Jan Kara <jack@suse.cz>,
Dave Jiang <dave.jiang@intel.com>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
Andrea Parri <andrea.parri@amarulasolutions.com>
Subject: Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations
Date: Fri, 15 Feb 2019 15:08:36 +0800 [thread overview]
Message-ID: <871s49bkaz.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20190214143318.GJ4525@dhcp22.suse.cz> (Michal Hocko's message of "Thu, 14 Feb 2019 15:33:18 +0100")
Michal Hocko <mhocko@kernel.org> writes:
> On Mon 11-02-19 16:38:46, Huang, Ying wrote:
>> From: Huang Ying <ying.huang@intel.com>
>>
>> When swapin is performed, after getting the swap entry information from
>> the page table, system will swap in the swap entry, without any lock held
>> to prevent the swap device from being swapoff. This may cause the race
>> like below,
>>
>> CPU 1 CPU 2
>> ----- -----
>> do_swap_page
>> swapin_readahead
>> __read_swap_cache_async
>> swapoff swapcache_prepare
>> p->swap_map = NULL __swap_duplicate
>> p->swap_map[?] /* !!! NULL pointer access */
>>
>> Because swapoff is usually done when system shutdown only, the race may
>> not hit many people in practice. But it is still a race need to be fixed.
>>
>> To fix the race, get_swap_device() is added to check whether the specified
>> swap entry is valid in its swap device. If so, it will keep the swap
>> entry valid via preventing the swap device from being swapoff, until
>> put_swap_device() is called.
>>
>> Because swapoff() is very rare code path, to make the normal path runs as
>> fast as possible, disabling preemption + stop_machine() instead of
>> reference count is used to implement get/put_swap_device(). From
>> get_swap_device() to put_swap_device(), the preemption is disabled, so
>> stop_machine() in swapoff() will wait until put_swap_device() is called.
>>
>> In addition to swap_map, cluster_info, etc. data structure in the struct
>> swap_info_struct, the swap cache radix tree will be freed after swapoff,
>> so this patch fixes the race between swap cache looking up and swapoff
>> too.
>>
>> Races between some other swap cache usages protected via disabling
>> preemption and swapoff are fixed too via calling stop_machine() between
>> clearing PageSwapCache() and freeing swap cache data structure.
>>
>> Alternative implementation could be replacing disable preemption with
>> rcu_read_lock_sched and stop_machine() with synchronize_sched().
>
> using stop_machine is generally discouraged. It is a gross
> synchronization.
>
> Besides that, since when do we have this problem?
For problem, you mean the race between swapoff and the page fault
handler? The problem is introduced in v4.11 when we avoid to replace
swap_info_struct->lock with swap_cluster_info->lock in
__swap_duplicate() if possible to improve the scalability of swap
operations. But because swapoff is a really rare operation, I don't
think it's necessary to backport the fix.
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2019-02-15 7:09 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-11 8:38 Huang, Ying
2019-02-11 19:06 ` Daniel Jordan
2019-02-12 3:21 ` Andrea Parri
2019-02-12 6:47 ` Huang, Ying
2019-02-12 17:58 ` Tim Chen
2019-02-13 3:23 ` Huang, Ying
2019-02-12 20:06 ` Daniel Jordan
2019-02-12 6:40 ` Huang, Ying
2019-02-12 10:13 ` Andrea Parri
2019-02-15 6:34 ` Huang, Ying
2019-02-14 2:38 ` Andrea Arcangeli
2019-02-14 8:07 ` Huang, Ying
2019-02-14 21:47 ` Andrea Arcangeli
2019-02-15 7:50 ` Huang, Ying
2019-02-14 14:33 ` Michal Hocko
2019-02-14 20:30 ` Andrew Morton
2019-02-14 21:22 ` Andrea Arcangeli
2019-02-15 7:08 ` Huang, Ying [this message]
2019-02-15 13:11 ` Michal Hocko
2019-02-18 0:51 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871s49bkaz.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrea.parri@amarulasolutions.com \
--cc=daniel.m.jordan@oracle.com \
--cc=dave.jiang@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jglisse@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=minchan@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=tim.c.chen@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox