Re: [PATCH] mm: fix the race between swapin_readahead and SWP_SYNCHRONOUS_IO path

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Minchan Kim <minchan@kernel.org>
To: Vinayak Menon <vinmenon@codeaurora.org>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix the race between swapin_readahead and SWP_SYNCHRONOUS_IO path
Date: Mon, 9 Sep 2019 16:26:13 -0700	[thread overview]
Message-ID: <20190909232613.GA39783@google.com> (raw)
In-Reply-To: <1567169011-4748-1-git-send-email-vinmenon@codeaurora.org>

Hi Vinayak,

On Fri, Aug 30, 2019 at 06:13:31PM +0530, Vinayak Menon wrote:
> The following race is observed due to which a processes faulting
> on a swap entry, finds the page neither in swapcache nor swap. This
> causes zram to give a zero filled page that gets mapped to the
> process, resulting in a user space crash later.
> 
> Consider parent and child processes Pa and Pb sharing the same swap
> slot with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set.
> Virtual address 'VA' of Pa and Pb points to the shared swap entry.
> 
> Pa                                       Pb
> 
> fault on VA                              fault on VA
> do_swap_page                             do_swap_page
> lookup_swap_cache fails                  lookup_swap_cache fails
>                                          Pb scheduled out
> swapin_readahead (deletes zram entry)
> swap_free (makes swap_count 1)
>                                          Pb scheduled in
>                                          swap_readpage (swap_count == 1)
>                                          Takes SWP_SYNCHRONOUS_IO path
>                                          zram enrty absent
>                                          zram gives a zero filled page
> 
> Fix this by reading the swap_count before lookup_swap_cache, which conforms
> with the order in which page is added to swap cache and swap count is
> decremented in do_swap_page. In the race case above, this will let Pb take
> the readahead path and thus pick the proper page from swapcache.

Thanks for the report, Vinayak.

It's a zram specific issue because it deallocates zram block
unconditionally once read IO is done. The expectation was that dirty
page is on the swap cache but with SWP_SYNCHRONOUS_IO, it's not true
any more so I want to resolve the issue in zram specific code, not
general one.

A idea in my mind is swap_slot_free_notify should check the slot
reference counter and if it's higher than 1, it shouldn't free the
slot until. What do you think about?

> 
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> ---
>  mm/memory.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index e0c232f..22643aa 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2744,6 +2744,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  	struct page *page = NULL, *swapcache;
>  	struct mem_cgroup *memcg;
>  	swp_entry_t entry;
> +	struct swap_info_struct *si;
> +	bool skip_swapcache = false;
>  	pte_t pte;
>  	int locked;
>  	int exclusive = 0;
> @@ -2771,15 +2773,24 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  
>  
>  	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> +
> +	/*
> +	 * lookup_swap_cache below can fail and before the SWP_SYNCHRONOUS_IO
> +	 * check is made, another process can populate the swapcache, delete
> +	 * the swap entry and decrement the swap count. So decide on taking
> +	 * the SWP_SYNCHRONOUS_IO path before the lookup. In the event of the
> +	 * race described, the victim process will find a swap_count > 1
> +	 * and can then take the readahead path instead of SWP_SYNCHRONOUS_IO.
> +	 */
> +	si = swp_swap_info(entry);
> +	if (si->flags & SWP_SYNCHRONOUS_IO && __swap_count(entry) == 1)
> +		skip_swapcache = true;
> +
>  	page = lookup_swap_cache(entry, vma, vmf->address);
>  	swapcache = page;
>  
>  	if (!page) {
> -		struct swap_info_struct *si = swp_swap_info(entry);
> -
> -		if (si->flags & SWP_SYNCHRONOUS_IO &&
> -				__swap_count(entry) == 1) {
> -			/* skip swapcache */
> +		if (skip_swapcache) {
>  			page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
>  							vmf->address);
>  			if (page) {
> -- 
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
> member of the Code Aurora Forum, hosted by The Linux Foundation
>

next prev parent reply	other threads:[~2019-09-09 23:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 12:43 Vinayak Menon
2019-09-02 13:21 ` Michal Hocko
2019-09-03  6:13   ` Vinayak Menon
2019-09-03 11:41     ` Michal Hocko
2019-09-03 12:17       ` Vinayak Menon
2019-09-09  4:05         ` Vinayak Menon
2019-09-09 11:23           ` Michal Hocko
2019-09-09 23:26 ` Minchan Kim [this message]
2019-09-10  8:22   ` Vinayak Menon
2019-09-10 17:51     ` Minchan Kim
2019-09-11 10:07       ` Vinayak Menon
2019-09-12 17:14         ` Minchan Kim
2019-09-13  9:05           ` Vinayak Menon
2019-09-16 20:05             ` Minchan Kim
2019-09-17  5:38               ` Vinayak Menon
2019-09-18  1:12                 ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190909232613.GA39783@google.com \
    --to=minchan@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vinmenon@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox