linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Shaohua Li <shli@kernel.org>
Cc: Rafael Aquini <aquini@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	riel@redhat.com, minchan@kernel.org, kmpark@infradead.org,
	linux-mm@kvack.org
Subject: Re: [patch 2/4 v3]swap: __swap_duplicate check bad swap entry
Date: Tue, 19 Mar 2013 14:34:30 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LNX.2.00.1303191352320.5966@eggly.anvils> (raw)
In-Reply-To: <20130221021738.GB32580@kernel.org>

On Thu, 21 Feb 2013, Shaohua Li wrote:

> In swapin_readahead(), read_swap_cache_async() can read a bad swap entry,
> because we don't check if readahead swap entry is bad. This doesn't break
> anything but such swapin page is wasteful and can only be freed at page
> reclaim. We avoid read such swap entry.
> 
> And next patch will mark a swap entry bad temporarily for discard. Without this
> patch, swap entry count will be messed.
> 
> Thanks Hugh to inspire swapin_readahead could use bad swap entry.
> 
> Signed-off-by: Shaohua Li <shli@fusionio.com>

Acked-by: Hugh Dickins <hughd@google.com>

Personally, I'd have merged this one into the next, or added it just
after the next - it can be easier to explain a bug once it's already
there, and okay to fix up after, just so long as it's very unlikely to
interfere with anybody's future bisection.  But maybe you and I just
prefer to tell stories in a different way: no need for you to reorder.

I applied your v3 series shortly before 3.9-rc1, and ran it under load
for a week.  With this addition to your previous, I no longer saw any
"Unused swap" or VM_BUG_ON(error == -EEXIST) issues, it all ran fine.

With one exception: swapoff occasionally failed, and it was another
such SWAP_MAP_BAD issue.  At the bottom I've appended the patch I was
using to fix that (which of course only makes sense after your next).

Maybe you'd like to merge that into this 2/4, and exchange 2/4 and 3/4,
or maybe you'd just like to merge it into your 3/4, or maybe you'd
prefer to keep it as a separate fixup following 3/4: up to you, I'm
not hung up the the ownership of it.  (I never experimented without
the SWP_WRITEOK part of it: I'm not sure how necessary that part is,
but I feel safer with it in.)

> ---
>  mm/swapfile.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> Index: linux/mm/swapfile.c
> ===================================================================
> --- linux.orig/mm/swapfile.c	2013-02-18 15:21:09.285317914 +0800
> +++ linux/mm/swapfile.c	2013-02-18 15:21:34.545004083 +0800
> @@ -2374,6 +2374,11 @@ static int __swap_duplicate(swp_entry_t
>  		goto unlock_out;
>  
>  	count = p->swap_map[offset];
> +	if (unlikely(swap_count(count) == SWAP_MAP_BAD)) {
> +		err = -ENOENT;
> +		goto unlock_out;
> +	}
> +
>  	has_cache = count & SWAP_HAS_CACHE;
>  	count &= ~SWAP_HAS_CACHE;
>  	err = 0;

[PATCH] swap: fix swapoff ENOMEMs from discard

swapoff was sometimes failing with "Cannot allocate memory", coming
from try_to_unuse()'s -ENOMEM: it needs to allow for swap_duplicate()
failing on a free entry temporarily SWAP_MAP_BAD while being discarded.

We should use ACCESS_ONCE() there, and whenever accessing swap_map
locklessly; but rather than peppering it throughout try_to_unuse(),
just declare *swap_map with volatile.

try_to_unuse() is accustomed to *swap_map going down racily, but not
necessarily to it jumping up from 0 to SWAP_MAP_BAD: we'll be safer to
prevent that transition once SWP_WRITEOK is switched off, when it's a
waste of time to issue discards anyway (swapon can do a whole discard).

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 mm/swapfile.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- 3.9-rc1+shli/mm/swapfile.c	2013-02-24 12:14:25.799160751 -0800
+++ linux/mm/swapfile.c	2013-02-25 21:43:04.112050626 -0800
@@ -199,8 +199,8 @@ static void discard_swap_cluster(struct
 static int swap_cluster_check_discard(struct swap_info_struct *si,
 		unsigned int idx)
 {
-
-	if (!(si->flags & SWP_DISCARDABLE))
+	if ((si->flags & (SWP_WRITEOK | SWP_DISCARDABLE)) !=
+			 (SWP_WRITEOK | SWP_DISCARDABLE))
 		return 0;
 	/*
 	 * If scan_swap_map() can't find a free cluster, it will check
@@ -1223,7 +1223,7 @@ static unsigned int find_next_to_unuse(s
 			else
 				continue;
 		}
-		count = si->swap_map[i];
+		count = ACCESS_ONCE(si->swap_map[i]);
 		if (count && swap_count(count) != SWAP_MAP_BAD)
 			break;
 	}
@@ -1243,7 +1243,7 @@ int try_to_unuse(unsigned int type, bool
 {
 	struct swap_info_struct *si = swap_info[type];
 	struct mm_struct *start_mm;
-	unsigned char *swap_map;
+	volatile unsigned char *swap_map;	/* ACCESS_ONCE throughout */
 	unsigned char swcount;
 	struct page *page;
 	swp_entry_t entry;
@@ -1294,7 +1294,8 @@ int try_to_unuse(unsigned int type, bool
 			 * reused since sys_swapoff() already disabled
 			 * allocation from here, or alloc_page() failed.
 			 */
-			if (!*swap_map)
+			swcount = *swap_map;
+			if (!swcount || swcount == SWAP_MAP_BAD)
 				continue;
 			retval = -ENOMEM;
 			break;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

       reply	other threads:[~2013-03-19 21:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20130221021738.GB32580@kernel.org>
2013-03-19 21:34 ` Hugh Dickins [this message]
2013-02-21  2:22 Shaohua Li
2013-03-12  1:45 ` Rafael Aquini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LNX.2.00.1303191352320.5966@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=kmpark@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox