From: Andrew Morton <akpm@linux-foundation.org>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Johannes Weiner <hannes@cmpxchg.org>,
Minchan Kim <minchan@kernel.org>, Rik van Riel <riel@redhat.com>,
Shaohua Li <shli@kernel.org>, Hugh Dickins <hughd@google.com>,
Fengguang Wu <fengguang.wu@intel.com>,
Tim Chen <tim.c.chen@intel.com>,
Dave Hansen <dave.hansen@intel.com>
Subject: Re: [PATCH -mm -v3 6/6] mm, swap: Don't use VMA based swap readahead if HDD is used as swap
Date: Tue, 25 Jul 2017 13:50:59 -0700 [thread overview]
Message-ID: <20170725135059.11d65c1f6f17101e977f2b59@linux-foundation.org> (raw)
In-Reply-To: <20170725015151.19502-7-ying.huang@intel.com>
On Tue, 25 Jul 2017 09:51:51 +0800 "Huang, Ying" <ying.huang@intel.com> wrote:
> From: Huang Ying <ying.huang@intel.com>
>
> VMA based swap readahead will readahead the virtual pages that is
> continuous in the virtual address space. While the original swap
> readahead will readahead the swap slots that is continuous in the swap
> device. Although VMA based swap readahead is more correct for the
> swap slots to be readahead, it will trigger more small random
> readings, which may cause the performance of HDD (hard disk) to
> degrade heavily, and may finally exceed the benefit.
>
> To avoid the issue, in this patch, if the HDD is used as swap, the VMA
> based swap readahead will be disabled, and the original swap readahead
> will be used instead.
>
> ...
>
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -399,16 +399,17 @@ extern struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask,
> struct vm_fault *vmf,
> struct vma_swap_readahead *swap_ra);
>
> -static inline bool swap_use_vma_readahead(void)
> -{
> - return READ_ONCE(swap_vma_readahead);
> -}
> -
> /* linux/mm/swapfile.c */
> extern atomic_long_t nr_swap_pages;
> extern long total_swap_pages;
> +extern atomic_t nr_rotate_swap;
This is rather ugly. If the system is swapping to both an SSD and to a
spinning disk, we'll treat the spinning disk as SSD.
Surely this decision can be made in a per-device fashion?
> extern bool has_usable_swap(void);
>
> +static inline bool swap_use_vma_readahead(void)
> +{
> + return READ_ONCE(swap_vma_readahead) && !atomic_read(&nr_rotate_swap);
> +}
> +
> /* Swap 50% full? Release swapcache more aggressively.. */
> static inline bool vm_swap_full(void)
> {
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6ba4aab2db0b..2685b9951cc1 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -96,6 +96,8 @@ static DECLARE_WAIT_QUEUE_HEAD(proc_poll_wait);
> /* Activity counter to indicate that a swapon or swapoff has occurred */
> static atomic_t proc_poll_event = ATOMIC_INIT(0);
>
> +atomic_t nr_rotate_swap = ATOMIC_INIT(0);
> +
> static inline unsigned char swap_count(unsigned char ent)
> {
> return ent & ~SWAP_HAS_CACHE; /* may include SWAP_HAS_CONT flag */
> @@ -2387,6 +2389,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> if (p->flags & SWP_CONTINUED)
> free_swap_count_continuations(p);
>
> + if (!p->bdev || !blk_queue_nonrot(bdev_get_queue(p->bdev)))
> + atomic_dec(&nr_rotate_swap);
What's that p->bdev test for? It's not symmetrical with the
sys_swapon() change and one wonders if the counter can get out of sync.
> mutex_lock(&swapon_mutex);
> spin_lock(&swap_lock);
> spin_lock(&p->lock);
> @@ -2963,7 +2968,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> cluster = per_cpu_ptr(p->percpu_cluster, cpu);
> cluster_set_null(&cluster->index);
> }
> - }
> + } else
> + atomic_inc(&nr_rotate_swap);
>
> error = swap_cgroup_swapon(p->type, maxpages);
> if (error)
> --
> 2.13.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-07-25 20:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-25 1:51 [PATCH -mm -v3 0/6] mm, swap: VMA based swap readahead Huang, Ying
2017-07-25 1:51 ` [PATCH -mm -v3 1/6] mm, swap: Add swap cache statistics sysfs interface Huang, Ying
2017-07-25 20:42 ` Andrew Morton
2017-07-26 1:29 ` Huang, Ying
2017-07-25 21:05 ` Rik van Riel
2017-07-26 1:30 ` Huang, Ying
2017-07-25 1:51 ` [PATCH -mm -v3 2/6] mm, swap: Add swap readahead hit statistics Huang, Ying
2017-07-25 1:51 ` [PATCH -mm -v3 3/6] mm, swap: Fix swap readahead marking Huang, Ying
2017-07-25 1:51 ` [PATCH -mm -v3 4/6] mm, swap: VMA based swap readahead Huang, Ying
2017-07-25 1:51 ` [PATCH -mm -v3 5/6] mm, swap: Add sysfs interface for " Huang, Ying
2017-07-25 1:51 ` [PATCH -mm -v3 6/6] mm, swap: Don't use VMA based swap readahead if HDD is used as swap Huang, Ying
2017-07-25 20:50 ` Andrew Morton [this message]
2017-07-26 1:17 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170725135059.11d65c1f6f17101e977f2b59@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
--cc=shli@kernel.org \
--cc=tim.c.chen@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox