From: Mel Gorman <mgorman@suse.de>
To: Dan Streetman <ddstreet@ieee.org>
Cc: Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
Weijie Yang <weijieut@gmail.com>, Linux-MM <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] swap: use separate priority list for available swap_infos
Date: Fri, 25 Apr 2014 09:49:02 +0100 [thread overview]
Message-ID: <20140425084902.GZ23991@suse.de> (raw)
In-Reply-To: <CALZtONCpnrFTzZkoKx75Ev-NutACD2SAnTA0xBf6JAdoeFx9jQ@mail.gmail.com>
On Thu, Apr 24, 2014 at 01:52:00PM -0400, Dan Streetman wrote:
> On Wed, Apr 23, 2014 at 9:14 AM, Mel Gorman <mgorman@suse.de> wrote:
> > On Sat, Apr 12, 2014 at 05:00:54PM -0400, Dan Streetman wrote:
> >> Originally get_swap_page() started iterating through the singly-linked
> >> list of swap_info_structs using swap_list.next or highest_priority_index,
> >> which both were intended to point to the highest priority active swap
> >> target that was not full. The previous patch in this series changed the
> >> singly-linked list to a doubly-linked list, and removed the logic to start
> >> at the highest priority non-full entry; it starts scanning at the highest
> >> priority entry each time, even if the entry is full.
> >>
> >> Add a new list, also priority ordered, to track only swap_info_structs
> >> that are available, i.e. active and not full. Use a new spinlock so that
> >> entries can be added/removed outside of get_swap_page; that wasn't possible
> >> previously because the main list is protected by swap_lock, which can't be
> >> taken when holding a swap_info_struct->lock because of locking order.
> >> The get_swap_page() logic now does not need to hold the swap_lock, and it
> >> iterates only through swap_info_structs that are available.
> >>
> >> Signed-off-by: Dan Streetman <ddstreet@ieee.org>
> >> ---
> >> include/linux/swap.h | 1 +
> >> mm/swapfile.c | 128 ++++++++++++++++++++++++++++++++++-----------------
> >> 2 files changed, 87 insertions(+), 42 deletions(-)
> >>
> >> diff --git a/include/linux/swap.h b/include/linux/swap.h
> >> index 96662d8..d9263db 100644
> >> --- a/include/linux/swap.h
> >> +++ b/include/linux/swap.h
> >> @@ -214,6 +214,7 @@ struct percpu_cluster {
> >> struct swap_info_struct {
> >> unsigned long flags; /* SWP_USED etc: see above */
> >> signed short prio; /* swap priority of this type */
> >> + struct list_head prio_list; /* entry in priority list */
> >> struct list_head list; /* entry in swap list */
> >> signed char type; /* strange name for an index */
> >> unsigned int max; /* extent of the swap_map */
> >> diff --git a/mm/swapfile.c b/mm/swapfile.c
> >> index b958645..3c38461 100644
> >> --- a/mm/swapfile.c
> >> +++ b/mm/swapfile.c
> >> @@ -57,9 +57,13 @@ static const char Unused_file[] = "Unused swap file entry ";
> >> static const char Bad_offset[] = "Bad swap offset entry ";
> >> static const char Unused_offset[] = "Unused swap offset entry ";
> >>
> >> -/* all active swap_info */
> >> +/* all active swap_info; protected with swap_lock */
> >> LIST_HEAD(swap_list_head);
> >>
> >> +/* all available (active, not full) swap_info, priority ordered */
> >> +static LIST_HEAD(prio_head);
> >> +static DEFINE_SPINLOCK(prio_lock);
> >> +
> >
> > I get why you maintain two lists with separate locking but it's code that
> > is specific to swap and in many respects, it's very similar to a plist. Is
> > there a reason why plist was not used at least for prio_head? They're used
> > for futex's so presumably the performance is reasonable. It might reduce
> > the size of swapfile.c further.
> >
> > It is the case that plist does not have the equivalent of rotate which
> > you need to recycle the entries of equal priority but you could add a
> > plist_shuffle helper that "rotates the list left if the next entry is of
> > equal priority".
>
> I did look at plist, but as you said there's no plist_rotate_left() so
> that would either need to be implemented in plist or a helper specific
> to swap added.
>
Which in itself should not be impossible and improves an existing structure
that might be usable elsewhere again.
> The plist sort order is also reverse from swap sort order; plist
> orders low->high while swap orders high->low. So either the prio
> would need to be negated when storing in the plist to correct the
> order, or plist_for_each_entry_reverse() would need to be added (or
> another helper specific for swap).
>
Or add a new plist iterator that wraps around list_for_each_entry_reverse
instead of list_for_each_entry? I admit I didn't actually check if this
would work.
> I think the main (only?) benefit of plist is during adds; insertion
> for lists is O(N) while insertion for plists is O(K) where K is the
> number of different priorities. And for swap, the add only happens in
> two places: on swapon (or if swapoff fails and reinserts it) and in
> swap_entry_free() when the swap_info_struct changes from full to
> not-full. The swapon and swapoff failure cases don't matter, and a
> swap_info_struct changing from full to not-full is (probably?) going
> to be a relatively rare occurrence. And even then, unless there are a
> significant number of not-full same-priority swap_info_structs that
> are higher prio than the one being added, there should (?) be no
> difference between plist and normal list add speed.
>
I'm not angling towards plist for performance reasons but for maintenance
reasons. It would just be preferable to use existing structures and
iterators instead of adding new swap-specific code.
> Finally, using a plist further increases the size of each swap_info_struct.
>
Considering how many of them there are in the system I would not worry
too much about the memory footprint in this case. IT's not like we are
increasing the size of struct page :)
> So to me, it seemed list plists were best used in situations where
> list entries were constantly being added/removed, and the add speed
> needed to be as fast as possible. That isn't typically the case with
> the swap_info_struct priority list, where adding/removing is a
> relatively rare occurrence. However, in the case where there are
> multiple groups of many same-priority swap devices/files, using plists
> may reduce the add insertion time when one of the lower-priority
> swap_info_struct changes from full to not-full after some of the
> higher prio ones also have changed from full to not-full. If you
> think it would be good to change to using a plist, I can update the
> patch.
>
The full to not-full case is a concern but I also think it's a corner case
that's only going to be commonly hit on stress tests and rarely on "normal"
workloads. For maintenance reasons I would prefer if plist was used to
reduce the amount of swap-specific code and move to swap-specific code only
if there is a measurable gain from doing that. If that is not possible
or the performance would suck in the normal case then just update the
changelog accordingly so the next reviewer does not bring up the same topic.
Thanks!
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-04-25 8:49 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 10:42 [PATCH] mm: swap: Use swapfiles in priority order Mel Gorman
2014-02-13 15:58 ` Weijie Yang
2014-02-14 10:17 ` Mel Gorman
2014-02-14 13:33 ` Weijie Yang
2014-02-14 13:10 ` Christian Ehrhardt
2014-02-16 2:59 ` Weijie Yang
2014-02-24 8:28 ` Hugh Dickins
2014-04-12 21:00 ` [PATCH 0/2] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-04-12 21:00 ` [PATCH 1/2] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-04-23 10:34 ` Mel Gorman
2014-04-24 0:17 ` Shaohua Li
2014-04-24 8:30 ` Mel Gorman
2014-04-24 18:48 ` Dan Streetman
2014-04-25 4:15 ` Weijie Yang
2014-05-02 20:00 ` Dan Streetman
2014-05-04 9:39 ` Bob Liu
2014-05-04 20:16 ` Dan Streetman
2014-04-25 8:38 ` Mel Gorman
2014-04-12 21:00 ` [PATCH 2/2] swap: use separate priority list for available swap_infos Dan Streetman
2014-04-23 13:14 ` Mel Gorman
2014-04-24 17:52 ` Dan Streetman
2014-04-25 8:49 ` Mel Gorman [this message]
2014-05-02 19:02 ` [PATCHv2 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-02 19:02 ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-02 19:02 ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 10:35 ` Mel Gorman
2014-05-02 19:02 ` [PATCH 3/4] plist: add plist_rotate Dan Streetman
2014-05-06 2:18 ` Steven Rostedt
2014-05-06 20:12 ` Dan Streetman
2014-05-06 20:39 ` Steven Rostedt
2014-05-06 21:47 ` Dan Streetman
2014-05-06 22:43 ` Steven Rostedt
2014-05-02 19:02 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-05 15:51 ` Dan Streetman
2014-05-05 19:13 ` Steven Rostedt
2014-05-05 19:38 ` Peter Zijlstra
2014-05-09 20:42 ` [PATCH] plist: make CONFIG_DEBUG_PI_LIST selectable Dan Streetman
2014-05-09 21:17 ` Steven Rostedt
2014-05-12 11:11 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Mel Gorman
2014-05-12 13:00 ` Dan Streetman
2014-05-12 16:38 ` [PATCHv3 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-12 16:38 ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-12 16:38 ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 16:38 ` [PATCHv2 3/4] plist: add plist_requeue Dan Streetman
2014-05-13 10:33 ` Mel Gorman
2014-05-12 16:38 ` [PATCHv2 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-13 10:34 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140425084902.GZ23991@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=ddstreet@ieee.org \
--cc=ehrhardt@linux.vnet.ibm.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=weijieut@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox