From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave@sr71.net>, Hugh Dickins <hughd@google.com>,
Dave Hansen <dave.hansen@intel.com>, Tejun Heo <tj@kernel.org>,
Linux-MM <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Vladimir Davydov <vdavydov@parallels.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: regression caused by cgroups optimization in 3.17-rc2
Date: Fri, 5 Sep 2014 17:39:48 +0200 [thread overview]
Message-ID: <20140905153948.GH26243@dhcp22.suse.cz> (raw)
In-Reply-To: <20140905144723.GB13392@cmpxchg.org>
On Fri 05-09-14 10:47:23, Johannes Weiner wrote:
> On Fri, Sep 05, 2014 at 11:25:37AM +0200, Michal Hocko wrote:
> > @@ -900,10 +900,10 @@ void lru_add_drain_all(void)
> > * grabbed the page via the LRU. If it did, give up: shrink_inactive_list()
> > * will free it.
> > */
> > -void release_pages(struct page **pages, int nr, bool cold)
> > +static void release_lru_pages(struct page **pages, int nr,
> > + struct list_head *pages_to_free)
> > {
> > int i;
> > - LIST_HEAD(pages_to_free);
> > struct zone *zone = NULL;
> > struct lruvec *lruvec;
> > unsigned long uninitialized_var(flags);
> > @@ -943,11 +943,26 @@ void release_pages(struct page **pages, int nr, bool cold)
> > /* Clear Active bit in case of parallel mark_page_accessed */
> > __ClearPageActive(page);
> >
> > - list_add(&page->lru, &pages_to_free);
> > + list_add(&page->lru, pages_to_free);
> > }
> > if (zone)
> > spin_unlock_irqrestore(&zone->lru_lock, flags);
> > +}
> > +/*
> > + * Batched page_cache_release(). Frees and uncharges all given pages
> > + * for which the reference count drops to 0.
> > + */
> > +void release_pages(struct page **pages, int nr, bool cold)
> > +{
> > + LIST_HEAD(pages_to_free);
> >
> > + while (nr) {
> > + int batch = min(nr, PAGEVEC_SIZE);
> > +
> > + release_lru_pages(pages, batch, &pages_to_free);
> > + pages += batch;
> > + nr -= batch;
> > + }
>
> We might be able to process a lot more pages in one go if nobody else
> needs the lock or the CPU. Can't we just cycle the lock or reschedule
> if necessary?
Is it safe to cond_resched here for all callers? I hope it is but there
are way too many callers to check so I am not 100% sure.
Besides that spin_needbreak doesn't seem to be available for all architectures.
git grep "arch_spin_is_contended(" -- arch/
arch/arm/include/asm/spinlock.h:static inline int arch_spin_is_contended(arch_spinlock_t *lock)
arch/arm64/include/asm/spinlock.h:static inline int arch_spin_is_contended(arch_spinlock_t *lock)
arch/ia64/include/asm/spinlock.h:static inline int arch_spin_is_contended(arch_spinlock_t *lock)
arch/mips/include/asm/spinlock.h:static inline int arch_spin_is_contended(arch_spinlock_t *lock)
arch/x86/include/asm/spinlock.h:static inline int arch_spin_is_contended(arch_spinlock_t *lock)
Moreover it doesn't seem to do anything for !CONFIG_PREEMPT but this
should be trivial to fix.
I am also not sure this will work well in all cases. If we have a heavy
reclaim activity on other CPUs then this path might be interrupted too
often resulting in too much lock bouncing. So I guess we want at least
few pages to be processed in one run. On the other hand if the lock is
not contended then doing batches and retake the lock shouldn't add too
much overhead, no?
> diff --git a/mm/swap.c b/mm/swap.c
> index 6b2dc3897cd5..ee0cf21dd521 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -944,6 +944,15 @@ void release_pages(struct page **pages, int nr, bool cold)
> __ClearPageActive(page);
>
> list_add(&page->lru, &pages_to_free);
> +
> + if (should_resched() ||
> + (zone && spin_needbreak(&zone->lru_lock))) {
> + if (zone) {
> + spin_unlock_irqrestore(&zone->lru_lock, flags);
> + zone = NULL;
> + }
> + cond_resched();
> + }
> }
> if (zone)
> spin_unlock_irqrestore(&zone->lru_lock, flags);
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index 3e0ec83d000c..c487ca4682a4 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -262,19 +262,12 @@ void free_page_and_swap_cache(struct page *page)
> */
> void free_pages_and_swap_cache(struct page **pages, int nr)
> {
> - struct page **pagep = pages;
> + int i;
>
> lru_add_drain();
> - while (nr) {
> - int todo = min(nr, PAGEVEC_SIZE);
> - int i;
> -
> - for (i = 0; i < todo; i++)
> - free_swap_cache(pagep[i]);
> - release_pages(pagep, todo, false);
> - pagep += todo;
> - nr -= todo;
> - }
> + for (i = 0; i < nr; i++)
> + free_swap_cache(pages[i]);
> + release_pages(pages, nr, false);
> }
>
> /*
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-09-05 15:39 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-02 19:05 Dave Hansen
2014-09-02 20:18 ` Dave Hansen
2014-09-02 20:57 ` Dave Hansen
2014-09-04 14:27 ` Michal Hocko
2014-09-04 20:27 ` Dave Hansen
2014-09-04 22:53 ` Dave Hansen
2014-09-05 9:28 ` Michal Hocko
2014-09-05 9:25 ` Michal Hocko
2014-09-05 14:47 ` Johannes Weiner
2014-09-05 15:39 ` Michal Hocko [this message]
2014-09-10 16:29 ` Michal Hocko
2014-09-10 16:57 ` Dave Hansen
2014-09-10 17:05 ` Michal Hocko
2014-09-05 12:35 ` Johannes Weiner
2014-09-08 15:47 ` Dave Hansen
2014-09-09 14:50 ` Johannes Weiner
2014-09-09 18:23 ` Dave Hansen
2014-09-02 22:18 ` Johannes Weiner
2014-09-02 22:36 ` Dave Hansen
2014-09-03 0:10 ` Johannes Weiner
2014-09-03 0:20 ` Linus Torvalds
2014-09-03 1:33 ` Johannes Weiner
2014-09-03 3:15 ` Dave Hansen
2014-09-03 0:30 ` Dave Hansen
2014-09-04 15:08 ` Johannes Weiner
2014-09-04 20:50 ` Dave Hansen
2014-09-05 8:04 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140905153948.GH26243@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=dave@sr71.net \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox