Re: [PATCH] mm: Throttle shrinkers harder

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Chris Wilson <chris@chris-wilson.co.uk>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, intel-gfx@lists.freedesktop.org,
	Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@suse.cz>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Dave Chinner <dchinner@redhat.com>,
	Glauber Costa <glommer@openvz.org>,
	Hugh Dickins <hughd@google.com>,
	David Rientjes <rientjes@google.com>
Subject: Re: [PATCH] mm: Throttle shrinkers harder
Date: Sat, 26 Apr 2014 14:10:26 +0100	[thread overview]
Message-ID: <20140426131026.GA4418@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <535A9901.6090607@intel.com>

On Fri, Apr 25, 2014 at 10:18:57AM -0700, Dave Hansen wrote:
> On 04/25/2014 12:23 AM, Chris Wilson wrote:
> > On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote:
> >> On 04/24/2014 08:39 AM, Chris Wilson wrote:
> >>> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote:
> >>>> Is it possible that there's still a get_page() reference that's holding
> >>>> those pages in place from the graphics code?
> >>>
> >>> Not from i915.ko. The last resort of our shrinker is to drop all page
> >>> refs held by the GPU, which is invoked if we are asked to free memory
> >>> and we have no inactive objects left.
> >>
> >> How sure are we that this was performed before the OOM?
> > 
> > Only by virtue of how shrink_slabs() works.
> 
> Could we try to raise the level of assurance there, please? :)
> 
> So this "last resort" is i915_gem_shrink_all()?  It seems like we might
> have some problems getting down to that part of the code if we have
> problems getting the mutex.

In general, but not in this example where the load is tightly controlled.
 
> We have tracepoints for the shrinkers in here (it says slab, but it's
> all the shrinkers, I checked):
> 
> /sys/kernel/debug/tracing/events/vmscan/mm_shrink_slab_*/enable
> and another for OOMs:
> /sys/kernel/debug/tracing/events/oom/enable
> 
> Could you collect a trace during one of these OOM events and see what
> the i915 shrinker is doing?  Just enable those two and then collect a
> copy of:
> 
> 	/sys/kernel/debug/tracing/trace
> 
> That'll give us some insight about how well the shrinker is working.  If
> the VM gave up on calling in to it, it might reveal why we didn't get
> all the way down in to i915_gem_shrink_all().

I'll add it to the list for QA to try.
 
> > Thanks for the pointer to
> > register_oom_notifier(), I can use that to make sure that we do purge
> > everything from the GPU, and do a sanity check at the same time, before
> > we start killing processes.
> 
> Actually, that one doesn't get called until we're *SURE* we are going to
> OOM.  Any action taken in there won't be taken in to account.

blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
if (freed > 0)
	/* Got some memory back in the last second. */
	return;

That looks like it should abort the oom and so repeat the allocation
attempt? Or is that too hopeful?

> >> Also, forgive me for being an idiot wrt the way graphics work, but are
> >> there any good candidates that you can think of that could be holding a
> >> reference?  I've honestly never seen an OOM like this.
> > 
> > Here the only place that we take a page reference is in
> > i915_gem_object_get_pages(). We do this when we first bind the pages
> > into the GPU's translation table, but we only release the pages once the
> > object is destroyed or the system experiences memory pressure. (Once the
> > GPU touches the pages, we no longer consider them to be cache coherent
> > with the CPU and so migrating them between the GPU and CPU requires
> > clflushing, which is expensive.)
> > 
> > Aside from CPU mmaps of the shmemfs filp, all operations on our
> > graphical objects should lead to i915_gem_object_get_pages(). However
> > not all objects are recoverable as some may be pinned due to hardware
> > access.
> 
> In that oom callback, could you dump out the aggregate number of
> obj->pages_pin_count across all the objects?  That would be a very
> interesting piece of information to have.  It would also be very
> insightful for folks who see OOMs in practice with i915 in their systems.

Indeed.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-04-26 13:11 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-10  7:05 Chris Wilson
2014-04-18 19:14 ` Andrew Morton
2014-04-22 19:30   ` Daniel Vetter
2014-04-23 21:14     ` Dave Hansen
2014-04-24  5:58       ` Chris Wilson
2014-04-24 15:21         ` Dave Hansen
2014-04-24 15:39           ` Chris Wilson
2014-04-24 22:35             ` Dave Hansen
2014-04-25  7:23               ` Chris Wilson
2014-04-25 17:18                 ` Dave Hansen
2014-04-25 17:56                   ` Dave Hansen
2014-04-26 13:10                   ` Chris Wilson [this message]
2014-04-28 16:38                     ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140426131026.GA4418@nuc-i3427.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=dchinner@redhat.com \
    --cc=glommer@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox