Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Thomas Hellström (Intel)" <thomas_os@shipmail.org>,
	"Michal Hocko" <mhocko@suse.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"Dave Chinner" <dchinner@redhat.com>, "Leo Liu" <Leo.Liu@amd.com>
Subject: Re: [PATCH] drm/ttm: stop warning on TT shrinker failure
Date: Wed, 24 Mar 2021 13:01:59 +0100	[thread overview]
Message-ID: <YFsqN7068vUL8rAM@phenom.ffwll.local> (raw)
In-Reply-To: <488c8996-1dd2-4928-a98a-4e72f3e0af64@amd.com>

On Wed, Mar 24, 2021 at 01:00:28PM +0100, Christian König wrote:
> Am 24.03.21 um 12:55 schrieb Daniel Vetter:
> > On Wed, Mar 24, 2021 at 11:19:13AM +0100, Thomas Hellström (Intel) wrote:
> > > On 3/23/21 4:45 PM, Christian König wrote:
> > > > Am 23.03.21 um 16:13 schrieb Michal Hocko:
> > > > > On Tue 23-03-21 14:56:54, Christian König wrote:
> > > > > > Am 23.03.21 um 14:41 schrieb Michal Hocko:
> > > > > [...]
> > > > > > > Anyway, I am wondering whether the overall approach is
> > > > > > > sound. Why don't
> > > > > > > you simply use shmem as your backing storage from the
> > > > > > > beginning and pin
> > > > > > > those pages if they are used by the device?
> > > > > > Yeah, that is exactly what the Intel guys are doing for their
> > > > > > integrated
> > > > > > GPUs :)
> > > > > > 
> > > > > > Problem is for TTM I need to be able to handle dGPUs and those have all
> > > > > > kinds of funny allocation restrictions. In other words I need to
> > > > > > guarantee
> > > > > > that the allocated memory is coherent accessible to the GPU
> > > > > > without using
> > > > > > SWIOTLB.
> > > > > > 
> > > > > > The simple case is that the device can only do DMA32, but you also got
> > > > > > device which can only do 40bits or 48bits.
> > > > > > 
> > > > > > On top of that you also got AGP, CMA and stuff like CPU cache behavior
> > > > > > changes (write back vs. write through, vs. uncached).
> > > > > OK, so the underlying problem seems to be that gfp mask (thus
> > > > > mapping_gfp_mask) cannot really reflect your requirements, right?  Would
> > > > > it help if shmem would allow to provide an allocation callback to
> > > > > override alloc_page_vma which is used currently? I am pretty sure there
> > > > > will be more to handle but going through shmem for the whole life time
> > > > > is just so much easier to reason about than some tricks to abuse shmem
> > > > > just for the swapout path.
> > > > Well it's a start, but the pages can have special CPU cache settings. So
> > > > direct IO from/to them usually doesn't work as expected.
> > > > 
> > > > Additional to that for AGP and CMA I need to make sure that I give those
> > > > pages back to the relevant subsystems instead of just dropping the page
> > > > reference.
> > > > 
> > > > So I would need to block for the swapio to be completed.
> > > > 
> > > > Anyway I probably need to revert those patches for now since this isn't
> > > > working as we hoped it would.
> > > > 
> > > > Thanks for the explanation how stuff works here.
> > > Another alternative here that I've tried before without being successful
> > > would perhaps be to drop shmem completely and, if it's a normal page (no dma
> > > or funny caching attributes) just use add_to_swap_cache()? If it's something
> > > else, try alloc a page with relevant gfp attributes, copy and
> > > add_to_swap_cache()? Or perhaps that doesn't work well from a shrinker
> > > either?
> > So before we toss everything and go an a great rewrite-the-world tour,
> > what if we just try to split up big objects. So for objects which are
> > bigger than e.g. 10mb
> > 
> > - move them to a special "under eviction" list
> > - keep a note how far we evicted thus far
> > - interleave allocating shmem pages, copying data and releasing the ttm
> >    backing store on a chunk basis (maybe 10mb or whatever, tuning tbh)
> > 
> > If that's not enough, occasionally break out of the shrinker entirely so
> > other parts of reclaim can reclaim the shmem stuff. But just releasing our
> > own pages as we go should help a lot I think.
> 
> Yeah, the later is exactly what I was currently prototyping.
> 
> I just didn't used a limit but rather a only partially evicted BOs list
> which is used when we fail to allocate a page.
> 
> For the 5.12 cycle I think we should just go back to a hard 50% limit for
> now and then resurrect this when we have solved the issues.

Can we do the 50% limit without tossing out all the code we've done thus
far? Just so this doesn't get too disruptive.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

next prev parent reply	other threads:[~2021-03-24 12:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210319140857.2262-1-christian.koenig@amd.com>
     [not found] ` <YFTk1GSaUDI3wcWt@phenom.ffwll.local>
     [not found]   ` <2831bfcc-140e-dade-1f50-a6431e495e9d@gmail.com>
     [not found]     ` <YFT2LSR97rkkPyEP@phenom.ffwll.local>
     [not found]       ` <1ae415c4-8e49-5183-b44d-bc92088657d5@gmail.com>
     [not found]         ` <CAKMK7uEDhuvSwJj5CX8vHgLb+5zm=rdJPmXwb-VQWdrW6GwQZw@mail.gmail.com>
     [not found]           ` <e6e9df3e-cd2b-d80f-205d-6ca1865819b2@gmail.com>
2021-03-22 13:49             ` Daniel Vetter
2021-03-22 14:05               ` Matthew Wilcox
2021-03-22 14:22                 ` Daniel Vetter
2021-03-22 15:57                 ` Michal Hocko
2021-03-22 17:02                   ` Daniel Vetter
2021-03-22 19:34                     ` Christian König
2021-03-23  7:38                       ` Michal Hocko
2021-03-23 11:28                         ` Daniel Vetter
2021-03-23 11:46                           ` Michal Hocko
2021-03-23 11:51                             ` Christian König
2021-03-23 12:00                               ` Daniel Vetter
2021-03-23 12:05                               ` Michal Hocko
2021-03-23 11:48                           ` Christian König
2021-03-23 12:04                             ` Michal Hocko
2021-03-23 12:21                               ` Christian König
2021-03-23 12:37                                 ` Michal Hocko
2021-03-23 13:06                                   ` Christian König
2021-03-23 13:41                                     ` Michal Hocko
2021-03-23 13:56                                       ` Christian König
2021-03-23 15:13                                         ` Michal Hocko
2021-03-23 15:45                                           ` Christian König
2021-03-24 10:19                                             ` Thomas Hellström (Intel)
2021-03-24 11:55                                               ` Daniel Vetter
2021-03-24 12:00                                                 ` Christian König
2021-03-24 12:01                                                   ` Daniel Vetter [this message]
2021-03-24 12:07                                                     ` Christian König
2021-03-24 19:20                                                       ` Daniel Vetter
2021-03-23 13:15                               ` Daniel Vetter
2021-03-23 13:48                                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFsqN7068vUL8rAM@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=Leo.Liu@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dchinner@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=thomas_os@shipmail.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox