From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf0-f72.google.com (mail-lf0-f72.google.com [209.85.215.72]) by kanga.kvack.org (Postfix) with ESMTP id 6566F6B026E for ; Fri, 19 Jan 2018 05:02:47 -0500 (EST) Received: by mail-lf0-f72.google.com with SMTP id q6so526421lfi.23 for ; Fri, 19 Jan 2018 02:02:47 -0800 (PST) Received: from netline-mail3.netline.ch (mail.netline.ch. [148.251.143.178]) by mx.google.com with ESMTP id 85si4118891lfr.146.2018.01.19.02.02.45 for ; Fri, 19 Jan 2018 02:02:45 -0800 (PST) Subject: Re: [RFC] Per file OOM badness References: <1516294072-17841-1-git-send-email-andrey.grodzovsky@amd.com> <20180118170006.GG6584@dhcp22.suse.cz> <20180118171355.GH6584@dhcp22.suse.cz> <87k1wfgcmb.fsf@anholt.net> <20180119082046.GL6584@dhcp22.suse.cz> <0cfaf256-928c-4cb8-8220-b8992592071b@amd.com> <11153f4f-8b9a-5780-6087-bc1e85459584@gmail.com> From: =?UTF-8?Q?Michel_D=c3=a4nzer?= Message-ID: <8939a03e-8204-940b-dd69-be28f75a2492@daenzer.net> Date: Fri, 19 Jan 2018 11:02:42 +0100 MIME-Version: 1.0 In-Reply-To: <11153f4f-8b9a-5780-6087-bc1e85459584@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-CA Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: christian.koenig@amd.com, Michal Hocko , Eric Anholt Cc: amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org On 2018-01-19 10:58 AM, Christian KA?nig wrote: > Am 19.01.2018 um 10:32 schrieb Michel DA?nzer: >> On 2018-01-19 09:39 AM, Christian KA?nig wrote: >>> Am 19.01.2018 um 09:20 schrieb Michal Hocko: >>>> On Thu 18-01-18 12:01:32, Eric Anholt wrote: >>>>> Michal Hocko writes: >>>>> >>>>>> On Thu 18-01-18 18:00:06, Michal Hocko wrote: >>>>>>> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote: >>>>>>>> Hi, this series is a revised version of an RFC sent by Christian >>>>>>>> KA?nig >>>>>>>> a few years ago. The original RFC can be found at >>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This is the same idea and I've just adressed his concern from the >>>>>>>> original RFC >>>>>>>> and switched to a callback into file_ops instead of a new member >>>>>>>> in struct file. >>>>>>> Please add the full description to the cover letter and do not make >>>>>>> people hunt links. >>>>>>> >>>>>>> Here is the origin cover letter text >>>>>>> : I'm currently working on the issue that when device drivers >>>>>>> allocate memory on >>>>>>> : behalf of an application the OOM killer usually doesn't knew >>>>>>> about that unless >>>>>>> : the application also get this memory mapped into their address >>>>>>> space. >>>>>>> : >>>>>>> : This is especially annoying for graphics drivers where a lot of >>>>>>> the VRAM >>>>>>> : usually isn't CPU accessible and so doesn't make sense to map >>>>>>> into the >>>>>>> : address space of the process using it. >>>>>>> : >>>>>>> : The problem now is that when an application starts to use a lot >>>>>>> of VRAM those >>>>>>> : buffers objects sooner or later get swapped out to system memory, >>>>>>> but when we >>>>>>> : now run into an out of memory situation the OOM killer obviously >>>>>>> doesn't knew >>>>>>> : anything about that memory and so usually kills the wrong process. >>>>>> OK, but how do you attribute that memory to a particular OOM killable >>>>>> entity? And how do you actually enforce that those resources get >>>>>> freed >>>>>> on the oom killer action? >>>>>> >>>>>>> : The following set of patches tries to address this problem by >>>>>>> introducing a per >>>>>>> : file OOM badness score, which device drivers can use to give the >>>>>>> OOM killer a >>>>>>> : hint how many resources are bound to a file descriptor so that it >>>>>>> can make >>>>>>> : better decisions which process to kill. >>>>>> But files are not killable, they can be shared... In other words this >>>>>> doesn't help the oom killer to make an educated guess at all. >>>>> Maybe some more context would help the discussion? >>>>> >>>>> The struct file in patch 3 is the DRM fd.A That's effectively "my >>>>> process's interface to talking to the GPU" not "a single GPU >>>>> resource". >>>>> Once that file is closed, all of the process's private, idle GPU >>>>> buffers >>>>> will be immediately freed (this will be most of their allocations), >>>>> and >>>>> some will be freed once the GPU completes some work (this will be most >>>>> of the rest of their allocations). >>>>> >>>>> Some GEM BOs won't be freed just by closing the fd, if they've been >>>>> shared between processes.A Those are usually about 8-24MB total in a >>>>> process, rather than the GBs that modern apps use (or that our >>>>> testcases >>>>> like to allocate and thus trigger oomkilling of the test harness >>>>> instead >>>>> of the offending testcase...) >>>>> >>>>> Even if we just had the private+idle buffers being accounted in OOM >>>>> badness, that would be a huge step forward in system reliability. >>>> OK, in that case I would propose a different approach. We already >>>> have rss_stat. So why do not we simply add a new counter there >>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be >>>> that such a memory is bound to the process life time. I guess we will >>>> find more users for this later. >>> I already tried that and the problem with that approach is that some >>> buffers are not created by the application which actually uses them. >>> >>> For example X/Wayland is creating and handing out render buffers to >>> application which want to use OpenGL. >>> >>> So the result is when you always account the application who created the >>> buffer the OOM killer will certainly reap X/Wayland first. And that is >>> exactly what we want to avoid here. >> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland >> anymore. With DRI3 and Wayland, buffers are allocated by the clients and >> then shared with the X / Wayland server. > > Good point, when I initially looked at that problem DRI3 wasn't widely > used yet. > >> Also, in all cases, the amount of memory allocated for buffers shared >> between DRI/Wayland clients and the server should be relatively small >> compared to the amount of memory allocated for buffers used only locally >> in the client, particularly for clients which create significant memory >> pressure. > > That is unfortunately only partially true. When you have a single > runaway application which tries to allocate everything it would indeed > work as you described. > > But when I tested this a few years ago with X based desktop the > applications which actually used most of the memory where Firefox and > Thunderbird. Unfortunately they never got accounted for that. > > Now, on my current Wayland based desktop it actually doesn't look much > better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of > all memory was allocated either by gnome-shell or Xwayland. My guess would be this is due to pixmaps, which allow X clients to cause the X server to allocate essentially unlimited amounts of memory. It's a separate issue, which would require a different solution than what we're discussing in this thread. Maybe something that would allow the X server to tell the kernel that some of the memory it allocates is for the client process. -- Earthling Michel DA?nzer | http://www.amd.com Libre software enthusiast | Mesa and X developer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org