From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id A94776B6ECD for ; Tue, 4 Sep 2018 14:06:36 -0400 (EDT) Received: by mail-pf1-f200.google.com with SMTP id i68-v6so2385156pfb.9 for ; Tue, 04 Sep 2018 11:06:36 -0700 (PDT) Received: from mx1.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id s16-v6si20015139plp.317.2018.09.04.11.06.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Sep 2018 11:06:35 -0700 (PDT) Date: Tue, 4 Sep 2018 20:06:31 +0200 From: Michal Hocko Subject: Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects Message-ID: <20180904180631.GR14951@dhcp22.suse.cz> References: <20180831203450.2536-1-guro@fb.com> <3b05579f964cca1d44551913f1a9ee79d96f198e.camel@surriel.com> <20180831213138.GA9159@tower.DHCP.thefacebook.com> <20180903182956.GE15074@dhcp22.suse.cz> <20180903202803.GA6227@castle.DHCP.thefacebook.com> <20180904070005.GG14951@dhcp22.suse.cz> <20180904153445.GA22328@tower.DHCP.thefacebook.com> <20180904161431.GP14951@dhcp22.suse.cz> <20180904175243.GA4889@tower.DHCP.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180904175243.GA4889@tower.DHCP.thefacebook.com> Sender: owner-linux-mm@kvack.org List-ID: To: Roman Gushchin Cc: Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, Josef Bacik , Johannes Weiner , Andrew Morton On Tue 04-09-18 10:52:46, Roman Gushchin wrote: > On Tue, Sep 04, 2018 at 06:14:31PM +0200, Michal Hocko wrote: [...] > > I am not opposing your patch but I am trying to figure out whether that > > is the best approach. > > I don't think the current logic does make sense. Why should cgroups > with less than 4k kernel objects be excluded from being scanned? How is it any different from the the LRU reclaim? Maybe it is just not that visible because there usually more pages there. But in principle it is the same issue AFAICS. > Reparenting of all pages is definitely an option to consider, > but it's not free in any case, so if there is no problem, > why should we? Let's keep it as a last measure. In my case, > the proposed patch works perfectly: the number of dying cgroups > jumps around 100, where it grew steadily to 2k and more before. Let me emphasise that I am not opposing the patch. I just think that we have made some decisions which are not ideal but I would really like to prevent from building workarounds on top. If we have to reconsider some of those decisions then let's do it. Maybe the priority scaling is just too coarse and what seem to work work for normal LRUs doesn't work for shrinkers. > I believe that reparenting of LRU lists is required to minimize > the number of LRU lists to scan, but I'm not sure. Well, we do have more lists to scan for normal LRUs. It is true that shrinkers add multiplining factor to that but in principle I guess we really want to distinguish dead memcgs because we do want to reclaim those much more than the rest. Those objects are basically of no use just eating resources. The pagecache has some chance to be reused at least but I fail to see why we should keep kernel objects around. Sure, some of them might be harder to reclaim due to different life time and internal object management but this doesn't change the fact that we should try hard to reclaim those. So my gut feeling tells me that we should have a way to distinguish them. Btw. I do not see Vladimir on the CC list. Added (the thread starts here http://lkml.kernel.org/r/20180831203450.2536-1-guro@fb.com) -- Michal Hocko SUSE Labs