From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34E95CA9EAF for ; Thu, 24 Oct 2019 08:24:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F0BE320856 for ; Thu, 24 Oct 2019 08:24:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F0BE320856 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 960426B0005; Thu, 24 Oct 2019 04:24:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 936896B0006; Thu, 24 Oct 2019 04:24:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 84C386B0007; Thu, 24 Oct 2019 04:24:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id 647FB6B0005 for ; Thu, 24 Oct 2019 04:24:43 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 0C15F68BE for ; Thu, 24 Oct 2019 08:24:43 +0000 (UTC) X-FDA: 76077992046.16.view60_463df2dbe5b39 X-HE-Tag: view60_463df2dbe5b39 X-Filterd-Recvd-Size: 4233 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Thu, 24 Oct 2019 08:24:42 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2F6C6B2AC; Thu, 24 Oct 2019 08:24:41 +0000 (UTC) Date: Thu, 24 Oct 2019 10:24:40 +0200 From: Michal Hocko To: Johannes Weiner Cc: Andrew Morton , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH 2/2] mm: memcontrol: try harder to set a new memory.high Message-ID: <20191024082440.GT17610@dhcp22.suse.cz> References: <20191022201518.341216-1-hannes@cmpxchg.org> <20191022201518.341216-2-hannes@cmpxchg.org> <20191023065949.GD754@dhcp22.suse.cz> <20191023175724.GD366316@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191023175724.GD366316@cmpxchg.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 23-10-19 13:57:24, Johannes Weiner wrote: > On Wed, Oct 23, 2019 at 08:59:49AM +0200, Michal Hocko wrote: > > On Tue 22-10-19 16:15:18, Johannes Weiner wrote: > > > Setting a memory.high limit below the usage makes almost no effort to > > > shrink the cgroup to the new target size. > > > > > > While memory.high is a "soft" limit that isn't supposed to cause OOM > > > situations, we should still try harder to meet a user request through > > > persistent reclaim. > > > > > > For example, after setting a 10M memory.high on an 800M cgroup full of > > > file cache, the usage shrinks to about 350M: > > > > > > + cat /cgroup/workingset/memory.current > > > 841568256 > > > + echo 10M > > > + cat /cgroup/workingset/memory.current > > > 355729408 > > > > > > This isn't exactly what the user would expect to happen. Setting the > > > value a few more times eventually whittles the usage down to what we > > > are asking for: > > > > > > + echo 10M > > > + cat /cgroup/workingset/memory.current > > > 104181760 > > > + echo 10M > > > + cat /cgroup/workingset/memory.current > > > 31801344 > > > + echo 10M > > > + cat /cgroup/workingset/memory.current > > > 10440704 > > > > > > To improve this, add reclaim retry loops to the memory.high write() > > > callback, similar to what we do for memory.max, to make a reasonable > > > effort that the usage meets the requested size after the call returns. > > > > That suggests that the reclaim couldn't meet the given reclaim target > > but later attempts just made it through. Is this due to amount of dirty > > pages or what prevented the reclaim to do its job? > > > > While I am not against the reclaim retry loop I would like to understand > > the underlying issue. Because if this is really about dirty memory then > > we should probably be more pro-active in flushing it. Otherwise the > > retry might not be of any help. > > All the pages in my test case are clean cache. But they are active, > and they need to go through the inactive list before reclaiming. The > inactive list size is designed to pre-age just enough pages for > regular reclaim targets, i.e. pages in the SWAP_CLUSTER_MAX ballpark, > In this case, the reclaim goal for a single invocation is 790M and the > inactive list is a small funnel to put all that through, and we need > several iterations to accomplish that. Thanks for the clarification. > But 790M is not a reasonable reclaim target to ask of a single reclaim > invocation. And it wouldn't be reasonable to optimize the reclaim code > for it. So asking for the full size but retrying is not a bad choice > here: we express our intent, and benefit if reclaim becomes better at > handling larger requests, but we also acknowledge that some of the > deltas we can encounter in memory_high_write() are just too > ridiculously big for a single reclaim invocation to manage. Yes that makes sense and I think it should be a part of the changelog. Acked-by: Michal Hocko Thanks! -- Michal Hocko SUSE Labs