From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE6D8C432C0 for ; Wed, 20 Nov 2019 12:24:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 835A9224F3 for ; Wed, 20 Nov 2019 12:24:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LYbm8lNV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 835A9224F3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1B35C6B0275; Wed, 20 Nov 2019 07:24:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 163AE6B0276; Wed, 20 Nov 2019 07:24:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07BAC6B0277; Wed, 20 Nov 2019 07:24:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id E58126B0275 for ; Wed, 20 Nov 2019 07:24:31 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 9108E181AEF1A for ; Wed, 20 Nov 2019 12:24:31 +0000 (UTC) X-FDA: 76176573942.30.level51_61aaf1a8cc012 X-HE-Tag: level51_61aaf1a8cc012 X-Filterd-Recvd-Size: 8198 Received: from mail-io1-f66.google.com (mail-io1-f66.google.com [209.85.166.66]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Nov 2019 12:24:31 +0000 (UTC) Received: by mail-io1-f66.google.com with SMTP id k1so27407845ioj.6 for ; Wed, 20 Nov 2019 04:24:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fIUcP2ZGllDuC2zf+lqt7G4D2eKXOSCdQoARe56MzEM=; b=LYbm8lNVo6GUrrxDkHyR/0nqFN5fSgmdxdT3ZIMSpsZWTTDOXU3B64aH4IYC0tCgfV 52aT2BYqK4HKOd+Femvua/LJrgZvs+mLhw0lXKhZ/SEuTFRZGa/pEV6Lp8C2fGFV4qbJ bn5pph43KC/PdRZW2fcmdiQe598neZYaJj4m9mbQ8M/K2sr0xSocmbwOIMoLCGwnVvez E330ftE38qI1ORqWu3+cwr4IrkA7oqPkaYYpIBRjNGgGramS/rHIQJGHD0q2g1YnzQnZ hWhH+JEQFLMtYUnydWX+BIf4bbC+krRtJtXFQsY/zEagtN+AP+3aJzZpXwIWs2Ox1z45 Yx7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fIUcP2ZGllDuC2zf+lqt7G4D2eKXOSCdQoARe56MzEM=; b=a+acE/Z448s0D6pUXCTM3Z/5CBtwLo38SdnrxgWZGc6mTDHitnK4cSBsoGogbokwCF sfq7lHYvEQZ+QYUNbr+uogz5GW8+KBdLI/a+c/hVDpR0CEKIiVvrhwAEQFAo+m5GwD9R BaGMK63HGUkcyCpBbBOnWGtxvEBK40bCBSiMJQfnzfOXVDzdmDCcFhM2FOyjerQAxSqr rxxHmYCg0QvSInRuzvfW1g+S2YLTOlXlbJvIP6iwx3sIZTXl7/WvadRtK2GUYqO++qdH G9Y1GMwNujieMWzJtLcdGvmLgVicmrvVD9w+r1k4ltLKIHOcqYhBDp6ZNSM1HW0OjlpF t28Q== X-Gm-Message-State: APjAAAXbFpUqXxSB+5HFOwSPUPACJJs0b7GmKUkCRcO+rV/edX65xtw8 7+VoUmXztjfzRaDN9T9UPLOzy046XfTj2V4RftkDentP X-Google-Smtp-Source: APXvYqzv5hGM5a5VEW/T7sQj0xVHr84ydnucuTkQCwj+v3d/gYahhaNQNlqSM4xLDvLVyhkDy/aGFpTrs58aWt4dPeM= X-Received: by 2002:a6b:b2d5:: with SMTP id b204mr1941635iof.137.1574252670316; Wed, 20 Nov 2019 04:24:30 -0800 (PST) MIME-Version: 1.0 References: <1574239985-1916-1-git-send-email-laoar.shao@gmail.com> <20191120102157.GF23213@dhcp22.suse.cz> <20191120114043.GH23213@dhcp22.suse.cz> In-Reply-To: <20191120114043.GH23213@dhcp22.suse.cz> From: Yafang Shao Date: Wed, 20 Nov 2019 20:23:54 +0800 Message-ID: Subject: Re: [PATCH] mm, memcg: show memcg min setting in oom messages To: Michal Hocko Cc: Johannes Weiner , Vladimir Davydov , Andrew Morton , Linux MM Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 20, 2019 at 7:40 PM Michal Hocko wrote: > > On Wed 20-11-19 18:53:44, Yafang Shao wrote: > > On Wed, Nov 20, 2019 at 6:22 PM Michal Hocko wrote: > > > > > > On Wed 20-11-19 03:53:05, Yafang Shao wrote: > > > > A task running in a memcg may OOM because of the memory.min settings of his > > > > slibing and parent. If this happens, the current oom messages can't show > > > > why file page cache can't be reclaimed. > > > > > > min limit is not the only way to protect memory from being reclaim. The > > > memory might be pinned or unreclaimable for other reasons (e.g. swap > > > quota exceeded for memcg). > > > > Both swap or unreclaimabed (unevicteable) is printed in OOM messages. > > Not really. Consider a memcg which has reached it's swap limit. The > anonymous memory is not really reclaimable even when there is a lot of > swap space available. > The memcg swap limit is already printed in oom messages, see bellow, [ 141.721625] memory: usage 1228800kB, limit 1228800kB, failcnt 18337 [ 141.721958] swap: usage 0kB, limit 9007199254740988kB, failcnt 0 > > If something else can prevent the file cache being reclaimed, we'd > > better show them as well. > > How are you going to do that? How do you track pins on pages? > Actually I don't have a clear idea how to track pins on pages yet. > > > Besides that, there is the very same problem > > > with the global OOM killer, right? And I do not expect we want to print > > > all memcgs in the system (this might be hundreds). > > > > > > > I forgot the global oom... > > > > Why not just print the memcgs which are under memory.min protection or > > something like a total number of min protected memory ? > > Yes, this would likely help. But the main question really reamains, is > this really worth it? > If it doesn't cost too much, I think it is worth to do it. As the oom path is not the critical path, so adding some print info should not add much overhead. > > > > So it is better to show the memcg > > > > min settings. > > > > Let's take an example. > > > > bar bar/memory.max = 1200M memory.min=800M > > > > / \ > > > > barA barB barA/memory.min = 800M memory.current=1G (file page cache) > > > > barB/memory.min = 0 (process in this memcg is allocating page) > > > > > > > > The process will do memcg reclaim if the bar/memory.max is reached. Once > > > > the barA/memory.min is reached it will stop reclaiming file page caches in > > > > barA, and if there is no reclaimable pages in bar and bar/barB it will > > > > enter memcg OOM then. > > > > After this pacch, bellow messages will be show then (only includeing the > > > > relevant messages here). The lines begin with '#' are newly added info (the > > > > '#' symbol is not in the original messages). > > > > memory: usage 1228800kB, limit 1228800kB, failcnt 18337 > > > > ... > > > > # Memory cgroup min setting: > > > > # /bar: min 819200KB emin 0KB > > > > # /bar/barA: min 819200KB emin 819200KB > > > > # /bar/barB: min 0KB emin 0KB > > > > ... > > > > Memory cgroup stats for /bar: > > > > anon 418328576 > > > > file 835756032 > > > > ... > > > > unevictable 0 > > > > ... > > > > oom-kill:constraint=CONSTRAINT_MEMCG..oom_memcg=/bar,task_memcg=/bar/barB > > > > > > > > With the new added information, we can find the memory.min in bar/barA is > > > > reached and the processes in bar/barB can't reclaim file page cache from > > > > bar/barA any more. While without this new added information we don't know > > > > why the file page cache in bar can't be reclaimed. > > > > > > Well, I am not sure this is really usefull enough TBH. It doesn't give > > > you the whole picture and it potentially generates a lot of output in > > > the oom report. FYI we used to have a more precise break down of > > > counters in memcg hierarchy, see 58cf188ed649 ("memcg, oom: provide more > > > precise dump info while memcg oom happening") which later got rewritten > > > by c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup OOM") > > > > > > > At least we'd better print a total protected memory in the oom messages. > > > > > Could you be more specific why do you really need this piece of > > > information? > > > > I have said in the commit log, that we don't know why the file cache > > can't be reclaimed (when evictable is 0 and dirty is 0 as well.) > > And the counter argument is that this will not help you there much in > many large and much more common cases. > > I argue, and I might be wrong here so feel free to correct me, that the > reclaim protection guarantee (min) is something to be under admins > control. It shouldn't really happen nilly-willy because it has really > large consequences, the OOM including. So if there is a suspicious > amount of memory that could be reclaimed normally then the reclaim > protection is really the first suspect to go after. > -- I don't know whether it happens nilly-willy or not. But if we all know that it may cause OOMs and it don't take too much effort to show it in the OOM messages, we'd better show it. That can help the admins and the admins don't need to suspect it any more. Thanks Yafang