From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B85D6C433E1 for ; Thu, 16 Jul 2020 07:04:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 734BD2070E for ; Thu, 16 Jul 2020 07:04:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bBxODUjJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 734BD2070E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F10A26B0007; Thu, 16 Jul 2020 03:04:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC15E6B0008; Thu, 16 Jul 2020 03:04:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D892A6B000C; Thu, 16 Jul 2020 03:04:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id C29B76B0007 for ; Thu, 16 Jul 2020 03:04:21 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5AF6A180ACF75 for ; Thu, 16 Jul 2020 07:04:21 +0000 (UTC) X-FDA: 77043050322.30.lip14_6004d1226f00 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id 2915F180B3C95 for ; Thu, 16 Jul 2020 07:04:21 +0000 (UTC) X-HE-Tag: lip14_6004d1226f00 X-Filterd-Recvd-Size: 6566 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Thu, 16 Jul 2020 07:04:20 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id t15so4203735pjq.5 for ; Thu, 16 Jul 2020 00:04:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=triWF8xaV9S3MxIeX2UZ8cSP2jQhrBicyk81vTA+2PI=; b=bBxODUjJf6WKtnoIWRiONg3XF5GAEX4NxoV9QCXTxfMrw/k5J9OtHdRd+K2UyLHa6F hUnnllFErR0evFQK9mEHIMHUhut0FAnMqzbWH8tx1adA1TaTFCIh0pPxfMMEH5Fkk7FY AZQ06v+Nktd9FSOikzYYWIGg1kc/iDeMCArr8hhooAZ5k4GbB4+Wx2HQ2j0I3CRI9gat vqNp0gs7uWNNWeoplIvhhaHK/gBIUKNLBb6UjdvFKWe8PbeDl1gFwyIUmf7XDV7yuD86 Gcf8lwzJpu5gv48LbXZ/mA9/JrAHU9e+oc4zFYHd2d6D1T0ypkhCWuC+ZzOhyG8sFxgm VDRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=triWF8xaV9S3MxIeX2UZ8cSP2jQhrBicyk81vTA+2PI=; b=N5hUis1WmO5pkwPSUDTd3ThXHaDQFD5TyMa+JCeao0ar0I+MVwls16Zr9X5F2fZC+g zQzsygFl0qUBaphwtYQcmXGsPxjd6SK7bVecBK3mtUIAioa9Jj0DXFwiJtgC0wImeZ6k 4/b6JrHVanwYCHY26nWbI8wxwKyTyxOVymW8cFTi5WEwXyFSSyslAj1DIClSlnm73AnX jceycLbxd4YL0HraSWVSa6O/1zUWUgTpzPqLrwFUd8hsq8C8bWlGMbsptC+PDemexpe5 qIvnDAVOOPx+ZoC3E4XucqLZvahf9iN4lg1IjwDhkFY77nr3EpE+fAC1Kx+HTf66fbQv koNw== X-Gm-Message-State: AOAM531AIBp/GQf7tkOkSqgYv1sWsijcMDVuq0TDNUBZLJWVG+Qgs202 vFyS1y18kwuuXeUq+igRx6ZFMw== X-Google-Smtp-Source: ABdhPJzoOFzVWolC7pQVnqSTH04N92tWlNWxQbkgthNd37AYi3S4N7macufwLyZY3ILIHl/jxK+ahA== X-Received: by 2002:a17:902:d685:: with SMTP id v5mr2477112ply.117.1594883059469; Thu, 16 Jul 2020 00:04:19 -0700 (PDT) Received: from [2620:15c:17:3:4a0f:cfff:fe51:6667] ([2620:15c:17:3:4a0f:cfff:fe51:6667]) by smtp.gmail.com with ESMTPSA id f2sm3921731pfb.184.2020.07.16.00.04.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Jul 2020 00:04:18 -0700 (PDT) Date: Thu, 16 Jul 2020 00:04:18 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Yafang Shao cc: Michal Hocko , Tetsuo Handa , Andrew Morton , Johannes Weiner , Linux MM Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom In-Reply-To: Message-ID: References: <1594735034-19190-1-git-send-email-laoar.shao@gmail.com> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 2915F180B3C95 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 16 Jul 2020, Yafang Shao wrote: > > But yes, we store vital > > information about the memcg at the time of the first oom event when the > > oom killer is disabled (to allow userspace to determine what the best > > course of action is). > > > > It would be better if you could upstream the features in your kernel, > and I think it could also help the other users. > Everything we've discussed so far has been proposed in the past, actually. I think we stress the oom killer and use it at scale that others do not, so only a subset of users would find it interesting. You are very likely one of those subset of users. We should certainly talk about other issues that we have run into that make the upstream oom killer unusable. Are there other areas that you're currently focused on or having trouble with? I'd be happy to have a discussion on how we have resolved a lot of its issues. > I understand what you mean "point of no return", but that seems a > workaround rather than a fix. > If you don't want to kill unnecessary processes, then checking the > memcg margin before sending sigkill is better, because as I said > before the race will be most likely happening in dump_header(). > If you don't want to show strange OOM information like "your process > was oom killed and it shows usage is 60MB in a memcg limited > to 100MB", it is better to get the snapshot of the OOM when it is > triggered and then show it later, and I think it could also apply to > the global OOM. > It's likely a misunderstanding: I wasn't necessarily concerned about showing 60MB in a memcg limited to 100MB, that part we can deal with, the concern was after dumping all of that great information that instead of getting a "Killed process..." we get a "Oh, there's memory now, just kidding about everything we just dumped" ;) We could likely enlighten userspace about that so that we don't consider that to be an actual oom kill. But I also very much agree that after dump_header() would be appropriate as well since the goal is to prevent unnecessary oom killing. Would you mind sending a patch to check mem_cgroup_margin() on is_memcg_oom() prior to sending the SIGKILL to the victim and printing the "Killed process..." line? We'd need a line that says "xKB of memory now available -- suppressing oom kill" or something along those lines so userspace understands what happened. But the memory info that it emits both for the state of the memcg and system RAM may also be helpful to understand why we got to the oom kill in the first place, which is also a good thing. I'd happy ack that patch since it would be a comprehensive solution that avoids oom kill of user processes at all costs, which is a goal I think we can all rally behind.