From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44B80C433E7 for ; Wed, 15 Jul 2020 17:30:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E4E8A2065D for ; Wed, 15 Jul 2020 17:30:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ezkjXT01" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4E8A2065D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3969B8D0002; Wed, 15 Jul 2020 13:30:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 347966B000A; Wed, 15 Jul 2020 13:30:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25DE88D0002; Wed, 15 Jul 2020 13:30:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0097.hostedemail.com [216.40.44.97]) by kanga.kvack.org (Postfix) with ESMTP id 0D0A86B0007 for ; Wed, 15 Jul 2020 13:30:55 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 63DD51EE6 for ; Wed, 15 Jul 2020 17:30:54 +0000 (UTC) X-FDA: 77041000428.25.rule50_1c0a57a26efb Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 3DB6E1804E3B2 for ; Wed, 15 Jul 2020 17:30:54 +0000 (UTC) X-HE-Tag: rule50_1c0a57a26efb X-Filterd-Recvd-Size: 6092 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Jul 2020 17:30:53 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id gc9so3404664pjb.2 for ; Wed, 15 Jul 2020 10:30:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=ylWtbPT42YG6SsHQ3H0CnN6TiN1tPBq4dA7FnD50tus=; b=ezkjXT01JfdvxAOqwrtbh6bOobJPCOhN9W2dR1/Ji+nv1mvmDbXi8uB9HFVQ4Z0t/r NyGu26RpzqPK9+5hLWSLm+VRQ906qP1qRKHOC0oZOLcCpj/aqa83oTi2csorlBjaFrau PNxkxJkBEpT56q6KmPgy3ukNcDVqQ1NQm7Lxw24TUuTN5mhwsIKXFMUmygbmd5jZAnrG zI3QOhlwkDsjrY7fuAkfuUKFGmNORtBAohZbNeiYXWfKtIwNk71R1ouD8gLVx49Y0Vz8 c64AfSGQlGoX0sSuaJA7y02dkvEC+K5Hn6IzUkkDMExr32p9rCArzgG3T4WRx0a6W2Aq G17w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=ylWtbPT42YG6SsHQ3H0CnN6TiN1tPBq4dA7FnD50tus=; b=I/KYbYE/z8L1JmO7T1ppue9yUdAgiQnLCShjs2ez03q2gHtN8uDGpo6vE1Obeqd8vH 3tUVHUXD7ZOCXTXQ07TnCRg7qEmeE+sDOLeE/ARpBI7WmwIq0/+UuMvjHoO6sgIqW6nU mpKHIKUd68VusLuVdzKQzuI4eHCtKcw5yQD8A65PaXoqQgRmtqMWDZ8IJ/y0oAzSIvMp OzjDSsNAt8wWay6lnCgUoAejpsoit6fnZ6LSq67fqlbzWKDQ6oHGLAwbIZNFy5dGgDoE A/u2Sim1HCYtgQMyTUtQhQBrcwm/mxRb+1MfGIcql09C8eylTc1LOtLbLpWOUKtPq0Xh M73A== X-Gm-Message-State: AOAM531Fzn12Jiu95yJlZYEfBgGh4jz9/KgT1Qq5ntk8J3CogY1usNzT LO2NTnfex/1lNdDMisogGX6Gzg== X-Google-Smtp-Source: ABdhPJxp9lgUR5L7SXKXZ0TRAnyBxgWEgR0mzbrw0I6qHnp8sZTNj2u873HoHSHymmLi64BeO5B8fQ== X-Received: by 2002:a17:902:9307:: with SMTP id bc7mr422918plb.213.1594834252571; Wed, 15 Jul 2020 10:30:52 -0700 (PDT) Received: from [2620:15c:17:3:4a0f:cfff:fe51:6667] ([2620:15c:17:3:4a0f:cfff:fe51:6667]) by smtp.gmail.com with ESMTPSA id o26sm2732509pfp.219.2020.07.15.10.30.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Jul 2020 10:30:51 -0700 (PDT) Date: Wed, 15 Jul 2020 10:30:51 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Yafang Shao cc: Michal Hocko , Tetsuo Handa , Andrew Morton , Johannes Weiner , Linux MM Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom In-Reply-To: Message-ID: References: <1594735034-19190-1-git-send-email-laoar.shao@gmail.com> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 3DB6E1804E3B2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 15 Jul 2020, Yafang Shao wrote: > > > If it is the race which causes this issue and we want to reduce the > > > race window, I don't know whether it is proper to check the memcg > > > margin in out_of_memory() or do it before calling do_send_sig_info(). > > > Because per my understanding, dump_header() always takes much more > > > time than select_bad_process() especially if there're slow consoles. > > > So the race might easily happen when doing dump_header() or dumping > > > other information, but if we check the memcg margin after dumping this > > > oom info, it would be strange to dump so much oom logs without killing > > > a process. > > > > > > > Absolutely correct :) In my proposed patch, we declare dump_header() as > > the "point of no return" since we don't want to dump oom kill information > > to the kernel log when nothing is actually killed. We could abort at the > > very last minute, as you mention, but I think that may have an adverse > > impact on anything that cares about that log message. > > How about storing the memcg information in oom_control when the memcg > oom is triggered, and then show this information in dump_header() ? > IOW, the OOM info really shows the memcg status when oom occurs, > rather than the memcg status when this info is printed. > We actually do that too in our kernel but for slightly other reasons :) It's pretty interesting how a lot of our previous concerns with memcg oom killing have been echoed by you in this thread. But yes, we store vital information about the memcg at the time of the first oom event when the oom killer is disabled (to allow userspace to determine what the best course of action is). But regardless of whether we present previous data to the user in the kernel log or not, we've determined that oom killing a process is a serious matter and go to any lengths possible to avoid having to do it. For us, that means waiting until the "point of no return" to either go ahead with oom killing a process or aborting and retrying the charge. I don't think moving the mem_cgroup_margin() check to out_of_memory() right before printing the oom info and killing the process is a very invasive patch. Any strong preference against doing it that way? I think moving the check as late as possible to save a process from being killed when racing with an exiter or killed process (including perhaps current) has a pretty clear motivation.