From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EFA5C433E1 for ; Thu, 16 Jul 2020 20:04:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0EF02207BC for ; Thu, 16 Jul 2020 20:04:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GAIhNRNm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0EF02207BC Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7D2228D0009; Thu, 16 Jul 2020 16:04:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 782D28D0003; Thu, 16 Jul 2020 16:04:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 699E58D0009; Thu, 16 Jul 2020 16:04:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 530218D0003 for ; Thu, 16 Jul 2020 16:04:49 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E495E2470 for ; Thu, 16 Jul 2020 20:04:48 +0000 (UTC) X-FDA: 77045017056.01.look63_5903a7326f04 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 967953000250760E for ; Thu, 16 Jul 2020 20:04:48 +0000 (UTC) X-HE-Tag: look63_5903a7326f04 X-Filterd-Recvd-Size: 6276 Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Thu, 16 Jul 2020 20:04:47 +0000 (UTC) Received: by mail-pl1-f195.google.com with SMTP id b9so4340464plx.6 for ; Thu, 16 Jul 2020 13:04:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=zTt9JHjaeBzVNEKKokWOXyK2HKJ1uRO0T6LWVKwUZws=; b=GAIhNRNmad0hVE2ZglS97yYaS6I2QKWFRMj+yxnlyEEm7iGl1H0ODSACGOFjB7/8aR lHjoqZ4IgoLZtSECp++ojNCtFCdrd6WAEy/k/z/SfSppK+pOp7YCAx6JpMZISfBaXLzG D/3T0M+pJf+jVlyxESBDQT6hpVPP17XZ6vvfBOECPo3RpISjDaKiJdiLeemyD12bUlcC gBYQQ/L9ZOhx/UwAYUGrrmIM4Q9kcDcI0kNZgqxXKdieybxnV60pAyVzP0cH4u4vM3oP AQeMLKI6QF6bUeHBEdxm33+N9lQw394x9AcFLB06urc00tH62Cts5PQMaJ0BZLShX6vq 9JsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=zTt9JHjaeBzVNEKKokWOXyK2HKJ1uRO0T6LWVKwUZws=; b=WFbELdq0SDhw7IEsZSJHCNDklW4o8pYv9uZYcyWeyJ0tNLyNKsjtwtmGOvrwBq6LT9 m8SaNgGV5J/5zCo56sjzbGT1fRDIJ939HA46xNn1NKCr7BTEJjqJJqOlhU37G4LSCKXH huS7U1q1ZNqfEq3kjupnXgpR8sux0MHGUVhaHfjyX5Lf9LMdn9urHAQDfrxrjE1ui3Xb gohYSQ/8N9x2vQVDMebkhZ47+roFKoA7E/iBHgd+RTgOPdltvCtvw2wpMV5c39mIhNMM OKI5rrsl/ONHglipUfLzUuTEvy+XGEExvTJJcGNeFLh8GQNYCx6s+hnDLsmjXz1YdPdR gpag== X-Gm-Message-State: AOAM531xmaEB+rL26jgHCuYu+b19cg2cBsay1N5vyRg4T6vucS2yUYot wsqerBnJy2JZEqLsGhD4CwayRg== X-Google-Smtp-Source: ABdhPJy6ZBUO17gVvIgiFvVuPzhDKnVABCxG/UdKZ6pG9R5na9EEPvAgB7Q++Kw5T2q+QpsU6I4z0Q== X-Received: by 2002:a17:90b:196:: with SMTP id t22mr1875296pjs.13.1594929885570; Thu, 16 Jul 2020 13:04:45 -0700 (PDT) Received: from [2620:15c:17:3:4a0f:cfff:fe51:6667] ([2620:15c:17:3:4a0f:cfff:fe51:6667]) by smtp.gmail.com with ESMTPSA id y7sm761600pjy.54.2020.07.16.13.04.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Jul 2020 13:04:44 -0700 (PDT) Date: Thu, 16 Jul 2020 13:04:43 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Michal Hocko cc: Yafang Shao , Tetsuo Handa , Andrew Morton , Johannes Weiner , Linux MM Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom In-Reply-To: <20200716071240.GD31089@dhcp22.suse.cz> Message-ID: References: <1594735034-19190-1-git-send-email-laoar.shao@gmail.com> <20200716060814.GA31089@dhcp22.suse.cz> <20200716071240.GD31089@dhcp22.suse.cz> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 967953000250760E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 16 Jul 2020, Michal Hocko wrote: > > It's not possible to present data because we've had such a check for years > > in our fleet so I can't say that it has prevented X unnecessary oom kills > > compared to doing the check prior to calling out_of_memory(). I'm hoping > > that can be understood. > > > > Since Yafang is facing the same issue, and there is no significant > > downside to doing the mem_cgroup_margin() check prior to > > oom_kill_process() (or checking task_will_free_mem(current)), and it's > > acknowledged that it *can* prevent unnecessary oom killing, which is a > > very good thing, I'd like to understand why such resistance to it. > > Because exactly this kind of arguments has led to quite some "should be > fine" heuristics which kicked back: do not kill exiting task, sacrifice > child instead of a victim just to name few. All of them make some sense > from a glance but they can serious kick back as the experience has > thought us. > > Really, I do not see what is so hard to understand that each heuristic, > especially those to subtle areas like oom definitely is, needs data to > justify them. We are running this for years is really not an argument. > Sure arguing that your workload leads to x amount of false positives > and just shifting the check to later saves y amount of them sounds like > a relevant argument to me. > Deferring the go/no-go decision on the oom kill to the very last moment doesn't seem like a heuristic, I think it's an inherent responsibility of the kernel to do whatever necessary to prevent a userspace process from being oom killed (and the way to solve Yafang's issue that we had solved years ago). That can be done by closing the window as much as possible (including within out_of_memory()) to reduce the likelihood of unnecessary oom killing. It's intuitive and seems rather trivial. I would understand an argument against such an approach if it added elaborate complexity, but this isn't doing so. If the decision was already made in oom_kill_process(), I don't think anybody would advocate for moving it below out_of_memory() to its current state. We aren't losing anything here, we are only preventing unnecessary oom killing that has caused issues for Yafang as well as us. Any solution that does a mem_cgroup_margin() check before out_of_memory() in the memcg path is closing that window a little bit, but I think we can do better.