From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E61ECC10DCE for ; Wed, 18 Mar 2020 09:57:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A54CC20767 for ; Wed, 18 Mar 2020 09:57:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A54CC20767 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 52A616B0072; Wed, 18 Mar 2020 05:57:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4DCA36B0073; Wed, 18 Mar 2020 05:57:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C9F46B0074; Wed, 18 Mar 2020 05:57:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 2607F6B0072 for ; Wed, 18 Mar 2020 05:57:14 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DCDC2181AEF07 for ; Wed, 18 Mar 2020 09:57:13 +0000 (UTC) X-FDA: 76608029946.10.scarf21_3f1988400de00 X-HE-Tag: scarf21_3f1988400de00 X-Filterd-Recvd-Size: 3938 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Mar 2020 09:57:13 +0000 (UTC) Received: by mail-wr1-f49.google.com with SMTP id s5so29512635wrg.3 for ; Wed, 18 Mar 2020 02:57:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=GOcmbfCGsBYkxr9ElKbbF2lKiCO06vnAuf1KQJx6gnM=; b=DxwxF7kTbLU8l8Jrx7NZqk4K/SpUIxI1rJj2tbhhFNr60ZSJ+fPoufxkP9Dz85dYla q0eFNdTKTNHaUm5qnP3XG7jVztlmT+99SdIFKvi5fmRsgRX1uMQIMmRxOPaqq2YDouuL Dl7VyAb7ofQSdQ7XOOf67Y0URU32dMm6Ur04S/dpy+kCIUIae5b9HjRhLyzTsgkJAbrz 8+SyWDc3LzamM0ILgNagK78cUkIsmztWrIaP79zVY0UFGLyre3s2lECPt3Ulgikq937U N4RhW/+Ypneo29AE1UgVGQt3THs/V93k/JLO2tivo6lGoCvXFI2qzgVofVr1uOpfszVA lYlw== X-Gm-Message-State: ANhLgQ1A3K8XE2rl03DDSCAIp048HKfpe6faee+tyS+dpx0PyAF2jFHx 3/83nxD8LjqcYyX13Dq5EX0= X-Google-Smtp-Source: ADFU+vvBHA61twBo/iJCkAyPfb8alCOE0RxXYb3BGFv8E2BcCaIptytGe1N2yy2yIUN1EMDdUQzxpQ== X-Received: by 2002:adf:ea42:: with SMTP id j2mr4642015wrn.3.1584525432444; Wed, 18 Mar 2020 02:57:12 -0700 (PDT) Received: from localhost (ip-37-188-180-89.eurotel.cz. [37.188.180.89]) by smtp.gmail.com with ESMTPSA id s7sm8708665wro.10.2020.03.18.02.57.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2020 02:57:11 -0700 (PDT) Date: Wed, 18 Mar 2020 10:57:10 +0100 From: Michal Hocko To: Ami Fischman Cc: Robert Kolchmeyer , David Rientjes , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: make a last minute check to prevent unnecessary memcg oom kills Message-ID: <20200318095710.GG21362@dhcp22.suse.cz> References: <20200310221938.GF8447@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 17-03-20 12:00:45, Ami Fischman wrote: > On Tue, Mar 17, 2020 at 11:26 AM Robert Kolchmeyer > wrote: > > > > On Tue, Mar 10, 2020 at 3:54 PM David Rientjes wrote: > > > > > > Robert, could you elaborate on the user-visible effects of this issue that > > > caused it to initially get reported? > > > > Ami (now cc'ed) knows more, but here is my understanding. > > Robert's description of the mechanics we observed is accurate. > > We discovered this regression in the oom-killer's behavior when > attempting to upgrade our system. The fraction of the system that > went unhealthy due to this issue was approximately equal to the > _sum_ of all other causes of unhealth, which are many and varied, > but each of which contribute only a small amount of > unhealth. This issue forced a rollback to the previous kernel > where we ~never see this behavior, returning our unhealth levels > to the previous background levels. Could you be more specific on the good vs. bad kernel versions? Because I do not remember any oom changes that would affect the time-to-check-time-to-kill race. The timing might be slightly different in each kernel version of course. -- Michal Hocko SUSE Labs