From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6289C10F27 for ; Wed, 11 Mar 2020 08:27:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7B44820848 for ; Wed, 11 Mar 2020 08:27:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7B44820848 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DC0EC6B0003; Wed, 11 Mar 2020 04:27:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D705B6B0006; Wed, 11 Mar 2020 04:27:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C86F86B0007; Wed, 11 Mar 2020 04:27:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id AE4C16B0003 for ; Wed, 11 Mar 2020 04:27:40 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 52BC0442A for ; Wed, 11 Mar 2020 08:27:40 +0000 (UTC) X-FDA: 76582402680.17.shoes64_25d96eab90f5e X-HE-Tag: shoes64_25d96eab90f5e X-Filterd-Recvd-Size: 5095 Received: from mail-wm1-f68.google.com (mail-wm1-f68.google.com [209.85.128.68]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Mar 2020 08:27:39 +0000 (UTC) Received: by mail-wm1-f68.google.com with SMTP id n2so1051741wmc.3 for ; Wed, 11 Mar 2020 01:27:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=t/jUgb2WsM9EoAjEAHRyAbPDuMelzLd2J5nybEhmsYU=; b=mLTSLiVp2IdpufyPZaXRf1A/Ba+KQLsyMPRpB+4SPpB8TXt3dUETRwgK8kAVQP4wzu 4aOhPQLs6RgZccpxu6Fbo/4w3fEP22dPDFimY+SGQhrcyHJ5skvTqLimsHASl4MDaGkI v89a0/DFTwzypOr8sLz4b27UE0olWVb6bDehpLIdeiaeHg0FDF+bItpM6Hp5TaGobtAF Pc1KHgwCxKZtdiJ5kFV9Snls6QbNWh4i2Yw2vHfZhbelt6n2dGmpZIX6bC559ez+9h+/ rDwWHKxsyCLvmwbKQjBsoCxGx3qg6694dQ8YvIrPBzqBa/9ap3qoMJSusxHNtf9b01PF dHHg== X-Gm-Message-State: ANhLgQ0ecNlyB/LfDFoX+TpO125DPuGFn1lmL4rp4MbZMbAOgBHD0Fhr oYy3IzkkXydRunOdr/AZN58= X-Google-Smtp-Source: ADFU+vt1rzvjovs9Fqx3ZtwpZNkzPk0A8oKdn7NCmBdaEs2hWB3Vz0N8ji7oL3cMOzx5S1SulIhugw== X-Received: by 2002:a1c:9904:: with SMTP id b4mr2565466wme.34.1583915258992; Wed, 11 Mar 2020 01:27:38 -0700 (PDT) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id y184sm7683553wmd.43.2020.03.11.01.27.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Mar 2020 01:27:38 -0700 (PDT) Date: Wed, 11 Mar 2020 09:27:36 +0100 From: Michal Hocko To: David Rientjes Cc: Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems Message-ID: <20200311082736.GA23944@dhcp22.suse.cz> References: <20200310221019.GE8447@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 10-03-20 16:02:23, David Rientjes wrote: > On Tue, 10 Mar 2020, Michal Hocko wrote: > > > > When a process is oom killed as a result of memcg limits and the victim > > > is waiting to exit, nothing ends up actually yielding the processor back > > > to the victim on UP systems with preemption disabled. Instead, the > > > charging process simply loops in memcg reclaim and eventually soft > > > lockups. > > > > > > Memory cgroup out of memory: Killed process 808 (repro) total-vm:41944kB, anon-rss:35344kB, file-rss:504kB, shmem-rss:0kB, UID:0 pgtables:108kB oom_score_adj:0 > > > watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [repro:806] > > > CPU: 0 PID: 806 Comm: repro Not tainted 5.6.0-rc5+ #136 > > > RIP: 0010:shrink_lruvec+0x4e9/0xa40 > > > ... > > > Call Trace: > > > shrink_node+0x40d/0x7d0 > > > do_try_to_free_pages+0x13f/0x470 > > > try_to_free_mem_cgroup_pages+0x16d/0x230 > > > try_charge+0x247/0xac0 > > > mem_cgroup_try_charge+0x10a/0x220 > > > mem_cgroup_try_charge_delay+0x1e/0x40 > > > handle_mm_fault+0xdf2/0x15f0 > > > do_user_addr_fault+0x21f/0x420 > > > page_fault+0x2f/0x40 > > > > > > Make sure that something ends up actually yielding the processor back to > > > the victim to allow for memory freeing. Most appropriate place appears to > > > be shrink_node_memcgs() where the iteration of all decendant memcgs could > > > be particularly lengthy. > > > > There is a cond_resched in shrink_lruvec and another one in > > shrink_page_list. Why doesn't any of them hit? Is it because there are > > no pages on the LRU list? Because rss data suggests there should be > > enough pages to go that path. Or maybe it is shrink_slab path that takes > > too long? > > > > I think it can be a number of cases, most notably mem_cgroup_protected() > checks which is why the cond_resched() is added above it. Rather than add > cond_resched() only for MEMCG_PROT_MIN and for certain MEMCG_PROT_LOW, the > cond_resched() is added above the switch clause because the iteration > itself may be potentially very lengthy. Was any of the above the case for your soft lockup case? How have you managed to trigger it? As I've said I am not against the patch but I would really like to see an actual explanation what happened rather than speculations of what might have happened. If for nothing else then for the future reference. If this is really about all the hierarchy being MEMCG_PROT_MIN protected and that results in a very expensive and pointless reclaim walk that can trigger soft lockup then it should be explicitly mentioned in the changelog. -- Michal Hocko SUSE Labs