From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F68FC10DCE for ; Wed, 11 Mar 2020 00:18:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3B34122522 for ; Wed, 11 Mar 2020 00:18:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="pO7/4cHW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3B34122522 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E1D1F6B0006; Tue, 10 Mar 2020 20:18:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA6096B0007; Tue, 10 Mar 2020 20:18:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBC506B0008; Tue, 10 Mar 2020 20:18:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id B0F0A6B0006 for ; Tue, 10 Mar 2020 20:18:04 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 77A36181AEF10 for ; Wed, 11 Mar 2020 00:18:04 +0000 (UTC) X-FDA: 76581168888.28.mint30_80a9ae2f0b939 X-HE-Tag: mint30_80a9ae2f0b939 X-Filterd-Recvd-Size: 3519 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Mar 2020 00:18:04 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C4334222C4; Wed, 11 Mar 2020 00:18:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1583885883; bh=y5QJ9GXWSnz15V7XZRVYJqb5uqwEux6PilkXReQ4U1Q=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=pO7/4cHWR3qTj0bznJ7188fG64qaOSB88v7476iEvv3KbMspuOvOFmbEqLGsOanQ1 9/io2pTiqypzmUKiZ/4F0cXTFv2UUPnnwJQ9foCp+4eBKERrzND7jG8Aw8qFxSU4U3 2XbTIWxreNknyUIg94KcQqpX/rci92Nfcn53fLfU= Date: Tue, 10 Mar 2020 17:18:02 -0700 From: Andrew Morton To: David Rientjes Cc: Vlastimil Babka , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems Message-Id: <20200310171802.128129f6817ef3f77d230ccd@linux-foundation.org> In-Reply-To: References: X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 10 Mar 2020 14:39:48 -0700 (PDT) David Rientjes wrote: > When a process is oom killed as a result of memcg limits and the victim > is waiting to exit, nothing ends up actually yielding the processor back > to the victim on UP systems with preemption disabled. Instead, the > charging process simply loops in memcg reclaim and eventually soft > lockups. > > Memory cgroup out of memory: Killed process 808 (repro) total-vm:41944kB, anon-rss:35344kB, file-rss:504kB, shmem-rss:0kB, UID:0 pgtables:108kB oom_score_adj:0 > watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [repro:806] > CPU: 0 PID: 806 Comm: repro Not tainted 5.6.0-rc5+ #136 > RIP: 0010:shrink_lruvec+0x4e9/0xa40 > ... > Call Trace: > shrink_node+0x40d/0x7d0 > do_try_to_free_pages+0x13f/0x470 > try_to_free_mem_cgroup_pages+0x16d/0x230 > try_charge+0x247/0xac0 > mem_cgroup_try_charge+0x10a/0x220 > mem_cgroup_try_charge_delay+0x1e/0x40 > handle_mm_fault+0xdf2/0x15f0 > do_user_addr_fault+0x21f/0x420 > page_fault+0x2f/0x40 > > Make sure that something ends up actually yielding the processor back to > the victim to allow for memory freeing. Most appropriate place appears to > be shrink_node_memcgs() where the iteration of all decendant memcgs could > be particularly lengthy. > That's a bit sad. > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2637,6 +2637,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > unsigned long reclaimed; > unsigned long scanned; > > + cond_resched(); > + > switch (mem_cgroup_protected(target_memcg, memcg)) { > case MEMCG_PROT_MIN: > /* Obviously better, but this will still spin wheels until this tasks's timeslice expires, and we might want to do something to help ensure that the victim runs next (or soon)? (And why is shrink_node_memcgs compiled in when CONFIG_MEMCG=n?)