From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3632C4743E for ; Tue, 8 Jun 2021 13:58:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7720661354 for ; Tue, 8 Jun 2021 13:58:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7720661354 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EB40E6B0036; Tue, 8 Jun 2021 09:58:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E6ECD6B006E; Tue, 8 Jun 2021 09:58:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB6886B0070; Tue, 8 Jun 2021 09:58:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0130.hostedemail.com [216.40.44.130]) by kanga.kvack.org (Postfix) with ESMTP id 9415D6B0036 for ; Tue, 8 Jun 2021 09:58:20 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2A1F3180AD806 for ; Tue, 8 Jun 2021 13:58:20 +0000 (UTC) X-FDA: 78230711160.03.C40CE2A Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf07.hostedemail.com (Postfix) with ESMTP id 25D7BA0001CF for ; Tue, 8 Jun 2021 13:58:16 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 5A90D1FD2A; Tue, 8 Jun 2021 13:58:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1623160698; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3eQcJU82pP5ycYh4pXk4zYpBiluail9LWdG6tOqrsbk=; b=tkeLvhYC+AR5T7NP+R3G7NGdVmV695Y/tz2aFQgfuvTobXvsZIDOHAne0RtdOMWe3XDeqR jjesP2KTgQlzCdbfBuTCbIkqjquSc4mA+JLFhbfqSHTIinZVF4ogSCzTb9LfNixikDBPkr 2tpSs/hP4r0RUTqhd0+RNAa6T95YF+I= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 27495A3B83; Tue, 8 Jun 2021 13:58:18 +0000 (UTC) Date: Tue, 8 Jun 2021 15:58:17 +0200 From: Michal Hocko To: Aaron Tomlin Cc: Waiman Long , Shakeel Butt , Linux MM , Andrew Morton , Vlastimil Babka , LKML Subject: Re: [RFC PATCH] mm/oom_kill: allow oom kill allocating task for non-global case Message-ID: References: <6d23ce58-4c4b-116a-6d74-c2cf4947492b@redhat.com> <353d012f-e8d4-c54c-b33e-54737e1a0115@redhat.com> <20210608100022.pzuwa6aiiffnoikx@ava.usersys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20210608100022.pzuwa6aiiffnoikx@ava.usersys.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 25D7BA0001CF X-Stat-Signature: xyxm536hgsq4anipryr5h48fnja1mask Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=tkeLvhYC; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf07.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com X-HE-Tag: 1623160696-269546 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 08-06-21 11:00:22, Aaron Tomlin wrote: > On Tue 2021-06-08 08:22 +0200, Michal Hocko wrote: > > OK. A full report (including the backtrace) would tell us more what i= s > > the source of the charge. I thought that most #PF charging paths use = the > > same gfp mask as the allocation (which would include other flags on t= op > > of GFP_KERNEL) but it seems we just use GFP_KERNEL at many places. >=20 > The following is what I can provide for now: >=20 Let me add what we have from previous email > [ 8221.433608] memory: usage 21280kB, limit 204800kB, failcnt 49116 > =A0 : > [ 8227.239769] [ pid ]=A0=A0 uid=A0 tgid total_vm=A0=A0=A0=A0=A0 rss pg= tables_bytes swapents oom_score_adj name > [ 8227.242495] [1611298]=A0=A0=A0=A0 0 1611298=A0=A0=A0 35869=A0=A0=A0=A0= =A0 635 167936=A0=A0=A0=A0=A0=A0=A0 0=A0=A0=A0=A0=A0=A0=A0=A0 -1000 conmo= n > [ 8227.242518] [1702509]=A0=A0=A0=A0 0 1702509=A0=A0=A0 35869=A0=A0=A0=A0= =A0 701 176128=A0=A0=A0=A0=A0=A0=A0 0=A0=A0=A0=A0=A0=A0=A0=A0 -1000 conmo= n > [ 8227.242522] [1703345] 1001050000 1703294=A0=A0 183440=A0=A0=A0=A0=A0= =A0=A0 0 2125824=A0=A0=A0=A0=A0=A0=A0 0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 999= node > [ 8227.242706] Out of memory and no killable processes... I do not see this message to be ever printed on 4.18 for memcg oom: /* Found nothing?!?! Either we hang forever, or we panic. */ if (!oc->chosen && !is_sysrq_oom(oc) && !is_memcg_oom(oc)) { dump_header(oc, NULL); panic("Out of memory and no killable processes...\n"); } So how come it got triggered here? Is it possible that there is a global oom killer somehow going on along with the memcg OOM? Because the below stack clearly points to a memcg OOM and a new one AFAICS. That being said, a full chain of oom events would be definitely useful to get a better idea. > [ 8227.242731] node invoked oom-killer: gfp_mask=3D0x6000c0(GFP_KERNEL)= , nodemask=3D(null), order=3D0, oom_score_adj=3D999 > [ 8227.242732] node cpuset=3DXXXX mems_allowed=3D0-1 > [ 8227.242736] CPU: 12 PID: 1703347 Comm: node Kdump: loaded Not tainte= d 4.18.0-193.51.1.el8_2.x86_64 #1 > [ 8227.242737] Hardware name: XXXX > [ 8227.242738] Call Trace: > [ 8227.242746] dump_stack+0x5c/0x80 > [ 8227.242751] dump_header+0x6e/0x27a > [ 8227.242753] out_of_memory.cold.31+0x39/0x8d > [ 8227.242756] mem_cgroup_out_of_memory+0x49/0x80 > [ 8227.242758] try_charge+0x58c/0x780 > [ 8227.242761] ? __alloc_pages_nodemask+0xef/0x280 > [ 8227.242763] mem_cgroup_try_charge+0x8b/0x1a0 > [ 8227.242764] mem_cgroup_try_charge_delay+0x1c/0x40 > [ 8227.242767] do_anonymous_page+0xb5/0x360 > [ 8227.242770] ? __switch_to_asm+0x35/0x70 > [ 8227.242772] __handle_mm_fault+0x662/0x6a0 > [ 8227.242774] handle_mm_fault+0xda/0x200 > [ 8227.242778] __do_page_fault+0x22d/0x4e0 > [ 8227.242780] do_page_fault+0x32/0x110 > [ 8227.242782] ? page_fault+0x8/0x30 > [ 8227.242783] page_fault+0x1e/0x30 >=20 > --=20 > Aaron Tomlin --=20 Michal Hocko SUSE Labs