From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8675C433F5 for ; Mon, 1 Nov 2021 08:24:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 65E0B60F58 for ; Mon, 1 Nov 2021 08:24:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 65E0B60F58 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BB94B94000A; Mon, 1 Nov 2021 04:24:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B68BB940008; Mon, 1 Nov 2021 04:24:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A5B8594000A; Mon, 1 Nov 2021 04:24:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id 95B70940008 for ; Mon, 1 Nov 2021 04:24:23 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4484118042E0A for ; Mon, 1 Nov 2021 08:24:23 +0000 (UTC) X-FDA: 78759674406.03.28CC5BD Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf23.hostedemail.com (Postfix) with ESMTP id 4151190000AF for ; Mon, 1 Nov 2021 08:24:12 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 725F3212C7; Mon, 1 Nov 2021 08:24:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1635755061; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v73T2I+R3+XsQ58nd63iTWxJazBHCqhFl5kE+JfgSvQ=; b=A8EZzgOTU7ycVXKyf1uY2ceQpnEEle1vwKlTLBIeuVE0dJ3psGUqmEaO4GHbNbQu1zn4jh 59QgOci0x9LLjVjt0YTxnLvyr2Q8nbeQ3ZZFHIAaSOuriKp6jy6GM5XhdYUUjNT0YweNz9 8bOs2GEKr+OhWZN8xtRHEGe5ADNfDec= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id E80A8A3B83; Mon, 1 Nov 2021 08:24:19 +0000 (UTC) Date: Mon, 1 Nov 2021 09:24:18 +0100 From: Michal Hocko To: Yongqiang Liu Cc: rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, "Wangkefeng (OS Kernel Lab)" Subject: Re: [QUESTION] oom killed the key system process triggered by a bad process alloc memory with MAP_LOCKED Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4151190000AF X-Stat-Signature: nz47d3pwhskzcwy99ksxdbtybc47eycs Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=A8EZzgOT; spf=pass (imf23.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-HE-Tag: 1635755052-317578 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, On Mon 01-11-21 16:05:50, Yongqiang Liu wrote: [...] > And we found that when the oom_reaper is done but the memory is still h= igh: >=20 > [=A0=A0 45.115685] Out of memory: Killed process 2553 (oom) total-vm:95= 3404kB, > anon-rss:947748kB, file-rss:388kB, shmem-rss:0kB, UID:0 pgtables:1896kB > oom_score_adj:1000 > [=A0=A0 45.115739] oom_reaper: reaped process 2553 (oom), now anon-rss:= 947708kB, > file-rss:0kB, shmem-rss:0kB >=20 > This is because the bad proccess which recieved SIGKILL is unlocking th= e mem > to exit which needs more time. And the next oom is triggered to kill th= e > other system process. Yes, this is a known limitation of the oom_reaper based OOM killing. __oom_reap_task_mm has to skip over mlocked memory areas because munlocking requires some locking (or at least that was the case when the oom reaper was introduced) and the primary purpose of the oom_reaper is to guarantee a forward progress. Addressing that limitation would require the munlock operation to not depend on any locking. I am not sure how much work that would be with the current code. Until now this was not a high priority because processes with a high mlock limit should be really trusted with their memory consumption so they shouldn't be really the primary oom killer target. Are you seeing this problem happening with a real workload or is this only triggered with some artificial tests? E.g. LTP oom tests are known to trigger this situation but they do not represent any real workload. --=20 Michal Hocko SUSE Labs