From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8217C433EF for ; Wed, 24 Nov 2021 10:36:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44A616B007D; Wed, 24 Nov 2021 05:36:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F9796B007B; Wed, 24 Nov 2021 05:36:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E86D6B007D; Wed, 24 Nov 2021 05:36:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0133.hostedemail.com [216.40.44.133]) by kanga.kvack.org (Postfix) with ESMTP id 211856B0075 for ; Wed, 24 Nov 2021 05:36:05 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DD2CC8248D52 for ; Wed, 24 Nov 2021 10:35:54 +0000 (UTC) X-FDA: 78843468228.12.85912EE Received: from outbound-smtp62.blacknight.com (outbound-smtp62.blacknight.com [46.22.136.251]) by imf22.hostedemail.com (Postfix) with ESMTP id 7B6A01932 for ; Wed, 24 Nov 2021 10:35:53 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp62.blacknight.com (Postfix) with ESMTPS id 071C9FB396 for ; Wed, 24 Nov 2021 10:35:53 +0000 (GMT) Received: (qmail 16955 invoked from network); 24 Nov 2021 10:35:52 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 24 Nov 2021 10:35:52 -0000 Date: Wed, 24 Nov 2021 10:35:50 +0000 From: Mel Gorman To: Alexey Avramov Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.com, vbabka@suse.cz, neilb@suse.de, akpm@linux-foundation.org, corbet@lwn.net, riel@surriel.com, hannes@cmpxchg.org, david@fromorbit.com, willy@infradead.org, hdanton@sina.com, penguin-kernel@i-love.sakura.ne.jp, oleksandr@natalenko.name, kernel@xanmod.org, michael@michaellarabel.com, aros@gmx.com, hakavlad@gmail.com Subject: Re: mm: 5.16 regression: reclaim_throttle leads to stall in near-OOM conditions Message-ID: <20211124103550.GE3366@techsingularity.net> References: <20211124011954.7cab9bb4@mail.inbox.lv> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20211124011954.7cab9bb4@mail.inbox.lv> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 7B6A01932 X-Stat-Signature: bwmu8gz45s8fdsfk1fzfx8kdec1c4gsf Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.251 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-HE-Tag: 1637750153-260683 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000008, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 24, 2021 at 01:19:54AM +0900, Alexey Avramov wrote: > I found stalls in near-OOM conditions with Linux 5.16. This is not the > hang-up that was reported by Artem S. Tashkinov in 2019 [1]. It's a *new* > regression. I will demonstrate this with one simple experiment, which I > will reproduce with different kernels or settings. > > With older versions of the kernel, running the `tail /dev/zero` command > usually quickly leads to OOM condition. > > I will run the command `for i in {1...3}; do tail /dev/zero; done` and log > PSI metrics (using psi2log script from nohang v0.2.0 [2]) and some values > from `/proc/meminfo` (using mem2log v0.1.0 [3]) while this command is > running. During the experiment a single tab browser will be kept opened in > which some video will be playing. > Ok, I can reproduce this. However, it does eventually get killed OOM so the system makes progress but maybe the throttling should be for very short intervals if failing to make progress and there have been multiple reclaim failures recently. Disabling the throttling entirely just results in cases where 100% CPU is used spinning through lru lists. Thanks for the report -- Mel Gorman SUSE Labs