From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B205C4332F for ; Tue, 31 Oct 2023 08:14:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AFD6B6B029E; Tue, 31 Oct 2023 04:14:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AAC746B02A0; Tue, 31 Oct 2023 04:14:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 99BF36B02A2; Tue, 31 Oct 2023 04:14:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8A4286B029E for ; Tue, 31 Oct 2023 04:14:57 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5AF1480BC8 for ; Tue, 31 Oct 2023 08:14:57 +0000 (UTC) X-FDA: 81405045834.20.4C37B5E Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf23.hostedemail.com (Postfix) with ESMTP id 70649140006 for ; Tue, 31 Oct 2023 08:14:54 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=qIRBOpbB; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf23.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698740094; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AY9OCccMzkujq4sE+gwgw90jz2mOKvMAomfVopVHue4=; b=CsIgciQwUkn6nIfzVyobYvc5WeCRoMgolH1c2c0KdKgN5oqn/1uj5NaQlq1A9Efr72coT2 mTLRihoO+35pea+VnwX44i1ckliIERyYKQ8RXd8k+uGqPewp/rlvDbvljIWLPQ/WheeQJy v19lkHvBrtJ07SxO83gyeStnSCCfWbw= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=qIRBOpbB; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf23.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698740094; a=rsa-sha256; cv=none; b=sCP/thXq3SEl32hZg//oEAkILZaVuUoqbdwIa6VS7aebYnGplDqScBqXuqnkCZ7Jyp7HkK EXVzAI9VSCXh7fNjsbqQSGnm2u3AoVPdKvE8kV5ciuDeFMz3dBRrtah4SPfWNBlLfHDyci IaiyDfswXz6iM24MZVZm7WbQt6Y9MZo= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 463B721ADA; Tue, 31 Oct 2023 08:14:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1698740092; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=AY9OCccMzkujq4sE+gwgw90jz2mOKvMAomfVopVHue4=; b=qIRBOpbBVxPNJwKkfhdIfC2rvF6t5lAxdpK4BlRfu63rldwT/gN4aMMMagWW2bfNiYBxwa Rm2BcI54BjroS5nQo1PKC1atPTcI4OY+IqK89lOiEq/RnP+7M11bWEC2XFJorepzsuNd3B p9vnbx33r5g/gPhz550OoY5yEVI3Q84= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 270291391B; Tue, 31 Oct 2023 08:14:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id F0kXB3y3QGW2OAAAMHmgww (envelope-from ); Tue, 31 Oct 2023 08:14:52 +0000 Date: Tue, 31 Oct 2023 09:14:51 +0100 From: Michal Hocko To: Charan Teja Kalla Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, david@redhat.com, vbabka@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: page_alloc: unreserve highatomic page blocks before oom Message-ID: References: <1698669590-3193-1-git-send-email-quic_charante@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1698669590-3193-1-git-send-email-quic_charante@quicinc.com> X-Rspamd-Queue-Id: 70649140006 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: ni8f8cwrf591fnb7wm18b4c7a6caocnr X-HE-Tag: 1698740094-236700 X-HE-Meta: U2FsdGVkX18OcVZ3Qr7Tak09WAcj4PnEmlouQZ+aZ9TwUFWQgzFmRYTvFeFpsKBwfatXjfhZrZPyL9Bcf6ukckjYlw3gr4vXgh3cMGSwJIGxFlDNsQNMSoYbmMNkQiPMP15ZYA902M8QV2CEJzptMsbL4qDZ4zv+x/DWpojQVVN0yYRDwI4zb4RNNmzBseMO+Iv5xey6gBHFaNoPo0NL4dizc5JKjdZ5hzInNzge0CUFHrLZH1UPCsThrbjfpYzhGB37aNI8182CYIU693Tz8k5+nuh/ot1IZQ35Txhogj+Jj/JsPiT1EIZlExoxRtk/ipcYEpYYNS0STnT8eigYpIMEDs1ySh72Y2oOfLBYf+qSLPMuOt8JFuTMeQ6459tWHied7zw6IORKA6cCF4w6DCRhbez60GzgPStvpGMvHbTD1TtdgHal2j9uNmN+XZVmHrglJSzmySLPL47+Seu9zpZpj7riz8ey9ZTLMlyEkz9dDdQDZlfwpJTa4AJM9BFyr9nZNaoW5vFILtGmxAADFcW8RVsQaeLkcrAsjoTBfyMEQQ4Ijot2vqiEI+NhebKEZhukqUF6gUPzfuOwsCtDhXUMzc29vDHZe0T970LKuhtW3CFrIdN+82dDJe2+tKxdNpI3VNtNVac27rVVSDzZJ6hhO24NuCHLCK+bFn5iLm4h3ej6X/omvoOJNwWhbujU0wIG5SIULbTFXqjhY+fhe5fROxF7k7ghEUKtuFkrnQ6MWOpnz4f7WllJu4BkyMk0r1RETdazICF9mqEsFwOeR+QnhLVLmMsIsIts00u4Gtf91wTZ7yZXtnZb5JjWgaNcZUnhOuJIh6PZfAG5P+8UxzqGlFSGyOKjCoo5aGP7RK2H0JvnBkPiRGvTPldQvl/jv/MJ99Tl0w5ALZGYG1cjxIjVPpiwU/M+OSAz+/yhnEIDRKUb7/z5SWRsetV68i0Pdky0aycf98463w8ooF6 wHWo1ux+ ks4qtE5z3ePqt+lHHTzxoPdv+VbzKSuM6iVi6NvsGSPLJrKttOnwLLNDkkgJl0srr36/5Wg8RhvFC/8+V6R/tz2D+eDMLHYaMKspkwUOeJQ3M6dhnN7oBgHAHezCc7N2ZNcf25B+6YMc0e6I+BwLwQvX6RlS7OZHEmMMq1HF7M+mXi3Av+qfZ053fUrD0yAo1TBzUMkHfUa+YwZDhGYrKHn2F41acC5NDHVxum9pc9v7TOcE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 30-10-23 18:09:50, Charan Teja Kalla wrote: > __alloc_pages_direct_reclaim() is called from slowpath allocation where > high atomic reserves can be unreserved after there is a progress in > reclaim and yet no suitable page is found. Later should_reclaim_retry() > gets called from slow path allocation to decide if the reclaim needs to > be retried before OOM kill path is taken. > > should_reclaim_retry() checks the available(reclaimable + free pages) > memory against the min wmark levels of a zone and returns: > a) true, if it is above the min wmark so that slow path allocation will > do the reclaim retries. > b) false, thus slowpath allocation takes oom kill path. > > should_reclaim_retry() can also unreserves the high atomic reserves > **but only after all the reclaim retries are exhausted.** > > In a case where there are almost none reclaimable memory and free pages > contains mostly the high atomic reserves but allocation context can't > use these high atomic reserves, makes the available memory below min > wmark levels hence false is returned from should_reclaim_retry() leading > the allocation request to take OOM kill path. This is an early oom kill > because high atomic reserves are holding lot of free memory and > unreserving of them is not attempted. OK, I see. So we do not release those reserved pages because OOM hits too early. > (early)OOM is encountered on a machine in the below state(excerpt from > the oom kill logs): > [ 295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB > high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB > active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB > present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB > local_pcp:492kB free_cma:0kB > [ 295.998656] lowmem_reserve[]: 0 32 > [ 295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH) > 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB > 0*4096kB = 7752kB OK, this is quite interesting as well. The system is really tiny and 8MB of reserved memory is indeed really high. How come those reservations have grown that high? > > Per above log, the free memory of ~7MB exist in the high atomic > reserves is not freed up before falling back to oom kill path. > > This fix includes unreserving these atomic reserves in the OOM path > before going for a kill. The side effect of unreserving in oom kill path > is that these free pages are checked against the high wmark. If > unreserved from should_reclaim_retry()/__alloc_pages_direct_reclaim(), > they are checked against the min wmark levels. I do not like the fix much TBH. I think the logic should live in should_reclaim_retry. One way to approach it is to unreserve at the end of the function, something like this: diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 95546f376302..d04e14adf2c5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3813,10 +3813,8 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, * Make sure we converge to OOM if we cannot make any progress * several times in the row. */ - if (*no_progress_loops > MAX_RECLAIM_RETRIES) { - /* Before OOM, exhaust highatomic_reserve */ - return unreserve_highatomic_pageblock(ac, true); - } + if (*no_progress_loops > MAX_RECLAIM_RETRIES) + goto out; /* * Keep reclaiming pages while there is a chance this will lead @@ -3859,6 +3857,12 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, schedule_timeout_uninterruptible(1); else cond_resched(); + +out: + /* Before OOM, exhaust highatomic_reserve */ + if (!ret) + return unreserve_highatomic_pageblock(ac, true); + return ret; } -- Michal Hocko SUSE Labs