From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13195C433F5 for ; Tue, 23 Nov 2021 14:27:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8FD186B006C; Tue, 23 Nov 2021 09:27:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AD656B0071; Tue, 23 Nov 2021 09:27:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 774D46B0073; Tue, 23 Nov 2021 09:27:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id 69BC86B006C for ; Tue, 23 Nov 2021 09:27:30 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 274B118488929 for ; Tue, 23 Nov 2021 14:27:20 +0000 (UTC) X-FDA: 78840422640.26.4591713 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf17.hostedemail.com (Postfix) with ESMTP id 934E2F0001F1 for ; Tue, 23 Nov 2021 14:27:19 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id D0B681FD5A; Tue, 23 Nov 2021 14:27:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1637677636; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nY7Gs9iD2o4pmGT+rV7UrKH6+E5aLaBuNjkn+nr25hg=; b=J+NynC9FfeJtcWC9+9tpS5mbUkh4xiJqF8Iz4DA2GKgbkEI3dzEJ22KdejigsrgJ+0TTGA v2gok0L/FiGw6tq1uAQ0i+KhLWDVYiY7q9Hsx5U5UTZrnudyy19+2YTD9uU2PcUBevEMp6 ExDfZXHCytViU1FzXuJYn6kuirqS1B8= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id A3188A3B85; Tue, 23 Nov 2021 14:27:16 +0000 (UTC) Date: Tue, 23 Nov 2021 15:27:13 +0100 From: Michal Hocko To: NeilBrown Cc: Matthew Wilcox , Andrew Morton , Thierry Reding , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] MM: discard __GFP_ATOMIC Message-ID: References: <163712397076.13692.4727608274002939094@noble.neil.brown.name> <163727727803.13692.15470049610672496362@noble.neil.brown.name> <163740548025.13692.6428652897557849182@noble.neil.brown.name> <163764092051.7248.17895085691664185172@noble.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <163764092051.7248.17895085691664185172@noble.neil.brown.name> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 934E2F0001F1 X-Stat-Signature: prxm1xbfce163mzxxttphkx4khga7rwg Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=J+NynC9F; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf17.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com X-HE-Tag: 1637677639-996841 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 23-11-21 15:15:20, Neil Brown wrote: > On Tue, 23 Nov 2021, Michal Hocko wrote: [...] > > Both __GFP_DIRECT_RECLAIM and __GFP_KSWAPD_RECLAIM are way too lowlevel > > but historically we've had requests to inhibit kswapd for a particular > > requests because that has led to problems - fun reading caf491916b1c1. > > Unfortunately that commit doesn't provide any reasoning, just an > assertion. > The best reasoning I could find was in caf491916b1c1 which was the initial > revert. There the primary reasoning was "there is a bug that we don't > have time for a proper fix before the next release, so let's just use > this quick fix". > ... and maybe "the quick fix" was "the right fix", but I cannot tell from > the commit logs :-( Yeah, that was not entirely fair from me but I just found it a nice example of how fun our process around gpf has been historically. A more fair would be to point you at 32dba98e085f ("thp: _GFP_NO_KSWAPD") which has introduced for THP use. Mostly as a workaround to existing reclaim problems because THPs have been enabled by default for everybody and that had backfired. Rik has tried to remove the flag c654345924f7 ("mm: remove __GFP_NO_KSWAPD") because most problems had been fixed - he believed. But that has turned out to be not the case 82b212f40059 ("Revert "mm: remove __GFP_NO_KSWAPD"") and swap storms triggered by THP peak loads were still observed. THP still seem to remain to be the biggest user of the flag (read only to care to not have the flag. Maybe another round of the check whether we need it... > > __GFP_ALLOW_BLOCKING would make a lot of sense but I am not sure it > > would be a good match to __GFP_KSWAPD_RECLAIM. > > So? __GFP_ALLOW_BLOCKING makes it clear what is, or is not, acceptable > to the caller. How much reclaim, or other activity, alloc_page() > engages in is largely irrelevant to the caller as lock as it doesn't > block if asked not to (and doesn't enter an FS if asked not to, etc). Hmm, maybe you are right. > > > Actually ... I take it back about __GFP_NOWARN. That probably shouldn't > > > exist at all. Warnings should be based on how stressed the mm system is, > > > not on whether the caller wants thinks failure is manageable. > > > > Unless we change the way when allocation warnings are triggered then we > > really need this. There are many opportunistic allocations with a > > fallback behavior which do not want to swamp kernel logs with failures > > that are of no use. Think of a THP allocation that really want to be > > just very quick and falls back to normal base pages otherwise. Deducing > > context which is just fine to not report failures is quite tricky and it > > can get wrong easily. Callers should know whether warning can be of any > > use in many cases. > > "Unless" being the key work. > It makes sense to warn when a __GFP_HIGH or __GFP_MEMALLOC allocation > fails, because they are clearly important. > > It makes sense to warning if direct reclaim and retrying were enabled, > as then alloc_page() has tried really hard, but failed anyway. Thought > maybe if COSTLY_ORDER is exceeded, then the warning is unlikely to be > interesting. For "normal" small allocations we usually get an OOM report if the memory is depleted. That will provide quite a lot of potentially useful context to debug memory usage. Non reclaiming allocations can be just opportunistic that choose to not reclaim with an other approach as a fallback but there are others that really cannot reclaim because they are in an atomic context. I do not see an easy way to tell one from the other. Simirarly for higher order allocations it can be useful to see whether the memory is depletely or just fragmented. > But does it ever make sense to warn if either of > __GFP_RETRY_MAYFAIL __GFP_NORETRY are present? > If we always suppressed warning when those flags were present, then many > (most?) uses for __GFP_NOWARN can be discarded. Yes __GFP_NORETRY is mostly (maybe always) used with __GFP_NOWARN. Coccinelle would be a good way to check. I do remember MAYFAIL is used for page migration to allocate target memory. It is often useful to see that the migration is failing because of lack of memory. > I can see that some of the __GFP flags are designed to each perform a > single well-defined function and internally to mm/ that makes sense. > But exposing those flags to all users appears to be a recipe for > trouble. Hiding them all behind "__" doesn't stop people from using and > misusing them. Others are externally meaningful. Making them visually > similar to the ones we want to hide isn't helping anyone. I do agree here. -- Michal Hocko SUSE Labs