From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 406C9C73C66 for ; Thu, 29 Aug 2024 11:08:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCEBB6B009B; Thu, 29 Aug 2024 07:08:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B31026B009C; Thu, 29 Aug 2024 07:08:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D1AA6B009D; Thu, 29 Aug 2024 07:08:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7B6566B009B for ; Thu, 29 Aug 2024 07:08:58 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1254C120D92 for ; Thu, 29 Aug 2024 11:08:58 +0000 (UTC) X-FDA: 82505010756.05.675F72A Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by imf12.hostedemail.com (Postfix) with ESMTP id EEDD64000E for ; Thu, 29 Aug 2024 11:08:55 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=N4z0grY5; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf12.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.54 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724929637; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tHWgXXg7l/fVXrIi7EOmqx3KL96WeyQfolynqyBIiyM=; b=dQatjZz2KbOKuxqr/xzxAeRn5U1KHAxetys4EJWYOnbc2rkqlz+aTj7xu4TkiY1PU4T39T DLj35eyHgnBJSJgC3RuVocd+FdwpuS9+V9Wj+CS0lVC+KYVJYItPARLFEEV2CZD0bqfuah sdp4N2lknXbw1ann2VITf0Am1wa9+Rc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724929637; a=rsa-sha256; cv=none; b=ZCFq1wAzYXfAqX2pgJrRuMexbysQ3Og1ksfJnrKhYC4LLmPts2Dq4b6ccp62hRgXDChxZ8 NRPiz25DFzaYNQtQNJajLuKtS3eTukxW/lgjbrFwmNMAY2yr6Bv400ySD68kTboPFrzx7q yHvdN2/cFKjXwkLGpiow52FMwolnLbU= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=N4z0grY5; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf12.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.54 as permitted sender) smtp.mailfrom=mhocko@suse.com Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-4281ca54fd3so4773925e9.2 for ; Thu, 29 Aug 2024 04:08:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1724929734; x=1725534534; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=tHWgXXg7l/fVXrIi7EOmqx3KL96WeyQfolynqyBIiyM=; b=N4z0grY564lYWxAyIi8YE1Mi1OV9Z5VuFRy/rEBdGkN1srnvWHUzziIHl7HH65MeDP k853xlP442P6SASANjVA6vVLxjefNdSW1QdeXaKl5Vo/KFBvoG2QPVytds+ygZbZ5v5W PJipi3HnTuuShPPSzQ+k5sLbk/Ft5GulbY3SSyzgePBVBPn5rmnCd18gRhBY+S10PDCE WTt1vmIrmtZ7wFySZ0Wq96V9FeVtXpjIJUGGMeT9PSt34eaWJMFGBgmX1z6arloiGJ0u pQ+SSLSfuCvIgXX0lNttXDXJKA7CTGAmPT9+P4cE0E2dB2c+RbP2SdjQzCCbxS6YOfYy Xp6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724929734; x=1725534534; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=tHWgXXg7l/fVXrIi7EOmqx3KL96WeyQfolynqyBIiyM=; b=J48jMqXmj94VB0bZc/UY/sgQhn9ZVdNYMSnGNgQX/kRWaI9i1sGZ1s8VWnvqCLMRDD reAtIH2jqsnOaQzuPdneMzo1Iw9yoUPFSUpBTfxMuOuU35cJazZzdgCkEcGDl3IFdEZD uImQ3Ta+hIF9UpP5KeDA/EBs0STBeKy39/j+Y3pV7ZIoud3ulmIWkn9howwmJhap7uQe +R0LtHfz2v2X7jkvwvvpvZg9Sq3CiVwKBek2tSvLEfcvt3MMUgK0BXXIQ9UVRyREEXYA oJvBUSxQyuYGbJZf9yhdtnWPvNMfEcajSTpSntjoWhroKGsxwrhjZk0RjoMb4auij3Jb Dxew== X-Forwarded-Encrypted: i=1; AJvYcCUvMb134nVr8agGiSlYGPWK2UkUhNEW2JhwS29K7/0Hq43fiAKnCnaUmEnMNqJNz/3e54GfERnnMg==@kvack.org X-Gm-Message-State: AOJu0YyVGVjKgt/G1vFXA3KXlQdktZDpDG1dOqxW93wLj/YKdGyFUGgB siEQcT2oKgfz4TqQK4CVwoqWc/o4VryJmzWOopAeTXvLffVsuIPWn9iTIwuaTsQ= X-Google-Smtp-Source: AGHT+IFu5KZqvebb/Rgm2t93l+uZ71HKcObnBSTt523ilS4l+nccDCBApQWByso4xCKbk6PBDWSSKg== X-Received: by 2002:a05:600c:3c93:b0:426:62c5:4742 with SMTP id 5b1f17b1804b1-42bb02c1d88mr16490455e9.7.1724929734254; Thu, 29 Aug 2024 04:08:54 -0700 (PDT) Received: from localhost (109-81-82-19.rct.o2.cz. [109.81.82.19]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3749ef7e109sm1108794f8f.67.2024.08.29.04.08.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Aug 2024 04:08:54 -0700 (PDT) Date: Thu, 29 Aug 2024 13:08:53 +0200 From: Michal Hocko To: Kent Overstreet Cc: Matthew Wilcox , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dave Chinner Subject: Re: [PATCH] bcachefs: Switch to memalloc_flags_do() for vmalloc allocations Message-ID: References: <20240828140638.3204253-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EEDD64000E X-Stat-Signature: ntwfrzit754hmbxzniwx7d9jgsmd5n78 X-Rspam-User: X-HE-Tag: 1724929735-883261 X-HE-Meta: U2FsdGVkX1/Dx7B/1S5YNFomDGvupsIQj+QuQ8flgRKbKzTNcOVXtzUrivRQ0JO9Go9BUih90NBb3FEXK0vYOpsNRn9MWYNJ26dQ3AY0Rwx8IGSdBexHzYFL4aeSOQpKcY8bAl6np6SMWIR1Zb5MVpH7m0SYDMk5ocRzWcK+6xP3Dkfv/L06lX+B9nVJ1zpjuz5sIy1dhQO0i0466hZRkWBEHf0ZxQdjKKf6IvtmNGZBXMKrIcp1ei/VZuwPvT8coiVn+ilbChb8s7277HB4rFcHy2aaWnqJ0ym0W0iPDmCmzPvaaIqWsWxEKtrAh1Fo2FUGNeF3XX5raXqHQ8Rxw3Z8tLoVLX4eV+jFfu41jlhdIUWXxh+7IuTiKpNg0kRyjV9l5i+evRGxwUGiZmOeFuE2grEOwQhFlvMAErAkYnM1qV6FFohzfxtJTmkGbao37dBufU41ueI5nsnHumtgz3xQIEzqjGjogcokPIrpkLGHwSQ+PB6YCt76gR0VsYd1fqT6RpMtdBxUFVPnyIbHU7uzipi0+0bSR7qEJIZqns5cwe93MAAXRlmWtki2qFpbbaNitEorCCEg75G4IeYOBER88d8cMchRZVjINlYoL6lCfQ8ETuxZyHWuHYc1zKtFeRFsPMJKby4S3IoT61PVoG0WAlK6DyhggJxGWPYHzlJytzK8gfDDBXAC4BQtqfRS38NYSWD2S/3KO036JPmEXlGDQ28whVkrwgzuTnDrrF2rNhv7hCIwKbUl0YgcUSMe+DU7OaThiJ1oB4G7418Oe11zEttnZC1jfm1AODrLWxEr3EodjQ56CSCD6qz1CVRzFsbABn/eii7KG9qS4txDRDZ7aFjVeYoSoBE4CT2D9mEJZNC4O0UJSKFRtVrIF+UUAPmXG2LRyDK5j2NVyVVVO8C50IkpCWUYbNN3xQHOaTJx5Ud9Mb7bZDiLVC5rQPEXy0HCYHZz/ZkT6sdfmL7 782xWJbE 27yedgiQZAX1zXc0gnrtkssAHw1/SL4ncyWdD9chW9qOk3XD2vTFrXDc/6nKZXgsNmXyi45QJcioR68IjlSxLIrJpzDObxwhSdSuENwyyit2Qrnjmx05Ofx2oeQPf8TUNACgzZVMuLF4T4nn+ONkKYNdw9SJCueJdA71NjMeNrzcJGlsRok41pp+/72ffN9xrzFJhAXTNpbVprRlPq2Ky/8dRrIAA1CPD/c6VTc/zIQa7sHhhXcFINgF9z8pXTSkUG+WD+PPDVZniJPRvV87vLsSNCdru6FMWNYMAOKAmox45/3Cjq0d8eEEM+TXpN4seZFnRNLlOM8CuxTtZWoxfcMo8Vjc/OI7bQV9s X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed 28-08-24 18:58:43, Kent Overstreet wrote: > On Wed, Aug 28, 2024 at 09:26:44PM GMT, Michal Hocko wrote: > > On Wed 28-08-24 15:11:19, Kent Overstreet wrote: [...] > > > It was decided _years_ ago that PF_MEMALLOC flags were how this was > > > going to be addressed. > > > > Nope! It has been decided that _some_ gfp flags are acceptable to be used > > by scoped APIs. Most notably NOFS and NOIO are compatible with reclaim > > modifiers and other flags so these are indeed safe to be used that way. > > Decided by who? Decides semantic of respective GFP flags and their compatibility with others that could be nested in the scope. Zone modifiers __GFP_DMA, __GFP_HIGHMEM, __GFP_DMA32 and __GFP_MOVABLE would allow only __GFP_DMA to have scoped semantic because it is the most restrictive of all of them (i.e. __GFP_DMA32 can be served from __GFP_DMA but not other way around) but nobody really requested that. __GFP_RECLAIMABLE is slab allocator specific and nested allocations cannot be assumed they have shrinkers so this cannot really have scoped semantic. __GFP_WRITE only implies node spreading. Likely OK for scope interface, nobody requested that. __GFP_HARDWALL only to be used for user space allocations. Wouldn't break anything if it had scoped interface but nobody requested that. __GFP_THISNODE only to be used by allocators internally to define NUMA placement strategy. Not safe for scoped interface as it changes the failure semantic __GFP_ACCOUNT defines memcg accounting. Generally usable from user context and safe for scope interface in that context as it doesn't change the failure nor reclaim semantic __GFP_NO_OBJ_EXT internal flag not to be used outside of mm. __GFP_HIGH gives access to memory reserves. It could be used for scope interface but nobody requested that. __GFP_MEMALLOC - already has a scope interface PF_MEMALLOC. This is not really great though because it grants unbounded access to memory reserves and that means that it isreally tricky to see how many allocations really can use reserves. It has been added because swap over NFS had to guarantee forward progress and networking layer was not prepared for that. Fundamentally this doesn't change the allocation nor reclaim semantic so it is safe for a scope API. __GFP_NOMEMALLOC used to override PF_MEMALLOC so a scoped interface doesn't make much sense __GFP_IO already has scope interface to drop this flag. It is safe because it doesn't change failure semantic and it makes the reclaim context more constrained so it is compatible with other reclaim modifiers. Contrary it would be unsafe to have a scope interface to add this flag because all GFP_NOIO nested allocations could deadlock __GFP_FS. Similar to __GFP_IO. __GFP_DIRECT_RECLAIM allows allocation to sleep. Scoped interface to set the flag is unsafe for any nested GFP_NOWAIT/GFP_ATOMIC requests which might be called from withing atomic contexts. Scope interface to clear the flag is unsafe for scoped interface because __GFP_NOFAIL allocation mode doesn't support requests without this flag so any nested NOFAIL allocation would break and see unexpected and potentially unhandled failure mode. __GFP_KSWAPD_RECLAIM controls whether kswapd is woken up. Doesn't change the failure nor direct reclaim behavior. Scoped interface to set the flag seems rather pointless and one to clear the bit dangerous because it could put MM into unbalanced state as kswapd wouldn't wake up. __GFP_RETRY_MAYFAIL - changes the failure mode so it is fundamentally incompatible with nested __GFP_NOFAIL allocations. Scoped interface to clear the flag would be safe but probably pointless. __GFP_NORETRY - same as above __GFP_NOFAIL - incompatible with any nested GFP_NOWAIT/GFP_ATOMIC allocations. One could argue that those are fine to see allocation failure so this will not create any unexpected failure mode which is a fair argument but what would be the actual usecase for setting all nested allocations to NOFAIL mode when they likely have a failure mode? Interface to clear the flag for the scope would be unsafe because all nested NOFAIL allocations would get an unexpected failure mode. __GFP_NOWARN safe to have scope interface both to set and clear the flag. __GFP_COMP only to be used for high order allocations and changes the tail pages tracking which would break any nested high order request without the flag. So unsafe for the scope interface both to set and clear the flag. __GFP_ZERO changes the initialization and safe for scope interface. We even have a global switch to do that for all allocations init_on_alloc __GFP_NOLOCKDEP disables lockdep reclaim recursion detection. Safe for scope interface AFAICS. -- Michal Hocko SUSE Labs