From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD870CD5BC7 for ; Thu, 5 Sep 2024 14:05:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 594C56B0083; Thu, 5 Sep 2024 10:05:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 545176B0102; Thu, 5 Sep 2024 10:05:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40C696B0103; Thu, 5 Sep 2024 10:05:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 22F906B0083 for ; Thu, 5 Sep 2024 10:05:27 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 98A228045E for ; Thu, 5 Sep 2024 14:05:26 +0000 (UTC) X-FDA: 82530857052.09.30175EF Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) by imf27.hostedemail.com (Postfix) with ESMTP id B219E4001C for ; Thu, 5 Sep 2024 14:05:23 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=hW2nI1Il; spf=pass (imf27.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.171 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725545026; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MyCjQthpFJhnYK6tEfReEkQ3UbH/LJW1l4f9CWgc5Fc=; b=BclJzy3T/ULF9qJ8pB2NJg/PMCPr1KIH9x0akrP2COopPnilhHOKW6vubKg4VvFF1UlHNI FFF6ri44jg1YGVl2+/GT6vHWIRUtCfjP0H7Z6JfFcazJ5198l/RlEzbrNoElrGIQymwHNP 9Z3b08HBgx7kLhuT48jSOFXs2BqwphE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725545026; a=rsa-sha256; cv=none; b=jnomHOLrPB92gr8MbihWucdvhXV92NT1Kdjfirl1GgN+40m57LBjzd4qwz0BpsREuPnpFE 3SdwiGRO+HYJqD1QGyoR/CqcbF4xZNVWJ4pHLTcov7xaFGcMXX/4dUPOW6S9LhukGeGnws arNLWFD/tPLj52IA5ywdEICr8anMQcY= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=hW2nI1Il; spf=pass (imf27.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.171 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Date: Thu, 5 Sep 2024 10:05:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1725545121; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MyCjQthpFJhnYK6tEfReEkQ3UbH/LJW1l4f9CWgc5Fc=; b=hW2nI1IlY8C3UAhbNifysoKIq+TJeJoVcizmcoRH4fUejSVQruyq6fk04012lwnxvEmkah i6tgXPD4QgRXHbE89wQ9Oi65yCi54TneuQAxZ2EXNJMLMmiwRCniBHRBc0ZN1fsRO3obE8 NS2TA/IPKHrlBcNkankYN8VfXh17UOY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Theodore Ts'o Cc: Michal Hocko , Andrew Morton , Christoph Hellwig , Yafang Shao , jack@suse.cz, Vlastimil Babka , Dave Chinner , Christian Brauner , Alexander Viro , Paul Moore , James Morris , "Serge E. Hallyn" , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-bcachefs@vger.kernel.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/2 v2] remove PF_MEMALLOC_NORECLAIM Message-ID: <4ty2psn26sergqax6yhcs3htt2tsg3wuvrfyvfdvseom22zhqk@yppva6vxpmjz> References: <20240902145252.1d2590dbed417d223b896a00@linux-foundation.org> <20240905135326.GU9627@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240905135326.GU9627@mit.edu> X-Migadu-Flow: FLOW_OUT X-Stat-Signature: o7y19smkwrn5jmhjbrpa8p66attef3a9 X-Rspamd-Queue-Id: B219E4001C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725545123-581559 X-HE-Meta: U2FsdGVkX18wIFkMFQLgkY3Heg6meSdvdHveJ4CoB+lqQfLtv8yekZUMejahrMu7o2aJ5mJrnyRrDO2/Vm1gjrhftdMSkWTv/4Zhlis0VWc+TdTzKo3GgRxTt/XTvgPK+zQQM6X+e96KNPrTAE6Qrjdm2JJXC91zdwItWFRykkJDH3swbI+kxhbCVVjZ2VtDNM/YHzUj3i1GFw1g27HfKolKEuTNcXR4a1UpwGpCFkkhHl0EUP8UrAoJ7+9Usw+WczQ3XyMMVxYNPSTnXlv+JHzPJtQr/KIzw5W6di+E4CzZh46G3Hu1/44vGuk41CCZGLgrjYqrLhOPq7FZUH+N7L2p9lb9u3wxvWgeAhNZx0ryzaZIFYsedUMENxxnFb43ly/R9MD7vTBIUN9GA1RxM+h3hmyP/u25viWUpOUY4CHiXq5tHtS1i3Oig/rSr22GMIy2ZyYPJWyMnxMRW9c3ZAp4+up37HVbErQ6bK6Axtjf3Lhf0/L11zqzjn4opvamNmOxNiIkxZAb08ihuHX8sx9wN1yi3874pUbmWQd2v0rOd54PYGoTq/jtDKmBUR3gmZup+JgVOrLDkrXTpoFdO7BwF/h+g2zE++5w9A1baSh/tN6eB7uFZOn97poOGsIOAPSDY1TxL+rTYkMoAKsGGGYoC2eGfz7XxIPY89L10+8FiLNWWwhuX/cqbdfJEHBdOfRBIA7wK3JI9hROzGSF5DzfV9AOEJqGwaqlBAMk2Vd0LGQqebLUU6yGuZLaRjWMVh7BVY2t6FoPAmv0Ome5l9y5EOlU5e1L8F7dJ28/KhTvzh0q8YpF5AHZIjWKaNfdJSodOHqK2vNVoTtOw4sempgKnyrRD4OvJ5mWRtmIEKHlX1W9Xo/Zko5XurqU63D0zp7C3gXSSXXAHIE1qUMDtlmwyh9bkMBl+HLx6jHFS+UBpr7lcrYt4CB4TBw8r7hGwtk7ktHPzOLyJYn8nCJ 2d3Uanuy /6jne0+BHRwAX6E9CpTKseLn+g/ErDMvlGRhOPG3zvY0MKyF60J9Wcri/DtrZxUpYE7UtDeeRFuNp7DrYR6TVWIg0sf28pBKdQ7xOLGboMyGSW+sKG6UqfPNGZ68uyosj2cvTysDwF7QJTqeBxmpD+NJf9ZunY1GglHeJpzmLhYFdtKxLsKI9ttjkE3SopOgq5Skn1SPESSW7GzOTvFdkZujv8Pn0ToXz3S6GvgOXtE1EB58= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 05, 2024 at 09:53:26AM GMT, Theodore Ts'o wrote: > On Thu, Sep 05, 2024 at 01:26:50PM +0200, Michal Hocko wrote: > > > > > > This is exactly GFP_KERNEL semantic for low order allocations or > > > > > > kvmalloc for that matter. They simply never fail unless couple of corner > > > > > > cases - e.g. the allocating task is an oom victim and all of the oom > > > > > > memory reserves have been consumed. This is where we call "not possible > > > > > > to allocate". > > > > > > > > > > Which does beg the question of why GFP_NOFAIL exists. > > > > > > > > Exactly for the reason that even rare failure is not acceptable and > > > > there is no way to handle it other than keep retrying. Typical code was > > > > while (!(ptr = kmalloc())) > > > > ; > > > > > > But is it _rare_ failure, or _no_ failure? > > > > > > You seem to be saying (and I just reviewed the code, it looks like > > > you're right) that there is essentially no difference in behaviour > > > between GFP_KERNEL and GFP_NOFAIL. > > That may be the currrent state of affiars; but is it > ****guaranteed**** forever and ever, amen, that GFP_KERNEL will never > fail if the amount of memory allocated was lower than a particular > multiple of the page size? If so, what is that size? I've checked, > and this is not documented in the formal interface. Yeah, and I think we really need to make that happen, in order to head off a lot more sillyness in the future. We'd also be documenting at the same time _exactly_ when it is required to check for errors: - small, fixed sized allocation in a known sleepable context, safe to skip - anything else, i.e. variable sized allocation or library code that can be called from different contexts: you check for errors (and probably that's just "something crazy has happened, emergency shutdown" for the xfs/ext4 paths > > The fundamental difference is that (appart from unsupported allocation > > mode/size) the latter never returns NULL and you can rely on that fact. > > Our docummentation says: > > * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller > > * cannot handle allocation failures. The allocation could block > > * indefinitely but will never return with failure. Testing for > > * failure is pointless. > > So if the documentation is going to give similar guarantees, as > opposed to it being an accident of the current implementation that is > subject to change at any time, then sure, we can probably get away > with all or most of ext4's uses of __GFP_NOFAIL. But I don't want to > do that and then have a "Lucy and Charlie Brown" moment from the > Peanuts comics strip where the football suddenly gets snatched away > from us[1] (and many file sytem users will be very, very sad and/or > angry). yeah absolutely, and the "what is a small allocation" limit needs to be nailed down as well