From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36A2DC433F5 for ; Fri, 26 Nov 2021 14:50:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 883616B0074; Fri, 26 Nov 2021 09:50:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 833CE6B0075; Fri, 26 Nov 2021 09:50:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FAE36B007B; Fri, 26 Nov 2021 09:50:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id 608B76B0074 for ; Fri, 26 Nov 2021 09:50:30 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0AA968B323 for ; Fri, 26 Nov 2021 14:50:20 +0000 (UTC) X-FDA: 78851367000.03.8DBCC89 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf30.hostedemail.com (Postfix) with ESMTP id 7E302E001990 for ; Fri, 26 Nov 2021 14:50:11 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 8C7851FD38; Fri, 26 Nov 2021 14:50:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1637938217; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vIFiAUSKYA6iDyHQMizk9Jjiw/pwaTCNzr86pTCJre4=; b=SHfMGLegEv0b2KByE3H2YJ6kJ4MIoSMhnvigyKETU7+4Tl840wTAL4Z6bVaAc2/fXo/wyh 8TfIXodqhbwHM4hDgnW3ZTXLhBnwyBAPBOshFHHoLDemXpIzLsXzOWzTFtmJLk+UiJmUux C9u7AVBBbmgeMKkbDk+8jkd7gOP1UxY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1637938217; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vIFiAUSKYA6iDyHQMizk9Jjiw/pwaTCNzr86pTCJre4=; b=lZU3tzKj1bkLYYq31czyyQv7U8vwHyfHRSeSVFUFClnmaasMryMs4TuivbeAnN+4DutXxb j8UERmDMCCucCTCQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5708B13C60; Fri, 26 Nov 2021 14:50:17 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 9RuTFCn0oGEXXAAAMHmgww (envelope-from ); Fri, 26 Nov 2021 14:50:17 +0000 Message-ID: <919f547e-beb7-34b7-7835-9e1625600323@suse.cz> Date: Fri, 26 Nov 2021 15:50:15 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.2 Content-Language: en-US To: NeilBrown , Andrew Morton Cc: Uladzislau Rezki , Michal Hocko , Dave Chinner , Christoph Hellwig , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, LKML , Ilya Dryomov , Jeff Layton , Michal Hocko References: <20211122153233.9924-1-mhocko@kernel.org> <20211122153233.9924-3-mhocko@kernel.org> <20211123170238.f0f780ddb800f1316397f97c@linux-foundation.org> <163772381628.1891.9102201563412921921@noble.neil.brown.name> <20211123194833.4711add38351d561f8a1ae3e@linux-foundation.org> <163773141164.1891.1440920123016055540@noble.neil.brown.name> From: Vlastimil Babka Subject: Re: [PATCH v2 2/4] mm/vmalloc: add support for __GFP_NOFAIL In-Reply-To: <163773141164.1891.1440920123016055540@noble.neil.brown.name> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: g37mduj8qsw4kgyntwozhb9qewjj85xc Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=SHfMGLeg; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=lZU3tzKj; dmarc=none; spf=pass (imf30.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7E302E001990 X-HE-Tag: 1637938211-381686 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/24/21 06:23, NeilBrown wrote: >> >> I forget why radix_tree_preload used a cpu-local store rather than a >> per-task one. >> >> Plus "what order pages would you like" and "on which node" and "in >> which zone", etc... > > "what order" - only order-0 I hope. I'd hazard a guess that 90% of > current NOFAIL allocations only need one page (providing slub is used - > slab seems to insist on high-order pages sometimes). Yeah AFAIK SLUB can prefer higher orders than SLAB, but also allows fallback to smallest order that's enough (thus 0 unless the objects are larger than a page). > "which node" - whichever. Unless __GFP_HARDWALL is set, alloc_page() > will fall-back to "whichever" anyway, and NOFAIL with HARDWALL is > probably a poor choice. > "which zone" - NORMAL. I cannot find any NOFAIL allocations that want > DMA. fs/ntfs asks for __GFP_HIGHMEM with NOFAIL, but that that doesn't > *requre* highmem. > > Of course, before designing this interface too precisely we should check > if anyone can use it. From a quick through the some of the 100-ish > users of __GFP_NOFAIL I'd guess that mempools would help - the > preallocation should happen at init-time, not request-time. Maybe if we > made mempools even more light weight .... though that risks allocating a > lot of memory that will never get used. > > This brings me back to the idea that > alloc_page(wait and reclaim allowed) > should only fail on OOM_KILL. That way kernel threads are safe, and > user-threads are free to return ENOMEM knowing it won't get to Hm I thought that's already pretty much the case of the "too small to fail" of today. IIRC there's exactly that gotcha that OOM KILL can result in such allocation failure. But I believe that approach is rather fragile. If you encounter such an allocation not checking the resulting page != NULL, you can only guess which one is true: - the author simply forgot to check at all - the author relied on "too small to fail" without realizing the gotcha - at the time of writing the code was verified that it can be only run in kernel thread context, not user and - it is still true - it stopped being true at some later point - might be hard to even decide which is the case IIRC at some point we tried to abolish the "too small to fail" rule because of this, but Linus denied that. But the opposite - make it hard guarantee in all cases - also didn't happen, so... > user-space. If user-thread code really needs NOFAIL, it punts to a > workqueue and waits - aborting the wait if it is killed, while the work > item still runs eventually. > > NeilBrown >