From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA5BBC433EF for ; Tue, 24 May 2022 20:38:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 195648D0003; Tue, 24 May 2022 16:38:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 146278D0001; Tue, 24 May 2022 16:38:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 00DC68D0003; Tue, 24 May 2022 16:38:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E6CDC8D0001 for ; Tue, 24 May 2022 16:38:34 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 55224351BF for ; Tue, 24 May 2022 20:38:33 +0000 (UTC) X-FDA: 79501799706.20.8370580 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf04.hostedemail.com (Postfix) with ESMTP id C1A7740035 for ; Tue, 24 May 2022 20:38:16 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B32D01F8C4; Tue, 24 May 2022 20:38:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1653424710; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=axk63yz+qUhcRJi6Yd8PYozsr5heAxMREUYpIa7UBSQ=; b=n6m/hQ39agXJkyBDMPqaqYDPbNKtIS0FZ9xr4EidYMoQ2+aaMaNCEiFtHbwBApiDu1H+jm t5+JybvXb9HhvmMZKYWW422r0/TRrFvSdA2CemTGgh0no/fKLsfrsH/Oq68CxltyNkOiUN YAyJzMCKSF/HEMG0IIH6M5ogtPWCC+0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1653424710; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=axk63yz+qUhcRJi6Yd8PYozsr5heAxMREUYpIa7UBSQ=; b=rHiJgglZIp18X/I/sIEU8OeG8CgIOg8Ex/cgDlIpn4x0xrtnlZpxFuuw7DQyQcm95FlJfz 6UnXdIXVer+A5eDQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7E1E913ADF; Tue, 24 May 2022 20:38:30 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id qpQCHkZCjWIENAAAMHmgww (envelope-from ); Tue, 24 May 2022 20:38:30 +0000 Message-ID: <1a0a859b-1f25-5136-bb86-9efe68aabbb8@suse.cz> Date: Tue, 24 May 2022 22:37:15 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: Memory allocation on speculative fastpaths Content-Language: en-US To: Johannes Weiner , Suren Baghdasaryan Cc: Matthew Wilcox , "Paul E. McKenney" , Michal Hocko , "Liam R. Howlett" , Michel Lespinasse , linux-mm , LKML , David Hildenbrand , Davidlohr Bueso References: <20220503155913.GA1187610@paulmck-ThinkPad-P17-Gen-1> <20220503163905.GM1790663@paulmck-ThinkPad-P17-Gen-1> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: C1A7740035 X-Stat-Signature: u5mz6qaxnjh79ukh7xzdbaqehhzkg49g X-Rspam-User: Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="n6m/hQ39"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=rHiJgglZ; spf=pass (imf04.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-HE-Tag: 1653424696-518275 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/4/22 18:23, Johannes Weiner wrote: > On Tue, May 03, 2022 at 04:15:46PM -0700, Suren Baghdasaryan wrote: >> On Tue, May 3, 2022 at 11:28 AM Matthew Wilcox wrote: >>> >>> On Tue, May 03, 2022 at 09:39:05AM -0700, Paul E. McKenney wrote: >>>> On Tue, May 03, 2022 at 06:04:13PM +0200, Michal Hocko wrote: >>>>> On Tue 03-05-22 08:59:13, Paul E. McKenney wrote: >>>>>> Hello! >>>>>> >>>>>> Just following up from off-list discussions yesterday. >>>>>> >>>>>> The requirements to allocate on an RCU-protected speculative fastpath >>>>>> seem to be as follows: >>>>>> >>>>>> 1. Never sleep. >>>>>> 2. Never reclaim. >>>>>> 3. Leave emergency pools alone. >>>>>> >>>>>> Any others? >>>>>> >>>>>> If those rules suffice, and if my understanding of the GFP flags is >>>>>> correct (ha!!!), then the following GFP flags should cover this: >>>>>> >>>>>> __GFP_NOMEMALLOC | __GFP_NOWARN >>>>> >>>>> GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN >>>> >>>> Ah, good point on GFP_NOWAIT, thank you! >>> >>> Johannes (I think it was?) made the point to me that if we have another >>> task very slowly freeing memory, a task in this path can take advantage >>> of that other task's hard work and never go into reclaim. So the >>> approach we should take is: > > Right, GFP_NOWAIT can starve out other allocations. It can clear out > the freelists without the burden of having to do reclaim like > everybody else wanting memory during a shortage. Including GFP_KERNEL. FTR, I wonder if this is really true, given the suggested fallback. With GFP_NOWAIT, you can either see memory (in all applicable zones) as a) above low_watermark, just go ahead and allocate, as GFP_KERNEL would b) between min and low watermark, wake up kswapd and allocate, as GFP_KERNEL would c) below min watermark, the most interesting. GFP_KERNEL fallbacks to reclaim. If the GFP_NOWAIT path's fallback also includes reclaim, as suggested in this thread, how is it really different from GFP_KERNEL? So am I missing something or is GFP_NOWAIT fastpath with an immediate fallback that includes reclaim (and not just a retry loop) fundamentally not different from GFP_KERNEL, regardless of how often we attempt it? > In smaller doses and/or for privileged purposes (e.g. single-argument > kfree_rcu ;)), those allocations are fine. But because the context is > page tables specifically, it would mean that userspace could trigger a > large number of those and DOS other applications and the kernel. > >>> p4d_alloc(GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN); >>> pud_alloc(GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN); >>> pmd_alloc(GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN); >>> >>> if (failure) { >>> rcu_read_unlock(); >>> do_reclaim(); >>> return FAULT_FLAG_RETRY; >>> } >>> >>> ... but all this is now moot since the approach we agreed to yesterday >>> is: >> >> I think the discussion was about the above approach and Johannes >> suggested to fallback to the normal pagefault handling with mmap_lock >> locked if PMD does not exist. Please correct me if I misunderstood >> here. > > Yeah. Either way works, as long as the task is held accountable. >