From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4D72C4332F for ; Wed, 8 Nov 2023 21:57:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3826A8D00C6; Wed, 8 Nov 2023 16:57:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 332E88D0073; Wed, 8 Nov 2023 16:57:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 221298D00C6; Wed, 8 Nov 2023 16:57:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0F3E28D0073 for ; Wed, 8 Nov 2023 16:57:40 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DB3441CB955 for ; Wed, 8 Nov 2023 21:57:39 +0000 (UTC) X-FDA: 81436149438.04.0E0D5A7 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf20.hostedemail.com (Postfix) with ESMTP id D40EE1C000C for ; Wed, 8 Nov 2023 21:57:37 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="GrxrYWf/"; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699480658; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fncfKP0bgPMoFXw1eqNZCA36EmRuUbrfkMQb0sCdUcA=; b=xcf7q1non+ehL0Tdzx8BXec9Qw0ztSihfj0FdlXQT7qFBGpMyVqJqveX8jKaOIrT9El24j VBRVTx0Nm3FtVm9h4KX1U1l6LZkzOJPFiP2j10PcFtg+ewU6Ga0O26AR/qL9wgAwzov6os ce4tsAlGrEnLvX9pmfiej6tZ1A21mQc= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="GrxrYWf/"; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699480658; a=rsa-sha256; cv=none; b=qQwPqbDK40RZn9BO10nwpvWSfSQbvb7jsRZt3CJP+yRFt81ZELSxQgBBIekxAkAUTBXa+f zJwxndECdbGC7Jo0o1L35WX8esrNPYTG+GPVgeu0UEVUYoNjZB/6DioZHJ93WRjYLnkPHz u674K7qjM9zR41qT7s9r7WlLPeeX/xA= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id B4D8DB81EA2; Wed, 8 Nov 2023 21:57:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB69CC433C7; Wed, 8 Nov 2023 21:57:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1699480655; bh=IyTyvj5NUZ9Cs+Ftzhz/qOsaiUiCPmY1i2ONdxmDhpk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=GrxrYWf/Qj5KaDWqd/kk6YawuamBl0xBXC0R6PDngW6aqplnJhv5aM31yDtyWi2J2 RFhyUjgD6tqkdq7OncASPA0Jrw8r5HmKs3ISAnjIReCdqW271tyiDA34wxLZpYeBR8 nfH0X6XdRHd6xIdHvSvPvNyx8vupobnabhcc83Xg= Date: Wed, 8 Nov 2023 13:57:34 -0800 From: Andrew Morton To: Zhiguo Jiang Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Johannes Weiner , opensource.kernel@vivo.com Subject: Re: [PATCH] mm:ALLOC_HIGHATOMIC flag allocation issuse Message-Id: <20231108135734.7b3820985c288ff98f41d2c4@linux-foundation.org> In-Reply-To: <20231108065408.1861-1-justinjiang@vivo.com> References: <20231108065408.1861-1-justinjiang@vivo.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: D40EE1C000C X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: qg3oinz4qdd7y9mkfhxmmzrjhqfhda57 X-HE-Tag: 1699480657-137004 X-HE-Meta: U2FsdGVkX1/DMTsPR7H0fcvuA1er4api73yymKFl+ypKZ56saDLcPo4wY3tubElzggOvxS/fr8xd3AcnXk5HxzjEhwVOqAmScIG9+2OytmxfAKrN2pPtwyfaPgz1w3xnwQQyFIT6LbAmIGPKaWNBOzjuBMLyfuat/bDQeyxen58GzbR4s4lvZRELoYD//RCiXkysmz7x65kqa3vQbAQQpQM+7gMP5Zj6TzaQ7Q46WJS3oNNVnt/ZvkGBMl6y3DQVSNAhMK6Hv79m/3B8c/hEXz54+W36t7tH6OgkU2NLbqh75eqaGGfRi3tmEzmdX94CVsrxQAtYdVHypyHWJSlcqWkmmZSNN/7TZQMLbmCH0th3mUSTXwHWnzqL0wtxwYv53OyLG0M3f1x8X7I4Gt0L8SRBU7wnqzAftCE+f3VA2CMCboZpo2iYf2WB4hgtHSmEvwPGsHCOyNsBpWdEOdXx4bUP+63/2y1rIsBaQujBgxO0FWCa3xvtit1XS8FjPGlDJtM3TZCV9E67vV0kP4fV9eOEZ8pLfcHo7nfLuvZsoY1wvCozlscxqvI9+LVUEDwIeVg4V/t/ZpPysKao7bCnFGK6/tyhHlJiHwRnt7ioTb1VJqyLzxvG/Q7KKl6h7jvepeygpCpFXVi+RjWXbaHNdhY42H3AHVzLXboDY3Sc19UINBRZ7OXJ1ymimSw/AAwD+stG28skJva0comq/D2aaaVO7sqAHhqOGGMHsK4s/psDooax5eo0gE7Ab9OWCHi3lW6xYsDklXSE+LLbjfoR5xIixQKYBElJCs1w9CyQRo2UAw8X0VuX0mGErPB/PcONJ6Rxh5UcVr8bHFL86EhjUzPMN+ThMhCCz5QlX8y91+PyN/HLkfIVQVBfR3DFVZ/R8BpC1RSU/uL7pQPbRoWuuuIPfcSMaoHtFpYIk0I8vu4wL3X0HfqFj9IeFPH0/a/p90P4vXZ1l5y/bNH5iTt bewwAAvF 8qE53u4ddGVHpb+YKEtbFFvlZpidbgXZQrGKn98usio+WtBSKhgRgqSAXnsOsIQpAd2DGdRi4raxmAlsa0t1DQBm6YcyGK+xjHwkQYyCSBYyxli0TiORO1rt0qUrsgZcxcDteJaCneJ/QR3oYAFu3xao95R2iGCbvwkHbwdzfGnWs0LwuT+SzlwlxsuTYjJxzoAsHf0PiJI96vlGti8So0v8QwiIS6VpW1B6Vb5W/jwRFDCraf9le2HVgwnL3qvIYs5wXNG1MtLvXLphjx+mhNO5Gp05feOySI0XMTL7/0gifPWsz2vpI67dqKDpfRICNAJSMhcpj9iNlnM4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 8 Nov 2023 14:54:07 +0800 Zhiguo Jiang wrote: > In case that alloc_flags contains ALLOC_HIGHATOMIC and alloc order > is order1/2/3/10 in rmqueue(), if pages are alloced successfully > from pcplist, a free pageblock will be also moved from the alloced > migratetype freelist to MIGRATE_HIGHATOMIC freelist, rather than > alloc from MIGRATE_HIGHATOMIC freelist firstly, so this will result > in an increasing number of pages on the MIGRATE_HIGHATOMIC freelist, > pages in other migratetype freelist are reduced and more likely to > allocation failure. > > Currently the sequence of ALLOC_HIGHATOMIC allocation is: > pcplist --> rmqueue_bulk() --> rmqueue_buddy() MIGRATE_HIGHATOMIC > --> rmqueue_buddy() allocation migratetype. > > Due to the fact that requesting pages from the pcplist is faster than > buddy, the sequence of modifying the ALLOC_HIGHATOMIC allocation is: > pcplist --> rmqueue_buddy() MIGRATE_HIGHATOMIC --> rmqueue_buddy() > allocation migratetype. > > This patch can solve the failure problem of allocating other types of > pages due to excessive MIGRATE_HIGHATOMIC freelist reservations. > > In comparative testing, cat /proc/pagetypeinfo and the HighAtomic > freelist size is: > Test without this patch: > Node 0, zone Normal, type HighAtomic 2369 771 138 15 0 0 0 0 0 0 0 > Test with this patch: > Node 0, zone Normal, type HighAtomic 206 82 4 2 1 0 0 0 0 0 0 Hopefully hannes can check this for us, but I have a stylistic concern... > +#define ALLOC_PCPLIST 0x1000 /* Allocations from pcplist */ > > /* Flags that allow allocations below the min watermark. */ > #define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM) > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index b8544f08cc36..67cec88164b1 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2854,11 +2854,15 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order, > int batch = nr_pcp_alloc(pcp, zone, order); > int alloced; > > + if (alloc_flags & ALLOC_HIGHATOMIC) > + goto out; > + A comment here explaining why we're doing this would help readers. > alloced = rmqueue_bulk(zone, order, > batch, list, > migratetype, alloc_flags); > > pcp->count += alloced << order; > +out: > if (unlikely(list_empty(list))) > return NULL; > } > @@ -2921,7 +2925,7 @@ __no_sanitize_memory > static inline > struct page *rmqueue(struct zone *preferred_zone, > struct zone *zone, unsigned int order, > - gfp_t gfp_flags, unsigned int alloc_flags, > + gfp_t gfp_flags, unsigned int *alloc_flags, > int migratetype) > { > struct page *page; > @@ -2934,17 +2938,19 @@ struct page *rmqueue(struct zone *preferred_zone, > > if (likely(pcp_allowed_order(order))) { > page = rmqueue_pcplist(preferred_zone, zone, order, > - migratetype, alloc_flags); > - if (likely(page)) > + migratetype, *alloc_flags); > + if (likely(page)) { > + *alloc_flags |= ALLOC_PCPLIST; > goto out; > + } > } So we're effectively returning a boolean to the caller via *alloc_flags. This isn't a great way of doing it. Wouldn't it be cleaner to pass a new bool* argument to rmqueue() for this? Make it explicit? rmqueue() will be inlined into its sole caller, so this approach shouldn't add overhead. > - page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags, > + page = rmqueue_buddy(preferred_zone, zone, order, *alloc_flags, > migratetype); > > out: > /* Separate test+clear to avoid unnecessary atomics */ > - if ((alloc_flags & ALLOC_KSWAPD) && > + if ((*alloc_flags & ALLOC_KSWAPD) && > unlikely(test_bit(ZONE_BOOSTED_WATERMARK, &zone->flags))) { > clear_bit(ZONE_BOOSTED_WATERMARK, &zone->flags); > wakeup_kswapd(zone, 0, 0, zone_idx(zone)); > @@ -3343,7 +3349,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, > > try_this_zone: > page = rmqueue(ac->preferred_zoneref->zone, zone, order, > - gfp_mask, alloc_flags, ac->migratetype); > + gfp_mask, &alloc_flags, ac->migratetype); > if (page) { > prep_new_page(page, order, gfp_mask, alloc_flags); > > @@ -3351,7 +3357,8 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, > * If this is a high-order atomic allocation then check > * if the pageblock should be reserved for the future > */ > - if (unlikely(alloc_flags & ALLOC_HIGHATOMIC)) > + if (unlikely(alloc_flags & ALLOC_HIGHATOMIC) && > + unlikely(!(alloc_flags & ALLOC_PCPLIST))) Again, a comment explaining the reason for the test would be good. > reserve_highatomic_pageblock(page, zone); > > return page; > -- > 2.39.0 >