From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FE0FC433DB for ; Tue, 23 Mar 2021 16:30:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C32C4619BA for ; Tue, 23 Mar 2021 16:30:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C32C4619BA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2BB6E6B00D6; Tue, 23 Mar 2021 12:30:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 292A46B00D8; Tue, 23 Mar 2021 12:30:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 181CC6B00DB; Tue, 23 Mar 2021 12:30:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id F2CCF6B00D6 for ; Tue, 23 Mar 2021 12:30:02 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A651882499A8 for ; Tue, 23 Mar 2021 16:30:02 +0000 (UTC) X-FDA: 77951675844.29.CC52AF6 Received: from outbound-smtp09.blacknight.com (outbound-smtp09.blacknight.com [46.22.139.14]) by imf19.hostedemail.com (Postfix) with ESMTP id A42FA90009FB for ; Tue, 23 Mar 2021 16:29:53 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail01.blacknight.ie [81.17.254.10]) by outbound-smtp09.blacknight.com (Postfix) with ESMTPS id 270C91C3EE0 for ; Tue, 23 Mar 2021 16:29:51 +0000 (GMT) Received: (qmail 4407 invoked from network); 23 Mar 2021 16:29:50 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 23 Mar 2021 16:29:50 -0000 Date: Tue, 23 Mar 2021 16:29:49 +0000 From: Mel Gorman To: Jesper Dangaard Brouer Cc: Chuck Lever , Vlastimil Babka , Andrew Morton , Christoph Hellwig , Alexander Duyck , Matthew Wilcox , LKML , Linux-Net , Linux-MM , Linux-NFS Subject: Re: [PATCH 0/3 v5] Introduce a bulk order-0 page allocator Message-ID: <20210323162949.GM3697@techsingularity.net> References: <20210322091845.16437-1-mgorman@techsingularity.net> <20210323104421.GK3697@techsingularity.net> <20210323160814.62a248fb@carbon> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20210323160814.62a248fb@carbon> User-Agent: Mutt/1.10.1 (2018-07-13) X-Stat-Signature: e1668hhcz44nnbktkxzrz5ziawhc7nw1 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: A42FA90009FB Received-SPF: none (techsingularity.net>: No applicable sender policy available) receiver=imf19; identity=mailfrom; envelope-from=""; helo=outbound-smtp09.blacknight.com; client-ip=46.22.139.14 X-HE-DKIM-Result: none/none X-HE-Tag: 1616516993-947066 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 23, 2021 at 04:08:14PM +0100, Jesper Dangaard Brouer wrote: > On Tue, 23 Mar 2021 10:44:21 +0000 > Mel Gorman wrote: > > > On Mon, Mar 22, 2021 at 09:18:42AM +0000, Mel Gorman wrote: > > > This series is based on top of Matthew Wilcox's series "Rationalise > > > __alloc_pages wrapper" and does not apply to 5.12-rc2. If you want to > > > test and are not using Andrew's tree as a baseline, I suggest using the > > > following git tree > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v5r9 > > > > > > > Jesper and Chuck, would you mind rebasing on top of the following branch > > please? > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v6r2 > > > > The interface is the same so the rebase should be trivial. > > > > Jesper, I'm hoping you see no differences in performance but it's best > > to check. > > I will rebase and check again. > > The current performance tests that I'm running, I observe that the > compiler layout the code in unfortunate ways, which cause I-cache > performance issues. I wonder if you could integrate below patch with > your patchset? (just squash it) > Yes but I'll keep it as a separate patch that is modified slightly. Otherwise it might get "fixed" as likely/unlikely has been used inappropriately in the past. If there is pushback, I'll squash them together. > From: Jesper Dangaard Brouer > > Looking at perf-report and ASM-code for __alloc_pages_bulk() then the code > activated is suboptimal. The compiler guess wrong and place unlikely code in > the beginning. Due to the use of WARN_ON_ONCE() macro the UD2 asm > instruction is added to the code, which confuse the I-cache prefetcher in > the CPU > > Signed-off-by: Jesper Dangaard Brouer > --- > mm/page_alloc.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index f60f51a97a7b..88a5c1ce5b87 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5003,10 +5003,10 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > unsigned int alloc_flags; > int nr_populated = 0, prep_index = 0; > > - if (WARN_ON_ONCE(nr_pages <= 0)) > + if (unlikely(nr_pages <= 0)) > return 0; > Ok, I can make this change. It was a defensive check for the new callers in case insane values were being passed in. > - if (WARN_ON_ONCE(page_list && !list_empty(page_list))) > + if (unlikely(page_list && !list_empty(page_list))) > return 0; > > /* Skip populated array elements. */ FWIW, this check is now gone. The list only had to be empty if prep_new_page was deferred until IRQs were enabled to avoid accidentally calling prep_new_page() on a page that was already on the list when alloc_pages_bulk was called. > @@ -5018,7 +5018,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > prep_index = nr_populated; > } > > - if (nr_pages == 1) > + if (unlikely(nr_pages == 1)) > goto failed; > > /* May set ALLOC_NOFRAGMENT, fragmentation will return 1 page. */ I'm dropping this because nr_pages == 1 is common for the sunrpc user. > @@ -5054,7 +5054,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > * If there are no allowed local zones that meets the watermarks then > * try to allocate a single page and reclaim if necessary. > */ > - if (!zone) > + if (unlikely(!zone)) > goto failed; > > /* Attempt the batch allocation */ Ok. > @@ -5075,7 +5075,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > > page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags, > pcp, pcp_list); > - if (!page) { > + if (unlikely(!page)) { > /* Try and get at least one page */ > if (!nr_populated) > goto failed_irq; Hmmm, ok. It depends on memory pressure but I agree !page is unlikely. Current version applied is --8<-- mm/page_alloc: optimize code layout for __alloc_pages_bulk From: Jesper Dangaard Brouer Looking at perf-report and ASM-code for __alloc_pages_bulk() it is clear that the code activated is suboptimal. The compiler guesses wrong and places unlikely code at the beginning. Due to the use of WARN_ON_ONCE() macro the UD2 asm instruction is added to the code, which confuse the I-cache prefetcher in the CPU. [mgorman: Minor changes and rebasing] Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Mel Gorman diff --git a/mm/page_alloc.c b/mm/page_alloc.c index be1e33a4df39..1ec18121268b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5001,7 +5001,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, unsigned int alloc_flags; int nr_populated = 0; - if (WARN_ON_ONCE(nr_pages <= 0)) + if (unlikely(nr_pages <= 0)) return 0; /* @@ -5048,7 +5048,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, * If there are no allowed local zones that meets the watermarks then * try to allocate a single page and reclaim if necessary. */ - if (!zone) + if (unlikely(!zone)) goto failed; /* Attempt the batch allocation */ @@ -5066,7 +5066,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags, pcp, pcp_list); - if (!page) { + if (unlikely(!page)) { /* Try and get at least one page */ if (!nr_populated) goto failed_irq;