From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55B3BC43215 for ; Fri, 22 Nov 2019 05:52:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 05E5A20717 for ; Fri, 22 Nov 2019 05:52:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="LLobdi53" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 05E5A20717 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AE4736B04A1; Fri, 22 Nov 2019 00:52:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FAD46B04A2; Fri, 22 Nov 2019 00:52:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FD196B04A3; Fri, 22 Nov 2019 00:52:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0188.hostedemail.com [216.40.44.188]) by kanga.kvack.org (Postfix) with ESMTP id 69DA76B04A1 for ; Fri, 22 Nov 2019 00:52:08 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 19B4C5851 for ; Fri, 22 Nov 2019 05:52:08 +0000 (UTC) X-FDA: 76182842736.22.shirt61_86fd4955ac242 X-HE-Tag: shirt61_86fd4955ac242 X-Filterd-Recvd-Size: 6077 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Fri, 22 Nov 2019 05:52:07 +0000 (UTC) Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2C3B22075E; Fri, 22 Nov 2019 05:52:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1574401926; bh=7h6dwNVHY+1AQQph4vopgpi5NMKwVpbXI1hUGCpEHyY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LLobdi53F2bwPNJXQhWW+/PkqPuqwHCld5KxoMw+prl5t0Umfh8jXyi1xjIXZ5ydc 9T8oktvS9iVHdj+HfJF5xadu5lO/M5gN+dNhGm3A44ofaLm/FM/fo8/FqTCkJNm8xe VuxrRnQ9tmtfnCOySA0MFCIBKFHI77l4fDBj18+g= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Aaron Lu , Pawel Staszewski , Jesper Dangaard Brouer , Vlastimil Babka , Mel Gorman , Ilias Apalodimas , Alexander Duyck , Tariq Toukan , Pankaj gupta , Andrew Morton , Linus Torvalds , Sasha Levin , linux-mm@kvack.org Subject: [PATCH AUTOSEL 4.19 154/219] mm/page_alloc.c: free order-0 pages through PCP in page_frag_free() Date: Fri, 22 Nov 2019 00:48:06 -0500 Message-Id: <20191122054911.1750-147-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191122054911.1750-1-sashal@kernel.org> References: <20191122054911.1750-1-sashal@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Aaron Lu [ Upstream commit 65895b67ad27df0f62bfaf82dd5622f95ea29196 ] page_frag_free() calls __free_pages_ok() to free the page back to Buddy. This is OK for high order page, but for order-0 pages, it misses the optimization opportunity of using Per-Cpu-Pages and can cause zone lock contention when called frequently. Pawel Staszewski recently shared his result of 'how Linux kernel handles normal traffic'[1] and from perf data, Jesper Dangaard Brouer found the lock contention comes from page allocator: mlx5e_poll_tx_cq | --16.34%--napi_consume_skb | |--12.65%--__free_pages_ok | | | --11.86%--free_one_page | | | |--10.10%--queued_spin_lock_slowpath | | | --0.65%--_raw_spin_lock | |--1.55%--page_frag_free | --1.44%--skb_release_data Jesper explained how it happened: mlx5 driver RX-page recycle mechanism i= s not effective in this workload and pages have to go through the page allocator. The lock contention happens during mlx5 DMA TX completion cycle. And the page allocator cannot keep up at these speeds.[2] I thought that __free_pages_ok() are mostly freeing high order pages and thought this is an lock contention for high order pages but Jesper explained in detail that __free_pages_ok() here are actually freeing order-0 pages because mlx5 is using order-0 pages to satisfy its page poo= l allocation request.[3] The free path as pointed out by Jesper is: skb_free_head() -> skb_free_frag() -> page_frag_free() And the pages being freed on this path are order-0 pages. Fix this by doing similar things as in __page_frag_cache_drain() - send the being freed page to PCP if it's an order-0 page, or directly to Buddy if it is a high order page. With this change, Pawe=C5=82 hasn't noticed lock contention yet in his workload and Jesper has noticed a 7% performance improvement using a micr= o benchmark and lock contention is gone. Ilias' test on a 'low' speed 1Gbi= t interface on an cortex-a53 shows ~11% performance boost testing with 64byte packets and __free_pages_ok() disappeared from perf top. [1]: https://www.spinics.net/lists/netdev/msg531362.html [2]: https://www.spinics.net/lists/netdev/msg531421.html [3]: https://www.spinics.net/lists/netdev/msg531556.html [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/20181120014544.GB10657@intel.com Signed-off-by: Aaron Lu Reported-by: Pawel Staszewski Analysed-by: Jesper Dangaard Brouer Acked-by: Vlastimil Babka Acked-by: Mel Gorman Acked-by: Jesper Dangaard Brouer Acked-by: Ilias Apalodimas Tested-by: Ilias Apalodimas Acked-by: Alexander Duyck Acked-by: Tariq Toukan Acked-by: Pankaj gupta Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/page_alloc.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b34348a41bfed..dcc46d955df2e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4581,8 +4581,14 @@ void page_frag_free(void *addr) { struct page *page =3D virt_to_head_page(addr); =20 - if (unlikely(put_page_testzero(page))) - __free_pages_ok(page, compound_order(page)); + if (unlikely(put_page_testzero(page))) { + unsigned int order =3D compound_order(page); + + if (order =3D=3D 0) /* Via pcp? */ + free_unref_page(page); + else + __free_pages_ok(page, order); + } } EXPORT_SYMBOL(page_frag_free); =20 --=20 2.20.1