From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 966AEE95A96 for ; Mon, 9 Oct 2023 13:26:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34A22900003; Mon, 9 Oct 2023 09:26:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FBC5900002; Mon, 9 Oct 2023 09:26:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1EA31900003; Mon, 9 Oct 2023 09:26:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0DDF2900002 for ; Mon, 9 Oct 2023 09:26:02 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CC6E9801A1 for ; Mon, 9 Oct 2023 13:26:01 +0000 (UTC) X-FDA: 81325996122.20.F99C401 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) by imf22.hostedemail.com (Postfix) with ESMTP id 4BD44C002A for ; Mon, 9 Oct 2023 13:25:57 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of l.stach@pengutronix.de designates 185.203.201.7 as permitted sender) smtp.mailfrom=l.stach@pengutronix.de; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696857960; a=rsa-sha256; cv=none; b=MyUYLWVQ+qLHJEsr1MC9Q+IxxtG759FXwU/EJQLhBH+FoF/in5yuInjwtZY/+RZaB0bGCM +XURaoBgNkznwLNWNQaHzhOfm73CoamLigeEi8/IT0DVOmokFBSXr17OX5COlDXo1LLK8s nZynQkGDhKGtwCKX0F2IYEzu3hBFKjE= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of l.stach@pengutronix.de designates 185.203.201.7 as permitted sender) smtp.mailfrom=l.stach@pengutronix.de; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696857960; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=R+s/7ldRk44dCZChWSKVxSG8k9I/cUvV95qUUW9iuxw=; b=OUi7r4fWn+H5zu04TUgadBHd3NnmOXx4Rx8FvdtgX5DS+6mwDFyQJviyi5W298HemzhHc0 he4MLg4uaCx+DzK+9bU2w2u/CS0Nl5QUlIJzXjHYuOXPltrnK3vaal2CftzW7mpoa0Ftne 4BWAHc54iH6qMr2qKaj23wX+6b8RUXc= Received: from ptz.office.stw.pengutronix.de ([2a0a:edc0:0:900:1d::77] helo=[IPv6:::1]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qpqGh-0006Nr-6Q; Mon, 09 Oct 2023 15:25:55 +0200 Message-ID: Subject: high IRQ latency due to pcp draining From: Lucas Stach To: Mel Gorman , linux-mm@kvack.org Cc: kernel@pengutronix.de, dri-devel@lists.freedesktop.org Date: Mon, 09 Oct 2023 15:25:54 +0200 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 X-SA-Exim-Connect-IP: 2a0a:edc0:0:900:1d::77 X-SA-Exim-Mail-From: l.stach@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-mm@kvack.org X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4BD44C002A X-Stat-Signature: 9irz6445dshck1qpmsqda9nqc3mkrkim X-HE-Tag: 1696857957-810676 X-HE-Meta: U2FsdGVkX1+fsGfWKnwOw7JtdGJFt6SraA0So5oVEbq7zz1VZwJnoaEvrhXDOL4rTMpYZHG2Gsuy7JDy+ZNCgwgKEr7VKGQF0nHHJAXtpnB67+mFv7TFC2J+h5EeM9aSJHKaVJ0CGyR/JigyNu1xdgCaRe95lbwC+JnVXoEih2cAJFGOsudfDwet8Z2HsdHT+nSg3O7V9n6+sSod13aS9UvO3bD7Ub2owgeTaAmEGNTdsAkXMke5UxPmEqFYUgmd1I60WTKowK+/8KX2HFBIACpI3n8IsTsl+srDOdiiKCYhYG6fZ84dpKXRG5tda2uzsiKWPIKrRlO9vvNtJMbWI/upM43DCJBmoBkp9h3vGclfvnL1p0rV6KtAtzQoJZtT9EY12AMxZVYlYFx4a4HfuQjdTdNkEjd/k7COxoEAPv3qzQuzgKUCTh8ou8gOP1NhW7VwMxc3eYiUYG6A69YA2GL2SZW0d7rJiF9gYyFiEQJ5NbgAs4A3F83VRdnS/mHnOQY8KmlzK4bKiUD8lVo4lCIDShSNg4wmqSOV13g+gam29ey3nrjc9rDzDQj/Ls2pf7NfwhDRCqvFSB9l2ClBM4PR6Zo3Up16pqrunDfadPy28fy3geBMJJCt/+jiMCNOabMuvEwAlWaZpCVcym24/Yvh0CtZEUycVFEykgxmPPg5dtHse3bSZHvMrhpwQ1vTIAWCT0IyoYsOoJQog3FedYZW5Yb+BDOw0N2WQbFPKPUyjIjNIZogpvge5bbyMuyGM2sIvAK8RNx5TCVHC6CBXSdKuNnlqIFrhlWQ9MXwiImog82UTZ24OCiow+vU7nC89JCxIKTFbVQCl12vI0JmKVISfELV4W8LIH1HKEnpOqlqBfv97VR4gxD73XB3e/+yhp0XLT6LC7JxO/U6VzRvARMnRLm7E2TWxjUHDeUfGU3+WwYUT6hicW4DxZm3kyrWt6lElrRd5YoYvwsWtCl DUIGOHxU ZUUjp+vf65lXHllI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi all, recently I've been looking at inconsistent frame times in one of our graphics workloads and it seems the culprit lies within the MM subsystem. During workload execution sporadically some graphics buffers, which are typically single digit megabytes in size, are freed. The pages are freed via __folio_batch_release from drm_gem_put_pages, which means they are put on the pcp and drained back into the zone via free_pcppages_bulk. As the buffers are quite large even a single free triggers the batching optimization added in 3b12e7e97938 ("mm/page_alloc: scale the number of pages that are batch freed"), as there is a huge number of pages that get freed without any intervening allocations. The pcp for the normal zone on this system has a high watermark of 614 pages and batch of 63, which means that the batching optimizations will drive up the number of pages freed per batch to 551 pages. As the cost per page free (including tracing overhead, which isn't negligible on this small ARM system) is around 0.7=C2=B5s and the batch fre= e is done with zone spinlock held and IRQs disabled, this leads to significant IRQ disabled times of upwards of 250=C2=B5s even in the production system without tracing. Those long IRQ disabled sections do interfere with the workload of the system. As the larger free batching was added on purpose I don't want to rip it out altogether. But then there are also no tuneables aside from the pcp high watermark, which may have other unintended side effects. I'm hoping to get some ideas on how to proceed here. Should we consider a more conservative maximum of pages for the batching optimization? Should another tunable be added? Or any other clever ideas to minimize those critical sections? Regards, Lucas