From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C068FC83F12 for ; Mon, 28 Aug 2023 10:14:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F15318E0015; Mon, 28 Aug 2023 06:14:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC67D8E000E; Mon, 28 Aug 2023 06:14:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8CF48E0015; Mon, 28 Aug 2023 06:14:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C5B018E000E for ; Mon, 28 Aug 2023 06:14:33 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 78B5D1603A4 for ; Mon, 28 Aug 2023 10:14:33 +0000 (UTC) X-FDA: 81173104026.23.DD999FE Received: from smtp-fw-52003.amazon.com (smtp-fw-52003.amazon.com [52.119.213.152]) by imf12.hostedemail.com (Postfix) with ESMTP id 919224000B for ; Mon, 28 Aug 2023 10:14:30 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=ZocKF02y; dmarc=pass (policy=quarantine) header.from=amazon.de; spf=pass (imf12.hostedemail.com: domain of "prvs=597ee61cf=mheyne@amazon.de" designates 52.119.213.152 as permitted sender) smtp.mailfrom="prvs=597ee61cf=mheyne@amazon.de" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693217670; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x+R7rCL3CSOP8FR2dMzsl+zUfDtt3pidUGKwb7MyxkU=; b=2s8I20gioP+O4XDPqxE23rVpFC1yf66Qh9fR8U/q3gKdC8znxbrE3nmZDhmExK0/6P8EG4 UuqNGg6XRRilVQy3KhnKMF6iWhSnwssmMaVb7Izqt12hPYNuoYAAv71EWpzHdgq/69L4WJ 5/DorSrIxaXU9aRxJhBFv8U2eHZP25E= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=ZocKF02y; dmarc=pass (policy=quarantine) header.from=amazon.de; spf=pass (imf12.hostedemail.com: domain of "prvs=597ee61cf=mheyne@amazon.de" designates 52.119.213.152 as permitted sender) smtp.mailfrom="prvs=597ee61cf=mheyne@amazon.de" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693217670; a=rsa-sha256; cv=none; b=pDbjz2wM6PDJj7LIrgtn378qKkame7DvES8wEj6ls2iQcHLyKHDZvFXwFaBxhCKn+Qd8wS KF4/5pCdQ3S/FSS2aZGR48JmwQuUa5SVsaLHQlAL2tZUpluDQW7G3bs76y0DnBg04uAzHp 1h+p6HehCjRwOTjWbWhM4t75JTCj61Y= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1693217671; x=1724753671; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=x+R7rCL3CSOP8FR2dMzsl+zUfDtt3pidUGKwb7MyxkU=; b=ZocKF02y8aycxyMddQwguO62Qds2n1DCLYcOYALlu7hudedz2754b5KP tlCCU7BzOwYBgxlcFWpMTQ/S5CKx7Vwy4TeVct9hpkBxEyOzBT4lOc8Ke VZOePrAXA3uPWo3AwFkznWndEE7Y5YZzkF87JcqdoF7I+NPDZBbu1wiwH U=; X-IronPort-AV: E=Sophos;i="6.02,207,1688428800"; d="scan'208";a="604675000" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-iad-1a-m6i4x-edda28d4.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-52003.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2023 10:14:29 +0000 Received: from EX19D008EUA003.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1a-m6i4x-edda28d4.us-east-1.amazon.com (Postfix) with ESMTPS id 5F6CF8061D; Mon, 28 Aug 2023 10:14:22 +0000 (UTC) Received: from EX19MTAUEC001.ant.amazon.com (10.252.135.222) by EX19D008EUA003.ant.amazon.com (10.252.50.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.37; Mon, 28 Aug 2023 10:14:21 +0000 Received: from dev-dsk-mheyne-1b-c1362c4d.eu-west-1.amazon.com (10.15.57.183) by mail-relay.amazon.com (10.252.135.200) with Microsoft SMTP Server id 15.2.1118.37 via Frontend Transport; Mon, 28 Aug 2023 10:14:20 +0000 Received: by dev-dsk-mheyne-1b-c1362c4d.eu-west-1.amazon.com (Postfix, from userid 5466572) id 96B5F81F; Mon, 28 Aug 2023 10:14:20 +0000 (UTC) Date: Mon, 28 Aug 2023 10:14:20 +0000 From: Maximilian Heyne To: Greg KH CC: , Linus Torvalds , Michael Larabel , Matthieu Baerts , Dave Chinner , Matthew Wilcox , Chris Mason , Jan Kara , Amir Goldstein , Andrew Morton , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Luis Chamberlain , Kees Cook , , , Subject: Re: [PATCH] mm: allow a controlled amount of unfairness in the page lock Message-ID: <20230828101420.GA54787@dev-dsk-mheyne-1b-c1362c4d.eu-west-1.amazon.com> References: <20230823061642.76949-1-mheyne@amazon.de> <2023082731-crunching-second-ad89@gregkh> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <2023082731-crunching-second-ad89@gregkh> User-Agent: Mutt/1.5.21 (2010-09-15) X-Rspam-User: X-Stat-Signature: q5zqs14ro9q8r6bitc8se6pbyafs9x6z X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 919224000B X-HE-Tag: 1693217670-26002 X-HE-Meta: U2FsdGVkX1/+MptbiR00le3FI3wZGVVCA3RhLG9Hf4GzspQoh8Iq+Iid5F4B2p/TyrXBev2eHWXixYhIhwPkJ4SS5RSIz+IzxsB68JT0RXlAbKfSgCdxQsETFL/7uI4mdLt21i6C3QP5bWKyyao3Dc9uDjq7/w47ud+U+eIIoJyx+ne9IYqwaU8VODTejVlo1QhN42O+CBRjeNbfuvs0CCTu/PDfCHMFgzkwBqz6CybH7ewxROOBbNNXLUDslDDMZlW08IC0aMCK3G1fyY0b7Cen4guTdiR+xkPZzpbAx88L8OMgcd+srozrnVXN1FyQAc7x4VDFceUaOwsoFWPf67dBWB3FHBJ4L9UB7g3/KIHi/vJmM9pHs+GiUL0THEIsVwZopcWmFBF3eBTf4kfsWz2BtfgYqSLSEex6Xlkcw0VUqFEd4UJDcvyBRl3tPzJknxXSi261MYQ9hIHM7XsyWryGP0ddzdmPI1uAW4vt5QPDzzp21VhZQeuxH2QxAp6dh9rhh815sjImmd3M/o4FF7g6EBZYPeVaKCXaP2+A64rvdwbdONPs6quGk7E5/Q4iSlHuRH1l8H4PWTUFFJKdFKh5ZB5nIOQ6vLWFZYQG7JlAOCwFL93aHIHGOvpEuT7xFLldSVnHB7FqNYQOXTZjKJOX8lUuVVttpeNNnK9oGwKjQKBZdyxL7lkgcBy2rizDDTHU/iifict7lYF7PlB9M07c0KoEsj+TAiCHBC+TGySKp+aYvq6qFkXA4yX71oSfKx+a5V9+8n+7sraOIL58V1QSIJK8yWBiCGih7welwGdw0f5na5pRTmHUF+qVzRSJ7S/FP5YKIhYRkBtFVt/6dqUy66pZau1zFu8frKF38kDPWUsg3PMvZUAzrHsIDxn3svAg+6gIxWnP927M+TqrZSVisQQFhlwXdxIeFWD5i7Wd3rzgMTF219Spro7rHfNi+x5KDfJxz24sf65fIR7 RvDpMgm5 fabhF2qj1D1cBzUhgj4QDTALKQ66bFERtE8mUq+WsmxIETCBURlPCbxzWdQBNxOum0pF3r9q8y0BgaCAhYXwgsKo2mQPDU/mTB2e4kguawGUbyESNQWijKI4d98dQh5zzkZis66ivNRN+60xiH5PiYhqN/h6ejQ4wC1ov2VZH0Meog8qrCInPg7TlDEbnbtTB2sxBBsWyf7/uwFTzt/mUp5L+Cebs/OZUXEWqjH8nkpdcXLp10t14wctT7BhEBkAdDhqiVCm3odpBa2f4TrJQKZGFOBFcZpXLKYaUZrwv/mQ7j8SYuCUV9ALSvPp+Rxf/ZBnVr3oJgHauYtTU6z1B00pgM3IF6xIKdPDVplJC18hTOTGfqglYpx0j2wLpOsq07yEpG7Ys6mL2hIj3IKfZ88e9oMKCxzRYrDxzYL2tMy1dDmf/gkaD09tDvYvrgMQEITy1g1vXYPgGMnnZLIzs0ld5zzF+ICwRHjGTBGNQJtkMyoqxeW/msCS/0tIr4smpAKTsq/hLAHTRH15KpC6FxA7Su5dbY/RGAh5pHwIvz4cicECjrFjQdautJpLbCGJQr0j5ag0tuuOThj20q14xoiALZDnAHm1kitbr8/dsslULzGVGav52xM4B7tx+7eyGJheOFw1NtQTZyN3W0/ptEd3cFVrdJjVjAE04P2/raWcm9qGZE8opDIwBmTDRiRTDFAWaRWgodtvgIeePY6S9KS5Oii7aF/iIQ2kY+XnvU8d0RohWqIBlDDcorAgKPrM9/vht9RJl+Loo2hwuxw/C7r2PG1qvoTYf7lh+NiJDdP4c/F15CVI2DyvpkRbNbaWmWWO2X+IXc687rp7ex/amph9lQ6nvmEskDrneOF/t6G1kH5Vz/IEAJsgT1DZHotFxSNy2X2z7GlN3GKqQl0scQO+MmGF2V1PY7guLYJ5pSyF6uJcMuy7jkILMMUlUdk3hi5TW/rv6Nshhgxg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Aug 27, 2023 at 10:54:03AM +0200, Greg KH wrote: > On Wed, Aug 23, 2023 at 06:16:42AM +0000, Maximilian Heyne wrote: > > From: Linus Torvalds > > > > [ upstream commit 5ef64cc8987a9211d3f3667331ba3411a94ddc79 ] > > > > Commit 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") made > > the page locking entirely fair, in that if a waiter came in while the > > lock was held, the lock would be transferred to the lockers strictly in > > order. > > > > That was intended to finally get rid of the long-reported watchdog > > failures that involved the page lock under extreme load, where a process > > could end up waiting essentially forever, as other page lockers stole > > the lock from under it. > > > > It also improved some benchmarks, but it ended up causing huge > > performance regressions on others, simply because fair lock behavior > > doesn't end up giving out the lock as aggressively, causing better > > worst-case latency, but potentially much worse average latencies and > > throughput. > > > > Instead of reverting that change entirely, this introduces a controlled > > amount of unfairness, with a sysctl knob to tune it if somebody needs > > to. But the default value should hopefully be good for any normal load, > > allowing a few rounds of lock stealing, but enforcing the strict > > ordering before the lock has been stolen too many times. > > > > There is also a hint from Matthieu Baerts that the fair page coloring > > may end up exposing an ABBA deadlock that is hidden by the usual > > optimistic lock stealing, and while the unfairness doesn't fix the > > fundamental issue (and I'm still looking at that), it avoids it in > > practice. > > > > The amount of unfairness can be modified by writing a new value to the > > 'sysctl_page_lock_unfairness' variable (default value of 5, exposed > > through /proc/sys/vm/page_lock_unfairness), but that is hopefully > > something we'd use mainly for debugging rather than being necessary for > > any deep system tuning. > > > > This whole issue has exposed just how critical the page lock can be, and > > how contended it gets under certain locks. And the main contention > > doesn't really seem to be anything related to IO (which was the origin > > of this lock), but for things like just verifying that the page file > > mapping is stable while faulting in the page into a page table. > > > > Link: https://lore.kernel.org/linux-fsdevel/ed8442fd-6f54-dd84-cd4a-941e8b7ee603@MichaelLarabel.com/ > > Link: https://www.phoronix.com/scan.php?page=article&item=linux-50-59&num=1 > > Link: https://lore.kernel.org/linux-fsdevel/c560a38d-8313-51fb-b1ec-e904bd8836bc@tessares.net/ > > Reported-and-tested-by: Michael Larabel > > Tested-by: Matthieu Baerts > > Cc: Dave Chinner > > Cc: Matthew Wilcox > > Cc: Chris Mason > > Cc: Jan Kara > > Cc: Amir Goldstein > > Signed-off-by: Linus Torvalds > > CC: # 5.4 > > [ mheyne: fixed contextual conflict in mm/filemap.c due to missing > > commit c7510ab2cf5c ("mm: abstract out wake_page_match() from > > wake_page_function()"). Added WQ_FLAG_CUSTOM due to missing commit > > 7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem") ] > > Signed-off-by: Maximilian Heyne > > --- > > include/linux/mm.h | 2 + > > include/linux/wait.h | 2 + > > kernel/sysctl.c | 8 +++ > > mm/filemap.c | 160 ++++++++++++++++++++++++++++++++++--------- > > 4 files changed, 141 insertions(+), 31 deletions(-) > > This was also backported here: > https://lore.kernel.org/r/20230821222547.483583-1-saeed.mirzamohammadi@oracle.com > before yours. > > I took that one, can you verify that it is identical to yours and works > properly as well? Yes it's identical and fixes the performance regression seen. Therefore, Tested-by: Maximilian Heyne for the other patch. Thanks, Max Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879