From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D740CE8E62 for ; Thu, 24 Oct 2024 11:21:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 029E26B0082; Thu, 24 Oct 2024 07:21:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F1C536B0083; Thu, 24 Oct 2024 07:21:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE4006B0085; Thu, 24 Oct 2024 07:21:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BFAA26B0082 for ; Thu, 24 Oct 2024 07:21:00 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5CB8A1C6EF5 for ; Thu, 24 Oct 2024 11:20:39 +0000 (UTC) X-FDA: 82708253334.23.90B31C6 Received: from nx121.node01.secure-mailgate.com (nx121.node01.secure-mailgate.com [89.22.108.121]) by imf29.hostedemail.com (Postfix) with ESMTP id AFD3912001F for ; Thu, 24 Oct 2024 11:20:32 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bodenbinder.de header.s=cloudpit header.b=YM7foC2H; spf=pass (imf29.hostedemail.com: domain of matthias@bodenbinder.de designates 89.22.108.121 as permitted sender) smtp.mailfrom=matthias@bodenbinder.de; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729768806; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XWkfa5fEX5V4GuUQI0dMU/2kxx2O9C5ZTvyOv87JHak=; b=bN2VPu+EILLuKr8xgMet9vnNUpmBgIprkqrhB4RLBy8cVWPc4ai0WQuCS7UDDpqoBFn/PI 1XAiA2Li50t3zHGdAK/4LC6TCJsKRdHQUtQoMvMDv4myvQOPf4Axm11a1IqxbHR/8m/gk3 ten4iXAdmRUAfWJH32SZR658A9VFqb4= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=bodenbinder.de header.s=cloudpit header.b=YM7foC2H; spf=pass (imf29.hostedemail.com: domain of matthias@bodenbinder.de designates 89.22.108.121 as permitted sender) smtp.mailfrom=matthias@bodenbinder.de; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729768806; a=rsa-sha256; cv=none; b=47/nmaoPq9kCtDeodUi68g49j7CCOqEVYrUlZz6NTju14vNYo3UVITUd1TqXRWyKNjIDRj 2Z9dtaT6RgFAQ2Bbi9X5KHOrWYArH9B5UKJumbbVaOmJkaT0Ttiui++pAHSCzUgF3opTgD CBsV2rUMvGDRz+RPCLmrlccYQZn11lo= Received: from web263.dogado.net ([31.47.255.43]) by node01.secure-mailgate.com with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.92) (envelope-from ) id 1t3vtb-008vL6-Rm; Thu, 24 Oct 2024 13:20:55 +0200 X-SecureMailgate-Identity: bodenbinder_de;web263.dogado.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bodenbinder.de; s=cloudpit; t=1729768848; bh=XWkfa5fEX5V4GuUQI0dMU/2kxx2O9C5ZTvyOv87JHak=; h=Subject:From:Reply-To:To:Cc:Date:In-Reply-To:References:From; b=YM7foC2HNgQY8mlrhpzLQDHfJ6YAL2L93gpMghtvOA/C3KoxuCmXA/kD+uuVDTkld 0IPlnxIWX5A5LLlhn65C6Ca5CQKi2h9DcQ3M8Rn5EmOFyufDWLv0Bj+I+c7mlZZmhS /GH7sSwoEFkckf9RTSYRm+NRhH9P/ONEuY5yleNA= Received: from [127.0.0.1] (localhost [127.0.0.1]) by rakete.bodenbinder.de (Postfix) with ESMTP id C69692B0C863; Thu, 24 Oct 2024 13:20:47 +0200 (CEST) X-SecureMailgate-Identity: bodenbinder_de;web263.dogado.net Received: from [127.0.0.1] (localhost [127.0.0.1]) by rakete.bodenbinder.de (Postfix) with ESMTP id C69692B0C863; Thu, 24 Oct 2024 13:20:47 +0200 (CEST) Message-ID: <8f1e0a5a10df71a8b4e8856feefd256bb150c38a.camel@bodenbinder.de> Subject: Re: darktable performance regression on AMD systems caused by "mm: align larger anonymous mappings on THP boundaries" From: Matthias Bodenbinder Reply-To: matthias@bodenbinder.de To: Vlastimil Babka , Thorsten Leemhuis , Rik van Riel Cc: Andrew Morton , Linux kernel regressions list , LKML , Linux-MM , Yang Shi , Petr Tesarik Date: Thu, 24 Oct 2024 13:20:47 +0200 In-Reply-To: References: <2050f0d4-57b0-481d-bab8-05e8d48fed0c@leemhuis.info> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.1 MIME-Version: 1.0 X-Originating-IP: 31.47.255.43 X-SecureMailgate-Domain: web263.dogado.net X-SecureMailgate-Username: 31.47.255.43 X-SecureMailgate-Outgoing-Class: ham X-SecureMailgate-Outgoing-Evidence: Combined (0.01) X-Recommended-Action: accept X-Filter-ID: Pt3MvcO5N4iKaDQ5O6lkdGlMVN6RH8bjRMzItlySaT+Di/w7YEBINGW0nwvCkZUyPUtbdvnXkggZ 3YnVId/Y5jcf0yeVQAvfjHznO7+bT5yWDQ52YYvt9IisBTB4Q1BTDELDtmPLfdT2Vjpgvg+rlDS7 ZzDoczvwl69Kp+t4ZfNPMFkC7dju0wVi9xVi6etz6t4nFv6HbQROxrKHBWdVz0bCVIVYbbDzsdz7 jC0gj9bsucSeir9YkXOhPyRNXzDt2sHfcJwOEA9jKCSl094oPEA8vZNIz+4a7F8dfU2Fqbk2yWsN Z8gyqKw33bw2+EiXjgEQn6pw4NBZ8mDUNUFs7J2IA5RYNsQ2FJ5vBgmf1IHrXWU6cgGy8Nw/36CI Z8AWNlDo60GNfZ2F/yvXF5fp0ZC7J/Kxq6Wlezg2CfSW1DENoYkVbhZSPkZJaDp+CfVLudCSOzNy 9PUk3qu+PQAnn/V7TUmIicDOOWMzcOyJSvmS9X7umXUE4oWoGV4TyQjTv0kyaSBiWCAhy4JMoK+Q fW5+fLO9VVhpK4T4aCzoygxALto4Brt2dyUUhL+IPYnjcMOfNZZ7+Vw90V7OmzZlr23m0/SFC0zn ggTAyrq6gEQM2HNV91gBZQ/1Pbs8eo64jl36TXkSKbxR+NrMuhHwDU+tT6G6A1eH4vLMfHwjJ7+g ei65y0Nk9SLcHStypSM5+CyAq35eV5Znh58ZuD0Sly5wxy0DV+mfZOyx7yN0vJPMqA9uSzHf/EOh slWB0B8sZaOjZ/HsdGn4fWC/8/Uc4odvMYH3HM2nOIg1Q8cK42gKNzj+12gTeiYOwewVmnsVsQL4 NXHrQIQdbY00d7mOmyRu9qsmwsU/3Onabsct4Toal0h743VOpKNY+4MjijihJW/2S6VsWr8v8wYU 22Gz7taKCC+ZDU9vfHgXkJXgxBoG4yf4TuzSUn4Eqv2AnTV4aIfVaCHpEB6cFH6WJxE4ZoQmEszT mvj5plaKBwW8kAExvt39Pt9gUdZK44J+sQGUc+HLqmn/pLC2GklQLRsGsoX0fOQPR5LGYRx/IHNE wGOtC2+FKqdOMATtOe3+6YoEw/O+Anfr832S2bh0h3+LfdufaG44omTNnk2+H8rLwHveYa7+ehlS W35MGtD26+UkLrogMSVlrPEWO9GJc1lxPlJx1/MWs0nunOY0mMo2j57SrYfhlozbi+DDGJdOZmGm maNdVZCPNCR8FQT9g6A3PUsRyHdQM1T4ZnIJmHmedG43iN+78rrGceX2Us+1iu/dlWkx9G5WKnjQ eRgRsesUUlf9uZG5Xlf2yOMcW110oNec2/lxh3SFN2BqdKJL9fPSszlVwj1n2JAvHUSGQgae9xme Szqs5sL/ee+qsUXMYe2z/zdoQyyytN1h7S0B/89UrnroWE66jLebXsBVS+Akcg== X-Report-Abuse-To: spam@node04.secure-mailgate.com X-Rspam-User: X-Stat-Signature: n97y36mt3jpp85eo8gc3twgef8jyzn3o X-Rspamd-Queue-Id: AFD3912001F X-Rspamd-Server: rspam11 X-HE-Tag: 1729768832-795217 X-HE-Meta: U2FsdGVkX1/SnjAW3sX9dshC2+/3dQUMTouOmAA0rK1e4Qz0jZEYipUDKnoQF1MOVuzm2ln8k7ATdf53jM51mD619CtwRMYFs7qPJCWMkR2mCu4jq9dJEkKB2E8lkfwZCfHfwQ0ASGVYNYcBpwvT0f6n6BR15yFOFDgKt98+kw0cxXxdrr/kxPcxhIVEKBXF74jOemyMiYPPTrnvoipWy/h4q7HrGjuw2gHOgOr/uYplzELnoPoirLEAgUbN7O3GFHPIFeLoBvck2nKxZ3iyBpWotcW7lqMswdfbPsLXRlTIGr5dLjyU8JRfMGxmRGJP4N4nzNpOh0VYNtKbRe0krhefmwiheK6ypPr4GjMtViE/OVKaGtCEPf1H3pmhuiihom7OAsbxVgMcFapFk66ajXOO2b9u9idmdeFAmearq/IhsQpd5sGMy5xlVU9gRX6Zgb1A8U8/XA9KPC2iiI5zmmQmb0KdyYb44l3Ae5/O7Pgm+Tpog0+eY+A3r8IwlNSCVlrSfnaB4kDlHdkZ+FJS+IddGtgNR2CtBJFjPkQTyvPd0wT58HPEhrbIgq9TXSD/RQPVMdoLAx+7bS0ctl7SlriRHE+HMGqfqPoJGyEljQb0WWbUMMn2RtkDoqqoyg2nsnuabDCpmh4AJprI0kISNQ6TvMOU0I2zltSARmxYgTeb2Bw3+C+MXoAgr6nEytUnMTBMKqDRmfQ449SYIUOp8zjl83ssZGatCVje1Xbqo0twBewGcru5LZvhCythpoZPQjKe7wepKuuUuNpy5VYlD302vw/2rpIxql7Dneq5JoyGX63RteHVZmx/YOp+xOyuk+uYNTCFS8w3sjmzJPft61/c89unHqvkb+vCDtFO2hqAuK7lyc5nYffwaFhmFfOmnwW8HZ0Y6lXsyx9X/AXbJdFe6mzKqvRiMFPrzQCtpIkqn7hFTEHguPgVVlmDaCPgVLzS2G8arfE3v5JvLIv Am+fllhF 8xfxWf/CuXXgIlm/H26HWAuPasB8dJ8A5rIiRvsVJdR1RKZCYKs5XlQ0PXOYiClJoqZaxqArupqMwHDz71CqQi/T85FUxWodjNF32ZMEqbvKIHlIDZ8tkz3WUL7/4HWUSAW+f5EF65Eso4WZEiNulU5SQQHW+ywXJ/fN7GLSu3LCHTr/CdVap5Nk516MoW/3qYECudCiQi84LNEk+vJ7PIhAhT8NgHb7b4oUpY/FzvQGBdvZsbR0U0MgFV9+1736sSaVeKfsOxqbVBi8qNvrLWis1AP+cSBCW2W4QfcOrWVcTHC3ojHAJu+mrBG/+k/dJMmOhTNfnSYNDI2TAPSEsejIk+SjtTsXOGeEMm5TnkfFWSdmmTqNei01Jy2bXWp2F9yUqOVQBWFU1NsSScXNdtwYpiQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Am Donnerstag, dem 24.10.2024 um 12:23 +0200 schrieb Vlastimil Babka: > On 10/24/24 11:58, Vlastimil Babka wrote: > > On 10/24/24 09:45, Thorsten Leemhuis wrote: > > > Hi, Thorsten here, the Linux kernel's regression tracker. > > >=20 > > > Rik, I noticed a report about a regression in bugzilla.kernel.org tha= t > > > appears to be caused by the following change of yours: > > >=20 > > > efa7df3e3bb5da ("mm: align larger anonymous mappings on THP boundarie= s") > > > [v6.7] > > >=20 > > > It might be one of those "some things got faster, a few things became > > > slower" situations. Not sure. Felt odd that the reporter was able to > > > reproduce it on two AMD systems, but not on a Intel system. Maybe the= re > > > is a bug somewhere else that was exposed by this. > >=20 > > It seems very similar to what we've seen with spec benchmarks such as c= actus > > and bisected to the same commit: > >=20 > > https://bugzilla.suse.com/show_bug.cgi?id=3D1229012 > >=20 > > The exact regression varies per system. Intel regresses too but relativ= ely > > less. The theory is that there are many large-ish allocations that don'= t > > have individual sizes aligned to 2MB and would have been merged, commit > > efa7df3e3bb5da causes them to become separate areas where each aligns i= ts > > start at 2MB boundary and there are gaps between. This (gaps and vma > > fragmentation) itself is not great, but most of the problem seemed to b= e > > from the start alignment, which togethter with the access pattern cause= s > > more TLB or cache missess due to limited associtativity. > >=20 > > So maybe darktable has a similar problem. A simple candidate fix could > > change commit efa7df3e3bb5da so that the mapping size has to be a multi= ple > > of THP size (2MB) in order to become aligned, right now it's enough if = it's > > THP sized or larger. >=20 > Maybe this could be enough to fix the issue? (on 6.12-rc4) >=20 > diff --git a/mm/mmap.c b/mm/mmap.c > index 9c0fb43064b5..a5297cfb1dfc 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -900,7 +900,8 @@ __get_unmapped_area(struct file *file, unsigned long = addr, unsigned > long len, > =C2=A0 > =C2=A0 if (get_area) { > =C2=A0 addr =3D get_area(file, addr, len, pgoff, flags); > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) > + =C2=A0=C2=A0 && IS_ALIGNED(len, PMD_SIZE)) { > =C2=A0 /* Ensures that larger anonymous mappings are THP aligned. */ > =C2=A0 addr =3D thp_get_unmapped_area_vmflags(file, addr, len, > =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0 pgoff, flags, vm_flags); >=20 Hi,=20 here is Matthias, the reporter of the darktable issue https://bugzilla.kernel.org/show_bug.cgi?id=3D219366 I applied your patch to kernel 6.11.5 and it works. darktable pixel pipelin= e goes down to 3.8 s with this patch.=C2=A0Same performance as with kernel 6.6.x. It was 4= .7 s without that patch.