From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88E6DD2C542 for ; Thu, 24 Oct 2024 10:50:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 289EB6B0082; Thu, 24 Oct 2024 06:50:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 238C16B0088; Thu, 24 Oct 2024 06:50:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D97A6B0089; Thu, 24 Oct 2024 06:50:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E2A956B0082 for ; Thu, 24 Oct 2024 06:49:59 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 61B491C6ECB for ; Thu, 24 Oct 2024 10:49:38 +0000 (UTC) X-FDA: 82708174752.09.A80AE00 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by imf30.hostedemail.com (Postfix) with ESMTP id ADE6A8000A for ; Thu, 24 Oct 2024 10:49:22 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b="aBSp6/IH"; spf=pass (imf30.hostedemail.com: domain of ptesarik@suse.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=ptesarik@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729766845; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=loWmohrD0u9En+NaaMGpd3qTcGdZJB6GKDjj4u9HyTo=; b=yZGvNMEtR0b6rxTncPHdZQfX/oZZSzw7RHYHdA67R/F4AWhMTf+7+9Dpj+43BwWVPZsdz5 7nFCPhw7qfNGGH41xjy3AcahYyHrKJ2rgfbyh8ClO6fAhoPkEnoXfpsLeoCuDC65pOiePn gsxAhLAZxuqb/BumKFMEFgHLxRw//Io= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729766845; a=rsa-sha256; cv=none; b=K1+ekSFNF1fKjpcz7I8M198xYaFCKCZOHaemTHBLzQ2BWAhO6qXXQEa0vmO1x5xdG5T3Nc jGMO/PqXx/kqBCuMqXBHUIiRBWYcEYkEeK0C6EwgD1+DYjZE9lwnkBYc4SDYmM8xTDDVI+ nx9rFqoQu0Q5gSxrt5YLwKjeISBlwlk= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b="aBSp6/IH"; spf=pass (imf30.hostedemail.com: domain of ptesarik@suse.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=ptesarik@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-4315ce4d250so1131685e9.2 for ; Thu, 24 Oct 2024 03:49:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1729766996; x=1730371796; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=loWmohrD0u9En+NaaMGpd3qTcGdZJB6GKDjj4u9HyTo=; b=aBSp6/IHg3FrHLapYBki7/MX5egqE5S4Fty1AtmxW2FCL0RHuh0nh5J0xpSv2LE7/s UF/jokuLtFIsiBf18Qvgddy133Op8qwDZyV3CZ2QytvTq9rN28WI3BSKrF3wS1r8cOeR OAE8rLNXgZdIA8XUgPSaPW7UcEw+wc81PbOUY/bpq2DQvhUjm21daNPzcgV8hCaKIzri HiCZgupQ60ua5qVuOoQk7aWAxkq13WQoq81n+4KingZNbV1Mq00/2o5BIoiRU8InXeda CQhOy2seBGbPgdJEtxc1IR/oWSECEFboKM2l3xk9vpSFS4V4WPVKh8PW95FlXMDVfGlE b34Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729766996; x=1730371796; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=loWmohrD0u9En+NaaMGpd3qTcGdZJB6GKDjj4u9HyTo=; b=Z8Q9CqJpd1yHaMmLQ6xZMC55Dc6DU6vqQnq5H50NkDxtdajw7MYA25CAWLdqPkJUz4 usOWeC58h+3tPy8Iqlz9jcQfPZO9ulEAs91by9xiAwMIC3gIppA6FPTaUERdBoEeIJZ+ L2AuavDXVFNZNkW7Z9h5S+WSRpRdljb/9JmkZINHngMFln9qQNbCes1by71zuQmL6zgN lEB8fEEjOTqL7QnW2J/xKmO97mlIcTvBjnN6Q3eJ44USG2oAc/FebDksBVmCXea9Xpjn dh5/hPcjdOms86sNL6KNMJJ6lJ902SKNHnpNGfemZqUir30kEUj+9rvSha3pKsP+ZHSM YNFw== X-Forwarded-Encrypted: i=1; AJvYcCVAu7zWgTa3q4vcYjjBTdOKCvA/FPkyzZbCMs/N9/Zngq0rrse0z5UUzZE6S4Odi1WDgcu8O5QoSQ==@kvack.org X-Gm-Message-State: AOJu0YxWBIP6qt8pOHZXeL35zQhBObjG2M55azi7ml4tHCa+3p+GeLFp 1WoSGCH/g6u9cKnGBAmyVCFLUqPuTsstmeF93gjWky7GqyohCf+jX8ovzShiCEhd7p33qoxD4O1 H X-Google-Smtp-Source: AGHT+IH/mWHsBy1Th/KpM7QX0bH6dEt5y+/WtHFyadVPtQC3xZUeuSoB1grvDqXfiwwsRXGlOAapNA== X-Received: by 2002:a05:600c:1d06:b0:42c:bfd6:9d2f with SMTP id 5b1f17b1804b1-43184133ef0mr22694795e9.1.1729766996094; Thu, 24 Oct 2024 03:49:56 -0700 (PDT) Received: from mordecai.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-3010-3bd6-8521-caf1.ipv6.o2.cz. [2a00:1028:83b8:1e7a:3010:3bd6:8521:caf1]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37ee0a37d25sm10990154f8f.23.2024.10.24.03.49.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Oct 2024 03:49:55 -0700 (PDT) Date: Thu, 24 Oct 2024 12:49:53 +0200 From: Petr Tesarik To: Vlastimil Babka Cc: Thorsten Leemhuis , Rik van Riel , Matthias , Andrew Morton , Linux kernel regressions list , LKML , Linux-MM , Yang Shi Subject: Re: darktable performance regression on AMD systems caused by "mm: align larger anonymous mappings on THP boundaries" Message-ID: <20241024124953.5d77c0b3@mordecai.tesarici.cz> In-Reply-To: References: <2050f0d4-57b0-481d-bab8-05e8d48fed0c@leemhuis.info> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.43; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: ADE6A8000A X-Stat-Signature: ofh7ntqj41m7wxtbwubbfurqdja89zig X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729766962-230207 X-HE-Meta: U2FsdGVkX1+gKEwuOZGdAIlfD3L2iCRCLs43LLj3GYWnZsFuo4f7yZkj3c1GgYrtbtFSMs3RtxivHvNL+zzw/t2pWVnBxDRnqXFQ2/Nd+ot4Qi3gqJ4ZzRhLbiKx8unP9mAo/w12G+kyANEnNlIrs/hJbbQJHF5ze0NzJg+Wvb6XVpA0t6oXs6Fz9F0kgZJukoAdx29EsvEccvwIkI3FkrQK/PgAPWb70THTsGnVNMysx14yOydQ6G10M4GIH8bPJrszs6yYgGlRAXxCwPttfcHj8AzFxiWxLwAZiXBcUdCfidBO44y0BwvMvHtETXzBHfOz4rfoL691A13zq1HAjEnJFi+BBA96E8nVyolJwhP3XNEaIKs9zrSjRxQNJDZ2qqhIV7KcOR39f/nSg+LtihxwAMVaetp6srVrvW1m4Mdr54+qJP5RGPgAw0VwLkko3YuTKRWyRmk2fLMT9h0nzUUUg6tVjUt5h7j8sGZbjdoG7n7++/UaigMAyDHQoxK054R5yeulVsWjrDDvWkQ7ttfV+fkxxwFObwu9jPAp4EP5JSUq5isOhdd6MnjSXvDe+kwOcNppD3Xw37FaP6nPN875vuSVZCyA2UCXjScxstHMLgSBXTxkZD2vZO1FD6p0qmySR8+EMOHii9IKNpNwvl69ATKIr9bflnywhs338XIeic8lkz3QIQLhgwNE+zTwA5CfO2sd7VAH9qUG68ZqA7XcSrHwLKoECtD0rPVjjfwxqlaZvmr+pWPBpWs4uuTT/pQQXHOzCbWlUHz8hOVwfonl2bkwSELkMPbJBC5Qr3Mr6RpAZVE6lSvZGHJcMXk97F+heTCCXqwR5pEFhp0Tt3lmWIrlzwrdsviSrmxKJ6cWeeAxFt2lJXjfm3JEh/PCDTPNITCdsxmbSYcDU1scd6Pkqw8+cQnClQtySAYsgHENrOP+YYTSgROFHyCUQ35D8O2lxaddM0sOQWqQWTW hHQcwCUq pudPImcdRsc9GaA4OSlCrFjnLsOrt6kY/yjhGSLGxo9aOdt7G6VcR3BCLr6NWMbfNwFjE2DqGu4LgcgtS2CsQk9p43HJt/4h5zoIe7slj0iGNmvqQBmJEMZ0Y+xrN4Ag+E1X1j41cLQDlmmfBwfKWaKeMFsAe9ooqgfjn6Z1wPtf/qA++SIoDW1ZsvV4+U+tNcXEesC19YSqQwqNShSBZ69q4YOIBIuShE0b26bjUNNonHh4LTE6tYee4KGQfDku0JLxclNkuxS9W6Mv3DqtDJAQfKGgZa5XVQb7WkArN5aUZdiE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 24 Oct 2024 12:23:48 +0200 Vlastimil Babka wrote: > On 10/24/24 11:58, Vlastimil Babka wrote: > > On 10/24/24 09:45, Thorsten Leemhuis wrote: > >> Hi, Thorsten here, the Linux kernel's regression tracker. > >> > >> Rik, I noticed a report about a regression in bugzilla.kernel.org that > >> appears to be caused by the following change of yours: > >> > >> efa7df3e3bb5da ("mm: align larger anonymous mappings on THP boundaries") > >> [v6.7] > >> > >> It might be one of those "some things got faster, a few things became > >> slower" situations. Not sure. Felt odd that the reporter was able to > >> reproduce it on two AMD systems, but not on a Intel system. Maybe there > >> is a bug somewhere else that was exposed by this. > > > > It seems very similar to what we've seen with spec benchmarks such as cactus > > and bisected to the same commit: > > > > https://bugzilla.suse.com/show_bug.cgi?id=1229012 > > > > The exact regression varies per system. Intel regresses too but relatively > > less. The theory is that there are many large-ish allocations that don't > > have individual sizes aligned to 2MB and would have been merged, commit > > efa7df3e3bb5da causes them to become separate areas where each aligns its > > start at 2MB boundary and there are gaps between. This (gaps and vma > > fragmentation) itself is not great, but most of the problem seemed to be > > from the start alignment, which togethter with the access pattern causes > > more TLB or cache missess due to limited associtativity. > > > > So maybe darktable has a similar problem. A simple candidate fix could > > change commit efa7df3e3bb5da so that the mapping size has to be a multiple > > of THP size (2MB) in order to become aligned, right now it's enough if it's > > THP sized or larger. > > Maybe this could be enough to fix the issue? (on 6.12-rc4) Yes, this should work. I was unsure if thp_get_unmapped_area_vmflags() differs in other ways from mm_get_unmapped_area_vmflags(), which might still be relevant. I mean, does mm_get_unmapped_area_vmflags() also prefer to allocate THPs if the virtual memory block is large enough? Petr T > > diff --git a/mm/mmap.c b/mm/mmap.c > index 9c0fb43064b5..a5297cfb1dfc 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -900,7 +900,8 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, > > if (get_area) { > addr = get_area(file, addr, len, pgoff, flags); > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) > + && IS_ALIGNED(len, PMD_SIZE)) { > /* Ensures that larger anonymous mappings are THP aligned. */ > addr = thp_get_unmapped_area_vmflags(file, addr, len, > pgoff, flags, vm_flags); >