From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5595C47258 for ; Sat, 20 Jan 2024 12:13:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0AE16B0074; Sat, 20 Jan 2024 07:13:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DBB2F6B0082; Sat, 20 Jan 2024 07:13:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAA756B0083; Sat, 20 Jan 2024 07:13:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BC1916B0074 for ; Sat, 20 Jan 2024 07:13:27 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 663E31A0846 for ; Sat, 20 Jan 2024 12:13:27 +0000 (UTC) X-FDA: 81699579654.11.EF4AC30 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf04.hostedemail.com (Postfix) with ESMTP id 6514340019 for ; Sat, 20 Jan 2024 12:13:25 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf04.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705752805; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zufAuQYD7rnFHNpTbc5TzR8hKNMO4KM5347edoFGLIs=; b=H2qHmNo5KZ8EFpLjRiHhg9nuASa4N4DQ0NMttIW6Zqsqvq2QLxvJWkhM/UAkMlCYbx7888 oBUesRqrKduBmtuhn9xSrtNw5/RUg6a7HDm0KPQnKBPh4UdxVLp6hDMgGEcaKPn+/t5Ki+ cciEnRJOlPdLV5iZiCYnps0+TpwUR7U= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf04.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705752805; a=rsa-sha256; cv=none; b=Tcm39jM4QssDkxQxF5BwMsAvwJJ6oW/9barZFi03hDN7r/BFj5JRYUiXtgzxNGNANYD4Z0 YVMGBGuigkHHrUTlZMKf6R3zmqKHV4r0h1Br9Hc+06xzt5PjMP87TqKP6BycpWVXX1P6K0 FJ9iCiB0jPVOy5PZAEmBhuSKPk0Lqxo= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 435ABDA7; Sat, 20 Jan 2024 04:14:10 -0800 (PST) Received: from [10.57.77.97] (unknown [10.57.77.97]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 014EC3F73F; Sat, 20 Jan 2024 04:13:22 -0800 (PST) Message-ID: Date: Sat, 20 Jan 2024 12:13:21 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RESEND PATCH] mm: align larger anonymous mappings on THP boundaries Content-Language: en-GB From: Ryan Roberts To: Yang Shi , riel@surriel.com, shy828301@gmail.com, willy@infradead.org, cl@linux.com, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20231214223423.1133074-1-yang@os.amperecomputing.com> <1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com> In-Reply-To: <1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 6514340019 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 5cupna76j1wbdgmco5frz3rf14y5pi9m X-HE-Tag: 1705752805-137665 X-HE-Meta: U2FsdGVkX1+hGHvmOx3UYzGiyViT+MKLuxfCDpHt8G7BleI3hZGcYr8gCGE4EoQs51iExnkxX0h74H2gTTp4jGruvSviVK81mb11+tBn/Q8O/C41GMjxEbl1/FCV7yiptP4jUFbHPyTXBHSfYnHlQYOi0nXdeNHUZR5GRzqiDntK0Re/bf45bgQqFZkNqCsAIU5LBYINmb5Agf5KzIJaSNkVU/kMLKVcbkpHYQtv8Ia8pFHeLTEH/2LMMHQGuGU3N0j4QTpBRtEcbVok+nKJm2jy/pHm6pAUI270Z0XjhNWhu86SCT5gNgzvKJmZZgdCexFtusI9z8FNMnJQXhrSOVlpPE6uNPMPAembi5+/26RO2DjcRXz2hEt3DiB3Yzczecn/UyS8/9+gaxjXx3K3o+IeNuB2ckXmCKhhYJjlHMzzmb95vNnl3i/oMIzJb86IZqPQPgmPqNN5nBunKfCh0nAYQnf8+mZHWdmAookiVwDjdNjI5zp0kfdha47NF9Aue+9aQFLjl15Xdml+X7xoq1AWCJAv8U9birVZzoQn7mq8GmcqaNT4KdFpmZpG/faMktRwKtdVQTpK00Gzc9MN15cAvDD8/xmQOyYb0BVh5MU/qddGLgLFgMH1SbeBPVRwXMN0v0QTJBirAyMZuj9dO2fymMlcpONlu+ju+5r92W9yEHmrTSMKxK0D8g6sGMEzp84FlGKy4x/sScNhcjoP7ipXbXXTHFVwxkhZpNf4ILxU/w46/fayJARul1UQhqvM6y5/R6JySUiTaXNpAjkWQ2I99kL0IzGXHcBwfwZpoGd++6WDBHoG4TE3VkeK592gk7Rkbg+ZMvCqAs3V7Yl40p6BwL/JCIwR+fwdN16pbHko8Tb0twRQlWkuFVnmnknbB6/sWWLgbYnmUbcSZKEN1K8dWKJVCGgbA6s8mDob2wkxxobajr0gA8yJlrmqqrljefTxhZKf629JAk+L8vF 0Vum2vSr G6EvIFn0MiREVrju3ePmQlqYjK1FA9Z2+skRy1gAddvEDDu8xxpi8iI1B0ATGdvTTgHHSmPT63wJAA/zQz0b4E40a70cyX6OkhoPF6Cmh6BLHQeoV0JxV6HI2NPfGCsMcb31yuecgu3+hio2uSha1tSZMnJiN5+1j4wBUIxXAg7en/1e0S83waqCPHk0kIY91Yeu30WGNPPluTrNHsZ6ziVPcNHxUOHuoMB95WsIT+v40811aK9WfZQwP87bELZDbMmV6tXpdosDCcoKGxwGRFWW83pb+3IEiC6/1afA/E1c6jXYpxfZwRR2TBnc40FXdBXoRppiI54foMv9r8JbMrIHOTkYoabFwyjAfy6QVo2tywK2ycw86Ei8KmWjL2syT386DnxwikQNxND9dUaJORYf2wGzi1+j7YM2d4K1SYjnFwDOsY7Fdil8LU9L6LhHPw2FmtcxgFEkspw/1lM31DHxPbw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 20/01/2024 12:04, Ryan Roberts wrote: > On 14/12/2023 22:34, Yang Shi wrote: >> From: Rik van Riel >> >> Align larger anonymous memory mappings on THP boundaries by going through >> thp_get_unmapped_area if THPs are enabled for the current process. >> >> With this patch, larger anonymous mappings are now THP aligned. When a >> malloc library allocates a 2MB or larger arena, that arena can now be >> mapped with THPs right from the start, which can result in better TLB hit >> rates and execution time. >> >> Link: https://lkml.kernel.org/r/20220809142457.4751229f@imladris.surriel.com >> Signed-off-by: Rik van Riel >> Reviewed-by: Yang Shi >> Cc: Matthew Wilcox >> Cc: Christopher Lameter >> Signed-off-by: Andrew Morton >> --- >> This patch was applied to v6.1, but was reverted due to a regression >> report. However it turned out the regression was not due to this patch. >> I ping'ed Andrew to reapply this patch, Andrew may forget it. This >> patch helps promote THP, so I rebased it onto the latest mm-unstable. > > Hi Yang, > > I'm not sure what regression you are referring to above, but I'm seeing a > performance regression in the virtual_address_range mm selftest on arm64, caused > by this patch (which is now in v6.7). > > I see 2 problems when running the test; 1) it takes much longer to execute, and > 2) the test fails. Both are related: > > The (first part of the) test allocates as many 1GB anonymous blocks as it can in > the low 256TB of address space, passing NULL as the addr hint to mmap. Before > this patch, all allocations were abutted and contained in a single, merged VMA. > However, after this patch, each allocation is in its own VMA, and there is a 2M > gap between each VMA. This causes 2 problems: 1) mmap becomes MUCH slower > because there are so many VMAs to check to find a new 1G gap. 2) It fails once > it hits the VMA limit (/proc/sys/vm/max_map_count). Hitting this limit then > causes a subsequent calloc() to fail, which causes the test to fail. > > Looking at the code, I think the problem is that arm64 selects > ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT. But __thp_get_unmapped_area() allocates > len+2M then always aligns to the bottom of the discovered gap. That causes the > 2M hole. As far as I can see, x86 allocates bottom up, so you don't get a hole. > > I'm not quite sure what the fix is - perhaps __thp_get_unmapped_area() should be > implemented around vm_unmapped_area(), which can manage the alignment more > intelligently? > > But until/unless someone comes along with a fix, I think this patch should be > reverted. Looks like this patch is also the cause of `ksm_tests -H -s 100` starting to fail on arm64. I haven't looked in detail, but it passes without the change and fails with. So this should definitely be reverted, I think. > > Thanks, > Ryan > > >> >> >> mm/mmap.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 9d780f415be3..dd25a2aa94f7 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -2232,6 +2232,9 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, >> */ >> pgoff = 0; >> get_area = shmem_get_unmapped_area; >> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { >> + /* Ensures that larger anonymous mappings are THP aligned. */ >> + get_area = thp_get_unmapped_area; >> } >> >> addr = get_area(file, addr, len, pgoff, flags); >