From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4469DC47258 for ; Sat, 20 Jan 2024 12:04:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CA896B007E; Sat, 20 Jan 2024 07:04:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 552236B0080; Sat, 20 Jan 2024 07:04:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F30C6B0082; Sat, 20 Jan 2024 07:04:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2B4B26B007E for ; Sat, 20 Jan 2024 07:04:35 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E9819140859 for ; Sat, 20 Jan 2024 12:04:34 +0000 (UTC) X-FDA: 81699557268.19.84E998D Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 9A30620009 for ; Sat, 20 Jan 2024 12:04:31 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705752272; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EokvZDqkoqK6QQKw1AoMUwxKT4SN6UkK9EP0r/JHf20=; b=rh4kkaBdLxl8OuaysVkK92QRrRhHft5lISxja6uVYqzY51dfCvOS2kr+A7NH5KQ9egndcU DZDerSUWJ2ZnOo8JUOiGoe1ocN+RTXUi+6TFO7LjAlLJmXu5TSx6rEA++E+9G77d/GOZgi AhBIPVebEBwMasD3rTZsnn36KewvItI= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705752272; a=rsa-sha256; cv=none; b=1Ant4ld/L7UUvlPkXMWX33ivAkp1v/VejQtsJulK/xjoLGhf596h16McASwzV8SLk7yEUd i8SaoMbHej6LfD0Y5vNXafo6fqQ8+mM193bu8OmcZpEpA4gdzPDQ6eST8o4t/u10gaV9wD yLhPMAgBQu5Yobp0srgfEv+zSWIBXlI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 50EC7DA7; Sat, 20 Jan 2024 04:05:16 -0800 (PST) Received: from [10.57.77.97] (unknown [10.57.77.97]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1925D3F73F; Sat, 20 Jan 2024 04:04:28 -0800 (PST) Message-ID: <1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com> Date: Sat, 20 Jan 2024 12:04:27 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RESEND PATCH] mm: align larger anonymous mappings on THP boundaries Content-Language: en-GB To: Yang Shi , riel@surriel.com, shy828301@gmail.com, willy@infradead.org, cl@linux.com, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20231214223423.1133074-1-yang@os.amperecomputing.com> From: Ryan Roberts In-Reply-To: <20231214223423.1133074-1-yang@os.amperecomputing.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9A30620009 X-Stat-Signature: xh9gqzyq8ubfondh915frtagk5jz95dj X-HE-Tag: 1705752271-49057 X-HE-Meta: U2FsdGVkX1/qqACW5mjR7xi6vuia/2IYZFRI26xClHDymVHBKAg3sL2XvbA/Hke5v23pZWXhF8I9HgISNjt3NiZt2dXyB3WQBIyt6HgdlR1SapZI6Oeo1NvTgslOAUCMSYEMJEHR3Up66ALrJ18B9XX0hPdZF5fK1UAQq09RUGZtzcdb0SaA7ggOYqK90vHdKQB9eNqbIHmd41kxPjLjTxfKe6B65wshpALFIYhnCqFtsS8X+1B45zRp670Yc8uQcDXuIQDQJN1dYhlgE7OUXM4BUmr058nvuYLGbyF2b5449zNLJp8K3htO76YyMLCh+bL2E1h+H67u01vJw8vGB4ZymyW9osFwRvV1682Pz4BEVs/bBUwVrH5kObDmTGXnrRJpAnRdwOcL0BsKJi6KYyS3NFGirZIWJo5uvopd9F0e9MidZIHtijbU7WSHDkdW5n7eyxSmoDy8kMQY+ezzWQywfDF0ebQzZrJnE11MVE8xh3h8pBcpS+FGsJsnoXGrxW/pJhfE3ScgTvl90v832QbZNTyJiY+fiy23CuH3Ys0E0WwK3YUBGucwhDIc4abKzLuaIGJ/liZEm0szAAZYpblsaZFkAJUzI2HbWFzUBMUyPX3OX/Wz6FJNl6JGYn+RZTivw9yyg2FSwvRgkyoTCd4SzuOVC5LBuIQ6LypQnRgJL6GjJSuU7zfFKRCyKlpMZJT2BXRRcpFEh7ywdLR4oQQWe7d2NPrGxnyMZIeEWm9AZ1HIayp+MZpGfya0gnVPVERnlKu009z2xbW85IRmkrVy2dQXXb0aDALz1pLXofVmRelkZuWb+3bAMeGN0csA3/u2x1tQN0NQO52OQN0t+okoCLY2ixrJI61bVJcdHI6YdmI7UbpGdju1vqFleC6aNft4ySCpQSpnBnC3nX2DcyNz6Uquj+Y/ErX3JCmtFMYuG/BF0+A3Oyi6Fuohp7qflOYPUygSuDE7L/XmvM7 /a2ogbQh OanR6NCctokVNbuR60OhxK2aIVnrReMYjbrSjr8HpLsMDuYxEpX6l7yYXIJlGHIisFDAIrLCOAxkpStWFlijFYgLHF+VGmUXXtedgxxgeS8pZjIvo1G0vXjqwCn2VxGlLMtQ4mUpLhpPARrhVDNZMeVuZMPAOvarm30cUcq1rHmzm3Lpq0SmbydLIGgKqsPvk3aucc6PFdc/fzcGOGT41XuC+ENXlctpIG0hVvoI48+3DPnutKDIbD7YV6zZAIeSYI/eeuFp4jinQwmPkk9MuQllN2hruZNxvLJCojTKHkgzGgQAkDcSi3SuZ8p+KJIkEHkMzMEELI7tuLmyoi5WImqYofTeXhRSovB5PsSbMC74XkSya3xVWDKLETX9pbTCso9MZOv6o7VZ6+8u6irfTAJJtoPTT3xnKlxPMBKhAHqW+Y54KxdTTkeHuOj4bDQUws25XCh7EoNYrdpiyACM/Sr4QhA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 14/12/2023 22:34, Yang Shi wrote: > From: Rik van Riel > > Align larger anonymous memory mappings on THP boundaries by going through > thp_get_unmapped_area if THPs are enabled for the current process. > > With this patch, larger anonymous mappings are now THP aligned. When a > malloc library allocates a 2MB or larger arena, that arena can now be > mapped with THPs right from the start, which can result in better TLB hit > rates and execution time. > > Link: https://lkml.kernel.org/r/20220809142457.4751229f@imladris.surriel.com > Signed-off-by: Rik van Riel > Reviewed-by: Yang Shi > Cc: Matthew Wilcox > Cc: Christopher Lameter > Signed-off-by: Andrew Morton > --- > This patch was applied to v6.1, but was reverted due to a regression > report. However it turned out the regression was not due to this patch. > I ping'ed Andrew to reapply this patch, Andrew may forget it. This > patch helps promote THP, so I rebased it onto the latest mm-unstable. Hi Yang, I'm not sure what regression you are referring to above, but I'm seeing a performance regression in the virtual_address_range mm selftest on arm64, caused by this patch (which is now in v6.7). I see 2 problems when running the test; 1) it takes much longer to execute, and 2) the test fails. Both are related: The (first part of the) test allocates as many 1GB anonymous blocks as it can in the low 256TB of address space, passing NULL as the addr hint to mmap. Before this patch, all allocations were abutted and contained in a single, merged VMA. However, after this patch, each allocation is in its own VMA, and there is a 2M gap between each VMA. This causes 2 problems: 1) mmap becomes MUCH slower because there are so many VMAs to check to find a new 1G gap. 2) It fails once it hits the VMA limit (/proc/sys/vm/max_map_count). Hitting this limit then causes a subsequent calloc() to fail, which causes the test to fail. Looking at the code, I think the problem is that arm64 selects ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT. But __thp_get_unmapped_area() allocates len+2M then always aligns to the bottom of the discovered gap. That causes the 2M hole. As far as I can see, x86 allocates bottom up, so you don't get a hole. I'm not quite sure what the fix is - perhaps __thp_get_unmapped_area() should be implemented around vm_unmapped_area(), which can manage the alignment more intelligently? But until/unless someone comes along with a fix, I think this patch should be reverted. Thanks, Ryan > > > mm/mmap.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 9d780f415be3..dd25a2aa94f7 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2232,6 +2232,9 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, > */ > pgoff = 0; > get_area = shmem_get_unmapped_area; > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { > + /* Ensures that larger anonymous mappings are THP aligned. */ > + get_area = thp_get_unmapped_area; > } > > addr = get_area(file, addr, len, pgoff, flags);