From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84A78C47DDB for ; Tue, 23 Jan 2024 18:06:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 06B618D0001; Tue, 23 Jan 2024 13:06:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 018EA8D0003; Tue, 23 Jan 2024 13:06:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD5C08D0001; Tue, 23 Jan 2024 13:06:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C96E96B0089 for ; Tue, 23 Jan 2024 13:06:22 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8824612077E for ; Tue, 23 Jan 2024 18:06:22 +0000 (UTC) X-FDA: 81711355404.22.BF25B4B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 8299F180020 for ; Tue, 23 Jan 2024 18:06:20 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706033180; a=rsa-sha256; cv=none; b=1eye21iwgUdDWglklKKNH9Ad7vKE3dUhaV3eSvo62rr6f5TIWDNjmOWHjxe8euJ1DNuEs6 hhSQQrjtsEpr/RHfUZNxC2bcuhqA8eSi4cTKqj4rrX2T551UyX/5AShiPrw+DtD66/ZK6g CmGelIvpF+pvR5mM+Zdz4IASiPwa86o= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706033180; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SY5G7FThNnb4QybjVm+JjzXLIiX+JDoHpXLPFSAN3zQ=; b=zRz0ag+XWgG2xpLURPN+GVdR5B8eFC9wc/GbYWqRW8ygc9Tf4/UvzQetzj01+ErCfsnr/T Mppt1O+CkX9SVJyAL37j+NbHiySKGopXRd6ix3aojjlVJD+fd1SZE/FVuCDTYQo3SM8XoA H3+v58oWsztQXZj/L26Z7CiT+mQLJGg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C8BCD1FB; Tue, 23 Jan 2024 10:07:04 -0800 (PST) Received: from [10.57.77.165] (unknown [10.57.77.165]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8441B3F7F4; Tue, 23 Jan 2024 10:06:18 -0800 (PST) Message-ID: <47364d76-4eb9-42fe-bbbf-dec483cda2af@arm.com> Date: Tue, 23 Jan 2024 18:06:17 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] mm: thp_get_unmapped_area must honour topdown preference Content-Language: en-GB To: Yang Shi Cc: Andrew Morton , Rik van Riel , Matthew Wilcox , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org References: <20240123171420.3970220-1-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 8299F180020 X-Stat-Signature: n9qbwoqguyx97fb37yc1eacmansq5g89 X-HE-Tag: 1706033180-853883 X-HE-Meta: U2FsdGVkX18mY5Skw9ildTjFb2Pe/OiS3m21MOvFRRRZdevVSuRRAl36j4wTSafc8uituU2iU706zp9lbukC3GROcoree1i6li9GtotTxS1ZQoESHvoUxMEZuvHq0LGNW5ckPzA0HExLf4EOPNtiBfyXCiLMxkiOAQEQv4aDpfgWKzcWzAcrtcWAbl7iOyNP1M/P6vKcRzwcS1SdLhgYAX7TgWSkxQ4zeEhgdW80Gi2r+a2iwOt9JDu0+HPvqy4jOqEIwuzRLrA+PjPLBkxL5GuGbtIX5XtzQ5kNSQJnfAjL8D+kKS4aduwboOuPZtlwS4XgHlRwrz7ydGcbe1i2EPuKn51Qx7WgoPNlCLjb6RL77H6FFfCKbSby1RnZlSyDrzuCGams3uXbv7Zx5eq1MByjU019FG/8vFJHwZ8YFAGgUNPTfGvL8YQbiayG/3aW4dsKIy+YZNRtbf7KAnk1GWDvo9j+pI1CLMEiS3ooQ+5lierx96hROrMHIBok51oXwla0jiMy5r4afmLH18ivLOe8id35vp/zplRucXO2tTLiQKo8g61PBix83LIBxN2WZ4u3SKJWK8E5+8M2mwyQaqyIyJ9gzE0W7dyOn7jfDmop8chbQlZzb+vH8yycFIaWyA0qmS9F/04boUfADvuGux3SP1nxT89gLsNQxUw9+KoOvAvaIFRAOGIhceOW25p91aQIgUDEi+1LQThicoTH2r4B0tO/WJgbV/U23cUlXgZakrETfDQYqfGhYtDhSY837RAR1AIBB4fQ85GYgJjuivBEKDIR27/71/PUpsgvmlKr3lp5hTU+3bv0PdVtu6pXzXFxPcOAXuFd9NF+I+3kobjN8mHXVx/jd0H9LsNlEnjWV9cEM97lFUAItV6qpe2xHTU6cDHXFARyNEse3SuSg5baHasNAgpF6qFxcrtAVw4L+96EwVfkqmOwXmvM6Ago+ux1N6OZHvGzcTPLdNe TkNCVenN O7NMpbEMr8sbyoyUamVOFXkO649pGzppieCDpcUf73Yshf43z/TjpwcQXTUCBhNfH0lSWox8XsWQyaEwRqtThcJxs+xZHURENmxMqz7d7NCBTKJf5l0UKdtxsKsQTnk76GaUAQTX7Ru6xmbJSDxSlRqoUhgr19jrsGsRmQXKMTx4DM0T9S+IhXmvMiEVjLBk28Oi+5icxcmqY8xLdFNd1Fl3Ui/6cn7kI2DJVKO9VFnFmaxYBd9pwubu8s6wzhnIEkInvwd534a/hzDoO4L4wuFO5DWQpD7/aWSkwOehzePZeqyyyv5dJ9Pwzf96uJ6wQurFxJ2GOZNteTq0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 23/01/2024 17:52, Yang Shi wrote: > On Tue, Jan 23, 2024 at 9:14 AM Ryan Roberts wrote: >> >> The addition of commit efa7df3e3bb5 ("mm: align larger anonymous >> mappings on THP boundaries") caused the "virtual_address_range" mm >> selftest to start failing on arm64. Let's fix that regression. >> >> There were 2 visible problems when running the test; 1) it takes much >> longer to execute, and 2) the test fails. Both are related: >> >> The (first part of the) test allocates as many 1GB anonymous blocks as >> it can in the low 256TB of address space, passing NULL as the addr hint >> to mmap. Before the faulty patch, all allocations were abutted and >> contained in a single, merged VMA. However, after this patch, each >> allocation is in its own VMA, and there is a 2M gap between each VMA. >> This causes the 2 problems in the test: 1) mmap becomes MUCH slower >> because there are so many VMAs to check to find a new 1G gap. 2) mmap >> fails once it hits the VMA limit (/proc/sys/vm/max_map_count). Hitting >> this limit then causes a subsequent calloc() to fail, which causes the >> test to fail. >> >> The problem is that arm64 (unlike x86) selects >> ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT. But __thp_get_unmapped_area() >> allocates len+2M then always aligns to the bottom of the discovered gap. >> That causes the 2M hole. >> >> Fix this by detecting cases where we can still achive the alignment goal >> when moved to the top of the allocated area, if configured to prefer >> top-down allocation. >> >> While we are at it, fix thp_get_unmapped_area's use of pgoff, which >> should always be zero for anonymous mappings. Prior to the faulty >> change, while it was possible for user space to pass in pgoff!=0, the >> old mm->get_unmapped_area() handler would not use it. >> thp_get_unmapped_area() does use it, so let's explicitly zero it before >> calling the handler. This should also be the correct behavior for arches >> that define their own get_unmapped_area() handler. >> >> Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") >> Closes: https://lore.kernel.org/linux-mm/1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com/ >> Cc: stable@vger.kernel.org >> Signed-off-by: Ryan Roberts > > Thanks for debugging this. Looks good to me. Reviewed-by: Yang Shi > Thanks! > >> --- >> >> Applies on top of v6.8-rc1. Would be good to get this into the next -rc. > > This may have a conflict with my fix (" mm: huge_memory: don't force > huge page alignment on 32 bit") which is on mm-unstable now. It applies cleanly to mm-unstable. You change modifies the top part of __thp_get_unmapped_area() and mine modifies the bottom :) > >> >> Thanks, >> Ryan >> >> mm/huge_memory.c | 10 ++++++++-- >> mm/mmap.c | 6 ++++-- >> 2 files changed, 12 insertions(+), 4 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 94ef5c02b459..8c66f88e71e9 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -809,7 +809,7 @@ static unsigned long __thp_get_unmapped_area(struct file *filp, >> { >> loff_t off_end = off + len; >> loff_t off_align = round_up(off, size); >> - unsigned long len_pad, ret; >> + unsigned long len_pad, ret, off_sub; >> >> if (off_end <= off_align || (off_end - off_align) < size) >> return 0; >> @@ -835,7 +835,13 @@ static unsigned long __thp_get_unmapped_area(struct file *filp, >> if (ret == addr) >> return addr; >> >> - ret += (off - ret) & (size - 1); >> + off_sub = (off - ret) & (size - 1); >> + >> + if (current->mm->get_unmapped_area == arch_get_unmapped_area_topdown && >> + !off_sub) >> + return ret + size; >> + >> + ret += off_sub; >> return ret; >> } >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index b78e83d351d2..d89770eaab6b 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1825,15 +1825,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, >> /* >> * mmap_region() will call shmem_zero_setup() to create a file, >> * so use shmem's get_unmapped_area in case it can be huge. >> - * do_mmap() will clear pgoff, so match alignment. >> */ >> - pgoff = 0; >> get_area = shmem_get_unmapped_area; >> } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { >> /* Ensures that larger anonymous mappings are THP aligned. */ >> get_area = thp_get_unmapped_area; >> } >> >> + /* Always treat pgoff as zero for anonymous memory. */ >> + if (!file) >> + pgoff = 0; >> + >> addr = get_area(file, addr, len, pgoff, flags); >> if (IS_ERR_VALUE(addr)) >> return addr; >> -- >> 2.25.1 >>