From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E54D3E7716E for ; Fri, 6 Dec 2024 02:49:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44E966B0168; Thu, 5 Dec 2024 21:49:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FE846B0169; Thu, 5 Dec 2024 21:49:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2ED3F6B016F; Thu, 5 Dec 2024 21:49:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 123496B0168 for ; Thu, 5 Dec 2024 21:49:58 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 87A051601A6 for ; Fri, 6 Dec 2024 02:49:57 +0000 (UTC) X-FDA: 82863004056.15.3F64A65 Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by imf07.hostedemail.com (Postfix) with ESMTP id 973A74000C for ; Fri, 6 Dec 2024 02:49:37 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf07.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733453379; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=14F+HAErB4TAUHBbFCd0MWlZPEJyU4qRHjb5GKrS7J4=; b=G7VrSc0F2GxBDxOB1uAqbBGxDjU/eBxPrflyyjpa/38ckRq1wCaVqs4elM5pQYRBI6Pq7U tXvojtmWxztNblkJqjnjW5tQgVdmGsIVyM0+AiaOHjsfYJavz9TSzoSkRAq+e6r91rj66/ qoTpRyanXXmMH7m1ULz0PfrVOzJZfvc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733453379; a=rsa-sha256; cv=none; b=pm6OYkv48GEqIqcTBv11inHao18pKXV+i0DB3ICR35PLt9TPnfCunzMjecUD2pgFF/QxaD scxkZpLsKCLcgT7TarFKCxB6NO7A6h70vk9q9nj+gA4s5T6lBjt6hzQKA0R2x+Ur+WK7dC PPJcrZNy+mdeoOs7kZg+BfiFrSquCeM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf07.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4Y4G1w1Xz4z1JDjw; Fri, 6 Dec 2024 10:49:40 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id 952361A0188; Fri, 6 Dec 2024 10:49:49 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 6 Dec 2024 10:49:48 +0800 Message-ID: Date: Fri, 6 Dec 2024 10:49:48 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] mm: Respect mmap hint address when aligning for THP To: Kalesh Singh CC: , , Andrew Morton , Vlastimil Babka , Yang Shi , Rik van Riel , Ryan Roberts , Suren Baghdasaryan , Minchan Kim , Hans Boehm , Lokesh Gidra , , "Liam R. Howlett" , Lorenzo Stoakes , Jann Horn , Yang Shi , , References: <20241118214650.3667577-1-kaleshsingh@google.com> Content-Language: en-US From: Kefeng Wang In-Reply-To: <20241118214650.3667577-1-kaleshsingh@google.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemf100008.china.huawei.com (7.185.36.138) X-Rspamd-Queue-Id: 973A74000C X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: wow98m6jcape7x8aci8dk8uuz44cqhcy X-HE-Tag: 1733453377-412196 X-HE-Meta: U2FsdGVkX1/mMLKZTIImgwj8NG0TFsw7f65i+GxwyfDEvKqNnZIlRtwXAE2k/DkoUVv9qtnTaHRGb0xo2cEJZMi1m4se19PCJAMjLIoXW93T6xUjJq9sdJTyMNq/USaueIcUzDa/xbMuSVneKEX2Lq4o1iAa3knsgir50Sc3rtHfpYaGxpSWh1sHSyulQFvZqsx3jdA+In7wecEUZrDX0R4GoHP8sfreiYKnxbvqx3odD5GpX4ydTfuyn8ak8OXaW8fy1YUPoRIh0VZbqnwliF1D3iTROdFfVJk1qNh8PBd0VKTd2QHo8urWbHF9GqefUBFjA8P+iyn9GejW7WmA0t45TgJZFfLDi6RU0HEjCuv6RkvDt/eoGPGQrsZC2sENWLbzFWtkzMq06qOk1/siSkBMnwTvJJ2TYJ8dCmYE3AxGf/+qVsZT6WeiUt2j+ULtUetU79929A3q32+LSpGehKF8WdWqBGURagwZaEay2TVgePjlpA172mWJTDdQpCRhNe2q/KDinDGGHnXMSMHTJJBy7ASuKrm9UnMQ6Mf9kKaNf79wS5ndHTnrVLmQ1YFi2JeAkGR7JdZcBrHXpZ+WLvRnF4Bw5ea8ujQEadRdPN8vD0kpQcLEWHYFXuPfByr4xXYE1KIytWUt4fDoFowrIy/MhZB6TAqGPhRNBJiwEosucOYnTTbJWR3B7m8lI4B+LWwe3IO2HSgRuPPLqtM87cAH/omH1Cu5Elx7KCjA4vvaa38Or5LzGfaUZwCLbk1sz7BPE6UzqpsGCcomYCAMGgclAEqGJgND6z0nAPd7Fx+hUtwzOLzW8yC8ZDGJDfP4RP2cFOrj/b5jKPH1RO2HpRqvKAMDVlt34MA56TOcfYS0QzikGZ/OnCgEuV7X018uvk/A/ZuFZhebjMYwBf4qjMm0+itj2fCrZiTvQhusYEN9elPpzies3uoC4Aw+ZLgHeCBF7oiEdGbdBUP7ud7 SlJNyD8p kxjfHWcUCgwHkhbOY3o5IOJP9pZgwbHPIvMFOtwe4lhjuf2sihkUrr9NoXqvhLkn3zAAyJPDQCYJ5HYyeythaPJuVOKa9FtA0PgBnIUVUjPr8Av2OJY50sh5wlEkRCfzJT9ss4YJCdWfmjUOD4YrIfoVhoFaKtc6q3XjvdXu/wC2tl9Uj09QhMfbXMPJXgckaZzc5xiGUjJlUSiFcbYTrqDy/KV3RDBU9J8wZfTEurESyTBHwAqHu1mPgUnnXMigKqUgs0ekQTK0thDYGvvEO1+4VNkiLBZX0mxrLD9xjy+Lepo2ei2Dik46qU99LpBQ76HDyWDTqN3zziddcO75hUfFV9AajSn+uFDyKv/dsBqBM0Ae01AUSWmKBBBbSLVqewlQEOdtkksGU3YOV5inMFwAhUE3UeY+jA+YoOCDRkLWpWG5waLMcq1t/AyDAdY/WUq3H4sJpe8HEJQ0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/11/19 5:46, Kalesh Singh wrote: > Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP > boundaries") updated __get_unmapped_area() to align the start address > for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y. > > It does this by effectively looking up a region that is of size, > request_size + PMD_SIZE, and aligning up the start to a PMD boundary. > > Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment > on 32 bit") opted out of this for 32bit due to regressions in mmap base > randomization. > > Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous > mappings to PMD-aligned sizes") restricted this to only mmap sizes that > are multiples of the PMD_SIZE due to reported regressions in some > performance benchmarks -- which seemed mostly due to the reduced spatial > locality of related mappings due to the forced PMD-alignment. > > Another unintended side effect has emerged: When a user specifies an mmap > hint address, the THP alignment logic modifies the behavior, potentially > ignoring the hint even if a sufficiently large gap exists at the requested > hint location. > > Example Scenario: > > Consider the following simplified virtual address (VA) space: > > ... > > 0x200000-0x400000 --- VMA A > 0x400000-0x600000 --- Hole > 0x600000-0x800000 --- VMA B > > ... > > A call to mmap() with hint=0x400000 and len=0x200000 behaves differently: > > - Before THP alignment: The requested region (size 0x200000) fits into > the gap at 0x400000, so the hint is respected. > > - After alignment: The logic searches for a region of size > 0x400000 (len + PMD_SIZE) starting at 0x400000. > This search fails due to the mapping at 0x600000 (VMA B), and the hint > is ignored, falling back to arch_get_unmapped_area[_topdown](). > > In general the hint is effectively ignored, if there is any > existing mapping in the below range: > > [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE) > > This changes the semantics of mmap hint; from ""Respect the hint if a > sufficiently large gap exists at the requested location" to "Respect the > hint only if an additional PMD-sized gap exists beyond the requested size". > > This has performance implications for allocators that allocate their heap > using mmap but try to keep it "as contiguous as possible" by using the > end of the exisiting heap as the address hint. With the new behavior > it's more likely to get a much less contiguous heap, adding extra > fragmentation and performance overhead. > > To restore the expected behavior; don't use thp_get_unmapped_area_vmflags() > when the user provided a hint address, for anonymous mappings. > > Note: As, Yang Shi, pointed out: the issue still remains for filesystems > which are using thp_get_unmapped_area() for their get_unmapped_area() op. > It is unclear what worklaods will regress for if we ignore THP alignment > when the hint address is provided for such file backed mappings -- so this > fix will be handled separately. > > Cc: Andrew Morton > Cc: Vlastimil Babka > Cc: Yang Shi > Cc: Rik van Riel > Cc: Ryan Roberts > Cc: Suren Baghdasaryan > Cc: Minchan Kim > Cc: Hans Boehm > Cc: Lokesh Gidra > Cc: > Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") > Signed-off-by: Kalesh Singh > Reviewed-by: Rik van Riel > Reviewed-by: Vlastimil Babka > --- > > Changes in v2: > - Clarify the handling of file backed mappings, as highlighted by Yang > - Collect Vlastimil's and Rik's Reviewed-by's > > mm/mmap.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 79d541f1502b..2f01f1a8e304 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -901,6 +901,7 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, > if (get_area) { > addr = get_area(file, addr, len, pgoff, flags); > } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) > + && !addr /* no hint */ Hello, any update about this patch? And another question about efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries"), it said that align anon mapping, but the code does enable file mapping too, for fs without get_unmapped_area and enable CONFIG_TRANSPARENT_HUGEPAGE, we always try thp_get_unmapped_area_vmflags(), right? if (file) { if (file->f_op->get_unmapped_area) get_area = file->f_op->get_unmapped_area; } ... if (get_area) { addr = get_area(file, addr, len, pgoff, flags); } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && !addr /* no hint */ && IS_ALIGNED(len, PMD_SIZE)) { /* Ensures that larger anonymous mappings are THP aligned. */ addr = thp_get_unmapped_area_vmflags(); } else { addr = mm_get_unmapped_area_vmflags(); } Should we limit it to not call thp_get_unmapped_area_vmflags() for file? diff --git a/mm/mmap.c b/mm/mmap.c index 16f8e8be01f8..854fe240d27d 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -894,6 +894,7 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, addr = get_area(file, addr, len, pgoff, flags); } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && !addr /* no hint */ + && !file && IS_ALIGNED(len, PMD_SIZE)) { /* Ensures that larger anonymous mappings are THP aligned. */ addr = thp_get_unmapped_area_vmflags(file, addr, len, Thanks > && IS_ALIGNED(len, PMD_SIZE)) { > /* Ensures that larger anonymous mappings are THP aligned. */ > addr = thp_get_unmapped_area_vmflags(file, addr, len, > > base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623 > -- > 2.47.0.338.g60cca15819-goog > >