From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A4D3D591A3 for ; Mon, 18 Nov 2024 17:05:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AFB086B007B; Mon, 18 Nov 2024 12:05:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A5D2F6B0082; Mon, 18 Nov 2024 12:05:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AE5D6B0085; Mon, 18 Nov 2024 12:05:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 61E8B6B007B for ; Mon, 18 Nov 2024 12:05:53 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1892CACB76 for ; Mon, 18 Nov 2024 17:05:53 +0000 (UTC) X-FDA: 82799841474.30.DBC148B Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) by imf03.hostedemail.com (Postfix) with ESMTP id 88E4920013 for ; Mon, 18 Nov 2024 17:05:28 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=blPs1NZu; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731949350; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RAAcT/OPi/xTEIu4o5P9ht6vtv0FiG41VgA9+8+Ke2o=; b=1Immmj/ld/S2Y0CqiOt9sR7/PmTMEhbCos268qIe53goufyKGWHlG2tCe7mxuPNFX+ZXTi azd88Oc4Xk2cLkT4Eetm/tzCroPHNmlj1JgWCCELR0fDamfLmleLM0JVluzQ2ysR6xo/gw m1VZn8l57hCf6w+kmWRbA5+Gb4eim1w= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=blPs1NZu; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731949350; a=rsa-sha256; cv=none; b=T9AOZtG7Z7InzfWWNg8bvgv+DtLfpaukvP+Bq60Ok7L+zA1MuzOWKiuCx1CxlW1HhTRmV5 1+FW1CrgCsPVCxDUn48zvVXOJQ5xGl1ZJXt1HDA/e3hiZ7AhUM7sIXGEtBjSLetxuy/38l 74MYWPXPJYdDvVZvB24KKxyvFPvZwJU= Received: by mail-ed1-f46.google.com with SMTP id 4fb4d7f45d1cf-5cb15b84544so3055119a12.2 for ; Mon, 18 Nov 2024 09:05:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731949549; x=1732554349; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RAAcT/OPi/xTEIu4o5P9ht6vtv0FiG41VgA9+8+Ke2o=; b=blPs1NZu+SPYk48Tdix2CbgOxVqVFJW5ayisdUsHnkDLCV08gIGKoVp0N1hiFEHxNz oZBwW4sc/uNVYrnIXIwrko52qS3LYCjl7qyQONOMuijW9ggjGoe/Uo3PmXPKskfoFMV+ DJ7AwNo5mT7Zf1cL4NxmeUafhhdlvGuZ9oKcGNXwUUA4bTkjX35wq0ELq0hD3b5WwOlU Lnx+zh3cdmY6wdYT8VXkXIKlqXojBi7f4VEwGn6u1mqlaqlHubp7ys9lBOk+qo7TF6+k o+UeYG9M/M92fgW0Nxq08wfPz7wbkOX+3imQlW5xzIe9bc0TcPaAUvsqeB/SZFK8jjmD O74A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731949549; x=1732554349; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RAAcT/OPi/xTEIu4o5P9ht6vtv0FiG41VgA9+8+Ke2o=; b=ilEmnrz+fq5brF1A1IJe3HtB24nJBPShtjUcAdLGjBpEeda9k8kpzXeYDA0C2Tpmg5 FODCS8UVNGQ1Eybjcq3DGZ4SEfKPOxONuk4xZkejEHCIa6MvoqIiU0085RtLYaCzaoCX fbyTY5wznn+VepMErwCaZeyoAuGNEDD7ZFFijVEBh/s0Yak3UEfBwbCcGjK6yiHM+q4v rvo4H1pPKC8AaAiVXNrTPQlCpd5LSwBU8RnditspctTLZPiWcoUZh9sjTWU1gIXX5u2v WgCNe0JdIpVhxfHy5vIxUMv9XVGOtAi02sRnkgLNJwq+9UWA7BCoDC6OpNj96IqJYvtV Igjw== X-Forwarded-Encrypted: i=1; AJvYcCVv/pM1ey4KrsyMg5R05Rw3hCDU0TVd4+eVvCnu24XGZ2wmlwXdl+NEKWUAOBIKsaC0+Xur7iP1/w==@kvack.org X-Gm-Message-State: AOJu0YzNrrzTPOcbwzcIA2y6N5WIk8hI1OLqia0BWMRgkSljmA8rh1Tp gsuHU89A974a6P4lGncCv3bjQfp/JomIXmsjOFOmxi/fybW34ojNbuUO0q/okBHNToy6YmDEeVc 6XrUaOgELG8/X2FfZ3KSB4TtAN8U= X-Google-Smtp-Source: AGHT+IFT1a0ahsVwQr/ntY5jBpOYCzQ75ZgH6KFnj0LUci0TFmKYQgqclum4n/5C7Xi5p6jFSBhTk8VTOpuyk1gKTg8= X-Received: by 2002:a05:6402:35cd:b0:5cf:9517:e3e3 with SMTP id 4fb4d7f45d1cf-5cf9517e555mr10913313a12.25.1731949547556; Mon, 18 Nov 2024 09:05:47 -0800 (PST) MIME-Version: 1.0 References: <20241115215256.578125-1-kaleshsingh@google.com> In-Reply-To: From: Yang Shi Date: Mon, 18 Nov 2024 09:05:36 -0800 Message-ID: Subject: Re: [PATCH] mm: Respect mmap hint address when aligning for THP To: Vlastimil Babka Cc: Kalesh Singh , kernel-team@android.com, android-mm@google.com, Andrew Morton , Yang Shi , Rik van Riel , Ryan Roberts , Suren Baghdasaryan , Minchan Kim , Hans Boehm , Lokesh Gidra , stable@vger.kernel.org, "Liam R. Howlett" , Lorenzo Stoakes , Jann Horn , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 88E4920013 X-Stat-Signature: dfs3a798urwyqe65bkdrk3bhishadgim X-Rspam-User: X-HE-Tag: 1731949528-744325 X-HE-Meta: U2FsdGVkX18sbe8u4JncGaZSOGnLejTG9iFg0e8Ol08hT/jyu5wYb6vLq0uGLkQFx6+6bZFINt4JKEXkaoVob21VucVtttDzg2k93lI1U5vmkOLSNDgLssotv2rl6896oCZlG4Pa0zFNo0R/Qq3GYfxPuOR34QlsIUz9UXdZQLkfzFciuEAhOd0cooSK4a+br6ZjKe2Zy7KqeRSXIt7KGUlpe1Ye8RvAuv3A8NLrDihA0eEP0Rd+dbyyHJAl/iH6gTHP5uE6ybFCZcIrMYONI6BDrpKp4QlFEx1ynWqdibMFdbLA38+rXqZ5JUGWasSmT06zGHAto9bVa+cQWlnu/w3Dm61FK6QUi4UoQTbDBLZ7weHa7wreRMwfZJU2AsKBlq4OdIgmydC7VEKllPkavyMfl/P+pMKBFbjv4/HZ+9eGeQReOaYGDuEkJxEfybjekdtmZorWeVuTEar1VTd+S8ag0ddVbMYzKURfxqJKiGpC5hnaD5TEtNge6Zk64ni/AY7JSoAIRtWsqFriW5/6AUF+fPAdlvEpIb7CSbc486mEt2ROfOSQJ6n8zKB8XtmjuLk0ju0kTzwyIao+JQhdc/VBYeZV744uhRDsxbc3BIfCT+LlYGe9X2a7w9+Oszp9VArOd0RurWCpMkUglDsD+c1oqOJSH14bYCWf8iRLZrSvp+8MJ512JLTfMqnxfiTDFXCxcuZCr9N+x/JbzZHQnbLblItmyZoe4ZuNwus2wSY5Nw+u2tRwmwqEDs5i9I+R9G3iTwW+O4B+pd/ONJJ2+VMReWCZ6qLog3Fwz0c3/QMg73DLyB5pFZ1c1PcfleCmHPnOcAT+NpCocmQQ6tWT5BYJTbzHRHfc9Ya3vGvJ54BFU5HzfypzKCpWl2eYFLBkvpE5CKqrIG5iLwFYbv0RbaXe4VF41kaocd4eui1ayWUEVb3Q98Ml9IkrHA+Id6h78D36Z7pfS4vah9xvEXX oYKw8+Zl SQf2ujavUYZw6f8Ovz6U1mHEkDsy8O4FovZHOr2RW43PxCLH8Xhp3rfoL6s+Fnt/z1+gYk5tPZZMShdoemf+fq5SZ/LS2ZKgirGeLEKleoqTK5nbQoh9/1voSLA6fSGOfkdrAVf0lcaLALa8pH5Q6R5+LSZW35vy3zD2Ke5X1p29TIr8nA6xoghtEr6Y74RggP3cKAkq7QwRcTwnpYzj8hwp99/N88yIVNmFjv8msMUIPq7cPPXAl1zLyB/16OyITgSGRDGgyDZpfm53MWxYgpcfqFv6DBo1CrWWzwVv8nYh+GEpKol2Siv3yLClCfbRN1kBxcxWSR4QBD/JU0Uo7URAQXhegFfHuHa1Wo7U18b1KGOmwKiCtRtUrIJUA84C4lPyc203cAmOPzidXfXk5RnWg9x+qQuzS7nSwM/csuPb/Lpees6ZlYg2AtSqx6cIIAvafzrUZ+MoRsItEw/Z50/rFcNwo8zK2ISW/ZqcVTEX7Jk7N4ArwquttPqQVHtsm8x0Se9+3/7D0k5QOVJLeGrfYFxGNfWgBdXEpuCokT78uwH2L4ulketQqjDGI35SDcMyb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Nov 17, 2024 at 3:12=E2=80=AFAM Vlastimil Babka wr= ote: > > On 11/15/24 23:15, Yang Shi wrote: > > On Fri, Nov 15, 2024 at 1:52=E2=80=AFPM Kalesh Singh wrote: > >> > >> Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP > >> boundaries") updated __get_unmapped_area() to align the start address > >> for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=3Dy. > >> > >> It does this by effectively looking up a region that is of size, > >> request_size + PMD_SIZE, and aligning up the start to a PMD boundary. > >> > >> Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment > >> on 32 bit") opted out of this for 32bit due to regressions in mmap bas= e > >> randomization. > >> > >> Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous > >> mappings to PMD-aligned sizes") restricted this to only mmap sizes tha= t > >> are multiples of the PMD_SIZE due to reported regressions in some > >> performance benchmarks -- which seemed mostly due to the reduced spati= al > >> locality of related mappings due to the forced PMD-alignment. > >> > >> Another unintended side effect has emerged: When a user specifies an m= map > >> hint address, the THP alignment logic modifies the behavior, potential= ly > >> ignoring the hint even if a sufficiently large gap exists at the reque= sted > >> hint location. > >> > >> Example Scenario: > >> > >> Consider the following simplified virtual address (VA) space: > >> > >> ... > >> > >> 0x200000-0x400000 --- VMA A > >> 0x400000-0x600000 --- Hole > >> 0x600000-0x800000 --- VMA B > >> > >> ... > >> > >> A call to mmap() with hint=3D0x400000 and len=3D0x200000 behaves diffe= rently: > >> > >> - Before THP alignment: The requested region (size 0x200000) fits in= to > >> the gap at 0x400000, so the hint is respected. > >> > >> - After alignment: The logic searches for a region of size > >> 0x400000 (len + PMD_SIZE) starting at 0x400000. > >> This search fails due to the mapping at 0x600000 (VMA B), and the = hint > >> is ignored, falling back to arch_get_unmapped_area[_topdown](). > > Hmm looks like the search is not done in the optimal way regardless of > whether or not it ignores a hint - it should be able to find the hole, no= ? > > >> In general the hint is effectively ignored, if there is any > >> existing mapping in the below range: > >> > >> [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE) > >> > >> This changes the semantics of mmap hint; from ""Respect the hint if a > >> sufficiently large gap exists at the requested location" to "Respect t= he > >> hint only if an additional PMD-sized gap exists beyond the requested s= ize". > >> > >> This has performance implications for allocators that allocate their h= eap > >> using mmap but try to keep it "as contiguous as possible" by using the > >> end of the exisiting heap as the address hint. With the new behavior > >> it's more likely to get a much less contiguous heap, adding extra > >> fragmentation and performance overhead. > >> > >> To restore the expected behavior; don't use thp_get_unmapped_area_vmfl= ags() > >> when the user provided a hint address. > > Agreed, the hint should take precendence. > > > Thanks for fixing it. I agree we should respect the hint address. But > > this patch actually just fixed anonymous mapping and the file mappings > > which don't support thp_get_unmapped_area(). So I think you should > > move the hint check to __thp_get_unmapped_area(). > > > > And Vlastimil's fix d4148aeab412 ("mm, mmap: limit THP alignment of > > anonymous mappings to PMD-aligned sizes") should be moved to there too > > IMHO. > > This was brought up, but I didn't want to do it as part of the stable fix= as > that would change even situations that Rik's change didn't. > If the mmap hint change is another stable hotfix, I wouldn't conflate it > either. But we can try it for further development. But careful about just > moving the code as-is, the file-based mappings are different than anonymo= us > memory and I believe file offsets matter: > > https://lore.kernel.org/all/9d7c73f6-1e1a-458b-93c6-3b44959022e0@suse.cz/ > > https://lore.kernel.org/all/5f7a49e8-0416-4648-a704-a7a67e8cd894@suse.cz/ Did some research about the history of the code, I found this commit: 97d3d0f9a1cf ("mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignment"), it tried to fix "the function would not try to allocate PMD-aligned area if *any* hint address specified." It was for file mapping back then since anonymous mapping THP alignment was not supported yet. But it seems like this patch somehow tried to do something reverse. It may not be correct either. With Vlastimis's fix, we just try to make the address THP aligned for anonymous mapping when the size is PMD aligned. So we don't need to take into account the padding for anonymous mapping anymore. So IIUC we should do something like: @@ -1085,7 +1085,11 @@ static unsigned long __thp_get_unmapped_area(struct file *filp, if (off_end <=3D off_align || (off_end - off_align) < size) return 0; - len_pad =3D len + size; + if (filp) + len_pad =3D len + size; + else + len_pad =3D len; + if (len_pad < len || (off + len_pad) < off) return 0; > > >> Cc: Andrew Morton > >> Cc: Vlastimil Babka > >> Cc: Yang Shi > >> Cc: Rik van Riel > >> Cc: Ryan Roberts > >> Cc: Suren Baghdasaryan > >> Cc: Minchan Kim > >> Cc: Hans Boehm > >> Cc: Lokesh Gidra > >> Cc: > >> Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP bound= aries") > >> Signed-off-by: Kalesh Singh > > Reviewed-by: Vlastimil Babka > > >> --- > >> mm/mmap.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/mm/mmap.c b/mm/mmap.c > >> index 79d541f1502b..2f01f1a8e304 100644 > >> --- a/mm/mmap.c > >> +++ b/mm/mmap.c > >> @@ -901,6 +901,7 @@ __get_unmapped_area(struct file *file, unsigned lo= ng addr, unsigned long len, > >> if (get_area) { > >> addr =3D get_area(file, addr, len, pgoff, flags); > >> } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) > >> + && !addr /* no hint */ > >> && IS_ALIGNED(len, PMD_SIZE)) { > >> /* Ensures that larger anonymous mappings are THP alig= ned. */ > >> addr =3D thp_get_unmapped_area_vmflags(file, addr, len= , > >> > >> base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623 > >> -- > >> 2.47.0.338.g60cca15819-goog > >> >