From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0818CC47DDB for ; Tue, 23 Jan 2024 17:52:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 261598D0002; Tue, 23 Jan 2024 12:52:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 211E98D0001; Tue, 23 Jan 2024 12:52:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D9B58D0002; Tue, 23 Jan 2024 12:52:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F2FD28D0001 for ; Tue, 23 Jan 2024 12:52:42 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A429CA18B1 for ; Tue, 23 Jan 2024 17:52:42 +0000 (UTC) X-FDA: 81711320964.30.D09B695 Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf08.hostedemail.com (Postfix) with ESMTP id D93F8160017 for ; Tue, 23 Jan 2024 17:52:39 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Cr/kNdv4"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706032359; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wUhLXTm6NgtNUKfl8Ydcm6SSUWs92JF5Lv5SsIRmk5A=; b=ZMXhguNqKa32xIbI8ro+dHvr4hqiGEX6py612Iylg36Zn9JASz3bcaG2ZT9U8l9b851EDB tZz9Qwzx57raLc6URbW4Eqk/eCsyKwcA4SAeklH4W25hmhFK6SnVNNVym4inXWMjv7cRlq 78l7loohZl1CV/KpynBIU1fU2XxqQa4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Cr/kNdv4"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706032359; a=rsa-sha256; cv=none; b=IFitlpaDCsIQFYGnhUaLxGn4YuQ4BMxkoJkhs+5uSd1uvwB66l+Ss5T6PzXYT+M8NNPaFl qG4S4pTf62XmIl4wqxcAsCMCeUmbp/47FBWVzkgKyL2mR2uLINxTOv4A4Ayh5YodE8rkrM CNWf0ZQVGylOTbreSIgtq6pnI5bEFJo= Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-5cfd95130c6so966558a12.1 for ; Tue, 23 Jan 2024 09:52:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706032358; x=1706637158; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wUhLXTm6NgtNUKfl8Ydcm6SSUWs92JF5Lv5SsIRmk5A=; b=Cr/kNdv4qw4km7Rb/Px5N8qygrneSPzmppdyBr79lh/JqiSwAw+vIKCq5sHS6yoDbg 0j432KU/kBh5zlHWdZU3nLzcEZtycp5oMAdsaQJQ4tEOjR08ODFQWiPeEzk/0Wv1ku5D 83K7IKtzlEkqBQ9QZ3Z6ORpz6H5P4P89JdGZwYxWW+FEeKxAz9SKdOjdF4j8RJa0yfxA v+6StWG9jnLvZairOMGc8zIg9p+ed3MgkzuMReYq99VaDFOO164bSWQvA6ZwFXstAsoT Vm+NN69q/0chnb96Xnhf4vFmTaK5VASLQ+Rb9a9N12qs3ECZh+9ANiPoQmOTGlRc6o6V /O7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706032358; x=1706637158; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wUhLXTm6NgtNUKfl8Ydcm6SSUWs92JF5Lv5SsIRmk5A=; b=w3KuiElr0Cqh4WssYniCbbF1tu+MqJAdB3S4YVq0B76aN3bbrzdUJgNxxKA/PZUQOZ fM9bR7zolMhaSQisWKcVwD4ZE+EEDvZUcs1q2MO+hOBbLmbf/dzghyzDinoSIMgh6oo+ HVg/0LJCrAgkxL8KLski0JR9A96IgCIYZ3/22uRbSZRH9W21IJrmSVFKipKa/9B4DgLB eRpdHgeINpdWozjfz9l9u4EDYxYgbAp3pdgZzfUsfrCr0em2ytMOqjR+Z0CFU9HRfe4n PJMY0NxQ5VbGLszLgX8DpUZc8PIYHoqFvZBtKgidnNMtm2H8tkiP9qa2jV9+0wwUdG2A rKXw== X-Gm-Message-State: AOJu0Yz6qnzPmtL3cP8F8qF38sZAH6cOXWLdSGVKuMyEEoRuGuKiBUIV 27eR/Fvg3jkogBPkuFgX9XpRMDk/fl8SvnoP+k4guHZYd8jXJA/c1Tx7AkrWLtyrVwi0opYE55I 3BxD2mhqvDgPvYtR7dUb1/+EFztI= X-Google-Smtp-Source: AGHT+IFKYmAEAvoatOUmVhLild3TQJSwbKJoH54epWxtSe45P38ktiHvNTzMUkTOygI1dp3hLf6EgylAvHCx2+uNqa0= X-Received: by 2002:a17:90a:4ce5:b0:28e:7b6b:5d0 with SMTP id k92-20020a17090a4ce500b0028e7b6b05d0mr2666444pjh.67.1706032358537; Tue, 23 Jan 2024 09:52:38 -0800 (PST) MIME-Version: 1.0 References: <20240123171420.3970220-1-ryan.roberts@arm.com> In-Reply-To: <20240123171420.3970220-1-ryan.roberts@arm.com> From: Yang Shi Date: Tue, 23 Jan 2024 09:52:26 -0800 Message-ID: Subject: Re: [PATCH v1] mm: thp_get_unmapped_area must honour topdown preference To: Ryan Roberts Cc: Andrew Morton , Rik van Riel , Matthew Wilcox , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: D93F8160017 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: d1kpzkfko9gerph7pm355b3b89shf5g5 X-HE-Tag: 1706032359-35101 X-HE-Meta: U2FsdGVkX18JKM/KtqoagISJcV4tGvz3becy7l193sIOOiRRQ7HGPzCV3sJntM9lsgeGMrv8gLc8hpCtc8IpGlAJb61HNp7ud/Xbp3Y5ouAyRXXSmpzMcioq/Pm42u6TGKoiZefhPlsEytNOVh8p3+idBB7qCYZouyzMuouEUxyJ43N684R1Atyo+XFWbyFT8SjMSAFM2B3EI1n2lF9/hVK5YUjRyU4YEFZxpNpQoh627si5DoY1LIdxR590HMiKv6k7qQ1c/K+Rc7j7AItlQ9jkrNAJjZojfCxtJtMdnn0lJrB6zNJuFeE78q7EZsD61F2qFtzSGS4HlTrVKOloRLrmS5nHRSwbWDxCCqjgzMSIUdZIOqFBFvnGJUCd8VJcEvGsvRTfjSMtTC7BOf7fvLUQwAzyDEPRSgIm0V1hdwvdP7/7/IWXrdvK+MtSQSKOrHu5EYLAAewHUNSlh/GGXd5mLOY5FGOyK7F1sxGI4tJF+pG/pXdwYhDw5v2oJ/GjWW2bv9/hkLLiYZN7aPgmoZ8QPyXlJe522QhNNtjQQZLl76683l3ROZofmhp4I6xf4NYTzt4JXk2EYueZG2kvtDsMdJxKBLqoYnMldVWyOFu+o6rGlJthnH7gigDy5CKgbUZE/N388jeZ/VuM1GcFMVQUf9fKoXiUbYtpV7pTxxl5wDM0BrZMFHGuPcp2CWz3GUxrPEI0aVpsZA6w/A1JcTkX0K6b17wlhDQOl/GBDrShcJjt0YpqIdzKvVnDFUQSHMxHDeggvioBi1+l1WRBGpaatZsfra7tvGzFN46eWUOp3p461214q+sQYNmNyQ/s+q+aoNgjpvGhnM4fpWLi7I2J4i/DVtAFckVNF96tSriA4jNz9axZ5HS+fXuDFXkspCNITaFykWDLlEMIp/nPR/x3q8L8Hff9AGG6OPMQYhJfgqB91TRV6yPTiKJHnEc2QgnBP5NphVmhIjUyRV/ edTeh9Uz VwQ1Ctsq2ylBlkP5gnvlwWYMpt2oDo/0ns2XK7XA/npc6ZyMlC/U9QtXvoIvLm7gvEB1PiAdsLMxaqytSVm6JoX7EFBAUa7QQrelS7VZn7TmAZqtn9HgPREJqajLzq6qUkow6mqjywfUHS/10xbd1iDLkNCy4HMxE+x0WtogPrO1UNeY5Fmt9VE19vYzMFwHQN+E1DRvqX9UXE+WmUwF0sRd6cxjFfq5qY1H/k+kjqpwL4CaJkZGAqWgstFZk1hJ0HJz9J+y5jN2iGx2rOixDx8efQHo4FqQGuZZb2m/usY7vJZna4hrhmPlmP/yzqM82qR6UjSL+vFHoF6iHBXv/DJhRp4AEDXPpC9ld8dRPkFqaB3Oymu+BjEGMOWLbgk+HOS3aBvqbRruEoCb8cT0zxQsRBZY6QHWlkz6VpCZsXzZ4LkHvid7mgQbBAri3Y6KomgcDSnWFeQ1qeE9yYTgr/jsm75GsjV6R3QxMvD1celPlawg347FPEZvZdOcl7an/ODp3wAlYOXUO4VKt3tNcRWaEh+O0cxO+D3F4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 23, 2024 at 9:14=E2=80=AFAM Ryan Roberts = wrote: > > The addition of commit efa7df3e3bb5 ("mm: align larger anonymous > mappings on THP boundaries") caused the "virtual_address_range" mm > selftest to start failing on arm64. Let's fix that regression. > > There were 2 visible problems when running the test; 1) it takes much > longer to execute, and 2) the test fails. Both are related: > > The (first part of the) test allocates as many 1GB anonymous blocks as > it can in the low 256TB of address space, passing NULL as the addr hint > to mmap. Before the faulty patch, all allocations were abutted and > contained in a single, merged VMA. However, after this patch, each > allocation is in its own VMA, and there is a 2M gap between each VMA. > This causes the 2 problems in the test: 1) mmap becomes MUCH slower > because there are so many VMAs to check to find a new 1G gap. 2) mmap > fails once it hits the VMA limit (/proc/sys/vm/max_map_count). Hitting > this limit then causes a subsequent calloc() to fail, which causes the > test to fail. > > The problem is that arm64 (unlike x86) selects > ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT. But __thp_get_unmapped_area() > allocates len+2M then always aligns to the bottom of the discovered gap. > That causes the 2M hole. > > Fix this by detecting cases where we can still achive the alignment goal > when moved to the top of the allocated area, if configured to prefer > top-down allocation. > > While we are at it, fix thp_get_unmapped_area's use of pgoff, which > should always be zero for anonymous mappings. Prior to the faulty > change, while it was possible for user space to pass in pgoff!=3D0, the > old mm->get_unmapped_area() handler would not use it. > thp_get_unmapped_area() does use it, so let's explicitly zero it before > calling the handler. This should also be the correct behavior for arches > that define their own get_unmapped_area() handler. > > Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundari= es") > Closes: https://lore.kernel.org/linux-mm/1e8f5ac7-54ce-433a-ae53-81522b23= 20e1@arm.com/ > Cc: stable@vger.kernel.org > Signed-off-by: Ryan Roberts Thanks for debugging this. Looks good to me. Reviewed-by: Yang Shi > --- > > Applies on top of v6.8-rc1. Would be good to get this into the next -rc. This may have a conflict with my fix (" mm: huge_memory: don't force huge page alignment on 32 bit") which is on mm-unstable now. > > Thanks, > Ryan > > mm/huge_memory.c | 10 ++++++++-- > mm/mmap.c | 6 ++++-- > 2 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 94ef5c02b459..8c66f88e71e9 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -809,7 +809,7 @@ static unsigned long __thp_get_unmapped_area(struct f= ile *filp, > { > loff_t off_end =3D off + len; > loff_t off_align =3D round_up(off, size); > - unsigned long len_pad, ret; > + unsigned long len_pad, ret, off_sub; > > if (off_end <=3D off_align || (off_end - off_align) < size) > return 0; > @@ -835,7 +835,13 @@ static unsigned long __thp_get_unmapped_area(struct = file *filp, > if (ret =3D=3D addr) > return addr; > > - ret +=3D (off - ret) & (size - 1); > + off_sub =3D (off - ret) & (size - 1); > + > + if (current->mm->get_unmapped_area =3D=3D arch_get_unmapped_area_= topdown && > + !off_sub) > + return ret + size; > + > + ret +=3D off_sub; > return ret; > } > > diff --git a/mm/mmap.c b/mm/mmap.c > index b78e83d351d2..d89770eaab6b 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1825,15 +1825,17 @@ get_unmapped_area(struct file *file, unsigned lon= g addr, unsigned long len, > /* > * mmap_region() will call shmem_zero_setup() to create a= file, > * so use shmem's get_unmapped_area in case it can be hug= e. > - * do_mmap() will clear pgoff, so match alignment. > */ > - pgoff =3D 0; > get_area =3D shmem_get_unmapped_area; > } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { > /* Ensures that larger anonymous mappings are THP aligned= . */ > get_area =3D thp_get_unmapped_area; > } > > + /* Always treat pgoff as zero for anonymous memory. */ > + if (!file) > + pgoff =3D 0; > + > addr =3D get_area(file, addr, len, pgoff, flags); > if (IS_ERR_VALUE(addr)) > return addr; > -- > 2.25.1 >