From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE19EC47DD3 for ; Mon, 22 Jan 2024 20:20:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 217F66B00A4; Mon, 22 Jan 2024 15:20:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C7CF6B00A5; Mon, 22 Jan 2024 15:20:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0684A6B00A6; Mon, 22 Jan 2024 15:20:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E94596B00A4 for ; Mon, 22 Jan 2024 15:20:40 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 84C641A0436 for ; Mon, 22 Jan 2024 20:20:40 +0000 (UTC) X-FDA: 81708065040.15.B98D3E6 Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf16.hostedemail.com (Postfix) with ESMTP id A5A4A180018 for ; Mon, 22 Jan 2024 20:20:38 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=BZC3FUyn; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705954838; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4yiVnaa7fFjiQFr9rzWcLPD6FI515d8YZDTG1m7LqNw=; b=R25Ra64ikaW1qmSmRfLa/MQtzYHxTlIfcjqbV2Fl1XkVgbeTwwXz7RzYNRLB0Tibd/WrYK GAgSczGpJ0bKEoXneklMwYltFP/x1OuCIuh5d6a9it2WgDEB+uxQYQHhQq+IrptXTsJjYO fmZXAM1qF/hFyYEQn1OCR98j7N4avzw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=BZC3FUyn; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705954838; a=rsa-sha256; cv=none; b=FrNXT6ZnrVkx3vRv1nV1y0qXFGwbT7HCoQkWs/9ar/bRGDuxnBgPTINH0FcTkp1ngJVAAD IZqVDzGe+BlNuYClFlp0QemCmOy5ECL0D9jbyPaW2/KKFaXs+p7JgA25My9dCzmUGo0Lhb OWcwAVC5gck+VZdYHXhRgyZ1etSbFxQ= Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-5cdf76cde78so1731907a12.1 for ; Mon, 22 Jan 2024 12:20:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705954837; x=1706559637; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4yiVnaa7fFjiQFr9rzWcLPD6FI515d8YZDTG1m7LqNw=; b=BZC3FUynxjil23xxZyeo9WqMsR10dRdwA7rYTDRM87+LyIIxs8boYHRxbUCQGgc5BC VZHF2U0+o95yLCTKgCKMXVyYcop4MN3hr0VBt9s+zlRzW8Vy3gYZB+HfMEs4csn0t6zj g23QhBC6JpLrACFmFZbd0L+I5DIaEgUtgEym7tU3dyif6jNPRvBEiO2xLacRTZYvJ1Yf uKEQgaQ70+dWkbGQ+euN8RjnIGdiwoOxpnjMoXpKePYFc3QtgQ0Opev6uFg9KoW8NiGo 5ngIJ6ACkr0liPqed0mxykAICFOZFtulwwbroBZ9kyRSE572oEQlIBk6wRKmE9yafZJh UXDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705954837; x=1706559637; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4yiVnaa7fFjiQFr9rzWcLPD6FI515d8YZDTG1m7LqNw=; b=lDJPKMlwGe+A0n6nADTzPPE3VZx39cDWUAGp6S/dPyFOgL+48rbQVgcmn76NRUwT9i LLs3dWdVXlTO0AKoYR1AIFYM1rGtyzszgjGS6C/QVkXGvhTZzBEI5NTJm6fRRqvDzFFN LLVqDGCBMpD9ffnU67Zt0jZbUep5l8uPjdsykICJo5VLhP5p+bW/1T67ieFHBKZpgHaE AbL3Yf+KiC7rtGTjRL4vBLZdO181yS36hzRMHsLio5WTjBpb9LQbuICrXW/SAIBF41ww RhzXWqVjwsTqQoaSDM1y/eoO+axIutQOUS2swlhjq4oyaauhW1T26cLlkY8cvzsk88Bd gwAQ== X-Gm-Message-State: AOJu0YwZEGCl51UNHmKJ3rDXNB5dcBjVkRMptuVwpKRDtIl44SQJvkYF XgiDgMVTWGnbS/Rv1pdYfdAqscV4GQtZZP4WT8KvziAeXuTZa8VNn2n3QzgndgFOLlq1Ef5JOKN TqkwSiGujdHh96cbVk69Pkz+YVTs= X-Google-Smtp-Source: AGHT+IGfRRLp0QpMJf2c6UALTE0QEAbr/CTZWAiG37n+VmoEXFSobckpPUKU+13in2atzfiDB52SqVvM9/NXvWp1kUI= X-Received: by 2002:a17:90a:65c5:b0:290:a1c8:91e5 with SMTP id i5-20020a17090a65c500b00290a1c891e5mr1193537pjs.61.1705954837336; Mon, 22 Jan 2024 12:20:37 -0800 (PST) MIME-Version: 1.0 References: <20231214223423.1133074-1-yang@os.amperecomputing.com> <1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com> In-Reply-To: From: Yang Shi Date: Mon, 22 Jan 2024 12:20:25 -0800 Message-ID: Subject: Re: [RESEND PATCH] mm: align larger anonymous mappings on THP boundaries To: Ryan Roberts Cc: Matthew Wilcox , Yang Shi , riel@surriel.com, cl@linux.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A5A4A180018 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: ctmfnpxn8icyzorqd4wwye4oj63yqesb X-HE-Tag: 1705954838-474452 X-HE-Meta: U2FsdGVkX19CXTj1qeB1FTpygm6tpOFNzIe7CkUYRwMciVDhBzSmNL2F09tgatC7ymsq2Ha9v/yrmSXCWskumXcXFa+1Q5gD/ZNfnJ8wopuanmGHajqC0hQGj7xRSiog9TYydLLUsiK5wJOlcGHb5+ERDnlIokIeUwSguRQ1yQ2nKfOwtiyEJL6585W1sWbxC1bT9K1eKIrGjvLRfa2PnAIwGHLKGyiVayEI6KMjjZ+XPXzIWgSAI5Fqbr4lAwE2eehQGsUNCU9k+/bsFbkbeK+txihgcKHGYosJcDM2CpE5xQgfCm5+jO8ONeLeg5tBpA3Ln8bl5jVegpgu0/ToK2jF68wcV4D260NwpHK3OHEX6MlN6hcf3g6omoIT2grDmuK259ODHM6rKEt3VV7ha7o0Q1LXTxiN51IgVfGTyUkGsNX0LmJzrZGRPHaket6Uh4yzzCVH0IGVxcVpBBm6KDRE98zBEDPR0Cl1Z1ZPp0NixbQJdvPhX6rulwPO7rUfjiBfpLr1wuoafeADLQGFHaDOFbnQIU6qG1JGKdO/WM+QCyI/1isLmNH0x1qMTDKEVF7GGbgeiRdFpV5FJ7Bb2vD47GewYina0FM0tY1k+llbdPcQ4da7pD0iACWPEBGnIh75B8ttGZVlxUeY4TWHPHzORmhhlXllW1YUOlM9v/hogunBjLwk+5SF8k+IPDKAzyxjq2VpqkTVEfz9woQfhuR0yBYmwhbSTv3WqAyFBDExDHf16jfXmBE11AsUI6/V99XoiEaYVTZqzywlyLC0ufADVZ9jb8sILFmtVIDJvnUv9o9spmHzBJ4yPTweezHXpv8Oi1sr/W/xPDJ/QAV3+IvxHnjYgly9ZgQuS9q+sCeYJr9DRNg+YdMGEXWq+Dom4TvfshmmGt3AxZIXGc6TxaMoymmrucSUfQg7QN4EH7EeixAc1yGSLXXx4X7f1aA0Myb46vFKsgPDCOLEIRJ 074dlEjE mlFDrfMfoQ+msaFekTn+ioUPIRqkAH4I/OzlpKbxq7UsbejD8nY4ctHpBXmvIVyREil8WBy32AWhxaGUZL7xyjDLD6oYT1JO/svi9TZxd4xUyDq13c5u1KYp7MHkbm5PBNkUvCExqaGBqDckZTR/VGW7TZ1Fu+CccePKDRIx2YKdS8OBr9VTHRDaXHyISvzEuZmcfa1E6ynPclJDvLTuuwEOWUzkTXIDMKLLimqdo6kHQ1Jwi+8A5ked7Vd/u6L3wXh6Ss+EP9EKFXDmZO5Xqce98waKPdna87cd22qXfcXwy01aZO9LAC5/7j8vFwiGKwvy2wzJd3i+2NIzZM2Ng7gAZHK5M+gNHTzhWlPuS+RquRurQzBQD2WgdANHWykMkRLTg4OImJO+bqAe/I+/Ujzfa4yrYnoZKc4UE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 22, 2024 at 3:37=E2=80=AFAM Ryan Roberts = wrote: > > On 20/01/2024 16:39, Matthew Wilcox wrote: > > On Sat, Jan 20, 2024 at 12:04:27PM +0000, Ryan Roberts wrote: > >> However, after this patch, each allocation is in its own VMA, and ther= e is a 2M > >> gap between each VMA. This causes 2 problems: 1) mmap becomes MUCH slo= wer > >> because there are so many VMAs to check to find a new 1G gap. 2) It fa= ils once > >> it hits the VMA limit (/proc/sys/vm/max_map_count). Hitting this limit= then > >> causes a subsequent calloc() to fail, which causes the test to fail. > >> > >> Looking at the code, I think the problem is that arm64 selects > >> ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT. But __thp_get_unmapped_area() a= llocates > >> len+2M then always aligns to the bottom of the discovered gap. That ca= uses the > >> 2M hole. As far as I can see, x86 allocates bottom up, so you don't ge= t a hole. > > > > As a quick hack, perhaps > > #ifdef ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT > > take-the-top-half > > #else > > current-take-bottom-half-code > > #endif > > > > ? > > There is a general problem though that there is a trade-off between abutt= ing > VMAs, and aligning them to PMD boundaries. This patch has decided that in > general the latter is preferable. The case I'm hitting is special though,= in > that both requirements could be achieved but currently are not. > > The below fixes it, but I feel like there should be some bitwise magic th= at > would give the correct answer without the conditional - but my head is go= ne and > I can't see it. Any thoughts? > > Beyond this, though, there is also a latent bug where the offset provided= to > mmap() is carried all the way through to the get_unmapped_area() > impelementation, even for MAP_ANONYMOUS - I'm pretty sure we should be > force-zeroing it for MAP_ANONYMOUS? Certainly before this change, for arc= hes > that use the default get_unmapped_area(), any non-zero offset would not h= ave > been used. But this change starts using it, which is incorrect. That said= , there > are some arches that override the default get_unmapped_area() and do use = the > offset. So I'm not sure if this is a bug or a feature that user space can= pass > an arbitrary value to the implementation for anon memory?? > > Finally, the second test failure I reported (ksm_tests) is actually cause= d by a > bug in the test code, but provoked by this change. So I'll send out a fix= for > the test code separately. Thanks for figuring this out. > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 4f542444a91f..68ac54117c77 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -632,7 +632,7 @@ static unsigned long __thp_get_unmapped_area(struct f= ile *filp, > { > loff_t off_end =3D off + len; > loff_t off_align =3D round_up(off, size); > - unsigned long len_pad, ret; > + unsigned long len_pad, ret, off_sub; > > if (off_end <=3D off_align || (off_end - off_align) < size) > return 0; > @@ -658,7 +658,13 @@ static unsigned long __thp_get_unmapped_area(struct = file *filp, > if (ret =3D=3D addr) > return addr; > > - ret +=3D (off - ret) & (size - 1); > + off_sub =3D (off - ret) & (size - 1); > + > + if (current->mm->get_unmapped_area =3D=3D arch_get_unmapped_area_= topdown && > + !off_sub) > + return ret + size; > + > + ret +=3D off_sub; > return ret; > }