From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26CD9C35274 for ; Thu, 21 Dec 2023 19:46:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B13BA6B0087; Thu, 21 Dec 2023 14:46:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AC2CA6B0089; Thu, 21 Dec 2023 14:46:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 963B46B008A; Thu, 21 Dec 2023 14:46:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8332D6B0087 for ; Thu, 21 Dec 2023 14:46:07 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 54C1FC0761 for ; Thu, 21 Dec 2023 19:46:07 +0000 (UTC) X-FDA: 81591856374.28.2EEC601 Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by imf23.hostedemail.com (Postfix) with ESMTP id 9620D14001E for ; Thu, 21 Dec 2023 19:46:04 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Sc26RRF1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.172 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703187964; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+FKILjwnffsZ2mmqJQPrrhUAU5h2ZycGnJTVyIPfkMY=; b=JIk+nlQbVeFGjxDoMN0tI5YlMNSU9aVv3xh4uwpMCYCD8uXI7VG8vt/uwcOqfKUnlNoeVF yNJuYngPfOPnpAwBa9t30SWxsHP89DGQn9dXjx/dTLqDFUN3ubTJZsm9I+fxEhSm4aKVHD kPbdEnn1bjCEHfNNL/yTwdJq+5GVROc= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Sc26RRF1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.172 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703187964; a=rsa-sha256; cv=none; b=WKBI0bzavHdMb5PS1okykW71tFZUZmWS56eyZ81B5/v4kG77erv6FAh94rdpIWSny9w1vP BPwO6+n54PraTOA1BmcJbv5swUm1eCtMn6rUJ7a76Q6I5ABCKqAwsF0zi2rS1NkS3YfsKR iLJE9LTTxK8UZ5Jqn2aPVk2JrQclFxY= Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-5cda24a77e0so669017a12.2 for ; Thu, 21 Dec 2023 11:46:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703187963; x=1703792763; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+FKILjwnffsZ2mmqJQPrrhUAU5h2ZycGnJTVyIPfkMY=; b=Sc26RRF1O5lkp7bF9Hu5sr9///EKM7h94Ky6+ccX3Bi/yZDTl8/9Y8Cpm2/E9QCJzV 3iZSWVv4JjQpUpgXUZa/u0G9yN8JWCAoPuXsCz1pH3/OUye0VuFH9UyBp2uQQah+uDwe KpEaIWK8WYMPC5D5Ka2QpLskqbTWseEeD+CK7awBYTm/C+sSjeFk81wTDTLSFJ0IKh8W T7CggFk0zMvczOLJWixT0M/BdCDFV5/dySi5oEIPBMkmz5hja3HIBJYZKCxkIoHjnc5j WIhLBgSA8Wp+MJ4VzEXWj/M1+Wn4s+BQR4jWbaK9UbjO6CJdGF7wOkCNEFUVIRz6Vap4 maKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703187963; x=1703792763; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+FKILjwnffsZ2mmqJQPrrhUAU5h2ZycGnJTVyIPfkMY=; b=TFRyoTc36Cb6hyJA2NzKgOATWX7gsyqtxhUSecyVKCqpNHSeBzEXl5wCyJJ/ppo699 7j6dykqSbEqHaMf50j+8HLUVpHdxC1RBGviNzIbJpCI4t8dveW/PJ+RucanOydI5FfZh TJQmWznc/oZ4dv4ECES1DfzRRqOVk34fs8Tc/LNoTTaNNG//1qBgjgr6jeJkiqfjDsQU 89Zzm+wjlHQXLpKH9aQd740y3yX52soIhPyEV8xb2TboEgIsmypYtSr3q9A421UplDgI tI70DBlr7Br+Z7tZy+jLG2XG1K51f2ntmxPa6sxu5r2cUrhix/hlmXN0M7HIF0GwtU/3 veaA== X-Gm-Message-State: AOJu0Yx9l9FFdi9yRkNbR0VAsHGAtDd/pAILUcuqDl2WNrTN7UiEqGo7 mEAqTr3XLpsgJ5jrCHrB3jiIQYve/2c5J3F/0uaxFLJYunU= X-Google-Smtp-Source: AGHT+IHAj/A6aU4UWkIU8f4/Octn90lN03mujVgsSVLECUZw8Qs8yxIQskBPpmpPBW7sqhMDhoELhTs/OpbBxHxDDpI= X-Received: by 2002:a17:90a:7504:b0:28b:b6fa:82ec with SMTP id q4-20020a17090a750400b0028bb6fa82ecmr260414pjk.52.1703187963396; Thu, 21 Dec 2023 11:46:03 -0800 (PST) MIME-Version: 1.0 References: <20231220054123.1266001-1-maskray@google.com> In-Reply-To: From: Yang Shi Date: Thu, 21 Dec 2023 11:45:51 -0800 Message-ID: Subject: Re: [PATCH] mm: remove VM_EXEC requirement for THP eligibility To: Fangrui Song Cc: Andrew Morton , linux-mm@kvack.org, Song Liu , Miaohe Lin , linux-kernel@vger.kernel.org, Zhouyi Zhou Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 9620D14001E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 5tuwnwgaf1xtsimna5ac9txx6roo5mzb X-HE-Tag: 1703187964-819437 X-HE-Meta: U2FsdGVkX1/VpIFQOu0Wckbdgzy1oSPuNw2LmlSOfSpLOJBmQ+5bMTIrdLXJwS1hL667ZkP0rFmOBpN1Dd+F0s1vfnv6bjVqahuPcnDygzb8p/EjgAOA9r6ZoTAN00pU33MdCIE24poe9fAIeoowsiGZolm6ph+C/xd/OE4hRV0Lcl1Crq6JVg4b5BnYzalveRLwrwiwh/JvdZisVzyh9PMWBJkpQlZDzYC50pvKGJWUiAIdRhCtvqXHQLARgod1LD7VDobfBmYqAsHMhOxaysmCTTIR4wDjUrEhJjR/ML0cwqwCqtAQ7DfeZ4Cm+BZrSbT1yYaVmKebNKCrJLrdr5mDjcH8XZ4HDmV9lzrdct6pKKFQyQGGIX+hUfvV3QQO4LcgrNeaY1kQ4MPw6MbXVFSVolfLZzVSaX6blH0ph+gF48sAML84/9dB0Xp9OLWX4H32Dd40XcBjlsdE4MTuGc1nxBr0xYugm387PaYLP5Ze1tPK9ccUE9uHkWZ+4qtc8zQH43BApOGb0oJktjcN50joUhvXRkQEz64zyiD6gaIndIjRCdeieHkDipmPOe468CN/sVuOol5hBgTtx+cARlaO1dy6KxFpijPZd71xSOQTplhiiXNDH1gX27jH+cnBHnbtZ8k64DNZ3XM8ttEkWXARuvwb/tKKWgteoo40qbjAeSA6yfA52bi64M/VUkRj5jaJJpKmxq4sFG7MYfJiltSPuYa2WwUkAiBgCE+RWpuDjGdn9mThGEW27ZhV7IPaNwv6LXyKfMyla8aY6RNpM6R/TEu2X2CPP0roI7XxDfqQ50sRSs0LVYm9xXLEbR0PFFTcJUR6+1iIz/Uqel1rszuP3PIToz61V2mhHDSd08eGJVf49nZaFUCPshI8ePIqWSEZjF4rZwEHPfwdbXmv/4XG8ny+e1+wCtZcD+5F9QlTkISVbwf8MNnWoLGeJ/u3vYy5K+1BAALc669/Zfv zYWES1NU TnV92UFspdAcYjGzySkW99yLNzr+CiozIaFG+70YokyPFSXmg0cV1cxb6GvHUzOW+Gwb2lWngt82OGvpkI17VSCx6BIILpqsIIjSrnXr8Z5yWiJ+nKKlLB4fTJxp2X/VLZZWO+NZIjzDZdXUgwzqx407fZU0oCba6c3yt2c39AtMagUz+n13zIGpLfxB5B/9QeVURz/+R/b1we3mTZJRxmC0unQveo3AzDB5yTfy6tfO4r92tr2y7W2gX2OwJcvy0MwQ+s4d22n45nXSfFuPaO0e205C+3OFQmqQ4h10k36C0LnLWXe0ZOIkVl3/J1iqO6dv8RYcRMsLbCawP9mto5STCy8nsg2ZMzV4pkA3IbPaOqhbBJcKaCx9heV20Hk0UDUuXiUHo2oHfDL9xG/HmGpg845/Vz4owFX3J59MHnElBjarhfzj+rcoJm83e3kMSDnGsCqcBb8q9V9sNdspAsaVePC4SDxcTIrk43RI91hrgB3PaiBCV8rIcLuIkJ53MoMnmgmA2+QjnOdTrI3bL7b9ornVR9IXjca/xXxH5la+1uFtbB3JwTJlTO1HlcUehWcLbJ/SsxQAv1xpmAeO+3NpvEvKPSrOyi492KS0lLjoZK9b9mHkSFo+IMZ0Y/8wTg1nbCYXcJ1tFUZ9DB5vGPvkAkLedmSXSx+eh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 21, 2023 at 11:31=E2=80=AFAM Yang Shi wro= te: > > On Wed, Dec 20, 2023 at 8:53=E2=80=AFPM Fangrui Song = wrote: > > > > On Wed, Dec 20, 2023 at 3:42=E2=80=AFPM Yang Shi = wrote: > > > > > > On Tue, Dec 19, 2023 at 9:41=E2=80=AFPM Fangrui Song wrote: > > > > > > > > Commit e6be37b2e7bd ("mm/huge_memory.c: add missing read-only THP > > > > checking in transparent_hugepage_enabled()") introduced the VM_EXEC > > > > requirement, which is not strictly needed. > > > > > > > > lld's default --rosegment option and GNU ld's -z separate-code opti= on > > > > (default on Linux/x86 since binutils 2.31) create a read-only PT_LO= AD > > > > segment without the PF_X flag, which should be eligible for THP. > > > > > > > > Certain architectures support medium and large code models, where > > > > .lrodata may be placed in a separate read-only PT_LOAD segment, whi= ch > > > > should be eligible for THP as well. > > > > > > Yeah, it doesn't have to be VM_EXEC. The original implementation was > > > restricted to VM_EXEC to minimize the blast radius and the targe use > > > case is for large text segments. Out of curiosity, did you see any > > > noticeable improvement with this change? > > > > Hi Yang, > > > > Thanks for the comment. Frankly, I am not familiar with huge pages... > > I noticed this VM_EXEC condition when I was writing this > > hugepage-related section in > > https://maskray.me/blog/2023-12-17-exploring-the-section-layout-in-link= er-output#transparent-huge-pages-for-mapped-files > > (Thanks to Alexander Monakov's comment about > > CONFIG_READ_ONLY_THP_FOR_FS in > > https://mazzo.li/posts/check-huge-page.html). > > Thanks for sharing the article, learnt something about linker and loader. BTW, kernel should try to map the segments (the size has to be >=3D 2M) to 2M aligned address even though the loading address is not 2M aligned for ext4/xfs/btrfs since v5.18. See commit 1854bc6e2420 ("mm/readahead: Align file mappings for non-DAX"). Did you see this behavior? > > > > > As dTLB for read-only data is also an important optimization of > > file-backed THP, it seems straightforward that we should drop the > > VM_EXEC condition :) > > Yeah, as long as the use case is valid, it is definitely fine to lift > the restriction. > > > > > On my Arch linux machine, the r--p page gets split if I invoke > > madvise(__ehdr_start, HPAGE_SIZE, MADV_HUGEPAGE) I haven't figured out > > why it behaves so in the presence of the VM_EXEC check. > > What do you mean about "split"? THP got split into small pages? It > depends on the address of __ehdr_start. If it is in the middle of a > VMA, the VMA is going to be split due to the different huge page > attributes. > > > > > % g++ test.cc -o ~/tmp/test -O2 -fuse-ld=3Dlld > > -Wl,-z,max-page-size=3D2097152 && sudo ~/tmp/test > > __ehdr_start: 0x55f3b1c00000 > > 55f3b1c00000-55f3b1e00000 r--p 00000000 103:03 555277119 > > /home/ray/tmp/test > > 55f3b1e00000-55f3b1e01000 r--p 00200000 103:03 555277119 > > /home/ray/tmp/test > > 55f3b2000000-55f3b2002000 r-xp 00200000 103:03 555277119 > > /home/ray/tmp/test > > 55f3b2201000-55f3b2202000 r--p 00201000 103:03 555277119 > > /home/ray/tmp/test > > 55f3b2401000-55f3b2402000 rw-p 00201000 103:03 555277119 > > /home/ray/tmp/test > > 55f3b3a9a000-55f3b3abb000 rw-p 00000000 00:00 0 = [heap] > > > > > > It'd be greatly appreciated if someone familiar with > > CONFIG_READ_ONLY_THP_FOR_FS could provide some notes on how to use > > this feature:) > > I think your blog covered all the points. If you don't mind, you could > add some notes in Documentation/admin-guide/mm/transhuge.rst. > > > > > > > > > > > Signed-off-by: Fangrui Song > > > > --- > > > > include/linux/huge_mm.h | 1 - > > > > 1 file changed, 1 deletion(-) > > > > > > > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > > > > index fa0350b0812a..4c9e67e9000f 100644 > > > > --- a/include/linux/huge_mm.h > > > > +++ b/include/linux/huge_mm.h > > > > @@ -126,7 +126,6 @@ static inline bool file_thp_enabled(struct vm_a= rea_struct *vma) > > > > inode =3D vma->vm_file->f_inode; > > > > > > > > return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && > > > > - (vma->vm_flags & VM_EXEC) && > > > > !inode_is_open_for_write(inode) && S_ISREG(inode->i_= mode); > > > > } > > > > > > > > -- > > > > 2.43.0.472.g3155946c3a-goog > > > > > > > > > > > > -- > > =E5=AE=8B=E6=96=B9=E7=9D=BF