From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 570A9C00144 for ; Fri, 29 Jul 2022 19:42:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 369BC8E0001; Fri, 29 Jul 2022 15:42:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F1E46B0072; Fri, 29 Jul 2022 15:42:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 193BF8E0001; Fri, 29 Jul 2022 15:42:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 030E76B0071 for ; Fri, 29 Jul 2022 15:42:22 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C8BF381713 for ; Fri, 29 Jul 2022 19:42:21 +0000 (UTC) X-FDA: 79741158882.27.E0F463D Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf05.hostedemail.com (Postfix) with ESMTP id 9F1F21000C1 for ; Fri, 29 Jul 2022 19:42:20 +0000 (UTC) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 26THAJp6000595 for ; Fri, 29 Jul 2022 12:42:20 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3hkpemm5np-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 29 Jul 2022 12:42:19 -0700 Received: from twshared20276.35.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Fri, 29 Jul 2022 12:42:19 -0700 Received: by devvm6390.atn0.facebook.com (Postfix, from userid 352741) id 57B88189EA0F; Fri, 29 Jul 2022 12:42:16 -0700 (PDT) From: , To: , , CC: alexlzhu Subject: [PATCH] x86/sys_x86_64: fix VMA alginment for mmap file to THP Date: Fri, 29 Jul 2022 12:42:14 -0700 Message-ID: <20220729194214.1309313-1-alexlzhu@fb.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe Content-Type: text/plain X-Proofpoint-ORIG-GUID: Umk7FOvC6alHDRdCliCR4kT63owNDFFO X-Proofpoint-GUID: Umk7FOvC6alHDRdCliCR4kT63owNDFFO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-29_19,2022-07-28_02,2022-06-22_01 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659123741; a=rsa-sha256; cv=none; b=pmG5QnW3JlNSiF0EZtDJ698L3sIPF1QeH5JLmHltyfjxUbourHAZ35vQHq2ac4KbMZ3U8X 5NQAeVZrUdrIhVxTtDf6+Inr7tVNDzl6Tw0SQlFIBABru2qpIcUGO6jk+OAe8g1qoSeOnt D1v8/D8L++kJF13ZkhYcR6CK3DqaFaY= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of "prvs=9209b4152f=alexlzhu@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=9209b4152f=alexlzhu@fb.com" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659123741; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=vHzy1hdCE9zNHoi3CsRiXM2JwD3O8D5tqAUi9ZYeq6k=; b=jTES/zO3k4QsPEjdCozSi2raYlSroDLmhNYtL4cZSoVOsFCA8EcYdjWADftc++IVR4qX+H wJqOmbFNefNNwYy1C/V3CeaiLfmvTX6yhqMa3m8di7mFmvA16hbHwk9AGTZVDVVf3H3C8T ij0jVXooXLffNdQPYNY60B37rZEFC3Q= X-Rspamd-Queue-Id: 9F1F21000C1 X-Rspam-User: X-Stat-Signature: d6g41yfa7hjw49b9czenejy8bkiqniyt Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of "prvs=9209b4152f=alexlzhu@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=9209b4152f=alexlzhu@fb.com" X-Rspamd-Server: rspam08 X-HE-Tag: 1659123740-781141 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: alexlzhu With CONFIG_READ_ONLY_THP_FOR_FS, the Linux kernel supports using THPs fo= r read-only mmapped files, such as shared libraries. However, on x86 the kernel makes no attempt to actually align those mappings on 2MB boundarie= s, which makes it impossible to use those THPs most of the time. This issue applies to general file mapping THP as well as existing setups using CONFIG_READ_ONLY_THP_FOR_FS. This is easily fixed by using the alignment info passed to vm_unmapped_area. The problem can be seen in /proc/PID/smaps where THPeligible is set to 0 on mappings to eligible shared object files as shown below. Before this patch: 7fc6a7e18000-7fc6a80cc000 r-xp 00000000 00:1e 199856 /usr/lib64/libcrypto.so.1.1.1k Size: 2768 kB THPeligible: 0 VmFlags: rd ex mr mw me With this patch the library is mapped at a 2MB aligned address: fbdfe200000-7fbdfe4b4000 r-xp 00000000 00:1e 199856 /usr/lib64/libcrypto.so.1.1.1k Size: 2768 kB THPeligible: 1 VmFlags: rd ex mr mw me This fixes the alignment of VMAs for any mmap of a file that has the rd and ex permissions and size >=3D 2MB. The VMA alignment and THPeligible field for shared and anonymous memory are handled separately and are thus not effected by this change. Signed-off-by: alexlzhu --- arch/x86/entry/vdso/vma.c | 2 +- arch/x86/include/asm/elf.h | 2 +- arch/x86/kernel/sys_x86_64.c | 29 ++++++++++++++++++----------- 3 files changed, 20 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index 1000d457c332..da916040d2ba 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -337,7 +337,7 @@ static unsigned long vdso_addr(unsigned long start, u= nsigned len) * Forcibly align the final address in case we have a hardware * issue that requires alignment for performance reasons. */ - addr =3D align_vdso_addr(addr); + addr =3D align_vdso_addr(addr, len); =20 return addr; } diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index cb0ff1055ab1..65a09a0e0e97 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -396,5 +396,5 @@ struct va_alignment { } ____cacheline_aligned; =20 extern struct va_alignment va_align; -extern unsigned long align_vdso_addr(unsigned long); +extern unsigned long align_vdso_addr(unsigned long addr, unsigned long l= en); #endif /* _ASM_X86_ELF_H */ diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index 8cc653ffdccd..2506242e37aa 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -25,11 +25,18 @@ /* * Align a virtual address to avoid aliasing in the I$ on AMD F15h. */ -static unsigned long get_align_mask(void) +static unsigned long get_align_mask(unsigned long len) { /* handle 32- and 64-bit case with a single conditional */ - if (va_align.flags < 0 || !(va_align.flags & (2 - mmap_is_ia32()))) + if (va_align.flags < 0 || !(va_align.flags & (2 - mmap_is_ia32()))) { + /* + * Read-only file mappings can be mapped using transparent huge pages; + * make sure that large mappings are 2MB aligned. + */ + if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && len >=3D PMD_SIZE) + return PMD_SIZE - 1; return 0; + } =20 if (!(current->flags & PF_RANDOMIZE)) return 0; @@ -47,16 +54,16 @@ static unsigned long get_align_mask(void) * value before calling vm_unmapped_area() or ORed directly to the * address. */ -static unsigned long get_align_bits(void) +static unsigned long get_align_bits(unsigned long len) { - return va_align.bits & get_align_mask(); + return va_align.bits & get_align_mask(len); } =20 -unsigned long align_vdso_addr(unsigned long addr) +unsigned long align_vdso_addr(unsigned long addr, unsigned long len) { - unsigned long align_mask =3D get_align_mask(); + unsigned long align_mask =3D get_align_mask(len); addr =3D (addr + align_mask) & ~align_mask; - return addr | get_align_bits(); + return addr | get_align_bits(len); } =20 static int __init control_va_addr_alignment(char *str) @@ -151,8 +158,8 @@ arch_get_unmapped_area(struct file *filp, unsigned lo= ng addr, info.align_mask =3D 0; info.align_offset =3D pgoff << PAGE_SHIFT; if (filp) { - info.align_mask =3D get_align_mask(); - info.align_offset +=3D get_align_bits(); + info.align_mask =3D get_align_mask(len); + info.align_offset +=3D get_align_bits(len); } return vm_unmapped_area(&info); } @@ -209,8 +216,8 @@ arch_get_unmapped_area_topdown(struct file *filp, con= st unsigned long addr0, info.align_mask =3D 0; info.align_offset =3D pgoff << PAGE_SHIFT; if (filp) { - info.align_mask =3D get_align_mask(); - info.align_offset +=3D get_align_bits(); + info.align_mask =3D get_align_mask(len); + info.align_offset +=3D get_align_bits(len); } addr =3D vm_unmapped_area(&info); if (!(addr & ~PAGE_MASK)) --=20 2.30.2