From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E8F7D1356C for ; Sun, 27 Oct 2024 16:00:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BE296B008A; Sun, 27 Oct 2024 12:00:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 145DE6B008C; Sun, 27 Oct 2024 12:00:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F01CB6B0092; Sun, 27 Oct 2024 12:00:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C22D66B008A for ; Sun, 27 Oct 2024 12:00:53 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B1863A0248 for ; Sun, 27 Oct 2024 16:00:13 +0000 (UTC) X-FDA: 82719845124.18.32C03DB Received: from m43-9.mailgun.net (m43-9.mailgun.net [69.72.43.9]) by imf17.hostedemail.com (Postfix) with ESMTP id 11CCE40003 for ; Sun, 27 Oct 2024 16:00:33 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=tbrindus.ca header.s=mailo header.b=DWoLoXjQ; dmarc=none; spf=pass (imf17.hostedemail.com: domain of "bounce+d2966a.5f1019-linux-mm=kvack.org@tbrindus.ca" designates 69.72.43.9 as permitted sender) smtp.mailfrom="bounce+d2966a.5f1019-linux-mm=kvack.org@tbrindus.ca" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730044723; a=rsa-sha256; cv=none; b=I/7PhXu0Ki9vBJUTtnTLZtdDgdi02WrmVGacW1qnDf9wCt6HpGXueJ/7hZA0UD+Xe9IZAM Nugw35cUBI11mEBM2i5inh2xPprPumCYRSyfsTAmfx7r5h7UQ7AGab1WQDs9cNxfoVvny0 FjRuL16aViggLDZ+P3IXaY3p1z04NME= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=tbrindus.ca header.s=mailo header.b=DWoLoXjQ; dmarc=none; spf=pass (imf17.hostedemail.com: domain of "bounce+d2966a.5f1019-linux-mm=kvack.org@tbrindus.ca" designates 69.72.43.9 as permitted sender) smtp.mailfrom="bounce+d2966a.5f1019-linux-mm=kvack.org@tbrindus.ca" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730044723; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=2TNMZvWMe2lDRJAecrYUR7jqJjVN6/hE6n3XAlN/nng=; b=7yab8M/KZYP+KABAmWSUjHWJUxaSP7seAL+zk9FfmU0+tPbWoQsauKJ/+GtMELsFz7Z1ff 0DrgbZ3fzEXp309lizStMOi8stMnP3zsCgopDTwuFxUUgKETfDB/109GSxFti27T3jZ3SB D5OzKleFic94hU21oUQh87h6trCM33I= DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=tbrindus.ca; q=dns/txt; s=mailo; t=1730044849; x=1730052049; h=Content-Type: To: To: Subject: Subject: Message-ID: Date: From: From: MIME-Version: Sender: Sender; bh=2TNMZvWMe2lDRJAecrYUR7jqJjVN6/hE6n3XAlN/nng=; b=DWoLoXjQxOPtEYz68JrZ2lEGFurrx4b4eoI4pnvtdM/cd7OB2nINZ/eTEiIRC0U2rZxzf6bW41RL2UeCxqQSp2tTGtLEpPOYj2uWEFiF/da0ROdp718iyf/Wc/0xMO4pXw5MwHQijEKAXlXZfRzyVEF2QWufAGeTKd53UNWjMsM= X-Mailgun-Sending-Ip: 69.72.43.9 X-Mailgun-Sending-Ip-Pool-Name: X-Mailgun-Sending-Ip-Pool: X-Mailgun-Sid: WyIxNjZiZCIsImxpbnV4LW1tQGt2YWNrLm9yZyIsIjVmMTAxOSJd Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) by 43f2baabe9b4 with SMTP id 671e63b115a928f205696d65 (version=TLS1.3, cipher=TLS_AES_128_GCM_SHA256); Sun, 27 Oct 2024 16:00:49 GMT Received: by mail-ed1-f51.google.com with SMTP id 4fb4d7f45d1cf-5c9388a00cfso3983398a12.3 for ; Sun, 27 Oct 2024 09:00:49 -0700 (PDT) X-Gm-Message-State: AOJu0YxXvzbBbTXi9hijk2W1K3zQZTzbYRvksz9K0OW/1jCOn4mXEE+f SmEP8jKHAx2kMCHj/U3rsprOisR/c/96DuNNzP/5WjrKJEnwpo6YWV8lvfY31bI4tQtTt8NDp90 87jy4ScIIl+sfbaiBIpNQdX5HCuM= X-Google-Smtp-Source: AGHT+IH7MrstoMhd4kWWKw0JVOOHsHRc5K/tCuF6hDR7GB86HVZFwbHwIEbfIwSZxigby1GW6Xuw/8EL+6mVsUDwSGw= X-Received: by 2002:a05:6402:5107:b0:5cb:c081:92a7 with SMTP id 4fb4d7f45d1cf-5cbc081a751mr3819469a12.29.1730044848434; Sun, 27 Oct 2024 09:00:48 -0700 (PDT) MIME-Version: 1.0 From: Tudor Brindus Date: Sun, 27 Oct 2024 12:00:12 -0400 X-Gmail-Original-Message-ID: Message-ID: Subject: Bug/inconsistency in hugepage permission checking between mmap and shmget To: linux-mm@kvack.org Content-Type: multipart/alternative; boundary="000000000000a303380625777469" X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 11CCE40003 X-Stat-Signature: 533ynysm1e8cccgxo9axz84ygsfatepx X-Rspam-User: X-HE-Tag: 1730044833-285079 X-HE-Meta: U2FsdGVkX18SQW/l4ev076xo8rt3fIFqz7k0WSCc3WWv6792zsiMihO+HHIPgkdrNhSwDRJXr4QFpjhWQYaWlzm2WAJpYORIW8+kMKMXbRcs82qnQywS5tgCwIiO90ROo4QoSM/LtK6ov2UrihVGopwymNYoHfTlkOSZOqjUdg+KRnEARy2cljAe+DkFvEraKjY1c1Qh7Q0FbN11QLeBHChp5Wk311FpbtSgBPGq0J7fuqxQ+ETkguVBLTu5NzTE/BX7dxZ5OjzB3/sC2DTMaTnFZ+ruYD7TnUh71SEs72lS7XorRaN7Vww+7Z37BTS7RAWcqKkggEuvS9fvvHelA0SbT1Ftwtj1ZZ63iYThIfcO7N++Dr9KCqH+8J6w+ZNLmJg2bUk4d+hlZ5kVlGWbFvVKiyc9fLPdOaHjlecL4DKtDlnnAua8jCOAYgZ5RSGZSSaAbeQx7y73iih9ydyEn5DGzu5DzWww6YMXDzMniIaFj7uYz4QQf9GQyp2NOA49KatToKbEgpfYdnN8aME0YHOtQUPeKEnBPYfEyfbvZBvyGbfMgDefINmBDHrCG2qxq0UQnL8BJULJn3vujWdbj0Dk8mAYGzW31A6f2NEk/WQwt411qlQVuiTmL76e+MtlbMN/fDmLv7+4brvAgnezl/GEq9jmMWmw4qlHXbgFhBLBWw4lM2Aysr4PaJQWvtkGJLYwuLzLjgHdunL8kSHUok/1mBT7CtfGJSXmHbFWK8ALiJbG3twLQrpBSFTStbbXELLTSyn4naM6nI4JBLP5oQPsn4NySI92UgTIs0RVLCjxxxaUdeFKz0vM0OaN26HTR2E99fR3PmalaJ8DIIaaKulXpdM5lwy6H1+FXVGYR3HwnIODx4oeVXRklj+ubpVy6Z17MZ3cz0hMXi7ZNl/LSmVxL5Hrizf38lIcFZ3UsR5v61IatFAUfFZkM/SSbaCrN6FmjmPvQfaYnoqifun t83YJyPY MsCE1UFwjF+DWrIIGtT0dxdC+v+ogV0VsLdMMlZHE/4a1KN5upyQkGi2PpMBsp+3jRcDUukyIjMciam8dt9HoXL9NTz2riVBDAQvUEoS5EoT2SBtQZ8sVkyAMfmNK2At+S0PaGh4xcscAWjP993rb2QXBdE2BxxlZOdcCgFq9sXxlGoQhOPI+5g1mcBLCjmbcFraALPTXPcwznLelXsYv/jHfzUZ1NYRZw3bLyObMXuXy3CBB6CeewDja1gu7RdsdQPZyEpEKfYOQ5VB5VYIucfh/V+dHS7i1X0YD5cL223UGr0WQa890CECvJ8qG3BMQ1R/XhqM9wssuIewfWe9/c2lvgtSFsiCyI3w0ZbJMIDcD9zSoJWuQijZgtxpZd24XbIgtBk737/2m6WLhas9O3w6hqrEvcLwu0x9O50kK8W+W3aeGFdVBmJM3yccJ4OcCXXYsekCcsOrcfC6vLip6VcgD58f/5Sn0OMg/aRBowydHxcbKRyy3LaLqthfP2k4nu2CgImnsMoQXGiulP6GF9a+8D0+uU5QXCQsakvVWGPNAOXOXNhTjkT7RDJbj8wblkNZGpe50xLkFN6edolMO5/OL21KQm5dkuFNwukN19pYNnRFYzDeuie7lYIMWyNkHYsBWR95Q+1GV6Q1arWVCIywZ6A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.060137, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --000000000000a303380625777469 Content-Type: text/plain; charset="UTF-8" Hi, The documentation for both `man 2 mmap` and `man 2 shmget` states that anonymous hugepage mappings can only be obtained if the caller has `CAP_IPC_LOCK`, or is part of the `sysctl_hugetlb_shm_group`. I believe that in practice this is only enforced for `shmget`, and not `mmap`. This is true as of current master, and has been true at least since 4.x (I have not looked further back). `mm/mmap.c` contains, in a `MAP_HUGETLB` branch: file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, VM_NORESERVE, HUGETLB_ANONHUGE_INODE, (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); While `fs/hugetlbfs/inode.c` contains: static int can_do_hugetlb_shm(void) { kgid_t shm_group; shm_group = make_kgid(&init_user_ns, sysctl_hugetlb_shm_group); return capable(CAP_IPC_LOCK) || in_group_p(shm_group); } ... struct file *hugetlb_file_setup(const char *name, size_t size, vm_flags_t acctflag, int creat_flags, int page_size_log) { ... if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) { ... return ERR_PTR(-EPERM); } i.e., only checks `can_do_hugetlb_shm` when `create_flags == HUGETLB_SHMFS_INODE`, whereas the callsite in `mm/mmap.c` passes in `HUGETLB_ANONHUGE_INODE`. A simple test program that tries allocating hugepage memory with `mmap` and `shmget` while not possessing `CAP_IPC_LOCK` and not being in the `sysctl_hugetlb_shm_group` confirms this behavior. What's the right course of action here? - The logic in `hugetlb_file_setup` could be modified to enforce the permissions on `mmap` calls. This might break userspace code that's been relying on this working, though. - The restriction could be removed from `shmget`. - The inconsistency between `mmap` and `shmget` could be accepted as a fact of life, and the documentation fixed to match this reality. -- Tudor --000000000000a303380625777469 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

The documentation for both `man 2 mmap` and `ma= n 2 shmget` states that
anonymous hugepage mappings can only be obtained= if the caller has
`CAP_IPC_LOCK`, or is part of the `sysctl_hugetlb_shm= _group`.

I believe that in practice this is only enforced for `shmge= t`, and not `mmap`.
This is true as of current master, and has been true= at least since 4.x (I have
not looked further back).

`mm/mmap.c`= contains, in a `MAP_HUGETLB` branch:

=C2=A0 file =3D hugetlb_file_s= etup(HUGETLB_ANON_FILE, len,
=C2=A0 =C2=A0 =C2=A0 VM_NORESERVE,
=C2= =A0 =C2=A0 =C2=A0 HUGETLB_ANONHUGE_INODE,
=C2=A0 =C2=A0 =C2=A0 (flags &g= t;> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);

While `fs/hugetlbfs/ino= de.c` contains:

=C2=A0 static int can_do_hugetlb_shm(void)
=C2=A0= {
=C2=A0 =C2=A0 kgid_t shm_group;
=C2=A0 =C2=A0 shm_group =3D make_k= gid(&init_user_ns, sysctl_hugetlb_shm_group);
=C2=A0 =C2=A0 return c= apable(CAP_IPC_LOCK) || in_group_p(shm_group);
=C2=A0 }

=C2=A0 ..= .

=C2=A0 struct file *hugetlb_file_setup(const char *name, size_t si= ze,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 vm_flags_t acctflag, int creat_flags,=C2=A0 =C2=A0 =C2=A0 =C2=A0 int page_size_log)
=C2=A0 {
=C2=A0 ...=C2=A0 =C2=A0 if (creat_flags =3D=3D HUGETLB_SHMFS_INODE && !can_= do_hugetlb_shm()) {
=C2=A0 =C2=A0 =C2=A0 ...
=C2=A0 =C2=A0 =C2=A0 ret= urn ERR_PTR(-EPERM);
=C2=A0 =C2=A0 }

i.e., only checks `can_do_hu= getlb_shm` when `create_flags =3D=3D
HUGETLB_SHMFS_INODE`, whereas the c= allsite in `mm/mmap.c` passes in
`HUGETLB_ANONHUGE_INODE`.

A simp= le test program that tries allocating hugepage memory with `mmap` and
`s= hmget` while not possessing `CAP_IPC_LOCK` and not being in the
`sysctl_= hugetlb_shm_group` confirms this behavior.

What's the right cour= se of action here?

- The logic in `hugetlb_file_setup` could be modi= fied to enforce the
=C2=A0 permissions on `mmap` calls. This might break= userspace code that's been
=C2=A0 relying on this working, though.<= br>
- The restriction could be removed from `shmget`.

- The incon= sistency between `mmap` and `shmget` could be accepted as a fact of
=C2= =A0 life, and the documentation fixed to match this reality.

-- Tudo= r
--000000000000a303380625777469--