From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3081BC433DB for ; Mon, 15 Feb 2021 12:42:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B662B64DEC for ; Mon, 15 Feb 2021 12:42:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B662B64DEC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 37E688D00F5; Mon, 15 Feb 2021 07:42:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 28E268D00FD; Mon, 15 Feb 2021 07:42:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12BAC8D00FC; Mon, 15 Feb 2021 07:42:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EBAE98D00F5 for ; Mon, 15 Feb 2021 07:42:44 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B2F5A1821E77B for ; Mon, 15 Feb 2021 12:42:44 +0000 (UTC) X-FDA: 77820466248.11.bait67_1e14aa02763b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 990C11821E74A for ; Mon, 15 Feb 2021 12:42:44 +0000 (UTC) X-HE-Tag: bait67_1e14aa02763b X-Filterd-Recvd-Size: 8135 Received: from raptor.unsafe.ru (raptor.unsafe.ru [5.9.43.93]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Mon, 15 Feb 2021 12:42:43 +0000 (UTC) Received: from comp-core-i7-2640m-0182e6.redhat.com (ip-94-113-225-162.net.upcbroadband.cz [94.113.225.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by raptor.unsafe.ru (Postfix) with ESMTPSA id 0416220A1D; Mon, 15 Feb 2021 12:42:42 +0000 (UTC) From: Alexey Gladkov To: LKML , io-uring@vger.kernel.org, Kernel Hardening , Linux Containers , linux-mm@kvack.org Cc: Alexey Gladkov , Andrew Morton , Christian Brauner , "Eric W . Biederman" , Jann Horn , Jens Axboe , Kees Cook , Linus Torvalds , Oleg Nesterov Subject: [PATCH v6 4/7] Reimplement RLIMIT_MSGQUEUE on top of ucounts Date: Mon, 15 Feb 2021 13:41:11 +0100 Message-Id: <8a3c7bc4c0f45d9b8313ef395f3fa180eef01d67.1613392826.git.gladkov.alexey@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.1 (raptor.unsafe.ru [5.9.43.93]); Mon, 15 Feb 2021 12:42:43 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. Signed-off-by: Alexey Gladkov --- include/linux/sched/user.h | 4 ---- include/linux/user_namespace.h | 1 + ipc/mqueue.c | 29 +++++++++++++++-------------- kernel/fork.c | 1 + kernel/ucount.c | 1 + kernel/user_namespace.c | 1 + 6 files changed, 19 insertions(+), 18 deletions(-) diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h index d33d867ad6c1..8a34446681aa 100644 --- a/include/linux/sched/user.h +++ b/include/linux/sched/user.h @@ -18,10 +18,6 @@ struct user_struct { #endif #ifdef CONFIG_EPOLL atomic_long_t epoll_watches; /* The number of file descriptors currentl= y watched */ -#endif -#ifdef CONFIG_POSIX_MQUEUE - /* protected by mq_lock */ - unsigned long mq_bytes; /* How many bytes can be allocated to mqueue? *= / #endif unsigned long locked_shm; /* How many pages of mlocked shm ? */ unsigned long unix_inflight; /* How many files in flight in unix socket= s */ diff --git a/include/linux/user_namespace.h b/include/linux/user_namespac= e.h index 0a27cd049404..52453143fe23 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -51,6 +51,7 @@ enum ucount_type { UCOUNT_INOTIFY_WATCHES, #endif UCOUNT_RLIMIT_NPROC, + UCOUNT_RLIMIT_MSGQUEUE, UCOUNT_COUNTS, }; =20 diff --git a/ipc/mqueue.c b/ipc/mqueue.c index beff0cfcd1e8..05fcf067131f 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -144,7 +144,7 @@ struct mqueue_inode_info { struct pid *notify_owner; u32 notify_self_exec_id; struct user_namespace *notify_user_ns; - struct user_struct *user; /* user who created, for accounting */ + struct ucounts *ucounts; /* user who created, for accounting */ struct sock *notify_sock; struct sk_buff *notify_cookie; =20 @@ -292,7 +292,6 @@ static struct inode *mqueue_get_inode(struct super_bl= ock *sb, struct ipc_namespace *ipc_ns, umode_t mode, struct mq_attr *attr) { - struct user_struct *u =3D current_user(); struct inode *inode; int ret =3D -ENOMEM; =20 @@ -309,6 +308,8 @@ static struct inode *mqueue_get_inode(struct super_bl= ock *sb, if (S_ISREG(mode)) { struct mqueue_inode_info *info; unsigned long mq_bytes, mq_treesize; + struct ucounts *ucounts; + bool overlimit; =20 inode->i_fop =3D &mqueue_file_operations; inode->i_size =3D FILENT_SIZE; @@ -321,7 +322,7 @@ static struct inode *mqueue_get_inode(struct super_bl= ock *sb, info->notify_owner =3D NULL; info->notify_user_ns =3D NULL; info->qsize =3D 0; - info->user =3D NULL; /* set when all is ok */ + info->ucounts =3D NULL; /* set when all is ok */ info->msg_tree =3D RB_ROOT; info->msg_tree_rightmost =3D NULL; info->node_cache =3D NULL; @@ -371,19 +372,19 @@ static struct inode *mqueue_get_inode(struct super_= block *sb, if (mq_bytes + mq_treesize < mq_bytes) goto out_inode; mq_bytes +=3D mq_treesize; + ucounts =3D current_ucounts(); spin_lock(&mq_lock); - if (u->mq_bytes + mq_bytes < u->mq_bytes || - u->mq_bytes + mq_bytes > rlimit(RLIMIT_MSGQUEUE)) { + overlimit =3D inc_rlimit_ucounts_and_test(ucounts, UCOUNT_RLIMIT_MSGQU= EUE, + mq_bytes, rlimit(RLIMIT_MSGQUEUE)); + if (overlimit) { + dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_MSGQUEUE, mq_bytes); spin_unlock(&mq_lock); /* mqueue_evict_inode() releases info->messages */ ret =3D -EMFILE; goto out_inode; } - u->mq_bytes +=3D mq_bytes; spin_unlock(&mq_lock); - - /* all is ok */ - info->user =3D get_uid(u); + info->ucounts =3D get_ucounts(ucounts); } else if (S_ISDIR(mode)) { inc_nlink(inode); /* Some things misbehave if size =3D=3D 0 on a directory */ @@ -497,7 +498,7 @@ static void mqueue_free_inode(struct inode *inode) static void mqueue_evict_inode(struct inode *inode) { struct mqueue_inode_info *info; - struct user_struct *user; + struct ucounts *ucounts; struct ipc_namespace *ipc_ns; struct msg_msg *msg, *nmsg; LIST_HEAD(tmp_msg); @@ -520,8 +521,8 @@ static void mqueue_evict_inode(struct inode *inode) free_msg(msg); } =20 - user =3D info->user; - if (user) { + ucounts =3D info->ucounts; + if (ucounts) { unsigned long mq_bytes, mq_treesize; =20 /* Total amount of bytes accounted for the mqueue */ @@ -533,7 +534,7 @@ static void mqueue_evict_inode(struct inode *inode) info->attr.mq_msgsize); =20 spin_lock(&mq_lock); - user->mq_bytes -=3D mq_bytes; + dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_MSGQUEUE, mq_bytes); /* * get_ns_from_inode() ensures that the * (ipc_ns =3D sb->s_fs_info) is either a valid ipc_ns @@ -543,7 +544,7 @@ static void mqueue_evict_inode(struct inode *inode) if (ipc_ns) ipc_ns->mq_queues_count--; spin_unlock(&mq_lock); - free_uid(user); + put_ucounts(ucounts); } if (ipc_ns) put_ipc_ns(ipc_ns); diff --git a/kernel/fork.c b/kernel/fork.c index 812b023ecdce..0a939332efcc 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -823,6 +823,7 @@ void __init fork_init(void) init_user_ns.ucount_max[i] =3D max_threads/2; =20 init_user_ns.ucount_max[UCOUNT_RLIMIT_NPROC] =3D task_rlimit(&init_task= , RLIMIT_NPROC); + init_user_ns.ucount_max[UCOUNT_RLIMIT_MSGQUEUE] =3D task_rlimit(&init_t= ask, RLIMIT_MSGQUEUE); =20 #ifdef CONFIG_VMAP_STACK cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache", diff --git a/kernel/ucount.c b/kernel/ucount.c index 2f42d2ee6e27..6fb2ebdef0bc 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -81,6 +81,7 @@ static struct ctl_table user_table[] =3D { UCOUNT_ENTRY("max_inotify_instances"), UCOUNT_ENTRY("max_inotify_watches"), #endif + { }, { }, { } }; diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 2434b13b02e5..cc90d5203acf 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -122,6 +122,7 @@ int create_user_ns(struct cred *new) ns->ucount_max[i] =3D INT_MAX; } ns->ucount_max[UCOUNT_RLIMIT_NPROC] =3D rlimit(RLIMIT_NPROC); + ns->ucount_max[UCOUNT_RLIMIT_MSGQUEUE] =3D rlimit(RLIMIT_MSGQUEUE); ns->ucounts =3D ucounts; =20 /* Inherit USERNS_SETGROUPS_ALLOWED from our parent */ --=20 2.29.2