From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C652C433DB for ; Fri, 22 Jan 2021 13:00:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D5EE8235F7 for ; Fri, 22 Jan 2021 13:00:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5EE8235F7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 54E3F6B000A; Fri, 22 Jan 2021 08:00:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FC496B000C; Fri, 22 Jan 2021 08:00:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39DC36B000D; Fri, 22 Jan 2021 08:00:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id 1BE306B000A for ; Fri, 22 Jan 2021 08:00:54 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D4D824408 for ; Fri, 22 Jan 2021 13:00:53 +0000 (UTC) X-FDA: 77733420786.16.offer99_340cf442756c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id A43EA100E690B for ; Fri, 22 Jan 2021 13:00:53 +0000 (UTC) X-HE-Tag: offer99_340cf442756c X-Filterd-Recvd-Size: 9743 Received: from raptor.unsafe.ru (raptor.unsafe.ru [5.9.43.93]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Fri, 22 Jan 2021 13:00:52 +0000 (UTC) Received: from comp-core-i7-2640m-0182e6.redhat.com (ip-94-112-41-137.net.upcbroadband.cz [94.112.41.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by raptor.unsafe.ru (Postfix) with ESMTPSA id 90B81209D4; Fri, 22 Jan 2021 13:00:51 +0000 (UTC) From: Alexey Gladkov To: LKML , io-uring@vger.kernel.org, Kernel Hardening , Linux Containers , linux-mm@kvack.org Cc: Alexey Gladkov , Andrew Morton , Christian Brauner , "Eric W . Biederman" , Jann Horn , Jens Axboe , Kees Cook , Linus Torvalds , Oleg Nesterov Subject: [PATCH v4 1/7] Add a reference to ucounts for each cred Date: Fri, 22 Jan 2021 14:00:10 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.1 (raptor.unsafe.ru [5.9.43.93]); Fri, 22 Jan 2021 13:00:51 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For RLIMIT_NPROC and some other rlimits the user_struct that holds the global limit is kept alive for the lifetime of a process by keeping it in struct cred. Add a ucounts reference to struct cred, so that RLIMIT_NPROC can switch from using a per user limit to using a per user per user namespace limit. Signed-off-by: Alexey Gladkov --- include/linux/cred.h | 1 + include/linux/user_namespace.h | 7 ++++-- kernel/cred.c | 20 +++++++++++++-- kernel/ucount.c | 46 ++++++++++++++++++++++++++-------- kernel/user_namespace.c | 1 + 5 files changed, 61 insertions(+), 14 deletions(-) diff --git a/include/linux/cred.h b/include/linux/cred.h index 18639c069263..307744fcc387 100644 --- a/include/linux/cred.h +++ b/include/linux/cred.h @@ -144,6 +144,7 @@ struct cred { #endif struct user_struct *user; /* real user ID subscription */ struct user_namespace *user_ns; /* user_ns the caps and keyrings are re= lative to. */ + struct ucounts *ucounts; struct group_info *group_info; /* supplementary groups for euid/fsgid *= / /* RCU deletion */ union { diff --git a/include/linux/user_namespace.h b/include/linux/user_namespac= e.h index 64cf8ebdc4ec..4cf93f9f93a6 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -85,7 +85,7 @@ struct user_namespace { struct ctl_table_header *sysctls; #endif struct ucounts *ucounts; - int ucount_max[UCOUNT_COUNTS]; + long ucount_max[UCOUNT_COUNTS]; } __randomize_layout; =20 struct ucounts { @@ -93,7 +93,7 @@ struct ucounts { struct user_namespace *ns; kuid_t uid; int count; - atomic_t ucount[UCOUNT_COUNTS]; + atomic_long_t ucount[UCOUNT_COUNTS]; }; =20 extern struct user_namespace init_user_ns; @@ -102,6 +102,9 @@ bool setup_userns_sysctls(struct user_namespace *ns); void retire_userns_sysctls(struct user_namespace *ns); struct ucounts *inc_ucount(struct user_namespace *ns, kuid_t uid, enum u= count_type type); void dec_ucount(struct ucounts *ucounts, enum ucount_type type); +struct ucounts *get_ucounts(struct ucounts *ucounts); +void put_ucounts(struct ucounts *ucounts); +void set_cred_ucounts(struct cred *cred, struct user_namespace *ns, kuid= _t uid); =20 #ifdef CONFIG_USER_NS =20 diff --git a/kernel/cred.c b/kernel/cred.c index 421b1149c651..9473e71e784c 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -119,6 +119,8 @@ static void put_cred_rcu(struct rcu_head *rcu) if (cred->group_info) put_group_info(cred->group_info); free_uid(cred->user); + if (cred->ucounts) + put_ucounts(cred->ucounts); put_user_ns(cred->user_ns); kmem_cache_free(cred_jar, cred); } @@ -144,6 +146,9 @@ void __put_cred(struct cred *cred) BUG_ON(cred =3D=3D current->cred); BUG_ON(cred =3D=3D current->real_cred); =20 + if (cred->ucounts) + BUG_ON(cred->ucounts->ns !=3D cred->user_ns); + if (cred->non_rcu) put_cred_rcu(&cred->rcu); else @@ -270,6 +275,7 @@ struct cred *prepare_creds(void) get_group_info(new->group_info); get_uid(new->user); get_user_ns(new->user_ns); + get_ucounts(new->ucounts); =20 #ifdef CONFIG_KEYS key_get(new->session_keyring); @@ -363,6 +369,7 @@ int copy_creds(struct task_struct *p, unsigned long c= lone_flags) ret =3D create_user_ns(new); if (ret < 0) goto error_put; + set_cred_ucounts(new, new->user_ns, new->euid); } =20 #ifdef CONFIG_KEYS @@ -485,8 +492,11 @@ int commit_creds(struct cred *new) * in set_user(). */ alter_cred_subscribers(new, 2); - if (new->user !=3D old->user) - atomic_inc(&new->user->processes); + if (new->user !=3D old->user || new->user_ns !=3D old->user_ns) { + if (new->user !=3D old->user) + atomic_inc(&new->user->processes); + set_cred_ucounts(new, new->user_ns, new->euid); + } rcu_assign_pointer(task->real_cred, new); rcu_assign_pointer(task->cred, new); if (new->user !=3D old->user) @@ -661,6 +671,11 @@ void __init cred_init(void) /* allocate a slab in which we can store credentials */ cred_jar =3D kmem_cache_create("cred_jar", sizeof(struct cred), 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL); + /* + * This is needed here because this is the first cred and there is no + * ucount reference to copy. + */ + set_cred_ucounts(&init_cred, &init_user_ns, GLOBAL_ROOT_UID); } =20 /** @@ -704,6 +719,7 @@ struct cred *prepare_kernel_cred(struct task_struct *= daemon) get_uid(new->user); get_user_ns(new->user_ns); get_group_info(new->group_info); + get_ucounts(new->ucounts); =20 #ifdef CONFIG_KEYS new->session_keyring =3D NULL; diff --git a/kernel/ucount.c b/kernel/ucount.c index 11b1596e2542..8d3cf7369ee7 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -125,7 +125,7 @@ static struct ucounts *find_ucounts(struct user_names= pace *ns, kuid_t uid, struc return NULL; } =20 -static struct ucounts *get_ucounts(struct user_namespace *ns, kuid_t uid= ) +static struct ucounts *__get_ucounts(struct user_namespace *ns, kuid_t u= id) { struct hlist_head *hashent =3D ucounts_hashentry(ns, uid); struct ucounts *ucounts, *new; @@ -160,7 +160,7 @@ static struct ucounts *get_ucounts(struct user_namesp= ace *ns, kuid_t uid) return ucounts; } =20 -static void put_ucounts(struct ucounts *ucounts) +void put_ucounts(struct ucounts *ucounts) { unsigned long flags; =20 @@ -175,14 +175,40 @@ static void put_ucounts(struct ucounts *ucounts) kfree(ucounts); } =20 -static inline bool atomic_inc_below(atomic_t *v, int u) +struct ucounts *get_ucounts(struct ucounts *ucounts) { - int c, old; - c =3D atomic_read(v); + unsigned long flags; + + if (ucounts) { + spin_lock_irqsave(&ucounts_lock, flags); + if (ucounts->count =3D=3D INT_MAX) + WARN_ONCE(1, "ucounts: counter has reached its maximum value"); + else + ucounts->count +=3D 1; + spin_unlock_irqrestore(&ucounts_lock, flags); + } + + return ucounts; +} + +void set_cred_ucounts(struct cred *cred, struct user_namespace *ns, kuid= _t uid) +{ + struct ucounts *old =3D cred->ucounts; + if (old && old->ns =3D=3D ns && uid_eq(old->uid, uid)) + return; + cred->ucounts =3D __get_ucounts(ns, uid); + if (old) + put_ucounts(old); +} + +static inline bool atomic_long_inc_below(atomic_long_t *v, int u) +{ + long c, old; + c =3D atomic_long_read(v); for (;;) { if (unlikely(c >=3D u)) return false; - old =3D atomic_cmpxchg(v, c, c+1); + old =3D atomic_long_cmpxchg(v, c, c+1); if (likely(old =3D=3D c)) return true; c =3D old; @@ -194,19 +220,19 @@ struct ucounts *inc_ucount(struct user_namespace *n= s, kuid_t uid, { struct ucounts *ucounts, *iter, *bad; struct user_namespace *tns; - ucounts =3D get_ucounts(ns, uid); + ucounts =3D __get_ucounts(ns, uid); for (iter =3D ucounts; iter; iter =3D tns->ucounts) { int max; tns =3D iter->ns; max =3D READ_ONCE(tns->ucount_max[type]); - if (!atomic_inc_below(&iter->ucount[type], max)) + if (!atomic_long_inc_below(&iter->ucount[type], max)) goto fail; } return ucounts; fail: bad =3D iter; for (iter =3D ucounts; iter !=3D bad; iter =3D iter->ns->ucounts) - atomic_dec(&iter->ucount[type]); + atomic_long_dec(&iter->ucount[type]); =20 put_ucounts(ucounts); return NULL; @@ -216,7 +242,7 @@ void dec_ucount(struct ucounts *ucounts, enum ucount_= type type) { struct ucounts *iter; for (iter =3D ucounts; iter; iter =3D iter->ns->ucounts) { - int dec =3D atomic_dec_if_positive(&iter->ucount[type]); + int dec =3D atomic_long_dec_if_positive(&iter->ucount[type]); WARN_ON_ONCE(dec < 0); } put_ucounts(ucounts); diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index af612945a4d0..4b8a4468d391 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -1280,6 +1280,7 @@ static int userns_install(struct nsset *nsset, stru= ct ns_common *ns) =20 put_user_ns(cred->user_ns); set_cred_user_ns(cred, get_user_ns(user_ns)); + set_cred_ucounts(cred, user_ns, cred->euid); =20 return 0; } --=20 2.29.2