From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F29DC4332F for ; Tue, 14 Nov 2023 16:58:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F09C6B02FC; Tue, 14 Nov 2023 11:58:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A0AC6B02FD; Tue, 14 Nov 2023 11:58:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 668686B02FE; Tue, 14 Nov 2023 11:58:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 560106B02FC for ; Tue, 14 Nov 2023 11:58:13 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1E2131204B9 for ; Tue, 14 Nov 2023 16:58:13 +0000 (UTC) X-FDA: 81457167666.27.30B9D80 Received: from sonic304-27.consmr.mail.ne1.yahoo.com (sonic304-27.consmr.mail.ne1.yahoo.com [66.163.191.153]) by imf09.hostedemail.com (Postfix) with ESMTP id E7A8414001F for ; Tue, 14 Nov 2023 16:58:10 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=yahoo.com header.s=s2048 header.b=YBOIB2uk; dmarc=none; spf=none (imf09.hostedemail.com: domain of casey@schaufler-ca.com has no SPF policy when checking 66.163.191.153) smtp.mailfrom=casey@schaufler-ca.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699981091; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GDJ5Xw7eIW1h0CD8YYo/VIQzHj9A7Sut1sTIsbNgDJg=; b=tTBhfghIOcFF0AJU0P2Hnv63fiHIDeMnjkZlr95QaP0n8JR9iOWER+IYVxGntikfvaLEe/ SnX8SBLK6Z5DmzZ/bOF6S6XEM26GzWzk23H3cqeJWmEk5rCxkDc2S3WQwiku4Ho3kDGCgY LqfeOzGogr6zOVtj9BfggMoZlQkPHew= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=yahoo.com header.s=s2048 header.b=YBOIB2uk; dmarc=none; spf=none (imf09.hostedemail.com: domain of casey@schaufler-ca.com has no SPF policy when checking 66.163.191.153) smtp.mailfrom=casey@schaufler-ca.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699981091; a=rsa-sha256; cv=none; b=QzTIR/koukcU7W08PyyOOrsqpUYkEQvcQV5/GAJBXvmQXKs6mp4gg0qavapn+5iwIsl56Q p2aHLK5Ejvk4U4Oh8LfO4OQe9QKet6UqxHKs0VytVBhWUAk821UJJLRz1UtUvcoGiN3t/E RH/F54VuPxNYiEyxs5f/fYxAnAFzvTs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1699981089; bh=GDJ5Xw7eIW1h0CD8YYo/VIQzHj9A7Sut1sTIsbNgDJg=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From:Subject:Reply-To; b=YBOIB2uk5eL4H/cXdU1Ed1Av/uS4/n+nGMTWkl0H1dZ5twvaeorIlj1XvbuF3hF+IQRHlagX3fu7EjOW4CZJ9kc/X0aYOiEZSxH9I+WIBCxdIQw0u808qrShwl6uP0iKjmF/UwXpr1v7wQ93kKD6HvL9eogDGGvB/b2H82ejVeCUelNbBVe+dKAjM6KssWHfrf4+qfwiA2YntmncAbLzZ4ybP9xlUnVaotfKzsYD1KoTGwJ2/UgF9BuXluFGIZf10Ol75rBGucx85IHY1VZBPKJrYFylMLx2gYItY409KpE9CoqK8Jw/aoZ8TPfSRKMHV7Op/eyeJ1YwNvJO4bZpig== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1699981089; bh=8+nhCVC05YUDF7rvCX/1OY4jhw2+sh3dfE4loYTHX/P=; h=X-Sonic-MF:Date:Subject:To:From:From:Subject; b=uhoQW1IiD3qZ0AeFmWp7dE5BayZjgg06hYWqb177ooGVzhXUvQ0qG/JHdzY35Gxsf2wutWS586Rvevwguc/GgXRBXGVuEvpBA86atkh/qBOYbYW/Iax+emS7JN9Q3iRooih1emH1WNOzQqzhKrj0oVpcn+VF9ekWw4tMzFPEeC6z2gQZHbIsl/1ouaSuapub44PMlWZSAhdD7kX9/5L2spt62R5+2ZxJEkWTgC43iXLFB7zspJVyGlA4s5iMaXBVdCJtmTQJIzhz/MNwd+P/+ARpI+JXS5EMtGs7Ip3jS/YHNnwGBp7Q7runb+Nm9fle0PdAUXYcABTOU1j6QMx/mw== X-YMail-OSG: kYyPSW4VM1nDp.GJeuaZ1G562mmiLDMvjVXl3.mqV9_G9bhgiP0c7Tntc60ok15 Zu_NgVxrvqRAlafKLN45t4DquFrdH7_ZjNF0whpospzd5UcGp7vSETWYKqDgzYs7jscV004WATNE UV3xh1IjrIx.LgLwP1eiMSbF6p3S0dNxmNyVCB4iMVDdg3GZLAiHMBGZ5_T_KK3va.7bPO1x_o.t xHPicRIeC58DCMMzOBDZ88Hl6qJQcUfk04noOTfqX45QKjbFE.Dc4FiZOI2edKYqjZqM83vCgQaT .IyT31W9YpsQjZW0pFVr1RCkZMKPByZ02Zha4g..hX_hWhqYa9fU4q6SOl1Hvd1C181HgYpmA8PP 13MEEoUVNMA.mrC3OoeG8ypMRFHtywSuOwlCwythFFeXwaB6qwAcTfDoHjjPTrfQg5S96zTQ_usL e1jRPJRMZx_JCMqQNM4SvxVwh_8z51zdXRV.crNPu47_ud.y1IzeLtDq3mLkfY1EiBKrTuRMH9ei UFLjuPNUmh8lrkSObv5viitp3g9Q9mOo1tPYN6f5tXHeU7s0kniRKoM8_Z3by1IfGVdBiSyBCI44 6M6yVSa44oYF8AfK8h9ri3TZI2CcEEP4lcLjWuU7pY225ZoUZYak7Zz7E8jf0M.1vm8i_3QGnOwI oljTcCKmqUKF3i.DA3gtMhvIA521YkWVmwWGwfAvAmePGS6p7X12gFPNiDbRtYOdk3.x6UAIMhcH nx8Gm2OaGn_8QUWL143grXqqfcc0ncMJdfvY5r2tPjhN7snNo4ii5FXeec8EAqZXPdbXBI_lGaQe XNMzr4LOQB4QVNvCZlu7WnLPudBkLnZ7AvFI2n4KA0cKtmnC0zmQFrk3nfMJh7JESTMH8kYlW2nG .fYFG.yVLLC99iYcMBQqoMv7rgC8etkye4xYznBcKUT0_eGvf5wPzNcbLZTaZL3EPnjupiSz6QUy 9h23Hzi2j.ENQnW3ksDSCBRB0._WX7iKR3ZhGEh2QTrqj8OlDzoWR4VWdJwaM5r30KPldap.RpfL Luhkh8ILdEFEkQJHAq403LaFG7vdmDiLJg7Gi_8gZJUA6TFgw2t7ORuOSBPF6gtxW1OXjhryefBy 40OlpYX.7eNLxXOELzlVXJ5OmGyaJn1VYYTP7SHlVSQripnJpjTTm_jgV80cA4w90lbHEdj71E2c 7Uxw8nGj1SVBACSmpViSZmWbC76lbEkqm83WpJ5pblpt0.xQQeZeZsW7UOmoCyh6qr_6fLDwPu53 nj2RHKA_9dEADAsGn3JmIB3iF0.3F6TrYNCb4Scvvgd4ipa.Y.__HbLXH9veo15XJTMhz2v7DMc2 cXsC.F7r1ek9qLVFH37tkG64j9qLJ86nMcMynEqLmbFAXPaW_UDpYYYYPJnEF0cyjgeyIBvCOKhp O4SDVKEPIthkf_kaSG4WtMlmSrvwausloEsmA1.ZLqOnAGy1FQXtk2k8bBTZsXWjkez10lntvhrY CZXwm_n7GZxKKfHMa0sFaDc59f4GLTAN2pp4hWJbYXdO_oxFZb0QXwyp1Pgo58WKkIjNcrTuhpGm _qLqSMlHkxaOxpHKz2cX50HThZzRhrjOkxb8e0HCL06_auNVu_Plcnc1KhvsnPNWiZ7hSIb6ULRM FTBzra2Z9D7Fafhw9NGKrQoWrfatlN2WoWzFv6vULSUy.vU7r1js2xR5xSMmWtJSwRbeHFx2hvie Jlgk3bDP4ZqHn.3RML_7mKgE_cUEfg_AUacHOeNp7Kr8d.rZoS.qrQ4ndHJSPnz3HFBpZcRewgi4 YCUDxkbDSrVUlRR6H0I_QRm8flybJM2AM1G_2X383SEWCvGqOxH6S.Vy5dNTiAIKyPhT9Tcu.t2n uZDoRDzP230TM5.ZEjSiwA3g9YR8nlIn24N1xJTSnqiRKSLGONkLsyI7LOvGvPPS2kyP9VZO3s60 Fi9LgvZXavwsvk6jNXVIFqmht6W62MDr3WO8EtPhuDh_d9I38ShvGpccntHSuJ83tj9CMA2fHxAh EIUsH6zE5bwgAQX3Tp6Nahthk5vM3A6gnzxp9V8dNirsQD6h1F5z0iw4wOcohKmPcdYUZ3Yryb2c xMtp2hZQRgKHXIR3kzAUbQH.RgOk1KaHWHXNODXvTGxZ.s3SNqzvFRaa933QvsH7kHiDrrRf_717 Pl81Hu3Mbv8XauOwuiHAh3S9F2CigODrFXpR1ywR3uU4PAGSF4WbocZil1cdi6Iy532WGXjkV7VG iIsXtFZOfsvEMZQz6ok4C6I_7h3DRNqkp X-Sonic-MF: X-Sonic-ID: 4370bd34-8856-4583-8150-ab0931e637b0 Received: from sonic.gate.mail.ne1.yahoo.com by sonic304.consmr.mail.ne1.yahoo.com with HTTP; Tue, 14 Nov 2023 16:58:09 +0000 Received: by hermes--production-bf1-5b945b6d47-h4jfj (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 7ee03597e617be2bba3f59ab1d14b26f; Tue, 14 Nov 2023 16:58:07 +0000 (UTC) Message-ID: Date: Tue, 14 Nov 2023 08:57:58 -0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH -mm 0/4] mm, security, bpf: Fine-grained control over memory policy adjustments with lsm bpf Content-Language: en-US To: Yafang Shao , Michal Hocko Cc: akpm@linux-foundation.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, linux-mm@kvack.org, linux-security-module@vger.kernel.org, bpf@vger.kernel.org, ligang.bdlg@bytedance.com, Casey Schaufler References: <20231112073424.4216-1-laoar.shao@gmail.com> <188dc90e-864f-4681-88a5-87401c655878@schaufler-ca.com> From: Casey Schaufler In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Mailer: WebService/1.1.21896 mail.backend.jedi.jws.acl:role.jedi.acl.token.atz.jws.hermes.yahoo X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E7A8414001F X-Stat-Signature: jjik77x7dzd8beqhu89jaymxccwby81h X-HE-Tag: 1699981090-791345 X-HE-Meta: U2FsdGVkX18MDofCqemas83J+Th8sklorpjq+vjWGWB1CASQWqnPN2GAfi5j9G9f/GQ5YsaYPc81NQoUtY6b9lhDPLrKgK6OwZCPGb0TuZe2DzoBm8myHR+PWOmsVJoTVtMClQZHaOKv0gT76xW0IuIKSSnB3DSDV6uDsl/4Px1GHkI++qR6/PwAKjCgMqEkbqk6CSCuG5wiuv9ckhYwMB2DdyB/Fcc8lFGamJMF+os6HGbgXmlcfF/ildBPG4D5e84r/T/6rsmIXzcmjZOl9o5zGEBDKrtK283fosh/8frQHujKXJCLTsYJfOnlFKK+yZKfbQal3QeVJZqeA/xrUz0rj4B2buWB7tavKqyDyV7xRqdcA5ZMRARUjwq6A8g4TVMtTGEFMq2YfJ2coL362FBzibsUEP2aw8U7kWJIPJ+I0EL6HxpoV0ryFm6EdNIUKXt301/x7tWCywa9R3+2YirBSmsnEJT0ad+Oc8D2hV94TSKysdxqWwNff2ZbxfAly+D/r55gzFSgPH05H11zIbL229dbiM/ZHYpBmIE2ZGpJ8AQDmsPTYS/WEMA2FvHQeaQHA+1hJNNVOHHfUZkbbVZRYN685EKSsEhTuqnfSmdju6yJaEnOC/0P7Euw79bT4po6EjTPLn4eGA/0eQUQ0Ncmk41sHg3WvvU8e4+vSdWji8ICB7x3ZOK8VHosEd4nvnHsiBVlzlx68T66JzrdVQ/UrI7tWFKnTfk9JFV7wIJVEu6iMmXfZLfdblJogHUcLBcQz3KR+XmejUi/VIvw9Gt3IvbKQMdLGkQNL2maqpM2H6y5qgaeHJQEjGtucWQb3TOw04KjF8gq1pPOvUvBn+B75WPq+3C9gkiUpOhXly4VdQxaz4kn37BzYWh7fN2rC9Uio/YV4JFvyJO99wgNjNRPWt3Jj+R4dSfe1nDz1I8mzpFaw/Dm39to8NEXAhyc/DuzZkx3CUcIG1cY04Y Yq+s5gAr 2ZFYxXrscaqddQZdRyNCpXkvedmJpG8c8tZ1UUQl3hLU7GIkFzsyxXbwabq3/sX6IznmWm4YtQxi3Top+97FjXL+iU4STJN0puF7l17zSTmatX8QGr6wydhx+dbl7F+YUDo8YCJmZAEQblZSifkN/uSIJx9v6xIqCKdslYbMFdDnjmzJs1b/CLDQdda2I4l/21qgjgHnAU35CFPXNl0+Xdd2f1JlEvSzEPeI2/VggVCwaB8MR4aLM78T0ySPyZMCpF1yB8dnISMaRyqNMQiinMpH+ABO4XTnZma4GW4i9kJfJ+LQ3eE1cCPpeAcxKqXbbK9d1p9NWe0gTe+eA4N2+rq67jqGYEfAxStf9oi8CGzCtw93P/cYnIrDNZ/1RA1RcrBcwBYerf6C/FieEGh2g7B4JX42NmdxGs2i0E4oXLPVKFs0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.011445, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11/14/2023 3:59 AM, Yafang Shao wrote: > On Tue, Nov 14, 2023 at 6:15 PM Michal Hocko wrote: >> On Mon 13-11-23 11:15:06, Yafang Shao wrote: >>> On Mon, Nov 13, 2023 at 12:45 AM Casey Schaufler wrote: >>>> On 11/11/2023 11:34 PM, Yafang Shao wrote: >>>>> Background >>>>> ========== >>>>> >>>>> In our containerized environment, we've identified unexpected OOM events >>>>> where the OOM-killer terminates tasks despite having ample free memory. >>>>> This anomaly is traced back to tasks within a container using mbind(2) to >>>>> bind memory to a specific NUMA node. When the allocated memory on this node >>>>> is exhausted, the OOM-killer, prioritizing tasks based on oom_score, >>>>> indiscriminately kills tasks. This becomes more critical with guaranteed >>>>> tasks (oom_score_adj: -998) aggravating the issue. >>>> Is there some reason why you can't fix the callers of mbind(2)? >>>> This looks like an user space configuration error rather than a >>>> system security issue. >>> It appears my initial description may have caused confusion. In this >>> scenario, the caller is an unprivileged user lacking any capabilities. >>> While a privileged user, such as root, experiencing this issue might >>> indicate a user space configuration error, the concerning aspect is >>> the potential for an unprivileged user to disrupt the system easily. >>> If this is perceived as a misconfiguration, the question arises: What >>> is the correct configuration to prevent an unprivileged user from >>> utilizing mbind(2)?" >> How is this any different than a non NUMA (mbind) situation? > In a UMA system, each gigabyte of memory carries the same cost. > Conversely, in a NUMA architecture, opting to confine processes within > a specific NUMA node incurs additional costs. In the worst-case > scenario, if all containers opt to bind their memory exclusively to > specific nodes, it will result in significant memory wastage. That still sounds like you've misconfigured your containers such that they expect to get more memory than is available, and that they have more control over it than they really do. >> You can >> still have an unprivileged user to allocate just until the OOM triggers >> and disrupt other workload consuming more memory. Sure the mempolicy >> based OOM is less precise and it might select a victim with only a small >> consumption on a target NUMA node but fundamentally the situation is >> very similar. I do not think disallowing mbind specifically is solving a >> real problem. > How would you recommend addressing this more effectively? >