From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9315C2D0CD for ; Mon, 19 May 2025 21:35:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9DF436B0095; Mon, 19 May 2025 17:35:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B7D56B0096; Mon, 19 May 2025 17:35:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A73D6B0098; Mon, 19 May 2025 17:35:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 62B6B6B0095 for ; Mon, 19 May 2025 17:35:08 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 260121D0BF8 for ; Mon, 19 May 2025 21:24:59 +0000 (UTC) X-FDA: 83460937518.29.94E76B8 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf11.hostedemail.com (Postfix) with ESMTP id 3181540003 for ; Mon, 19 May 2025 21:24:57 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=v3DoIT7J; spf=pass (imf11.hostedemail.com: domain of jannh@google.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747689897; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Jwr2UogZmg2vxMmFxh7M3ZUqGzc1B04mt9jo7F27B84=; b=Q97mUFenepteiIELOB3V2frjoKtK1C+nL3jl6ru43qFxJhtyFgOi0qQQwqRBMD+DICh3IR vnWbHgJIjBpZ6N3blmHgfi/7L9nIl7aly1HO3c825TRuGKoYaxyt89Mw0w/a9Jn02UY1I+ XWnwnJVbRea++7Q9RSvo961bb4RJQ2U= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=v3DoIT7J; spf=pass (imf11.hostedemail.com: domain of jannh@google.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747689897; a=rsa-sha256; cv=none; b=twjvJTTRBy+XuRegMrdMsgr50x7srBJKtM1ZT3BBxvxnXxP75Z8Nxwm2RmTsvgOjy5kPQ6 Z2gkeqJPivQNpQD3+LoX519joLGB4HhUXpDzSXCoBOg3gL9Y2Pn1fir5StkBllLbE0B4TP ATqDzv6bBSX6s0t1OJarYmXkCMFDDcs= Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-231f6c0b692so433745ad.0 for ; Mon, 19 May 2025 14:24:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747689896; x=1748294696; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Jwr2UogZmg2vxMmFxh7M3ZUqGzc1B04mt9jo7F27B84=; b=v3DoIT7JLYd+K15h+wGhgVo2Zegck2W0YBmcJEOo9CCV0PYU2/y1drAHBA+n4Z2XHc xTFVX0mzrdxjCYsbj2ZqEVin4AzapRgKQO6z3YMQ+RAU2ewZPGRNked9K+HeY9eBunTQ WMdThLsih9hMu8rdxlzEQxad5jeu8dUzAttVPijJObC/vum/ZEtm+30T00NkUIYQ3R/q rycVbTQBW/eKn2uMmrbAWy6BUrRuNMW4Q08NOWbx2dwiAQZCFGxSFBrDUo0trC/S/VWd 2t05oEitQBxq53PqZ2iAoidVSddIHmMG/FjE04kRMCZ8DsDUPwujdkwE2cr7CaDLxLJ+ VIzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747689896; x=1748294696; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Jwr2UogZmg2vxMmFxh7M3ZUqGzc1B04mt9jo7F27B84=; b=lZ/zgjuO6ZUAE/R/OTwWVBvtGrseeN8WJZIVM1BPM6cfFOsPMaeD0iV8BzlpPoevkF TsgXpSEJiErKG0cbPVYni0FkJ4ob7/M9jd1MI0bsMKrgPPld9eqCYEy/8OK8iso/9sJg CyWHDe/Y/mmHE8dFziVao+/WsoLG0JxefLdPY4o89k2RWZW3xHu5JjHlJwJsVeXUmQ2V iXkfT/Q5wKmVLilSEHfDAR96ogZpkzIqtqLX1w4vbnlg/q5oFWTh1k5Ckhn538UZ5ER0 9Ui8eI2ZrH+qnZp4yAYgu7ShPEgZ3Wrc9sjLvQR2j50q1k6iuipmwMl/0ehfOL1DAFaa lEMw== X-Forwarded-Encrypted: i=1; AJvYcCXm7dzhpEQiP25MyXxa/ZtEtABQCkwnbGc/6Gfjny3NDkOc6Zvr47TCIelXacYDVzYju/TgXoPKBQ==@kvack.org X-Gm-Message-State: AOJu0Yz6kC2+zRKFOKDEP0m3MyXGBhrkTf+URhI+JavuT4+gDYSA3sR8 sn834/Syf6I7cGlzvUhsKaFixayItvAjyKJ9DgU63g9xFP8Ei5DlHbJu1Q3CBCZe0dKDGyVCf2r 7U7n7gGbWnzjIaKFYOoCH6ubWiFFxdfFqkeJTFk/2 X-Gm-Gg: ASbGncurdowrsOvi5Z8IuDKRRAvD1Had70FTjkm/oqSc2CslqlD1JYjb/3guFMqdekU kiraad5V0RHsKCLIPw2x+IpeIljm3B0U1RFExqaTPI6W29IQkY1Fa9R9F4QESxJld1zJSiH+ejj +uAEkqwEMRRk6ONKUr/an/1PBESvX3MdT4krDX6KWlj1vAITlwzOgTwjvLyLU= X-Google-Smtp-Source: AGHT+IHZl20CThZazmHRrMzF0aowGtpRTo4hj8YBed0NBKl5Jgl1WKHtg5swYnZ0pDOCIVlVWLY5qNR7gScCJ/npWcE= X-Received: by 2002:a17:902:cccb:b0:22c:3cda:df11 with SMTP id d9443c01a7336-231ffd1b02dmr5984095ad.10.1747689894886; Mon, 19 May 2025 14:24:54 -0700 (PDT) MIME-Version: 1.0 References: <20250519131151.988900-1-chenridong@huaweicloud.com> In-Reply-To: From: Jann Horn Date: Mon, 19 May 2025 23:24:16 +0200 X-Gm-Features: AX0GCFskDhKj6EKh6Dgpb4RDwoGMnYtP53lzDT7-l2M_PC7fwkDAzd65ZS5_Flc Message-ID: Subject: Re: [RFC next v2 0/2] ucounts: turn the atomic rlimit to percpu_counter To: Alexey Gladkov Cc: Chen Ridong , akpm@linux-foundation.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, pfalcato@suse.de, bigeasy@linutronix.de, paulmck@kernel.org, chenridong@huawei.com, roman.gushchin@linux.dev, brauner@kernel.org, pmladek@suse.com, geert@linux-m68k.org, mingo@kernel.org, rrangel@chromium.org, francesco@valla.it, kpsingh@kernel.org, guoweikang.kernel@gmail.com, link@vivo.com, viro@zeniv.linux.org.uk, neil@brown.name, nichen@iscas.ac.cn, tglx@linutronix.de, frederic@kernel.org, peterz@infradead.org, oleg@redhat.com, joel.granados@kernel.org, linux@weissschuh.net, avagin@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lujialin4@huawei.com, "Serge E. Hallyn" , David Howells Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3181540003 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: fz5ynqz1m6qfau5bwk3n5acjaij3w1i9 X-HE-Tag: 1747689897-798322 X-HE-Meta: U2FsdGVkX19wdZZc8SADbuKFu4mbGEZVwBk9sWs6GowefKMDemEgo9gnenn4v1XjHYw1vcEFgrTAAkOCYjWIMiV9suVG2ZdrdXdgGmkz/6m4GzLC6MMt3brxR+GQsVobN80xZcBjvnmiBn5sAc8MHfHteqqFV/c5j2zSTpRHG9w7zOFYobg9js8j8VDQDhIDMiDfJGCFSxNJKxOrBrVrteYuKBwN1DYTZy3/mL9YhhlbWFTXexsoZKS78NPTIa2r2yY4RngiNo6LuEuzCoLgriCzWgMUnc6XWp9A6Jc/8XE8CYDyVf+KRSwEvbaspQLTXh8tivt+fBngKZLtOSgaa+mwCD7mQp8KKgeIZcwPBzNW2YbBVDd0DOuhg4jl1gESgY8GV32Tz7nR7A1vVdgZ0h/KCQ7UYN6fT5ozfPWnXJMgSkjc1zA6zresBposeg2FDgY77fyWtOIQhr8ycJddaXgWo6ozM8Z5uuqKn0bwmjiVHD6GCRg3Op+7pKXbS9Y2FIGs5Pxey6qYu3yUDMJJjbvTF7tT347H+8G+oYKZy3cT9oDOQ/aEJANeVGqsCIQjVZnCTVpukLkNfh5Krh/r1rXI6cos0GGLz21f/lUvJjifwFifMJzlpaGpEbw9FZ9wk36lSccbgPy+6Mv+4M7TccDuKjE86/Sa3HIK7LMAIfrWvKf2TARa57ZI5yIJjVHchvq3tSaajvsRS1/A+x0xgwiPXTz6UzGzMNoOgRLcq2u3gZHMy/iQxgPHBrk5b5nJ4eV8ZgcTCblXL2f0DsftHbuAAlOxweiB43pLzQoVzTSRfW3l1CaRtA/qgW5O6gmXb3S6/2w544Pxqw4NeuCbGrR4xMKwm8ng6IvhDAemRCoTjsl7Qr523oTc7GYs78j+uzqKamYrekm5Aqqz0PyLO1/FL5kiQY5i0A1WAJ6BSRa7a8ZD3LJg2q6qC+cRdJTufz0GnNA8+sr93Ga1fC3 VIlQt2CY g7U8VRyadLvsr58Hvv4ye1dHGyRhFFt/csgrqUOr4UCtvgKIyMelvyXMm6SL2mBxv3GmLKaEZRavDWR0CaiMCORqsJd7mOzON/xi/LHtymwbd1wZk7gIFaYN/UkuxITd6tfBkPEJoJHkD7ObOsqQcvmbWGFgirh4GcXu5mwRENWiIpcA1TBygLv5xAmntCqXx+/F7JNcr2MaCxp4ad4EcoMavJp4/EXLyzuT12stOvnoRtw52jKq4G0y/xg9p1eDmm/NdBvVXI0iM42jKC7GUrVSBoI9gX4n7361SXrv60NLbFm2Cd+7y4xMmHq1iWfRfE3czgx0ZgMbRcim13P/hu5uz7UGgEFpIDvy8RV6ZxxPSZnKqtEzs1s4BD1YWSvKJVaKK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, May 19, 2025 at 11:01=E2=80=AFPM Alexey Gladkov = wrote: > On Mon, May 19, 2025 at 09:32:17PM +0200, Jann Horn wrote: > > On Mon, May 19, 2025 at 3:25=E2=80=AFPM Chen Ridong wrote: > > > From: Chen Ridong > > > > > > The will-it-scale test case signal1 [1] has been observed. and the te= st > > > results reveal that the signal sending system call lacks linearity. > > > To further investigate this issue, we initiated a series of tests by > > > launching varying numbers of dockers and closely monitored the throug= hput > > > of each individual docker. The detailed test outcomes are presented a= s > > > follows: > > > > > > | Dockers |1 |4 |8 |16 |32 |64 = | > > > | Throughput |380068 |353204 |308948 |306453 |180659 |129152= | > > > > > > The data clearly demonstrates a discernible trend: as the quantity of > > > dockers increases, the throughput per container progressively decline= s. > > > > But is that actually a problem? Do you have real workloads that > > concurrently send so many signals, or create inotify watches so > > quickly, that this is has an actual performance impact? > > > > > In-depth analysis has identified the root cause of this performance > > > degradation. The ucouts module conducts statistics on rlimit, which > > > involves a significant number of atomic operations. These atomic > > > operations, when acting on the same variable, trigger a substantial n= umber > > > of cache misses or remote accesses, ultimately resulting in a drop in > > > performance. > > > > You're probably running into the namespace-associated ucounts here? So > > the issue is probably that Docker creates all your containers with the > > same owner UID (EUID at namespace creation), causing them all to > > account towards a single ucount, while normally outside of containers, > > each RUID has its own ucount instance? > > > > Sharing of rlimits between containers is probably normally undesirable > > even without the cacheline bouncing, because it means that too much > > resource usage in one container can cause resource allocations in > > another container to fail... so I think the real problem here is at a > > higher level, in the namespace setup code. Maybe root should be able > > to create a namespace that doesn't inherit ucount limits of its owner > > UID, or something like that... > > If we allow rlimits not to be inherited in the userns being created, the > user will be able to bypass their rlimits by running a fork bomb inside > the new userns. > > Or I missed your point ? You're right, I guess it would actually still be necessary to have one shared limit across the entire container, so rather than not having a namespace-level ucount, maybe it would make more sense to have a private ucount instance for a container... (But to be clear I'm not invested in this suggestion at all, I just looked at that patch and was wondering about alternatives if that is actually a real performance problem...)