From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDAA7C3DA6D for ; Wed, 21 May 2025 01:41:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B42B6B0082; Tue, 20 May 2025 21:41:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 38B1B6B0083; Tue, 20 May 2025 21:41:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C7C46B0088; Tue, 20 May 2025 21:41:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0D7D26B0082 for ; Tue, 20 May 2025 21:41:22 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 63AC2C1DCC for ; Wed, 21 May 2025 01:41:21 +0000 (UTC) X-FDA: 83465212362.08.BF13872 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by imf03.hostedemail.com (Postfix) with ESMTP id A86722000A for ; Wed, 21 May 2025 01:41:16 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of chenridong@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=chenridong@huaweicloud.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747791679; a=rsa-sha256; cv=none; b=GKnbikkDKQ8/aMkGPfRg70DbAUOoafI765Cm/oC6HVnNAucNE/Nojn2cPRtIeZ4iXxgKQZ RUec1IfzqcvWa+cswkteL5vLXDjaRjJM1SzvyPjXtRtu0TFaS8vWui++JHmmyy8CWQhy2g 0IIEGvi6VwUfDGOqrAfQFnfFR3eaI1w= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of chenridong@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=chenridong@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747791679; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5EHEevZ6Nclc2fYJXSWVOeP/xDZXsw6AQuU+Q4Xs/wY=; b=rQOfilF51tR2UIiA6tfk7P878fEzmHGntwWFuf7kUw0XarbWMXKidBo8nAChmkmsd+zHYH WLOIGfQ6fbzuimNJuV2H9rfXxGgsHqyXH3DCjguPhC82/QPyIgkas8tum298BChSK9fX7g v5b0g0Dfg3uO3hqxwGvzYiZpcTjXWw0= Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4b2Ddv1lBNz4f3jt0 for ; Wed, 21 May 2025 09:40:51 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 1750A1A1026 for ; Wed, 21 May 2025 09:41:11 +0800 (CST) Received: from [10.67.109.79] (unknown [10.67.109.79]) by APP3 (Coremail) with SMTP id _Ch0CgDHG8Y2Ly1o9UFXMw--.54820S2; Wed, 21 May 2025 09:41:10 +0800 (CST) Message-ID: Date: Wed, 21 May 2025 09:41:09 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC next v2 0/2] ucounts: turn the atomic rlimit to percpu_counter To: Jann Horn Cc: akpm@linux-foundation.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, pfalcato@suse.de, bigeasy@linutronix.de, paulmck@kernel.org, chenridong@huawei.com, roman.gushchin@linux.dev, brauner@kernel.org, pmladek@suse.com, geert@linux-m68k.org, mingo@kernel.org, rrangel@chromium.org, francesco@valla.it, kpsingh@kernel.org, guoweikang.kernel@gmail.com, link@vivo.com, viro@zeniv.linux.org.uk, neil@brown.name, nichen@iscas.ac.cn, tglx@linutronix.de, frederic@kernel.org, peterz@infradead.org, oleg@redhat.com, joel.granados@kernel.org, linux@weissschuh.net, avagin@google.com, legion@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lujialin4@huawei.com, "Serge E. Hallyn" , David Howells References: <20250519131151.988900-1-chenridong@huaweicloud.com> Content-Language: en-US From: Chen Ridong In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-CM-TRANSID:_Ch0CgDHG8Y2Ly1o9UFXMw--.54820S2 X-Coremail-Antispam: 1UD129KBjvJXoW7ur48GFWUCw1xAw4UKw17KFg_yoW5JryfpF WSy3Z8KFn5JFnIy392q3yIva45Krs3CrWUJw45Gw4xAan8CFya9F17tw4YvFWDCrZ2ya4j vFWjg3sFkFWDXaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvFb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4kS 14v26rWY6Fy7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I 8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWrXVW8 Jr1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7 CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AK xVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvj xUsPfHUUUUU X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ X-Stat-Signature: 6np17o67w5kdpq4778wi7y78mig6y4a5 X-Rspam-User: X-Rspamd-Queue-Id: A86722000A X-Rspamd-Server: rspam06 X-HE-Tag: 1747791676-802893 X-HE-Meta: U2FsdGVkX1+WmJmRWJH4QkLn2qL4KFuXN5unXG5FtVESsbiUyCIUwVkZZI6a5ana1e/YM5RPBBJBcM1kntAGPjK8amREXX4rCW0ZUcFiWlHA/J8JsZxKMnBcQKOZv9pa7X0vALGo/1xRHPrdBgR2BQn5+w/ZJfvq2e5OqCpOgaDO2Gce0vgGIIHimTU2FaaowfqvkfZH8XFOrx1Wbj565r/NIb7VUtXEJpq6ZQknI3xyIdZsZsq/M3/d7fztha8bWmOWBD0nG1fDNjJQoiB78aMFESSqjY2X56MMEaBAvhVdcYcdFT0JVzfXUdyiyTvw/8qxjn6nwagZcdU9EqzlSqh5tubmaaUY5nYZyBfgAzbe7rEbJSVI+TyOKRnUjZtlBJOiKliHLceUbcMDEYiCweaBEnfdooXmrnTfEmKlP9X8hTORwwAIt/BUnKMN0lLyaHkF8wkventVYK5NIzTYz0IndrweRUSrseUdcFCQYZptR8xph8JTrkWfWM/vxe24JGRFoNMs0pT8RCwasjocJ+o7Fl/vHKDMT36VNa78bp/ybVUimNqFxpdCD8JzB2jF441aZN4Dg/8AVHm49Ig/aI5txpUEyfi39LOo1Uo0dOWDoYql6I2404xYxSjjpbz2FJt45rXV8K+D8DTFfq9+q6Pqev+0ZKEuMAlLxSWlSPbUAprYDnM4HrA2Pc/F4Mbi5s9ZjR3tvLiTsTO9tkovPtibn4/UsoOiCoafu8Sxk050W3u3zPeqjRkUqJ5pgloA1nur2Bg1hnbhNhRmaCTKCp5PcvWbPDiPo6EYmkF5TU6WvbQUM6skTlObZNwPgb08Yj5E4gwgff/Qmo1AB8bRCWpn55ZO9OGCuG4BLlHWvVbijWc/F8nggf1LsWEjE7xXn3ShDw1Ang8oWcEsnzewqSlvFipoZ4LISJriD7RpUUfGzWEvNAP5P7urMn6xA17wqJyE85H4M2jurZ+ZUDc 8Azlaw5R 4Xc1Ab5CDFDFr2s4bE+Uq2EuyyG+3hOgTQqZ31PFSkvpDCTP5BU3Ejv+xFckyDIkRwJaaRswfZ5v8XR8IWZPW/qdH2jbygTmkYnEjP1wo53pSe2CYC1MCDmMrAEa1yzX4ESe/fe4TXvp6ujOKe4sxihOzXcjMMDgHQZM2OhVNs5aXIWI/KC96XR8tp2cvJHqDq+KKirFI0Z9a6D6yk/YQhlfMFGCdzBREOKh6zKiDPzX7o7+PiEwTL2bINElF3po/XAYU2f/3xlGWAN+iZH5m3hCR+JnDfCSo40v6RZp2/8yL806nzXEHL9mTPZ2CeGjjqRowi5IOGTI5b0buZKl35DE9WdaCqCeZ8EAe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/5/20 3:32, Jann Horn wrote: > On Mon, May 19, 2025 at 3:25 PM Chen Ridong wrote: >> From: Chen Ridong >> >> The will-it-scale test case signal1 [1] has been observed. and the test >> results reveal that the signal sending system call lacks linearity. >> To further investigate this issue, we initiated a series of tests by >> launching varying numbers of dockers and closely monitored the throughput >> of each individual docker. The detailed test outcomes are presented as >> follows: >> >> | Dockers |1 |4 |8 |16 |32 |64 | >> | Throughput |380068 |353204 |308948 |306453 |180659 |129152 | >> >> The data clearly demonstrates a discernible trend: as the quantity of >> dockers increases, the throughput per container progressively declines. > > But is that actually a problem? Do you have real workloads that > concurrently send so many signals, or create inotify watches so > quickly, that this is has an actual performance impact? > Thanks Jann, Unfortunately, I do not have the specific scenario. >> In-depth analysis has identified the root cause of this performance >> degradation. The ucouts module conducts statistics on rlimit, which >> involves a significant number of atomic operations. These atomic >> operations, when acting on the same variable, trigger a substantial number >> of cache misses or remote accesses, ultimately resulting in a drop in >> performance. > > You're probably running into the namespace-associated ucounts here? So > the issue is probably that Docker creates all your containers with the > same owner UID (EUID at namespace creation), causing them all to > account towards a single ucount, while normally outside of containers, > each RUID has its own ucount instance? > Yes, every time rlimits change in the containers, they have to change the parent's rlimits, which are atomic options, even if these containers have their own user_namespace. Best regards, Ridong > Sharing of rlimits between containers is probably normally undesirable > even without the cacheline bouncing, because it means that too much > resource usage in one container can cause resource allocations in > another container to fail... so I think the real problem here is at a > higher level, in the namespace setup code. Maybe root should be able > to create a namespace that doesn't inherit ucount limits of its owner > UID, or something like that...