From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F3E8EE49B0 for ; Wed, 23 Aug 2023 14:56:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7256228007A; Wed, 23 Aug 2023 10:56:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D5A8280079; Wed, 23 Aug 2023 10:56:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 575BB28007A; Wed, 23 Aug 2023 10:56:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 43E1F280079 for ; Wed, 23 Aug 2023 10:56:22 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 189CF12018D for ; Wed, 23 Aug 2023 14:56:22 +0000 (UTC) X-FDA: 81155670204.21.F029AB6 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf23.hostedemail.com (Postfix) with ESMTP id 7C0A0140007 for ; Wed, 23 Aug 2023 14:56:18 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="Mi/Mvi2f"; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692802578; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PSVoIOZqrLzgTIYTqSKj0XgeIg86+0IH5XubqeiWf/A=; b=XjTjomdBOG5fRR/xSRoYPqpMocxoywiYmftoLhgxuh8RLsaSk2LD2nUlmOudhJmXWKTJZq q1BTB8Rj9uV5PCdpI/WCY/5zpi6ZTTiH4VR1SDAslSIuznml3HY2mgROOYHA2EHkvBls46 jQvCmjUlbMqRi3RPRM+3ll1VMZciy6A= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="Mi/Mvi2f"; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692802578; a=rsa-sha256; cv=none; b=FSz4KdLUmp8MxhK8z4EsctlSDoBZmFcwsrmgHYkfGTWKSCPEJBYdpHAaK7DSqW5l58Qlrz U45NSNuX7Yk8z71aomC9dpDO84w4OHkTONCrYixDLlCj2h+m7qB8GU1TkyXyAXMXe5GlnX CeOg2XiT1XD1Lc59w9y0TmPmRXOqFhI= Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-525bd0b2b48so7359406a12.0 for ; Wed, 23 Aug 2023 07:56:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692802577; x=1693407377; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PSVoIOZqrLzgTIYTqSKj0XgeIg86+0IH5XubqeiWf/A=; b=Mi/Mvi2fWR1+DchpFlIQ3KB1SwAMcdVb/97gHN9qIPrhg0gM3fbbPNqn7SdEETJnnR aRVvXtg7RwWm+BIilNjTNh33g7Nd+pFDl6xGvbYsd85QBTDnEPuLOyl8j0VwtD3wFnMN +P1+mk8vFOU38YVY+CnxgZ2MPY73RUTdSk1B+L7GY2DQsGn/XrbAhXbie+GrOsIfNFvD R898eDcFng2eThS1Y7Dt5IGVTTHmbPh1Cq+FjhP3181VxhcdT5WNqIkha4ouNEetRm5t YEZ1DX0+u3IImvaWYsQgKErOpesSlzAFCLaKaZgfqA/C6qS5gNoGeTQ7rQDSEErqPwZd esSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692802577; x=1693407377; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PSVoIOZqrLzgTIYTqSKj0XgeIg86+0IH5XubqeiWf/A=; b=V4KjGZ9XL1v/V9va2klBw8RpwXcKhDLYPMIbqDGX/tsUEHYkaPpwUleQBukKKfccYi mxTfLIozdh0JnwNBBYvqWTmcvglnrx3CRB8QSbX9q6IYDZ37a+DeEbdgnSukMWIO872A DR7jwW9kbvgF7k2izJs3cwwWPaY4DxEREQQNJmgu5+tmotDmw9kQY3iBdmRQFNaDJ4re OVTAK0+AEWtIbSb9tqp1TX9ezdQrdfgzR6ksLnHWA/+DuEY3tU3J2G9U9djVVLJDNbmC 90+rnHFtc+On7K7THUxnpRyDFI5fxLVtzQWfPT+ZPKrRsnj994tdH1JKFJiM8+2Udss5 23Tg== X-Gm-Message-State: AOJu0YxX5IVwLGEQtBY5qpsXgTL3u9Ij1X8+eClj5jLpw19/XZsVii+6 nUo0WD3TxnmnB2v6jU1fYZXAc7uhMLzI1fp/ghpB7g== X-Google-Smtp-Source: AGHT+IHBhGQcBdfBAtLnznwUh174FCcqukt/fRPu1gRsP6PfUmH2udELScCFG81I2lZbX42cqv0VUQde6snn0MQIoPI= X-Received: by 2002:a17:907:75e6:b0:9a1:fc1e:19ab with SMTP id jz6-20020a17090775e600b009a1fc1e19abmr23723ejc.53.1692802576591; Wed, 23 Aug 2023 07:56:16 -0700 (PDT) MIME-Version: 1.0 References: <20230821205458.1764662-1-yosryahmed@google.com> <20230821205458.1764662-4-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 23 Aug 2023 07:55:40 -0700 Message-ID: Subject: Re: [PATCH 3/3] mm: memcg: use non-unified stats flushing for userspace reads To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Ivan Babrou , Tejun Heo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 7C0A0140007 X-Rspam-User: X-Stat-Signature: hirzz6193m6qw8rpo4kosee3ygpq1pxe X-Rspamd-Server: rspam01 X-HE-Tag: 1692802578-547119 X-HE-Meta: U2FsdGVkX190mYMgf5zfAF3Ye4O/2T8HZGfP0FHwRBHs2EuLzWeyHe/yfDG+Dn9M/aWKNecwWSdqrKvBYoP5Li0dibIZfuXftdBE/sZsifo8cqoyUEyn3t2VSs1eYyx7EQ8W4gkq2b/3lQwKCBqicus+zsK8rV/naCfRw5ZuaHx0qmBxNzpA+D4Ed5bVNeYcp2VBftgBiJntbI5p9bixwjrJ+s5gNY2jVDkUZo0fsj9WSrNKjye5pKDpWw7yRcSfES33+mk/lqxBRW3NykByMwMUts/c7XJjUlqWkv5oihWGNBuxEP3njlMy8w0/xT78bEmLvEoFmeYxLVD1BDdN4jU39HMLnSZqRL603K28idiJAHLxQlcUHiWFO6sRjS35HrS8EMGiF0jg9iwUUJnmhDJn1uqf+NeyoQ945qrQ7qBMu4lQ9EWsd2KXcaYoAUexEKrQPGR7mhNIm6YiG8uRNJz4/0ClrRuM/smHWx7JYkZCigsy56oFLtgcxVMtEasHUrQqmKKJBPSZrC4rA1y5Vob43/oHlM8CeXJZqm3SevfXbsr2X2fgalPNuTIXWdLvkYSEIQJTmnH2L8U6PBlRkfpxZLJKIPRgyi6bCxnHQakEJoO3ymP7OJvs8VKZNeVv6/ADQXO7JZcZqS8J4GPXG550Xn8TdKyYgnrtyoYJUAnTb9PIgGBmgsyzOPM+Sm5KAtJs7JeZknrn2RcJ6gocPZJylpK/cUjYsuzHxo/W2LGts17viqKMVhmL15HdFGXVBdYhrpfLpvL8w6uY2Pf0zICIiRSYzS7b7YE++WyOXNx3PH9luxejF8CryjVAD2uXApkf1vLUJ64Qknl9LU05dPQlc6qjnmb6QOnLivveRLUmGf1BOhsBIl2bZmSlkXYdrj4VThhCtyjRh0TKrm9ESPD5DPmv4NDhT73D/iJoULchQL11+Q+m1Jg341nrxWJ+cBbFvy2zGL2JSfBFe5i kndaiNNz uZvdpK6Nf0l6emKgPGN4dI9HoaxA8znbl8eSgG949mjTLgGwOnrRVFJ+2gdmOneMHIPNLK4PoD+KTCerrc0NRepX+q5TMtAaujr9Bl4oZc1/P0T/V5HY10WEjkjCAusvnx3zA1ZMtX7Ew6uLauQw0+mc8sr9hdSep9Z/bSjPcRd1X5nG54humTE+uqsKAlr4q1wYO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 23, 2023 at 12:33=E2=80=AFAM Michal Hocko wro= te: > > On Tue 22-08-23 08:30:05, Yosry Ahmed wrote: > > On Tue, Aug 22, 2023 at 2:06=E2=80=AFAM Michal Hocko = wrote: > > > > > > On Mon 21-08-23 20:54:58, Yosry Ahmed wrote: > [...] > > So to answer your question, I don't think a random user can really > > affect the system in a significant way by constantly flushing. In > > fact, in the test script (which I am now attaching, in case you're > > interested), there are hundreds of threads that are reading stats of > > different cgroups every 1s, and I don't see any negative effects on > > in-kernel flushers in this case (reclaimers). > > I suspect you have missed my point. I suspect you are right :) > Maybe I am just misunderstanding > the code but it seems to me that the lock dropping inside > cgroup_rstat_flush_locked effectivelly allows unbounded number of > contenders which is really dangerous when it is triggerable from the > userspace. The number of spinners at a moment is always bound by the > number CPUs but depending on timing many potential spinners might be > cond_rescheded and the worst time latency to complete can be really > high. Makes more sense? I think I understand better now. So basically because we might drop the lock and resched, there can be nr_cpus spinners + other spinners that are currently scheduled away, so these will need to wait to be scheduled and then start spinning on the lock. This may happen for one reader multiple times during its read, which is what can cause a high worst case latency. I hope I understood you correctly this time. Did I? So the logic to give up the lock and sleep was introduced by commit 0fa294fb1985 ("cgroup: Replace cgroup_rstat_mutex with a spinlock") in 4.18. It has been possible for userspace to trigger this scenario by reading cpu.stat, which has been using rstat since then. On the memcg side, it was also possible to trigger this behavior between commit 2d146aa3aa84 ("mm: memcontrol: switch to rstat") and commit fd25a9e0e23b ("memcg: unify memcg stat flushing") (i.e between 5.13 and 5.16). I am not sure there has been any problems from this, but perhaps Tejun can answer this better than I can. The way I think about it is that random userspace reads will mostly be reading their subtrees, which is generally not very large (and can be limited), so every individual read should be cheap enough. Also, consequent flushes on overlapping subtrees will have very little to do as there won't be many pending updates, they should also be very cheap. So unless multiple jobs on the same machine are collectively trying to act maliciously (purposefully or not) and concurrently spawn multiple readers of different parts of the hierarchy (and maintain enough activity to generate stat updates to flush), I don't think it's a concern. I also imagine (but haven't checked) that there is some locking at some level that will throttle a malicious job that spawns multiple readers to the same memory.stat file. I hope this answers your question. > -- > Michal Hocko > SUSE Labs