From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1269BC02183 for ; Thu, 16 Jan 2025 15:36:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 804EC6B0085; Thu, 16 Jan 2025 10:36:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B3416B0088; Thu, 16 Jan 2025 10:36:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67B0A6B0089; Thu, 16 Jan 2025 10:36:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4B58B6B0085 for ; Thu, 16 Jan 2025 10:36:06 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D158C160DF0 for ; Thu, 16 Jan 2025 15:36:05 +0000 (UTC) X-FDA: 83013715890.14.1C181FA Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) by imf07.hostedemail.com (Postfix) with ESMTP id F00064000F for ; Thu, 16 Jan 2025 15:36:02 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CIjlANF9; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737041763; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gwoQXtB03J2k0TrNlaF4zg+3HM5xnvoWUuazwn262Ng=; b=l6f5Q+pJEF2JTFaX9gmENZkv6HurUGwkYfFOTMhLlJpO1GiHJf9AVKgj63+esAr4B0AshO 4OwPJMaLuVsBONE+udf6/6RwSZGjuWwX1GYDD1cPd9fa1Of3LN9x2CsA6tleo2lzQigxTt NXqhkObFv4Ms4LgDn30Ex+bX9EP2osY= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CIjlANF9; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737041763; a=rsa-sha256; cv=none; b=26mm/CSaaDDWf4MSYGHGmLJUyBErugAPyHSxc5AldpQTiYelehenUGFWYTilRy3l68Gq3m ulhgDopanmL44r6xFwjaFdPw9aq56w3hTJuhy221FfrVjo/T+RljgtsDTRkHeuWSEbW+J7 I5ZHn/50EB8e7OJP53KKicpQJI7o81I= Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-6dd1b895541so24190876d6.0 for ; Thu, 16 Jan 2025 07:36:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737041762; x=1737646562; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gwoQXtB03J2k0TrNlaF4zg+3HM5xnvoWUuazwn262Ng=; b=CIjlANF9j8TXxnxDCf0W09RcQya4oYtXyrskTEMBkUO2X6MqYErEEf6jrLrhD+3wi9 WhDhca2OVVMtZ+OK2gORW2u15KluqjQ07dVXDOQQghmP+9DSO4i2YMlbnR4bSmYUM3Qj RlmnK+F3ClXqrJtcmBAEcv2Be/Mnc6troatCuFcTYKDnIsHQJ4QX2AhDSkkdGz4XXnsN RHM92RIQ5MdnavBRhxSNOIVO5htU0A8jyzyDQCgBW9IdwybhT9kGTMgiCnZF0WNEft6J L09eLfhnnVUuUFozXjFivRHiadDmO4dkyWLQDBJBGBFu4duIzWO1uCkUNGL33AdyAQ+Y d1zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737041762; x=1737646562; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gwoQXtB03J2k0TrNlaF4zg+3HM5xnvoWUuazwn262Ng=; b=JD+gMYYtjGb9TBLLbzRTY0dh2kVeZ2fpi0bN6Dyr1oaQr8PA1BIbDDr2xevZ08aUuO Vk/4hjqY576YQumjX9WkCAFWm37W/7b2PBt0VmmRF/Fbe5AUMZ2oPv3lOueOpxoXNcTk dmSGXXgECJwlonykSOvwYWIIWZVWslbh+8kyz7HBi+rN6wHYD4Wn4/uIDlAizPh9jwqQ VoCEPUoCyl+ZbE/7QAt1SoeGRH0SFDOm+Xs10oYd9+OAjs4ivopOpvmlN3fS/6K9YC6z Ox8niLnz8boFORFnYrO1wk3rXtoja6QZpJKgkIHeW8Do0dAO+SekgU2lV6vRfK0pU7U8 HhaA== X-Forwarded-Encrypted: i=1; AJvYcCVpnvKC3fO4yRzPc4bUEHzLos3Y3yR1+1IgOuFWI/8yBf4XrBPe6T/L5mqO2woil/575ksnFMpHhA==@kvack.org X-Gm-Message-State: AOJu0YxJS44aADRME/NmUEqFp7yhS2DLyZtPuRwSNfFYGLmRYDijKz3L twXTpY4jatKWm1ynuYscyjoIsuMqIpHSpEJs/j8ah1O0uUyjDpaN3Zsum5qix/bv2ninbtFoR6S NSM+dbpPJNA3yCTWNozIcRtrnjN0SArPQpa5f X-Gm-Gg: ASbGncvQccNIqPNTuWaprCJ1YfPlxz7CXUjTG0El80tmetvkuyqn0GAFeXAutfjok5n Dmqwx4cdEiBA1f/816uCpFHc0/4wsbzggVqI= X-Google-Smtp-Source: AGHT+IGQx7oMBedqeGgpo6yQr9WNwpasiimO5fE+NbzzO4q5VJOFglCPvDATO0nKf03AVFZ+/JKqZYOHXU5sX1cr04M= X-Received: by 2002:a05:6214:1d0b:b0:6d8:86c8:c29a with SMTP id 6a1803df08f44-6df9b1ce0fbmr604590146d6.10.1737041761771; Thu, 16 Jan 2025 07:36:01 -0800 (PST) MIME-Version: 1.0 References: <20241224011402.134009-1-inwardvessel@gmail.com> <3wew3ngaqq7cjqphpqltbq77de5rmqviolyqphneer4pfzu5h5@4ucytmd6rpfa> In-Reply-To: From: Yosry Ahmed Date: Thu, 16 Jan 2025 07:35:25 -0800 X-Gm-Features: AbW1kvYFC55FklfPWaCbeq5zpRtCrNzWLETLwIsv7izVGGfcV01UiDEi7-li7Kg Message-ID: Subject: Re: [PATCH 0/9 RFC] cgroup: separate rstat trees To: =?UTF-8?Q?Michal_Koutn=C3=BD?= Cc: Shakeel Butt , JP Kobryn , hannes@cmpxchg.org, akpm@linux-foundation.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Tejun Heo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: F00064000F X-Stat-Signature: o46u3ij8santjxso1sfepnnqfijz5epy X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1737041762-95629 X-HE-Meta: U2FsdGVkX18hJVjDryFvlj6Pj5JRyX+ooUwNsQitPGRmcIcRYu3Zr+BOYuK1Jok9A5bcTjHMqywbFuBMiBsK3hkpuQXvH1TeesbCqaT0QyjBx4fUjQk0pAeDjP1mw8C86B0qRS2hVvMgZzd1E1Bg6xKUnij+S7Xe29RWBe3XQcRYrxWjZwhX0CAMkn7ThyfbyvR8utlgBQaGgVMj5T9IDK+6Falwkv+QPirHesKlJ7Izj193MhX7HVzNzKksmWU6KdaaJyjcF3lCtMQvzS7lhLtK7TyiCIsUXEFwDAeCgu0MfbHTte9xIswgJFUuW373leWcMlWtMkNWGsSif5olI6e8iddmUBDYUC+vJj4SM2zwTvb93UPevMWqu5AEFxtyn9IQ0r5gkjtLlMMalMhsVT46mKAR52T9RcxIfC2r3ovwqq1ywjZ3mZoINV2+idJ02KeGeA/ndYcq8oeAudR1AWD5Cf5Smk9mCyKvFyW6E3sGYywrZN0On2325Q1/kFBRzB6811cDpCEF6yS1z6AUYSKEV2Ue+p9Z/g7hyz8YMbU6h4qsSGVhYObFsPzQ5j5JR6vE6nxVhwyPH7ulFEtrp8KmUKKM9DVmClMv/4p5A7vLRQhSF8CL556hYiLN3Or4w7wZO1RXN8wJjkU4040bSan9Fp8/xcPyCWM8pNYS9dAFK+U2e8QOa5kEc46jLRDCmnWWCpPAPqHrgoJCBPihr+XRwAAZ2nelPz91u9z39LUsK+9gCG3ioAze3baX1pKP6HRc3PotiQcuXlx8jNeASnhNkzhr0YZ0ff0kqeqzcdDv0R3TmmBqKpIeIyotsiWGDlXnYFRScodpdKCTOVr/Wq7shfhrqZIKbfBaAHWQdm8clvWYEI0aFvRRaClIJZXtdV5jfeLWlt0OfTrP2A8A/D+IvX0zJzZqBBUKpmnBxC35VnxeF9rTWaw8yoas/+Vy3wTKRkH2oH8BUfoFUZZ EglEJ5PU XeMQ68wote00AGfpZfsgHqhHWn/5y6sZ3hpAFsB6yBYzHWjVULdOupXOIRu6HyJrwy/VVLquKBe/UH4sm/Prn+QSTiwa+pB3Xyyk38SVeflahrhGh1UDJ8qGEYsvnq0RcKXPTf958CfB0CusLoLv3CcFYCGLP80I+NFdNfv3y3Ta7fG3+ney9l1KRDnkEEsDkbiofo0Sw6jzaNTA9uKa+jZvSmlZkUNJy7dbGhd40UJ30XOeXP1cSlePmX7z8I7SdIlsC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 16, 2025 at 7:19=E2=80=AFAM Michal Koutn=C3=BD wrote: > > Hello. > > On Mon, Jan 13, 2025 at 10:25:34AM -0800, Shakeel Butt wrote: > > > and flushing efffectiveness depends on how individual readers are > > > correlated, > > > > Sorry I am confused by the above statement, can you please expand on > > what you meant by it? > > > > > OTOH writer correlation affects > > > updaters when extending the update tree. > > > > Here I am confused about the difference between writer and updater. > > reader -- a call site that'd need to call cgroup_rstat_flush() to > calculate aggregated stats > writer (or updater) -- a call site that calls cgroup_rstat_updated() > when it modifies whatever datum > > By correlated readers I meant that stats for multiple controllers are > read close to each other (time-wise). First such a reader does the heavy > lifting, consequent readers enjoy quick access. > (With per-controller flushing, each reader would need to do the flush > and I'm suspecting the total time non-linear wrt parts.) In this case, I actually think it's better if every reader pays for the flush they asked for (and only that). There is a bit of repeated work if we read memory stats then io stats right after, but in cases where we don't, paying to flush all subsystems because they are likely to be flushed soon is not necessarily a good thing imo. > > Similarly for writers, if multiple controller's data change in short > window, only the first one has to construct the rstat tree from top down > to self, the other are updating the same tree. This I agree about. If we have consecutive updates from two different subsystems to the same cgroup, almost all the work is repeated. Whether that causes a tangible performance difference or not is something the numbers should show. In my experience, real regressions on the update side are usually caught by LKP and are somewhat easy to surface in benchmarks (I used netperf in the past). > > > In-kernel memcg stats readers will be unaffected most of the time with > > this change. The only difference will be when they flush, they will onl= y > > flush memcg stats. > > That "most of the time" is what depends on how other controller's > readers are active. Since readers of other controllers are only in userspace (AFAICT), I think it's unlikely that they are correlated with in-kernel memcg stat readers in general. > > > Here I am assuming you meant measurements in terms of cpu cost or do yo= u > > have something else in mind? > > I have in mind something like Tejun's point 2: > | 2. It has noticeable benefits in the targeted use cases. > > The cover letter mentions some old problems (which may not be problems > nowadays with memcg flushing reworks) and it's not clear how the > separation into per-controller trees impacts (today's) problems. > > (I can imagine if the problem is stated like: io.stat readers are > unnecessarily waiting for memory.stat flushing, the benefit can be shown > (unless io.stat readers could benefit from flushing triggered by e.g. > memory). But I didn't get if _that_ is the problem.) Yeah I hope/expect that numbers will show that reading memcg stats (in userspace or the kernel) becomes a bit faster, while reading other subsystem stats should be significantly faster (at least in some cases). We will see how that turns out.