From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E0A5C4167B for ; Mon, 4 Dec 2023 20:13:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96E8D6B02F0; Mon, 4 Dec 2023 15:13:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 91ED06B02F1; Mon, 4 Dec 2023 15:13:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E6136B02F3; Mon, 4 Dec 2023 15:13:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6B1C56B02F0 for ; Mon, 4 Dec 2023 15:13:06 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3E9A8A032C for ; Mon, 4 Dec 2023 20:13:06 +0000 (UTC) X-FDA: 81530234772.13.83FD420 Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf29.hostedemail.com (Postfix) with ESMTP id 5A5F0120019 for ; Mon, 4 Dec 2023 20:13:04 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VjoPnsrB; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701720784; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EG2n1RagWpJyioFCtKEQGNRIi6T2whr4+NS0jQsnqJc=; b=41KXmrBVyA735ARAqQPGWstO5iCRdhZqoZEmERBBtAh12wKDUsUIym56F7hCLBcNTMRXpw jgjlJiPAUrT7GQ7tyRZE4Subdud8EylPAIL+/G5BxThNRKe1LwE1EJFg3I0S6e+LTfN7Sn ISdfdhXpyS2O9k9iU9S6juK5OeoHeTo= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VjoPnsrB; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701720784; a=rsa-sha256; cv=none; b=4iFooosIA1mAlJagaV+zN7B/fyFo80KKQSs+JBDz3fp7m5+w15U3jkfEDAvNFCRpOxtJQZ nTLIpRnZpPAGCoHQGSsK9v8B9vY5vX0UeqYB42mncsLn2E5yNtst8Z7uD4Hg90kNmKabYX mtsd3iXeh7Lsi2r5VDXWKEP0d3TzFr4= Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-2c9f559b82cso24023221fa.0 for ; Mon, 04 Dec 2023 12:13:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701720782; x=1702325582; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=EG2n1RagWpJyioFCtKEQGNRIi6T2whr4+NS0jQsnqJc=; b=VjoPnsrBostEKLPTvvRUdfW9Hj8F3VjcxRGW8/I+kKS68AspwiN4ddRCLxfvMo7MGb 61u3fH+1jmYPm+zzDOtmug4OtzQIWCJNN3qTY7tds68sFGjSI1xS/RpwZmjqE6FtZEKQ yojaMjnRNKI97U12v2HJVQ7F2UFZX/QOn6Bj/KfBbf6G6r1cw9419+bc5ll2aMNnGgBZ +xENNK3hz6A++LqBM7inSUTp4j7+pZ2Xw6DpRcIpF47V2j5Z/xwZEeInv9FB733gdh3s K6QFo0RtaIXWvnQYrStD1ERKwLXVMuPTNGP3IK2iKTf8mhlO776UDQI9EWDMchEfrbvk DVwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701720782; x=1702325582; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EG2n1RagWpJyioFCtKEQGNRIi6T2whr4+NS0jQsnqJc=; b=qPoxiFTS0AEfr7yt3n/EqgPGZhFR8yXTKMiqyi0qbvjATgATC74ByM28j8LFPInilK coU7zkzWFCLoLIkdVp4NTxCoRuVsgvJ5YkmTvcSyTQRgkYVSDUc3UDg6OaTDzQ8xXuHM 3dxU9jAbTnfofXeyNy83hAJRLlO3nfRJZizP8l1eVK4KgE7lD7+kILQMgNMOPRDKrvr0 uhO233efNL5lIry0TJ93JY/ywmqXCFMKFsQA517mpG4V0Ey3zHDU0yIkeqVmfpKHOopy 3sx2g65jOTYXEXZH/Yhjk7FcXTQgWELsGwPfaGbakU2N+XxuW8ZNSKgh+8tE4/4dQrLR HRuQ== X-Gm-Message-State: AOJu0YziNVjpk2pLKB3XH6US8O+i1aD73exrvFrg2OgT+9Zi3PhQ+Ls4 1ZDdU73i3lY3RVEaebsipkG+UvsnH5vKwMwFkDAxPg== X-Google-Smtp-Source: AGHT+IEAqW8tj2cjKzLL1LEqxk6D1GCavzEx5M1oQoWziuX/KyNNqhq2NyR/1wV/bzk3AHA7UVHZ98keWs1mbybvl0w= X-Received: by 2002:a2e:9a87:0:b0:2c9:efa3:e1e8 with SMTP id p7-20020a2e9a87000000b002c9efa3e1e8mr1610758lji.33.1701720782130; Mon, 04 Dec 2023 12:13:02 -0800 (PST) MIME-Version: 1.0 References: <20231129032154.3710765-1-yosryahmed@google.com> <20231129032154.3710765-6-yosryahmed@google.com> <20231202083129.3pmds2cddy765szr@google.com> In-Reply-To: <20231202083129.3pmds2cddy765szr@google.com> From: Yosry Ahmed Date: Mon, 4 Dec 2023 12:12:25 -0800 Message-ID: Subject: Re: [mm-unstable v4 5/5] mm: memcg: restore subtree stats flushing To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Ivan Babrou , Tejun Heo , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Waiman Long , kernel-team@cloudflare.com, Wei Xu , Greg Thelen , Domenico Cerasuolo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 5A5F0120019 X-Stat-Signature: q84agb78r5ar9tsjub1d4bczpdyqmogj X-Rspam-User: X-HE-Tag: 1701720784-660491 X-HE-Meta: U2FsdGVkX18gu1eSPLCHlm1lgMUuAnOlppmmKQTu1P+3i9oDjgdXn2+lhoWXvhlWih3DL+DyQT5gTRJZn1dbgrg6LKYXv+MYMmiIPRr+3qu4GqDe37vLbLvy15+oZkyRUoRJ/9Bwy+cz1qoyhRlAOa5MJIWCmmnwcGDwKcGxFs09Ek8hFJOBW74JZ2X7JF/JLORpMMXxYTaXobOE+nuyQnXO6sa3TErM92QT1qevYhQsYWxmigZ7ZZwny0K6BdGBpQkfDrbbCPl3G1TIr31DrXxv754ikZMb3tFxd6tJ00rn3KsYEkE+f3ktm5EG+Kluj3wkpUqI+T7sYpKJE3P3MLMn/bdZTmWKAnPw3ly3VKwhN++HC78nDOqb1VQPqsXtfxut9DWCyiKQN9m9djrEpSG+7avFBVZavem3qkSi7xcbk2Qlxlm+2Xkw+UJUpCZFnwb4pPBvOk9Od1qdZdog+y2ryvxwGWQMcBSmLJYBOXhAMDxQeCSZPdHmyi9hi7MIrv4OEB20HwM+pRN+KIDREg57E3ujx9XfhXTofB7VecCtho+WQbcAgS8OzVzqCWm34k5hj7RaEArwih6M/UQWSIaWGqodJ5OBLOIvk+vezPIFDHDS+jiQFhXXphY1XHTQoCzsWeyA/jVr6ttQBR/pa/ezsbcB/saCwRVCQ1MkW5dZLDmnUxXKBjGzUitl/24CvDcVKGgMmuAXD2tw+QEgXDrPvRtyrp3v9OUuxC80wfhRyZmA86WyIlt3Fvk4ppcXYV+ZyN3v7WGfA4xJHzpogdEul3X88KWG9hDwNyfOuQuQQTialUsYN57X4i1MIZ6rO/+3lCsUDBK0UEUmRDcv9SVooF2Nllj0QbaSVXDfvcrIjIwvcI+tcKP7TBQok5lT5V0ZsQ2XDvb+wBXhbbvg8HvfGbhYJwHLm54P4hgZGbsuqcKgiq5quU1jHbG3Ez468M7nEgMKGb0YUWl8tYM GXHLkv0J unOSiesAfabnWnj4iEfcLdxnLzZ1I0oPwYRLKmdQUYT1fhGEQEnT8axWyE8URMsBwfMPwxt8gncfw5pHqyCJ6ZxBHrMzhOPb0am/JE2tOQFge5cIZPP7rvcTn727p66a8xrkHovzucfEvcgCJJfDjtzfy6RwTjT/l0pXkhczJ5ZZN9j91jgkvA/eJibSrZwKscYZP67X8DRyEyKVpvnznwbt2G12RN3jSRhpmNpr+0P7bRcx9FPmWjReLVvevc62U5qhUAUl20uCtb/dkxKp2JCnygChaNerWK28cak4nehHRtVgRnql5XXC+7V46L+yLAVShlyPxzRZTw259GugopkPGLeR6c5YrxR1IQIZYTcIUgSRfA5pK86cFmg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Dec 2, 2023 at 12:31=E2=80=AFAM Shakeel Butt = wrote: > > On Wed, Nov 29, 2023 at 03:21:53AM +0000, Yosry Ahmed wrote: > [...] > > +void mem_cgroup_flush_stats(struct mem_cgroup *memcg) > > { > > - if (memcg_should_flush_stats(root_mem_cgroup)) > > - do_flush_stats(); > > + static DEFINE_MUTEX(memcg_stats_flush_mutex); > > + > > + if (mem_cgroup_disabled()) > > + return; > > + > > + if (!memcg) > > + memcg =3D root_mem_cgroup; > > + > > + if (memcg_should_flush_stats(memcg)) { > > + mutex_lock(&memcg_stats_flush_mutex); > > What's the point of this mutex now? What is it providing? I understand > we can not try_lock here due to targeted flushing. Why not just let the > global rstat serialize the flushes? Actually this mutex can cause > latency hiccups as the mutex owner can get resched during flush and then > no one can flush for a potentially long time. I was hoping this was clear from the commit message and code comments, but apparently I was wrong, sorry. Let me give more context. In previous versions and/or series, the mutex was only used with flushes from userspace to guard in-kernel flushers against high contention from userspace. Later on, I kept the mutex for all memcg flushers for the following reasons: (a) Allow waiters to sleep: Unlike other flushers, the memcg flushing path can see a lot of concurrency. The mutex avoids having a lot of CPUs spinning (e.g. concurrent reclaimers) by allowing waiters to sleep. (b) Check the threshold under lock but before calling cgroup_rstat_flush(): The calls to cgroup_rstat_flush() are not very cheap even if there's nothing to flush, as we still need to iterate all CPUs. If flushers contend directly on the rstat lock, overlapping flushes will unnecessarily do the percpu iteration once they hold the lock. With the mutex, they will check the threshold again once they hold the mutex. (c) Protect non-memcg flushers from contention from memcg flushers. This is not as strong of an argument as protecting in-kernel flushers from userspace flushers. There has been discussions before about changing the rstat lock itself to be a mutex, which would resolve (a), but there are concerns about priority inversions if a low priority task holds the mutex and gets preempted, as well as the amount of time the rstat lock holder keeps the lock for: https://lore.kernel.org/lkml/ZO48h7c9qwQxEPPA@slm.duckdns.org/ I agree about possible hiccups due to the inner lock being dropped while the mutex is held. Running a synthetic test with high concurrency between reclaimers (in-kernel flushers) and stats readers show no material performance difference with or without the mutex. Maybe things cancel out, or don't really matter in practice. I would prefer to keep the current code as I think (a) and (b) could cause problems in the future, and the current form of the code (with the mutex) has already seen mileage with production workloads.