From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D96DC83F2E for ; Thu, 31 Aug 2023 16:56:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ADF488D0018; Thu, 31 Aug 2023 12:56:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A8F5A8D0001; Thu, 31 Aug 2023 12:56:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 957538D0018; Thu, 31 Aug 2023 12:56:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8654C8D0001 for ; Thu, 31 Aug 2023 12:56:16 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 52A9D1A0332 for ; Thu, 31 Aug 2023 16:56:16 +0000 (UTC) X-FDA: 81185002752.18.7E43340 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf02.hostedemail.com (Postfix) with ESMTP id A691F80013 for ; Thu, 31 Aug 2023 16:56:14 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=SddW1rsu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3LcbwZAoKCKoiYcbiKRWONQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--yosryahmed.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LcbwZAoKCKoiYcbiKRWONQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693500974; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=gOQaXSmI19T3CmRXHqALPzLoBcwM1iddQy4gNQ1FwHI=; b=qGlBg98wdBSB8MHiz0K2zB4IvrKQRZ5c0AHjEiexZ4UGg083SlUUnML0l58QN4yqFXB95A v+8k6Rin05R2Z+u+nCSSaknI6yYPYE1KnuNHp/9RGkZmLbl4tuFaFnoqR54HQsMpJrL9CP HqzqNgO3Kqxn1bjsjfP40pW4G8bFe3U= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=SddW1rsu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3LcbwZAoKCKoiYcbiKRWONQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--yosryahmed.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LcbwZAoKCKoiYcbiKRWONQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693500974; a=rsa-sha256; cv=none; b=fBA6uMkiOGSkUULZUV55yfM9+F9TTVzK1+4InAnPDsOSUlljdGTiTireaYMBeE42OCYLsO 7NAWxGp01GVtMDsU/SpWE4UzmcY/DFX8FmaYNWaftyVrkLulK/NDjO9wQpqMmCtGJSO/qJ hmhfvjg3+ujfzXgrhv1gxqUqZN0ejyI= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-59204757627so13334757b3.1 for ; Thu, 31 Aug 2023 09:56:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1693500973; x=1694105773; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=gOQaXSmI19T3CmRXHqALPzLoBcwM1iddQy4gNQ1FwHI=; b=SddW1rsusDPF0K3Qa9SBzNI2LSCG2OBZ93ae88c5tPD8kcudmZAjCG87uljAloNaWn hg6c+rh5sCcFFVqEQ+N7DViE1rt2ndvxiQ1XSX4ZcqaL1XGY2+dUuYs+/oReeTYlINmM dzawMjJ3R8sgZYorqzrIGSFTRKG/IySJ421YjPHZn3bdQE1AfqHND0nLXmttMvJF3tcm V61tKBfXdHd1aGeTdf2L1nqssQ1TFvYMD3TjmL4urgSQZVGswAX5KH9OjR1ACaXNnYAr +o9HFYbVqI2gsiezoMWupEaKUSjXf4Ew+3c6bx/8MmQuONNg3BB134O28ptoZcrr6nHd iIEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693500973; x=1694105773; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=gOQaXSmI19T3CmRXHqALPzLoBcwM1iddQy4gNQ1FwHI=; b=Yu2FoV0gXCOe7+4YtwUMNVqTn9fkqt5ivvZnqW4FVtZmZ17JaUUCo5QuR4AJKjMzKw vz/7sEhNg104Gus8bJGnTuYRgl74KhHJhyyAveFS8h1M3hyGYhEJjJ8/utA3AT6zgYib 5r9vWE2PU08Mvj2qZiYXWouh7ZTWVhljEZvZI+L4GiG58zKzHEs9oIhBG2wPTes6KQeT KA3mkk71h1StyokEUAr0Fhzn8aj+/nyKCk1fNsTHFdqkjF7P1/0tUYysnlPv4XLXDEFy ryyztTVzL7sy+JdhMg8ij1zHMLPnlbzcZfSsP3FtJQmgCRIWPOPVA2SH6epPRxFmkb+P UK8A== X-Gm-Message-State: AOJu0YynzRCk8sVG/98x0Pv9YnkTyudABotQQ68ZaPxR7Q7jW0ESUlmi 0luB4UMuZi87pgwzFKMjDJ/AWcG+sPTmgR7k X-Google-Smtp-Source: AGHT+IFfZn2v6d2Ro49b+XHsKpTrNfM17Oft9zmXMB+ZfxC8MxeQy1a4OGUZlaj0AzNO9izDZCKRRyy9ddFAznDW X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a81:ae45:0:b0:57a:e0b:f66 with SMTP id g5-20020a81ae45000000b0057a0e0b0f66mr157969ywk.7.1693500973593; Thu, 31 Aug 2023 09:56:13 -0700 (PDT) Date: Thu, 31 Aug 2023 16:56:07 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.42.0.rc2.253.gd59a3bf2b4-goog Message-ID: <20230831165611.2610118-1-yosryahmed@google.com> Subject: [PATCH v4 0/4] memcg: non-unified flushing for userspace stats From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Ivan Babrou , Tejun Heo , "=?UTF-8?q?Michal=20Koutn=C3=BD?=" , Waiman Long , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A691F80013 X-Stat-Signature: j394579zo51ktc636rxqnf7yujmghgwy X-Rspam-User: X-HE-Tag: 1693500974-956711 X-HE-Meta: U2FsdGVkX19VqzbjVv//gpotB2LdNGGIxN86PGZHaqvGF2lbZS54FY0/PkHjKLe4Z441Pz2B29d3pP3Paz9uOGUrQtAc7x01733GAHuePdFYrH8sB5JgNfTetusR1ZfadP85flQgWIJPDMHRvM4Wu/ctXSSkmI6YktS80pVG7kVHRDHRMwjTLQisOxC2U8yF85mOwSB+maTXUEzlUr1qHQlhlSNew+2mNzG6JbA8einUgZDJcN9TdOw0fLXORzlFvo4YGnJU3FXAXEYQ9kiCF1RFsQuk54qGnDZnXkZlBC3ATNIiQEq8wiUTmroYqIycoiXXyN2AHZlkPIaCAXeiIgsYocGauN4IWv82zUMD1jhENlV7LS2S638V/movRitykje4O437wid7d4tgtL2xBgdG147+duh7CfAYP/pg2rbW07mkLADKTXUoT7uqtoObL0PqBEsfJqDZHZ0JrpUb4dFauEFsXzt783oJbMcCn8gpRNCzSvuELsaXHob/tKH5GPtdOUiM57zL5EuDyvBOkNGaf+rQJq2YpbATAOZda6lvsDXJzj/ooPJtK8P5LKgHqOaN4EG5vxnapFPPZqC8pUiW9oEwANv1e86wThRjTLlpJ/60TNYak/mPDsv8gUTXUXNZ4aH/AnMMkFgKYi+t6KjUVetlvnYWxxFX7KfRKiL5b00ekmr/OQu8OcPxDHlwdCwFCP7LwGQdNkq2rmHDOQexH7Gnp/O8/JyVwSUEFiNYKd3bzIJAtiXZCvm+NFCNNv1lDvc6ECuni18V3yNfMAcWRriKTGR3IRdHYU7FBY6vDGyDmYcSH4GJl9U0B0jDQr/GE8sYkTZv/5WoQAJ1LLasbN5v1IEMb3FHOO/G954dA1hdCZ4+0jbuKLlrMnI+mnuyZ+LSWBpk+8T9br1kqYhOptUKHUud/+mZfI2rnQa6yln+WAozOzsn5jSSoDlDEIgiAT2WWnjK+lqbr5n Nh4svXXQ S91gO57xUpSPF3PdUlCPp39LZp9TbOOki+SUL6hJ3jrGFSRzGQYzwtu+QMEGcHajLJVvKzGlmypab/uLxgh0+pYVx8KJleMBs7XPKFk1SwXiyvWa/bdfTLkji4C2xTW+rRX+3L8yWEYCV+orIiOx12f0sdWLyw4ouXO23Nc1heh5bhK150GMqpSI0D+jTuN9drbV96Vc7wGzxcIgn3wuSX9OO4sT3QPYDdj96huBjiyrGtNxuONAuCk8WhShw3YnHqEpGunG+JMJlByS+puR+dzMdjaCLDBNCJR6nRWBrU07f6PKJJmOTeKGRK0b8KcRglrLoRBrAVD+MFLInGPcNEQqqGXAy4X4OI+5lHix0FP5di6+L/D6eEh85mxGuANTFrrUgRBVDMd/CShPSaZK9XemZzILfwBZv/sMOXZf0mUXsx94+62P1JGaelGBMFasigidSwthw2jteMu9IRfX/c6U0XU13JXGmZev40G4SPFY7qrYzR86ol4zWVF/CWlNGJc6ghnI+YCI3f5eD9tQZu3c8GBK76y7BUkhwaTRjjBy27EVouZxEC4KQUT68d5f1Bf82ZrndMbq8hKv+JNLJSHQrE92pCdEOcJH3HP6kMOc4KgjLSKjAmu4JyiZSXZBFSzv78prbngWlCliXie6jv2W/EGMUHKNosBKuG9UNvnI1Y10Wpu0TucvIOEa6KFcRaXO1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Most memcg flushing contexts using "unified" flushing, where only one flusher is allowed at a time (others skip), and all flushers need to flush the entire tree. This works well with high concurrency, which mostly comes from in-kernel flushers (e.g. reclaim, refault, ..). For userspace reads, unified flushing leads to non-deterministic stats staleness and reading cost. This series clarifies and documents the differences between unified and non-unified flushing (patches 1 & 2), then opts userspace reads out of unified flushing (patch 3). This patch series is a follow up on the discussion in [1]. That was a patch that proposed that userspace reads wait for ongoing unified flushers to complete before returning. There were concerns about the latency that this introduces to userspace reads, especially with ongoing reports of expensive stat reads even with unified flushing. Hence, this series follows a different approach, by opting userspace reads out of unified flushing completely. The cost of userspace reads are now determinstic, and depend on the size of the subtree being read. This should fix both the *sometimes* expensive reads (due to flushing the entire tree) and occasional staless (due to skipping flushing). I attempted to remove unified flushing completely, but noticed that in-kernel flushers with high concurrency (e.g. hundreds of concurrent reclaimers). This sort of concurrency is not expected from userspace reads. More details about testing and some numbers in the last patch's changelog. v4 -> v5: - Fixed build error in the last patch with W=1 because of a missed 'static'. v4: https://lore.kernel.org/lkml/20230830175335.1536008-1-yosryahmed@google.com/ Yosry Ahmed (4): mm: memcg: properly name and document unified stats flushing mm: memcg: add a helper for non-unified stats flushing mm: memcg: let non-unified root stats flushes help unified flushes mm: memcg: use non-unified stats flushing for userspace reads include/linux/memcontrol.h | 8 +-- mm/memcontrol.c | 106 +++++++++++++++++++++++++++---------- mm/vmscan.c | 2 +- mm/workingset.c | 4 +- 4 files changed, 85 insertions(+), 35 deletions(-) -- 2.42.0.rc2.253.gd59a3bf2b4-goog