From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2073FC52D7B for ; Sun, 11 Aug 2024 20:17:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9D7156B0095; Sun, 11 Aug 2024 16:16:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 986EF6B0098; Sun, 11 Aug 2024 16:16:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 84E9D6B009A; Sun, 11 Aug 2024 16:16:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 673896B0095 for ; Sun, 11 Aug 2024 16:16:59 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D7DC58057F for ; Sun, 11 Aug 2024 20:16:58 +0000 (UTC) X-FDA: 82441073316.19.5208024 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf12.hostedemail.com (Postfix) with ESMTP id 227D24000A for ; Sun, 11 Aug 2024 20:16:56 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2m51nfOA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723407366; a=rsa-sha256; cv=none; b=sGmyDuMT7XTZnOMyxYbrBFWZLfX3n/W5+R2giaj60zOKcGwyBssYK5Ueih35d4E4Xh8wRJ Emzf/lauOBSKELkU86CVK0eZXzqf23fVneqm1/yJXuspm+nVdGy7+hdjz4MDHFVcoWDGFO szX+vbRaqp0EbolSCzWJ2pis3HSx++U= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2m51nfOA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723407366; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FAhUoo5pDgGhIiuV+sHAbITY8GMWO4IlHepz/Cnoit4=; b=dArOp+5Y6fEH4AcFIhOYyTmXPasVGjLhPuQwq1GHbjJvTbUl5SH7qsluERzte/jsfuxn8K LITxFHbbNUzrM2pteuloQdMvBqQ7wDcP/FVBfoySMcYeTzeqRHzaG6nVHW/FER+NHdjNFW +3Mm+SdEc6fQ0b5QIeomq0rYXhowiU8= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-200aa53d6d2so224295ad.0 for ; Sun, 11 Aug 2024 13:16:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723407416; x=1724012216; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=FAhUoo5pDgGhIiuV+sHAbITY8GMWO4IlHepz/Cnoit4=; b=2m51nfOAV7oPOxsef04DTGpumRYSI98hTjcqTncntBD31uQeDUQfC2A+y/3ZicYi1R krYveZ62u6JuZP6M18cyP7BIoAFfWDZHOA1uzvFP1mQtywm8MH81u+M9LXE8giI0ApeL ZTpipf6RGy/9gilI1fiBU70GAROR8y3QLIHM1Uu/kLgomKRgRmvONUaAaQcmjRhjukfm TfwrRgRwAgN18tNyfPEPUjnD0q+a96wiqQPJflNIiofk+glftg8ROGRVN4XCGrT7vzuM rEwuYNDiDrl8h3WuqvJjzNgnfpOVf81ddcFFkdS6d75Rydqhy/wum7Hsj9MTxV51xRtv KaQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723407416; x=1724012216; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FAhUoo5pDgGhIiuV+sHAbITY8GMWO4IlHepz/Cnoit4=; b=qDj0Sn+UnZPMy/Q/2EHjPJe6/dbSVJJOjPAsFdUR44+ZnqESOdvejhSdxtvgMheZ/p 0dj2HgOADhsiOTUuMh5LsmGfl6ZZPizL+eOwEEnZlA3X5EEVOU0Cewzj9MicPiOI+QJf YG5LyTXQgPz8rjOkkKr1dtbHJZAvxK/TYESYv2ETiJNZDkAldXYBtfJjc/rqofnku3Sn WbZ9PyRxLhE+9tonbdBgYB8aMgCWylOm9+WeFr5e13j5Iv8fpkv+FmyNoP1DEaokfqSd QEJh8rcR9R9F/6qfaID2A+sUix/RuUHC+9i2Dh66ZZAyfuQbV07ak3Pdp7tCM8waKUEa Yw0Q== X-Gm-Message-State: AOJu0Yz71L8kBJ0uRSOaOELXCh0VymkyadGSNCG/XYdBf5EjgN0ikUvY uo6Mq0G/jIVgUFqaaKaIipV/5ZnroYZ2G5X4VzZ5vZP67Hso21/kZOlfqIsQjQ== X-Google-Smtp-Source: AGHT+IFv4yaKcppkZbdBT6BpaVbXa612XbC3+tUsPYrhO7uViPdMKbr68w5ZQhUbQvBnYT+D3oGyBQ== X-Received: by 2002:a17:903:1c6:b0:1ff:4746:8ccf with SMTP id d9443c01a7336-200bbe23396mr3228185ad.26.1723407415369; Sun, 11 Aug 2024 13:16:55 -0700 (PDT) Received: from [2620:0:1008:15:49ba:9fa:21c6:8a73] ([2620:0:1008:15:49ba:9fa:21c6:8a73]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-200bbb39c05sm25687225ad.276.2024.08.11.13.16.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 11 Aug 2024 13:16:54 -0700 (PDT) Date: Sun, 11 Aug 2024 13:16:53 -0700 (PDT) From: David Rientjes To: Kaiyang Zhao cc: linux-mm@kvack.org, cgroups@vger.kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, mhocko@kernel.org, nehagholkar@meta.com, abhishekd@meta.com, hannes@cmpxchg.org Subject: Re: [PATCH] mm,memcg: provide per-cgroup counters for NUMA balancing operations In-Reply-To: <20240809212115.59291-1-kaiyang2@cs.cmu.edu> Message-ID: References: <20240809212115.59291-1-kaiyang2@cs.cmu.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 227D24000A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: jkur5x5qutgg8cuijdekch41rumc4qe7 X-HE-Tag: 1723407416-144659 X-HE-Meta: U2FsdGVkX1/740rxyg+8laDxYI02RyL5zhW1Y9j2+hJ2KKGfBCh3DTt6DZeFJIUFPymLqgXfdT1hUURLDlxog6R8t0DzXcDvGK4IuJzEvzNU6xNd9CAKNnOUZ8F9tOJ2UfDlZepwLyMgPV9sZLc29uEjQpK47HB9P/nL9OJ1FIKasKj7uLQv3dR1LIXpraaf7fNe7n6OWozljJULqnYC3NTvQ2K8eqvOQ4wY/AhCMOcyOP27eTvcEsPx5mLP7h6//42v/mkEQAmOSa2QA9a1KghAQTQNxuBaNxnCw4FQMyGV9MZII2zZzFbhNmA6lyrRz+xqemKvTrSi29dIghuWyJCw/+5iYskR2HEzrW5je0SB5kzyfK4Jaj1UIVY5VOofDUQu/YKXnIovsrDPx3ZZVIbjw5+6OcWNyOjkL1G3mUV4kOFKKsH+Y8EZJ4oY+fvvIKXPz2r2cDakQyDGORf/oo+ct79pFjwNu4PTH6i1oKsk+qx0MXzJ5iidc1vCH437DixqXY3lztQbFwNMBYJtOvamEg2BsuC+DY4vQI9uF3n1zKBluATqIdzo2EVXuTbaYMc0Am1jB+X4zIanE9rEJ/BAy9Au/S7dMKos10jwh9XMpjcl2c1tnEDffF0bnJM1tK93qD+dbHY53ckFdu9o/0n6rWGoCMPwcdEtVJ91i0pIzCWtTTN68sjsCM2CKJ65QNjSZ/gy59fXsf8dfu3h6hLw3HAdAe1IzqhsjtzpszRbSr2YBQjxrtQwoxWpoTjX6W21Mxa/ZKL7RAaBMI5ZG8eX1lgpk46TZrpNaB8hsKFXhPDFBOlneSupghXkCyE8wyVhvNdxiVfz+Vqrr8QmetGPMPqxT4A9o3vhaUVEIy/pPnl8boBg4osfb8kr+vqfFRAa7lvoLa+/YsTlKaU78B5Kx9m5SkW7sAhyOMWoCEDpT0Uia4GE61BQUmuVN2YrMiLRMZF7exbRqlv5bk1 N0Loq/Hy N/BNh7kN9ujIty1t+yEXOg0Pg55v5Ttr8CfHgZx4+ySFngVgVOC8eT5nUoK577CuavS3gAmbKCxw4rZKGYuecw5JGMpymSNikmysLnCPIBW9oS+xemUo7z9QYx6cmY3D2PRccTWkD/Am5uzSFtocng7BB9vc0o7Z93fKQwqUWJjURDPcFxNXRj2n8/3dTFT4DP7twUypRQPdXMwoHaypWtYEfVq+uCOBlsjvId1qi/XUSGNY3aAfhqkhRmbp6/B3MeTzXzFhqgU44Apj0e+Lcfpi8VcHe79f2GlF6vNLpNVwv9pCkhPzTaaEZke7Ev4ZFEUx8qr3bIpGBxSHPOLv/kqlNnw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.004920, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 9 Aug 2024, kaiyang2@cs.cmu.edu wrote: > From: Kaiyang Zhao > > The ability to observe the demotion and promotion decisions made by the > kernel on a per-cgroup basis is important for monitoring and tuning > containerized workloads on either NUMA machines or machines > equipped with tiered memory. > > Different containers in the system may experience drastically different > memory tiering actions that cannot be distinguished from the global > counters alone. > > For example, a container running a workload that has a much hotter > memory accesses will likely see more promotions and fewer demotions, > potentially depriving a colocated container of top tier memory to such > an extent that its performance degrades unacceptably. > > For another example, some containers may exhibit longer periods between > data reuse, causing much more numa_hint_faults than numa_pages_migrated. > In this case, tuning hot_threshold_ms may be appropriate, but the signal > can easily be lost if only global counters are available. > > This patch set adds five counters to > memory.stat in a cgroup: numa_pages_migrated, numa_pte_updates, > numa_hint_faults, pgdemote_kswapd and pgdemote_direct. > > count_memcg_events_mm() is added to count multiple event occurrences at > once, and get_mem_cgroup_from_folio() is added because we need to get a > reference to the memcg of a folio before it's migrated to track > numa_pages_migrated. The accounting of PGDEMOTE_* is moved to > shrink_inactive_list() before being changed to per-cgroup. > > Signed-off-by: Kaiyang Zhao Hi Kaiyang, have you considered per-memcg control over NUMA balancing operations as well? Wondering if that's the direction that you're heading in, because it would be very useful to be able to control NUMA balancing at memcg granularity on multi-tenant systems. I mentioned this at LSF/MM/BPF this year. If people believe this is out of scope for memcg, that would be good feedback as well.