From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E783C43334 for ; Wed, 15 Jun 2022 00:28:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CD29F6B0071; Tue, 14 Jun 2022 20:28:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C82696B0072; Tue, 14 Jun 2022 20:28:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B24456B0073; Tue, 14 Jun 2022 20:28:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A2F476B0071 for ; Tue, 14 Jun 2022 20:28:04 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 776D2340B6 for ; Wed, 15 Jun 2022 00:28:04 +0000 (UTC) X-FDA: 79578582888.26.894C59C Received: from mail-vs1-f46.google.com (mail-vs1-f46.google.com [209.85.217.46]) by imf20.hostedemail.com (Postfix) with ESMTP id 148471C008A for ; Wed, 15 Jun 2022 00:28:03 +0000 (UTC) Received: by mail-vs1-f46.google.com with SMTP id d39so10534702vsv.7 for ; Tue, 14 Jun 2022 17:28:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wB/C7+9OHI1d8feHo8juhkmXa2argsBnIXO6PSnDgOg=; b=GWnRQzi64D/DE8t4B34tEP2aZymSjnraorr8A9JVMRUlEmFMyX8I1r5tY6K6xHjHbd ylEYKZ85QUA+MNuzpF0KWkx9ZyBOrxSFkxrwDdrquNYbkLJBhJ/Oj/e5JbEHDUxcsBb4 fogerJz/Uvtn0/pKlO09lZCHeZcD6QZqlM2yUxV8aEUe4LPpAuoGeNcgmUNtgiMu+Gri uOUvLdHhvtK9F3Miqi94eNq4ynLxzuRvuLoVvtRwUmUiolTu/3fvDRwEwZTdd1Q65weg JY4TkXNNqW6h+CTk/y4T9BM9s3WFWOYsW1WOmuHdhlMcyyU9KIU2tdPY1MRfxkVtgQHT LLcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wB/C7+9OHI1d8feHo8juhkmXa2argsBnIXO6PSnDgOg=; b=PCiVqCmx93fuUBPE3AO1c2QCqnKILXEruCQd0CFztimp4+ZullspUfH4NRYUObRwC6 hQhbFaiQpmH84XxLe4SnWyoX4cji222+bZku8bTQ0LMzQ7tiQlU8ILXXqG64biwrEqlA 6utF3buQ2DgCArkPanoJo4TKNdCx0C2x5JFfASRqyRJ8WS0kmWRMh/mWoYhqV6ylfJqt kAomFYK/ki54/MrUjW/qRNmPloUSIV8dgMZ84lrKSxwMp80NVVWnS6GQ7QcjJdNoRGYe gxrHaFdj5Gq/hRHd6KdUYlJPTpblNVqCRDFv+OVuiQQBneSpmEG7z4wIp5mHiDhz12RE JGVQ== X-Gm-Message-State: AJIora/G/VgvYVLjfJRyj/+oeD31UZ9NdNKAXQ0hsLqAmgrQl7N0okfc V0h3AlGbSzvXiC8J8rtv0RUEo7iKfg1QtnUR4jQQAA== X-Google-Smtp-Source: AGRyM1t7AlUX9N9TmRtcmpqa/E+1LS/6x+8MM8JP9B9x9s4DpefGCRv7stKBYxGVtkV7av8XPoio/P26UtsJUvDyVxQ= X-Received: by 2002:a67:c486:0:b0:34b:918b:fc00 with SMTP id d6-20020a67c486000000b0034b918bfc00mr3783711vsk.44.1655252883215; Tue, 14 Jun 2022 17:28:03 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Wei Xu Date: Tue, 14 Jun 2022 17:27:51 -0700 Message-ID: Subject: Re: [RFC PATCH 2/3] mm/memory-tiers: Use page counter to track toptier memory usage To: Tim Chen Cc: Linux MM , Andrew Morton , Huang Ying , Greg Thelen , Yang Shi , Davidlohr Bueso , Brice Goglin , Michal Hocko , Linux Kernel Mailing List , Hesham Almatary , Dave Hansen , Jonathan Cameron , Alistair Popple , Dan Williams , Feng Tang , Jagdish Gediya , Baolin Wang , David Rientjes , "Aneesh Kumar K . V" , Shakeel Butt Content-Type: multipart/alternative; boundary="0000000000001e75b605e1719814" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655252884; a=rsa-sha256; cv=none; b=VPLKKU+V9JEXcU7xqhFy3A4ixDAzlE0zeFxFsLcU874Ew4fzXFRqnp8HtxiE/P15xz4W3e hCT+0Qr5aFLZWd/DH36YldpgAblf+2UWWxiyTmevdY3mD4Dvk9BE3d44yUZV6eQJkPCkWF zOSYd15VKcotvnH9+iaOX6DbLGF9z+U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655252884; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wB/C7+9OHI1d8feHo8juhkmXa2argsBnIXO6PSnDgOg=; b=l4zeYZwF56lt5homHHHqUaYNDvuHV1wPZh2XAT5h/PzDSKfk7M8WhydTFAAl1m7OA7u7Nw FhnXcIrSJo5SEoVutTtx8w3XmY3hR5LYDAFKEgNYN48VOAC88fNHkdUlGgHdSpPO6ZB0hJ KiCwGG2xQh5N0CXw9b78hnT9iQAJgEM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=GWnRQzi6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of weixugc@google.com designates 209.85.217.46 as permitted sender) smtp.mailfrom=weixugc@google.com Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=GWnRQzi6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of weixugc@google.com designates 209.85.217.46 as permitted sender) smtp.mailfrom=weixugc@google.com X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 148471C008A X-Stat-Signature: sx8xr7x9tapznka3tt95ggga7c5cncy8 X-HE-Tag: 1655252883-712726 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --0000000000001e75b605e1719814 Content-Type: text/plain; charset="UTF-8" On Tue, Jun 14, 2022 at 3:26 PM Tim Chen wrote: > If we need to restrict toptier memory usage for a cgroup, > we need to retrieve usage of toptier memory efficiently. > Add a page counter to track toptier memory usage directly > so its value can be returned right away. > --- > include/linux/memcontrol.h | 1 + > mm/memcontrol.c | 50 ++++++++++++++++++++++++++++++++------ > 2 files changed, 43 insertions(+), 8 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 9ecead1042b9..b4f727cba1de 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -241,6 +241,7 @@ struct mem_cgroup { > > /* Accounted resources */ > struct page_counter memory; /* Both v1 & v2 */ > + struct page_counter toptier; > > union { > struct page_counter swap; /* v2 only */ > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 2f6e95e6d200..2f20ec2712b8 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -848,6 +848,23 @@ static void mem_cgroup_charge_statistics(struct > mem_cgroup *memcg, > __this_cpu_add(memcg->vmstats_percpu->nr_page_events, nr_pages); > } > > +static inline void mem_cgroup_charge_toptier(struct mem_cgroup *memcg, > + int nid, > + int nr_pages) > +{ > + if (!node_is_toptier(nid) || !memcg) > + return; > + > + if (nr_pages >= 0) { > + page_counter_charge(&memcg->toptier, > + (unsigned long) nr_pages); > + } else { > + nr_pages = -nr_pages; > + page_counter_uncharge(&memcg->toptier, > + (unsigned long) nr_pages); > + } > +} > + > When we don't know which pages are being charged, we should still charge the usage to toptier (assuming that toptier always include the default tier), e.g. from try_charge_memcg(). The idea is that when lower tier memory is not used, memcg->toptier and memcg->memory should have the same value. Otherwise, it can cause confusions about where the pages of (memcg->memory - memcg->toptier) go. static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg, > enum mem_cgroup_events_target > target) > { > @@ -3027,6 +3044,8 @@ int __memcg_kmem_charge_page(struct page *page, > gfp_t gfp, int order) > if (!ret) { > page->memcg_data = (unsigned long)objcg | > MEMCG_DATA_KMEM; > + mem_cgroup_charge_toptier(page_memcg(page), > + page_to_nid(page), 1 << order); > return 0; > } > obj_cgroup_put(objcg); > @@ -3050,6 +3069,8 @@ void __memcg_kmem_uncharge_page(struct page *page, > int order) > > objcg = __folio_objcg(folio); > obj_cgroup_uncharge_pages(objcg, nr_pages); > + mem_cgroup_charge_toptier(page_memcg(page), > + page_to_nid(page), -nr_pages); > folio->memcg_data = 0; > obj_cgroup_put(objcg); > } > @@ -3947,13 +3968,10 @@ unsigned long mem_cgroup_memtier_usage(struct > mem_cgroup *memcg, > > unsigned long mem_cgroup_toptier_usage(struct mem_cgroup *memcg) > { > - struct memory_tier *top_tier; > - > - top_tier = list_first_entry(&memory_tiers, struct memory_tier, > list); > - if (top_tier) > - return mem_cgroup_memtier_usage(memcg, top_tier); > - else > + if (!memcg) > return 0; > + > + return page_counter_read(&memcg->toptier); > } > > #endif /* CONFIG_NUMA */ > @@ -5228,11 +5246,13 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state > *parent_css) > memcg->oom_kill_disable = parent->oom_kill_disable; > > page_counter_init(&memcg->memory, &parent->memory); > + page_counter_init(&memcg->toptier, &parent->toptier); > page_counter_init(&memcg->swap, &parent->swap); > page_counter_init(&memcg->kmem, &parent->kmem); > page_counter_init(&memcg->tcpmem, &parent->tcpmem); > } else { > page_counter_init(&memcg->memory, NULL); > + page_counter_init(&memcg->toptier, NULL); > page_counter_init(&memcg->swap, NULL); > page_counter_init(&memcg->kmem, NULL); > page_counter_init(&memcg->tcpmem, NULL); > @@ -5678,6 +5698,8 @@ static int mem_cgroup_move_account(struct page *page, > memcg_check_events(to, nid); > mem_cgroup_charge_statistics(from, -nr_pages); > memcg_check_events(from, nid); > + mem_cgroup_charge_toptier(to, nid, nr_pages); > + mem_cgroup_charge_toptier(from, nid, -nr_pages); > local_irq_enable(); > out_unlock: > folio_unlock(folio); > @@ -6761,6 +6783,7 @@ static int charge_memcg(struct folio *folio, struct > mem_cgroup *memcg, > > local_irq_disable(); > mem_cgroup_charge_statistics(memcg, nr_pages); > + mem_cgroup_charge_toptier(memcg, folio_nid(folio), nr_pages); > memcg_check_events(memcg, folio_nid(folio)); > local_irq_enable(); > out: > @@ -6853,6 +6876,7 @@ struct uncharge_gather { > unsigned long nr_memory; > unsigned long pgpgout; > unsigned long nr_kmem; > + unsigned long nr_toptier; > int nid; > }; > > @@ -6867,6 +6891,7 @@ static void uncharge_batch(const struct > uncharge_gather *ug) > > if (ug->nr_memory) { > page_counter_uncharge(&ug->memcg->memory, ug->nr_memory); > + page_counter_uncharge(&ug->memcg->toptier, ug->nr_toptier); > if (do_memsw_account()) > page_counter_uncharge(&ug->memcg->memsw, > ug->nr_memory); > if (ug->nr_kmem) > @@ -6929,12 +6954,18 @@ static void uncharge_folio(struct folio *folio, > struct uncharge_gather *ug) > ug->nr_memory += nr_pages; > ug->nr_kmem += nr_pages; > > + if (node_is_toptier(folio_nid(folio))) > + ug->nr_toptier += nr_pages; > + > folio->memcg_data = 0; > obj_cgroup_put(objcg); > } else { > /* LRU pages aren't accounted at the root level */ > - if (!mem_cgroup_is_root(memcg)) > + if (!mem_cgroup_is_root(memcg)) { > ug->nr_memory += nr_pages; > + if (node_is_toptier(folio_nid(folio))) > + ug->nr_toptier += nr_pages; > + } > ug->pgpgout++; > > folio->memcg_data = 0; > @@ -7011,6 +7042,7 @@ void mem_cgroup_migrate(struct folio *old, struct > folio *new) > /* Force-charge the new page. The old one will be freed soon */ > if (!mem_cgroup_is_root(memcg)) { > page_counter_charge(&memcg->memory, nr_pages); > + mem_cgroup_charge_toptier(memcg, folio_nid(new), nr_pages); > if (do_memsw_account()) > page_counter_charge(&memcg->memsw, nr_pages); > } > @@ -7231,8 +7263,10 @@ void mem_cgroup_swapout(struct folio *folio, > swp_entry_t entry) > > folio->memcg_data = 0; > > - if (!mem_cgroup_is_root(memcg)) > + if (!mem_cgroup_is_root(memcg)) { > page_counter_uncharge(&memcg->memory, nr_entries); > + mem_cgroup_charge_toptier(memcg, folio_nid(folio), > -nr_entries); > + } > > if (!cgroup_memory_noswap && memcg != swap_memcg) { > if (!mem_cgroup_is_root(swap_memcg)) > -- > 2.35.1 > > > --0000000000001e75b605e1719814 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Jun 14, 2022 at 3:26 PM Tim C= hen <tim.c.chen@linux.inte= l.com> wrote:
If we need to restrict toptier memory usage for a cgroup,
we need to retrieve usage of toptier memory efficiently.
Add a page counter to track toptier memory usage directly
so its value can be returned right away.
---
=C2=A0include/linux/memcontrol.h |=C2=A0 1 +
=C2=A0mm/memcontrol.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 50 +++++++= +++++++++++++++++++++++++------
=C2=A02 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 9ecead1042b9..b4f727cba1de 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -241,6 +241,7 @@ struct mem_cgroup {

=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Accounted resources */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct page_counter memory;=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Both v1 & v2 */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0struct page_counter toptier;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 union {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct page_counter= swap;=C2=A0 =C2=A0 =C2=A0 =C2=A0/* v2 only */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2f6e95e6d200..2f20ec2712b8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -848,6 +848,23 @@ static void mem_cgroup_charge_statistics(struct mem_cg= roup *memcg,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 __this_cpu_add(memcg->vmstats_percpu->nr_= page_events, nr_pages);
=C2=A0}

+static inline void mem_cgroup_charge_toptier(struct mem_cgroup *memcg,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 int nid,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 int nr_pages)
+{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!node_is_toptier(nid) || !memcg)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return;
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (nr_pages >=3D 0) {
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page_counter_charge= (&memcg->toptier,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(unsigned long) nr_pages);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0} else {
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nr_pages =3D -nr_pa= ges;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page_counter_unchar= ge(&memcg->toptier,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(unsigned long) nr_pages);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+}
+

When we don't know which pages ar= e being charged, we should still charge the usage to toptier=C2=A0(assuming= that toptier always include the default tier), e.g. from try_charge_memcg(= ).

The idea is that when lower tier memory is not = used, memcg->toptier and memcg->memory should have the same value. Ot= herwise, it can cause confusions about where the pages of (memcg->memory= - memcg->toptier) go.

=C2=A0static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0enum mem_= cgroup_events_target target)
=C2=A0{
@@ -3027,6 +3044,8 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t= gfp, int order)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!ret) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 page->memcg_data =3D (unsigned long)objcg |
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MEMCG_DATA_KMEM;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0mem_cgroup_charge_toptier(page_memcg(page),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page_to_n= id(page), 1 << order);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 return 0;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 obj_cgroup_put(objc= g);
@@ -3050,6 +3069,8 @@ void __memcg_kmem_uncharge_page(struct page *page, in= t order)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 objcg =3D __folio_objcg(folio);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 obj_cgroup_uncharge_pages(objcg, nr_pages);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0mem_cgroup_charge_toptier(page_memcg(page),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0page_to_nid(page), -nr_pages);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 folio->memcg_data =3D 0;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 obj_cgroup_put(objcg);
=C2=A0}
@@ -3947,13 +3968,10 @@ unsigned long mem_cgroup_memtier_usage(struct mem_c= group *memcg,

=C2=A0unsigned long mem_cgroup_toptier_usage(struct mem_cgroup *memcg)
=C2=A0{
-=C2=A0 =C2=A0 =C2=A0 =C2=A0struct memory_tier *top_tier;
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0top_tier =3D list_first_entry(&memory_tiers= , struct memory_tier, list);
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (top_tier)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return mem_cgroup_m= emtier_usage(memcg, top_tier);
-=C2=A0 =C2=A0 =C2=A0 =C2=A0else
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!memcg)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0;
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0return page_counter_read(&memcg->toptier= );
=C2=A0}

=C2=A0#endif /* CONFIG_NUMA */
@@ -5228,11 +5246,13 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *pa= rent_css)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 memcg->oom_kill_= disable =3D parent->oom_kill_disable;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->memory, &parent->memory);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page_counter_init(&= amp;memcg->toptier, &parent->toptier);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->swap, &parent->swap);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->kmem, &parent->kmem);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->tcpmem, &parent->tcpmem);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->memory, NULL);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page_counter_init(&= amp;memcg->toptier, NULL);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->swap, NULL);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->kmem, NULL);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_init(&= amp;memcg->tcpmem, NULL);
@@ -5678,6 +5698,8 @@ static int mem_cgroup_move_account(struct page *page,=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 memcg_check_events(to, nid);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 mem_cgroup_charge_statistics(from, -nr_pages);<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 memcg_check_events(from, nid);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0mem_cgroup_charge_toptier(to, nid, nr_pages); +=C2=A0 =C2=A0 =C2=A0 =C2=A0mem_cgroup_charge_toptier(from, nid, -nr_pages)= ;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 local_irq_enable();
=C2=A0out_unlock:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 folio_unlock(folio);
@@ -6761,6 +6783,7 @@ static int charge_memcg(struct folio *folio, struct m= em_cgroup *memcg,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 local_irq_disable();
=C2=A0 =C2=A0 =C2=A0 =C2=A0 mem_cgroup_charge_statistics(memcg, nr_pages);<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mem_cgroup_charge_toptier(memcg, folio_nid(foli= o), nr_pages);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 memcg_check_events(memcg, folio_nid(folio)); =C2=A0 =C2=A0 =C2=A0 =C2=A0 local_irq_enable();
=C2=A0out:
@@ -6853,6 +6876,7 @@ struct uncharge_gather {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 unsigned long nr_memory;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 unsigned long pgpgout;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 unsigned long nr_kmem;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long nr_toptier;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 int nid;
=C2=A0};

@@ -6867,6 +6891,7 @@ static void uncharge_batch(const struct uncharge_gath= er *ug)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (ug->nr_memory) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_unchar= ge(&ug->memcg->memory, ug->nr_memory);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page_counter_unchar= ge(&ug->memcg->toptier, ug->nr_toptier);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (do_memsw_accoun= t())
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 page_counter_uncharge(&ug->memcg->memsw, ug->nr_mem= ory);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (ug->nr_kmem)=
@@ -6929,12 +6954,18 @@ static void uncharge_folio(struct folio *folio, str= uct uncharge_gather *ug)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ug->nr_memory += =3D nr_pages;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ug->nr_kmem +=3D= nr_pages;

+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (node_is_toptier= (folio_nid(folio)))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0ug->nr_toptier +=3D nr_pages;
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 folio->memcg_dat= a =3D 0;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 obj_cgroup_put(objc= g);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* LRU pages aren&#= 39;t accounted at the root level */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!mem_cgroup_is_= root(memcg))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!mem_cgroup_is_= root(memcg)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 ug->nr_memory +=3D nr_pages;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if (node_is_toptier(folio_nid(folio)))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ug->nr_toptier +=3D nr_pages;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ug->pgpgout++;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 folio->memcg_dat= a =3D 0;
@@ -7011,6 +7042,7 @@ void mem_cgroup_migrate(struct folio *old, struct fol= io *new)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Force-charge the new page. The old one will = be freed soon */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!mem_cgroup_is_root(memcg)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_charge= (&memcg->memory, nr_pages);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mem_cgroup_charge_t= optier(memcg, folio_nid(new), nr_pages);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (do_memsw_accoun= t())
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 page_counter_charge(&memcg->memsw, nr_pages);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
@@ -7231,8 +7263,10 @@ void mem_cgroup_swapout(struct folio *folio, swp_ent= ry_t entry)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 folio->memcg_data =3D 0;

-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!mem_cgroup_is_root(memcg))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!mem_cgroup_is_root(memcg)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 page_counter_unchar= ge(&memcg->memory, nr_entries);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mem_cgroup_charge_t= optier(memcg, folio_nid(folio), -nr_entries);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!cgroup_memory_noswap && memcg !=3D= swap_memcg) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!mem_cgroup_is_= root(swap_memcg))
--
2.35.1


--0000000000001e75b605e1719814--