From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF89EC433EF for ; Tue, 14 Jun 2022 22:26:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F0386B0073; Tue, 14 Jun 2022 18:26:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85F0E6B0078; Tue, 14 Jun 2022 18:26:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5EED66B0073; Tue, 14 Jun 2022 18:26:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3262E6B0074 for ; Tue, 14 Jun 2022 18:26:16 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id F17FE2100E for ; Tue, 14 Jun 2022 22:26:15 +0000 (UTC) X-FDA: 79578275910.30.E1239CA Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf29.hostedemail.com (Postfix) with ESMTP id 69A52120087 for ; Tue, 14 Jun 2022 22:26:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655245575; x=1686781575; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1JcRJtvf1cqzATPkll8v8tW75scjLiK/F3DEZ16TkYw=; b=CNUyVjXoNYwmaLFSGSqOEdAZ97CBOpmImd9zRDF+HoFSlcbQ58+nY01i DcurL475Ayo78KxXjVUQoN6YgP+rYm24kHzg09dYv49M7vguAXHq3CUxI OpQJDknrR136vLsLssVd/M/fMNgoimB8Y71qcHsod/zQl4xaKipR6oNNl zER5vAO7JfN4zTcMQ71fP77AqBwYAkZQ0SvFnROhq1Wjpvx5piaJKsmB5 kQVVIuXAvuo9Fb9MArWbgtsyJkyphz2U84h4U9bQnVQRqn89gPTrzztmb uUxx7+xiBcPvtWB2HrMn2mfMwMufFcagBYC2987Is/4CxF/+OAJ+kOnwh A==; X-IronPort-AV: E=McAfee;i="6400,9594,10378"; a="261787295" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="261787295" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 15:26:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="588724728" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by fmsmga007.fm.intel.com with ESMTP; 14 Jun 2022 15:26:12 -0700 From: Tim Chen To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: Tim Chen , Wei Xu , Huang Ying , Greg Thelen , Yang Shi , Davidlohr Bueso , Brice Goglin , Michal Hocko , Linux Kernel Mailing List , Hesham Almatary , Dave Hansen , Jonathan Cameron , Alistair Popple , Dan Williams , Feng Tang , Jagdish Gediya , Baolin Wang , David Rientjes , "Aneesh Kumar K . V" , Shakeel Butt Subject: [RFC PATCH 2/3] mm/memory-tiers: Use page counter to track toptier memory usage Date: Tue, 14 Jun 2022 15:25:34 -0700 Message-Id: X-Mailer: git-send-email 2.32.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655245575; a=rsa-sha256; cv=none; b=K0mZZU3j4jrnS6HC79ikET5PWR+sRiz5Htlim6pYuP404sgT1vrzVmOJjyIE5csblWDUJY 0yw918cJ6s7ET+RCVQyyTHPbtiqAS1ntpT7Pq5lN2yvme3THEwG6kZqFakRrT6yFq3I6lS ihPV/B/Lm6v+6HGRYQfMOlYQj5h5vIM= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=CNUyVjXo; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf29.hostedemail.com: domain of tim.c.chen@linux.intel.com has no SPF policy when checking 134.134.136.126) smtp.mailfrom=tim.c.chen@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655245575; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tKTRulxsms4TAiee24gSxieuRb4lX6dzGyXSfl2vGnY=; b=ObjyfSOnj16JcutH4sS3ydUtHBTmF2ggYADU4iWG3g7l2Nq85uy/NMPta2he9R8zjCk1w1 V1rJQRx5hLcROL49o/KbKcLPaKl7OYraQzKdOYyWlgR1MygV63iAfE8boinCVI1AcIDX0P tsNE/xOGJviPz+VWrHXVcI69s9JrYvU= Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=CNUyVjXo; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf29.hostedemail.com: domain of tim.c.chen@linux.intel.com has no SPF policy when checking 134.134.136.126) smtp.mailfrom=tim.c.chen@linux.intel.com X-Rspam-User: X-Stat-Signature: hrzkmw75qskeeb95x8ddu8drnfrdh3z3 X-Rspamd-Queue-Id: 69A52120087 X-Rspamd-Server: rspam08 X-HE-Tag: 1655245575-40647 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If we need to restrict toptier memory usage for a cgroup, we need to retrieve usage of toptier memory efficiently. Add a page counter to track toptier memory usage directly so its value can be returned right away. --- include/linux/memcontrol.h | 1 + mm/memcontrol.c | 50 ++++++++++++++++++++++++++++++++------ 2 files changed, 43 insertions(+), 8 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 9ecead1042b9..b4f727cba1de 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -241,6 +241,7 @@ struct mem_cgroup { /* Accounted resources */ struct page_counter memory; /* Both v1 & v2 */ + struct page_counter toptier; union { struct page_counter swap; /* v2 only */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2f6e95e6d200..2f20ec2712b8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -848,6 +848,23 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, __this_cpu_add(memcg->vmstats_percpu->nr_page_events, nr_pages); } +static inline void mem_cgroup_charge_toptier(struct mem_cgroup *memcg, + int nid, + int nr_pages) +{ + if (!node_is_toptier(nid) || !memcg) + return; + + if (nr_pages >= 0) { + page_counter_charge(&memcg->toptier, + (unsigned long) nr_pages); + } else { + nr_pages = -nr_pages; + page_counter_uncharge(&memcg->toptier, + (unsigned long) nr_pages); + } +} + static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg, enum mem_cgroup_events_target target) { @@ -3027,6 +3044,8 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) if (!ret) { page->memcg_data = (unsigned long)objcg | MEMCG_DATA_KMEM; + mem_cgroup_charge_toptier(page_memcg(page), + page_to_nid(page), 1 << order); return 0; } obj_cgroup_put(objcg); @@ -3050,6 +3069,8 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) objcg = __folio_objcg(folio); obj_cgroup_uncharge_pages(objcg, nr_pages); + mem_cgroup_charge_toptier(page_memcg(page), + page_to_nid(page), -nr_pages); folio->memcg_data = 0; obj_cgroup_put(objcg); } @@ -3947,13 +3968,10 @@ unsigned long mem_cgroup_memtier_usage(struct mem_cgroup *memcg, unsigned long mem_cgroup_toptier_usage(struct mem_cgroup *memcg) { - struct memory_tier *top_tier; - - top_tier = list_first_entry(&memory_tiers, struct memory_tier, list); - if (top_tier) - return mem_cgroup_memtier_usage(memcg, top_tier); - else + if (!memcg) return 0; + + return page_counter_read(&memcg->toptier); } #endif /* CONFIG_NUMA */ @@ -5228,11 +5246,13 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) memcg->oom_kill_disable = parent->oom_kill_disable; page_counter_init(&memcg->memory, &parent->memory); + page_counter_init(&memcg->toptier, &parent->toptier); page_counter_init(&memcg->swap, &parent->swap); page_counter_init(&memcg->kmem, &parent->kmem); page_counter_init(&memcg->tcpmem, &parent->tcpmem); } else { page_counter_init(&memcg->memory, NULL); + page_counter_init(&memcg->toptier, NULL); page_counter_init(&memcg->swap, NULL); page_counter_init(&memcg->kmem, NULL); page_counter_init(&memcg->tcpmem, NULL); @@ -5678,6 +5698,8 @@ static int mem_cgroup_move_account(struct page *page, memcg_check_events(to, nid); mem_cgroup_charge_statistics(from, -nr_pages); memcg_check_events(from, nid); + mem_cgroup_charge_toptier(to, nid, nr_pages); + mem_cgroup_charge_toptier(from, nid, -nr_pages); local_irq_enable(); out_unlock: folio_unlock(folio); @@ -6761,6 +6783,7 @@ static int charge_memcg(struct folio *folio, struct mem_cgroup *memcg, local_irq_disable(); mem_cgroup_charge_statistics(memcg, nr_pages); + mem_cgroup_charge_toptier(memcg, folio_nid(folio), nr_pages); memcg_check_events(memcg, folio_nid(folio)); local_irq_enable(); out: @@ -6853,6 +6876,7 @@ struct uncharge_gather { unsigned long nr_memory; unsigned long pgpgout; unsigned long nr_kmem; + unsigned long nr_toptier; int nid; }; @@ -6867,6 +6891,7 @@ static void uncharge_batch(const struct uncharge_gather *ug) if (ug->nr_memory) { page_counter_uncharge(&ug->memcg->memory, ug->nr_memory); + page_counter_uncharge(&ug->memcg->toptier, ug->nr_toptier); if (do_memsw_account()) page_counter_uncharge(&ug->memcg->memsw, ug->nr_memory); if (ug->nr_kmem) @@ -6929,12 +6954,18 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) ug->nr_memory += nr_pages; ug->nr_kmem += nr_pages; + if (node_is_toptier(folio_nid(folio))) + ug->nr_toptier += nr_pages; + folio->memcg_data = 0; obj_cgroup_put(objcg); } else { /* LRU pages aren't accounted at the root level */ - if (!mem_cgroup_is_root(memcg)) + if (!mem_cgroup_is_root(memcg)) { ug->nr_memory += nr_pages; + if (node_is_toptier(folio_nid(folio))) + ug->nr_toptier += nr_pages; + } ug->pgpgout++; folio->memcg_data = 0; @@ -7011,6 +7042,7 @@ void mem_cgroup_migrate(struct folio *old, struct folio *new) /* Force-charge the new page. The old one will be freed soon */ if (!mem_cgroup_is_root(memcg)) { page_counter_charge(&memcg->memory, nr_pages); + mem_cgroup_charge_toptier(memcg, folio_nid(new), nr_pages); if (do_memsw_account()) page_counter_charge(&memcg->memsw, nr_pages); } @@ -7231,8 +7263,10 @@ void mem_cgroup_swapout(struct folio *folio, swp_entry_t entry) folio->memcg_data = 0; - if (!mem_cgroup_is_root(memcg)) + if (!mem_cgroup_is_root(memcg)) { page_counter_uncharge(&memcg->memory, nr_entries); + mem_cgroup_charge_toptier(memcg, folio_nid(folio), -nr_entries); + } if (!cgroup_memory_noswap && memcg != swap_memcg) { if (!mem_cgroup_is_root(swap_memcg)) -- 2.35.1