From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F027C83F1B for ; Wed, 16 Jul 2025 20:21:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D9846B009A; Wed, 16 Jul 2025 16:21:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 08B2E6B009B; Wed, 16 Jul 2025 16:21:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EBABA6B009C; Wed, 16 Jul 2025 16:21:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D85136B009A for ; Wed, 16 Jul 2025 16:21:20 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 65C38B9EA8 for ; Wed, 16 Jul 2025 20:21:20 +0000 (UTC) X-FDA: 83671247520.03.0462934 Received: from lgeamrelo07.lge.com (lgeamrelo07.lge.com [156.147.51.103]) by imf12.hostedemail.com (Postfix) with ESMTP id E864440002 for ; Wed, 16 Jul 2025 20:21:16 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=lge.com; spf=pass (imf12.hostedemail.com: domain of youngjun.park@lge.com designates 156.147.51.103 as permitted sender) smtp.mailfrom=youngjun.park@lge.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752697278; a=rsa-sha256; cv=none; b=gv7hocDS/WJI3pS/6hyrTIvvKoYCgA70uEMg1Hu+NioiNnP22M7NZ9mXfNkJVFNNyYxTU1 Fzyc8HIurR5aazR6hJl2mzERBQlaascZ5LiZCUuzMEDI7qtsOqBa1txgz4RmXAKrw0S3PF 3BK7NoBAo1a7wRljXTWzZ9VPXuJYpZo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=lge.com; spf=pass (imf12.hostedemail.com: domain of youngjun.park@lge.com designates 156.147.51.103 as permitted sender) smtp.mailfrom=youngjun.park@lge.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752697278; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eKG/RXjpi9lmAOY5/qjO4vazCmKeGvoJFF8DZjHTq+A=; b=aweDgNk+otUMIjXHBfmryrtCvKM4qOQMJOToPjl0u2rM/Imwrzwl2tG+rcJfrwYFoyUWYg KlwweJYouRQ5q2uiDFVKhfjBA2DnhMpDWvclh7Nw3J/f5yrTTrTHmh0mWvx787M7lmlC8B GNuY3aPDPL9vRzXcjiFhe0covJFfvBQ= Received: from unknown (HELO yjaykim-PowerEdge-T330.lge.net) (10.177.112.156) by 156.147.51.103 with ESMTP; 17 Jul 2025 05:21:07 +0900 X-Original-SENDERIP: 10.177.112.156 X-Original-MAILFROM: youngjun.park@lge.com From: Youngjun Park To: akpm@linux-foundation.org, hannes@cmpxchg.org Cc: mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, shikemeng@huaweicloud.com, kasong@tencent.com, nphamcs@gmail.com, bhe@redhat.com, baohua@kernel.org, chrisl@kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, gunho.lee@lge.com, iamjoonsoo.kim@lge.com, taejoon.song@lge.com, Youngjun Park Subject: [PATCH 2/4] mm: swap: Apply per-cgroup swap priority mechanism to swap layer Date: Thu, 17 Jul 2025 05:20:04 +0900 Message-Id: <20250716202006.3640584-3-youngjun.park@lge.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250716202006.3640584-1-youngjun.park@lge.com> References: <20250716202006.3640584-1-youngjun.park@lge.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: nghupqotmxebs8wq6wc6nc69xdgg5ecc X-Rspam-User: X-Rspamd-Queue-Id: E864440002 X-Rspamd-Server: rspam02 X-HE-Tag: 1752697276-252567 X-HE-Meta: U2FsdGVkX1/PBGL/5nfcCpJnI8X2U4W+3nzwDVj6frxaMNTzTKw/QL7ZFL2asVNoPD2Ke8RX74PShh3ILmjSuFS2RXudWmK0/8ED9tSHV/6gvEgtgwgZ7x3O8z2s3vYMn2M0ypCEcVSN4+h+ixm8wC/HDZuJUnBgtb48e3/JlKEb0yQUOfqHxN81kT3rTTWhT8NvNb1EQKZfCrcqLIfcrHMibSpgwbypN8XzOqM5bI+6/qw8bBiMEMuuwb4V1d6SQCRpmTgWXmZZGSvgplD0VANb7SwniL8kRkOukcGqSs+419v1DAoG60q0Qc+owq3MR4xSYd9MRPdPnxXUd10X8FMS+N1gQp2X081BfXxi14HM1H9UeBdMrO6Gp3AXYEORjcn782eOW1k3DDIccluNf/L/HhHqSWXXrrBElKkx3XzB6sax8qeQsjLoR9ydKRZPVjCEvOJxWHBRom7UlCjeumJEhFYZonqcF6oXDNEo1cpS8UbAyDLJAfpusg8fn3hM440p6CImYfxZInq5WCC3YNHqQkgcWdXcLyrbyiaadZckze9d1xVx0X1KmxjSpPXE9ri99+En80JoiXWXunZbum2mw5/d2J9ZdzRGMkKStMlv0j+8vhhr9tgGlyGmguhdV7mQF8Qo0+a9hW0FyiDP2HTXk8pMoi6jnd7rpaAXT35IfM4SS32HNho1IaZuaZHrPgsiMrsP6NAZVfV+2+kPmXp0rDCQXZ5STyzjS2y5ZRFGwTnesjIGADJ4nWPdfBFNT8Wgd8BE+XFCHmhNcq+6v9P09Ibf4AzTh7JN4qs6yMycobIi2slfDAs2Y3RkiXS2ZjjSOF0Ak43p15B0feVZ3Dt71oly/leeXYmMdVVlF64IFOjWMRSg7xoMvOmxeaHfmVN3VxDE2wjTmpa9UWe7BmrZO7Qu5lzx7KL/0OyKCeMvrzTDTkJ7osLaS7qmdtyKcOw4XsSZ11GmHrOpHy/ KYIlqyST yUiHB4aWWhUYnfix5pMKqZQq+xMnLHjRM8BWhpChoXizxSNGEG0qKkAcSkQYupxs0Awlng+tyvejMTuABVtwITsFqtvT2IfoS567IcRa26ed9d77BWv+1+tTEJpkyG6Sp32nJfTADIAujLkpn1HlmXdzB1AEv+MZ0ww74 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch applies the per-cgroup swap priority mechanism to the swap layer. It implements: - Swap device ID assignment based on the cgroup's effective priority - Swap device selection respecting cgroup-specific priorities - Swap on/off propagation logic that updates per-cgroup settings accordingly Currently, the per-CPU swap cluster cache is bypassed, since different cgroups may select different devices based on their configured priorities. Signed-off-by: Youngjun Park --- mm/swap_cgroup_priority.c | 6 ++--- mm/swapfile.c | 46 +++++++++++++++++++++++++++++++++++++-- 2 files changed, 47 insertions(+), 5 deletions(-) diff --git a/mm/swap_cgroup_priority.c b/mm/swap_cgroup_priority.c index abbefa6de63a..979bc18d2eed 100644 --- a/mm/swap_cgroup_priority.c +++ b/mm/swap_cgroup_priority.c @@ -243,9 +243,9 @@ bool swap_alloc_cgroup_priority(struct mem_cgroup *memcg, unsigned long offset; int node; - /* TODO - * Per-cpu swapdev cache can't be used directly as cgroup-specific - * priorities may select different devices. + /* + * TODO: Per-cpu swap cluster cache can't be used directly + * as cgroup-specific priorities may select different devices. */ spin_lock(&swap_avail_lock); node = numa_node_id(); diff --git a/mm/swapfile.c b/mm/swapfile.c index 4b56f117b2b0..bfd0532ad250 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1029,6 +1029,7 @@ static void del_from_avail_list(struct swap_info_struct *si, bool swapoff) for_each_node(nid) plist_del(&si->avail_lists[nid], &swap_avail_heads[nid]); + deactivate_swap_cgroup_priority(si, swapoff); skip: spin_unlock(&swap_avail_lock); } @@ -1072,6 +1073,7 @@ static void add_to_avail_list(struct swap_info_struct *si, bool swapon) for_each_node(nid) plist_add(&si->avail_lists[nid], &swap_avail_heads[nid]); + activate_swap_cgroup_priority(si, swapon); skip: spin_unlock(&swap_avail_lock); } @@ -1292,8 +1294,10 @@ int folio_alloc_swap(struct folio *folio, gfp_t gfp) } local_lock(&percpu_swap_cluster.lock); - if (!swap_alloc_fast(&entry, order)) - swap_alloc_slow(&entry, order); + if (!swap_alloc_cgroup_priority(folio_memcg(folio), &entry, order)) { + if (!swap_alloc_fast(&entry, order)) + swap_alloc_slow(&entry, order); + } local_unlock(&percpu_swap_cluster.lock); /* Need to call this even if allocation failed, for MEMCG_SWAP_FAIL. */ @@ -2778,6 +2782,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) if (!p->bdev || !bdev_nonrot(p->bdev)) atomic_dec(&nr_rotate_swap); + purge_swap_cgroup_priority(); mutex_lock(&swapon_mutex); spin_lock(&swap_lock); spin_lock(&p->lock); @@ -2895,6 +2900,8 @@ static void swap_stop(struct seq_file *swap, void *v) mutex_unlock(&swapon_mutex); } + +#ifndef CONFIG_SWAP_CGROUP_PRIORITY static int swap_show(struct seq_file *swap, void *v) { struct swap_info_struct *si = v; @@ -2921,6 +2928,34 @@ static int swap_show(struct seq_file *swap, void *v) si->prio); return 0; } +#else +static int swap_show(struct seq_file *swap, void *v) +{ + struct swap_info_struct *si = v; + struct file *file; + int len; + unsigned long bytes, inuse; + + if (si == SEQ_START_TOKEN) { + seq_puts(swap, "Filename\t\t\t\tType\t\tSize\t\tUsed\t\tPriority\t\tId\n"); + return 0; + } + + bytes = K(si->pages); + inuse = K(swap_usage_in_pages(si)); + + file = si->swap_file; + len = seq_file_path(swap, file, " \t\n\\"); + seq_printf(swap, "%*s%s\t%lu\t%s%lu\t%s%d\t\t\t%llu\n", + len < 40 ? 40 - len : 1, " ", + S_ISBLK(file_inode(file)->i_mode) ? + "partition" : "file\t", + bytes, bytes < 10000000 ? "\t" : "", + inuse, inuse < 10000000 ? "\t" : "", + si->prio, si->id); + return 0; +} +#endif static const struct seq_operations swaps_op = { .start = swap_start, @@ -3463,6 +3498,13 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) goto free_swap_zswap; } + error = prepare_swap_cgroup_priority(si->type); + if (error) { + inode->i_flags &= ~S_SWAPFILE; + goto free_swap_zswap; + } + get_swapdev_id(si); + mutex_lock(&swapon_mutex); prio = -1; if (swap_flags & SWAP_FLAG_PREFER) -- 2.34.1