From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E0185C61DB2
	for <linux-mm@archiver.kernel.org>; Fri, 13 Jun 2025 07:11:08 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 6C3C46B007B; Fri, 13 Jun 2025 03:11:08 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 6742E6B0089; Fri, 13 Jun 2025 03:11:08 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 58A2F6B008A; Fri, 13 Jun 2025 03:11:08 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14])
	by kanga.kvack.org (Postfix) with ESMTP id 3F7456B007B
	for <linux-mm@kvack.org>; Fri, 13 Jun 2025 03:11:08 -0400 (EDT)
Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay04.hostedemail.com (Postfix) with ESMTP id D4B3A1A130F
	for <linux-mm@kvack.org>; Fri, 13 Jun 2025 07:11:07 +0000 (UTC)
X-FDA: 83549505774.07.43C115E
Received: from lgeamrelo07.lge.com (lgeamrelo07.lge.com [156.147.51.103])
	by imf01.hostedemail.com (Postfix) with ESMTP id 20F7D4000E
	for <linux-mm@kvack.org>; Fri, 13 Jun 2025 07:11:04 +0000 (UTC)
Authentication-Results: imf01.hostedemail.com;
	dkim=none;
	spf=pass (imf01.hostedemail.com: domain of youngjun.park@lge.com designates 156.147.51.103 as permitted sender) smtp.mailfrom=youngjun.park@lge.com;
	dmarc=pass (policy=none) header.from=lge.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1749798666;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=Bi8Bls5f/bUNLyyL9Pd7XWEqiJqwEQBgZyENDtG8/PY=;
	b=n5cSputwuaK6sZNdecFOKAUaxiTmEtdWID1/caNriX9rqpkycrwBcQf7xowxpfmRiPLyx+
	Kn48sIaMTKj5qZ8E1ULHPINKHIp3ZMtgcV5RdB3uGmMkgGcWveboGGrz/fPtaFtin9AHqO
	+NiI7ArM0dXJ6s2HNqgSE8/TKd4QBVc=
ARC-Authentication-Results: i=1;
	imf01.hostedemail.com;
	dkim=none;
	spf=pass (imf01.hostedemail.com: domain of youngjun.park@lge.com designates 156.147.51.103 as permitted sender) smtp.mailfrom=youngjun.park@lge.com;
	dmarc=pass (policy=none) header.from=lge.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749798666; a=rsa-sha256;
	cv=none;
	b=b3+0Yp5ggGKot701e94A5Lbv+xKsmLXp3Q3k4R2PzkvvORVbOdz0ijeFE26rWYASZFWX5k
	wL0RcXeTyAxEgtXxAogvL7TTQz6xUaOMQpW4awbgnNDmxLz4ibjKaxaarYXy5HirpYYjc/
	KwbQNDWRqlmjvGPh6god3K7nrXGvueI=
Received: from unknown (HELO yjaykim-PowerEdge-T330) (10.177.112.156)
	by 156.147.51.103 with ESMTP; 13 Jun 2025 16:11:01 +0900
X-Original-SENDERIP: 10.177.112.156
X-Original-MAILFROM: youngjun.park@lge.com
Date: Fri, 13 Jun 2025 16:11:01 +0900
From: YoungJun Park <youngjun.park@lge.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Kairui Song <ryncsn@gmail.com>, linux-mm@kvack.org,
	akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org,
	roman.gushchin@linux.dev, shakeel.butt@linux.dev,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	shikemeng@huaweicloud.com, bhe@redhat.com, baohua@kernel.org,
	chrisl@kernel.org, muchun.song@linux.dev, iamjoonsoo.kim@lge.com,
	taejoon.song@lge.com, gunho.lee@lge.com
Subject: Re: [RFC PATCH 2/2] mm: swap: apply per cgroup swap priority
 mechansim on swap layer
Message-ID: <aEvPBSObBrrQCsa3@yjaykim-PowerEdge-T330>
References: <20250612103743.3385842-1-youngjun.park@lge.com>
 <20250612103743.3385842-3-youngjun.park@lge.com>
 <CAMgjq7BJE9ALFG4N8wb-hdkC+b-8d1+ckXL9D6pbbfgiXfuzPA@mail.gmail.com>
 <CAKEwX=PsGKS5JHqQ-G29Fg8xLssPhM+E-4wV_QakhqrDOsV36g@mail.gmail.com>
 <CAMgjq7Aq1LW9wFgyQ4oCS5Su23X62S+5ZW_d5OydJj-pp2n21Q@mail.gmail.com>
 <CAKEwX=PD+P_wugkAJ83ti6YRo4-6QNM7HDFs+KDURVwx2JrnZg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAKEwX=PD+P_wugkAJ83ti6YRo4-6QNM7HDFs+KDURVwx2JrnZg@mail.gmail.com>
X-Stat-Signature: hdaprpyr1wxwmaedgsaen3b355s6oexy
X-Rspamd-Queue-Id: 20F7D4000E
X-Rspam-User: 
X-Rspamd-Server: rspam09
X-HE-Tag: 1749798664-874798
X-HE-Meta: U2FsdGVkX1+LEUHVcXn8HAScbOMrDaT+Wjm+beknYoRAnzQVBFy5Fz0DxHSYxQ8qBUQoIYuK2ALRhUOaHNKkmuW8WdbP4pvz+IBGsaJCxT1yhkWB5R7gBkNr/76/TiB9NzgSXbUUrHJ05lBPKKHM8ZyyDjef6OMEvsC0gNaAphhhzZCyFAxeJRXVbp2nmo7GSRfa764IUB1C/oDa8Vtgm3GDYc240A1Jmm2ZVbJC0Nrlf9Xxv4RSi5tZRjBPhC0V9yhfqnIzjg8+tM9+D4cVuRrHLsO1fTL8Iz6e++V3mQsQPhmxpak49eYsdKNrsifgcaDc8Rq/oh1rHkd3XFZsWUZW8IjQS8wulaLnBeG8/MTBexdrY0udHFeLDN3K5w2OInghcjRte8cmXCW9SKI2Gd7ci7yKltLidJ56L5EfcsAi7O7wgTLGtw0g0W2WRlxMpScK4OLoJZBz3WzPL/psp4dnlBfjfzzcCLLG+6dpsW5aLg0c48Um5d28q5YXfP8umvhAZLt1B3+mXV3j0MurYsJ9N8IU5kTp/Vi7L/FLFRxTxIfdBj1WB6d5NTM89PDhMHEihsvxPCC5GbDIphu7ZZF00ykTbhOfya/ZJ9VJn5sTHQdqeYcnclEsMmhI8+Yg/wxSN/5R1rIDYze7FEhu94OxuJmJQr8C6zpdikP3ZqhXxb+aOb4o1eVM8DYyDqeHm0uvvjvMH3tmEkxkkT0QC4B+t0X4eeYMEGpQHdJfu6/7Z3pVVRYmTzmQMMTzLPvjh1USuT0pL99xP2EyblvEj7p+BmAfpzqRqA9NMhDE0l9oc+HRFbMoKzq6tw9kOiu/C4yJHxGFoiTw/Myiun9ci50AqnAhi4UgSOXFAsHoDV9k6YuZDfLvSaEROBKJJCbQkkfmfPAR85EbnO+YRLdxHOUSkMUOeJ+cvniLlC4In/2mDMIMCNuk9bifFBqZYrBaEP4MzmnqHo1JQHruMKy
 VOgj6yIO
 VAdMiVX6zVk0UQJETXHxYDyhim1X6oioEkpLFU/SZVJKJm42zhq7BRElmowCziWueifQZ4c1BIl4Ji/moeuKTRjGs9p9yY8/5q+BEWgF8vqPtCcH2ElQlmT82LYwqZXrWrsGT9W/wJ6Zs3TKigRH4CxGTnF4H4R3XUP6b60QxWBXvIrRzc/FEazzIqqoWXvi4KmZKciJMstZUrprsvcfXb6JWNwVDU8EzyvIONONimKuEynT5X8FadxmWkyeb+EAJUnVus1KhmTof9K0eUL/tMzd+uvPN8jhI9sAj
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Thu, Jun 12, 2025 at 01:08:08PM -0700, Nhat Pham wrote:
> On Thu, Jun 12, 2025 at 11:20 AM Kairui Song <ryncsn@gmail.com> wrote:
> >
> > On Fri, Jun 13, 2025 at 1:28 AM Nhat Pham <nphamcs@gmail.com> wrote:
> > >
> > > On Thu, Jun 12, 2025 at 4:14 AM Kairui Song <ryncsn@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 12, 2025 at 6:43 PM <youngjun.park@lge.com> wrote:
> > > > >
> > > > > From: "youngjun.park" <youngjun.park@lge.com>
> > > > >
> > > >
> > > > Hi, Youngjun,
> > > >
> > > > Thanks for sharing this series.
> > > >
> > > > > This patch implements swap device selection and swap on/off propagation
> > > > > when a cgroup-specific swap priority is set.
> > > > >
> > > > > There is one workaround to this implementation as follows.
> > > > > Current per-cpu swap cluster enforces swap device selection based solely
> > > > > on CPU locality, overriding the swap cgroup's configured priorities.
> > > >
> > > > I've been thinking about this, we can switch to a per-cgroup-per-cpu
> > > > next cluster selector, the problem with current code is that swap
> > >
> > > What about per-cpu-per-order-per-swap-device :-? Number of swap
> > > devices is gonna be smaller than number of cgroups, right?
> >
> > Hi Nhat,
> >
> > The problem is per cgroup makes more sense (I was suggested to use
> > cgroup level locality at the very beginning of the implementation of
> > the allocator in the mail list, but it was hard to do so at that
> > time), for container environments, a cgroup is a container that runs
> > one type of workload, so it has its own locality. Things like systemd
> > also organize different desktop workloads into cgroups. The whole
> > point is about cgroup.
> 
> Yeah I know what cgroup represents. Which is why I mentioned in the
> next paragraph that are still making decisions based per-cgroup - we
> just organize the per-cpu cache based on swap devices. This way, two
> cgroups with similar/same priority list can share the clusters, for
> each swapfile, in each CPU. There will be a lot less duplication and
> overhead. And two cgroups with different priority lists won't
> interfere with each other, since they'll target different swapfiles.
> 
> Unless we want to nudge the swapfiles/clusters to be self-partitioned
> among the cgroups? :) IOW, each cluster contains pages mostly from a
> single cgroup (with some stranglers mixed in). I suppose that will be
> very useful for swap on rotational drives where read contiguity is
> imperative, but not sure about other backends :-? 
> Anyway, no strong opinions to be completely honest :) Was just
> throwing out some ideas. Per-cgroup-per-cpu-per-order sounds good to
> me too, if it's easy to do.

Good point!
I agree with the mention that self-partitioned clusters and duplicated priority.
One concern is the cost of synchronization.
Specifically the one incurred when accessing the prioritized swap device
>From a simple performance perspective, a per-cgroup-per-CPU implementation
seems favorable - in line with the current swap allocation fastpath.

It seems most reasonable to carefully compare the pros and cons of the           
tow approaches.

To summaraize,

Option 1. per-cgroup-per-cpu
Pros: upstream fit. performance. 
Cons: duplicate priority(some memory structure consumtion cost), 
self partioned cluster 

Option 2. per-cpu-per-order(per-device)
Pros: Cons of Option1
Cons: Pros of Option1

It's not easy to draw a definitive conclusion right away, 
I should also evaluate other pros and cons that may arise during actual 
implementation.
so I'd like to take some time to review things in more detail 
and share my thoughs and conclusions in the next patch series.

What do you think, Nhat and Kairui?