linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: YoungJun Park <youngjun.park@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org,  Chris Li <chrisl@kernel.org>,
	Kairui Song <kasong@tencent.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
	 Barry Song <baohua@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	 Muchun Song <muchun.song@linux.dev>,
	gunho.lee@lge.com, taejoon.song@lge.com, austin.kim@lge.com
Subject: Re: [RFC PATCH v2 0/5] mm/swap, memcg: Introduce swap tiers for cgroup based swap control
Date: Sun, 22 Feb 2026 21:56:13 -0800	[thread overview]
Message-ID: <aZvX0HZy1PDylL8A@linux.dev> (raw)
In-Reply-To: <aZnBo+P3ifskts9J@yjaykim-PowerEdge-T330>

Hi YoungJun,

I see you have sent a separate email on BPF specific questions to which I will
respond separately, here I will respond to other questions/comments.

On Sat, Feb 21, 2026 at 11:30:59PM +0900, YoungJun Park wrote:
> On Fri, Feb 20, 2026 at 07:47:22PM -0800, Shakeel Butt wrote:
[...]
> 
> > Taking a step back, can you describe your use-case a bit more and share
> > requirements?
> 
> Our use case is simple at now. 
> We have two swap devices with different performance
> characteristics and want to assign different swap devices to different
> workloads (cgroups).

If you don't mind, can you share a bit more about the cgroup hierarchy structure
of your deployment. Do you use cgroup v1 or v2 on your production environment?

> 
> For some background, when I initially proposed this, I suggested allowing
> per-cgroup swap device priorities so that it could also accommodate the
> broader scenarios you mentioned. However, since even our own use case
> does not require reversing swap priorities within a cgroup, we pivoted
> to the "swap tier" mechanism that Chris proposed.
> 
> > 1. If more than one device is assign to a workload, do you want to have
> >    some kind of ordering between them for the worklod or do you want option to
> >    have round robin kind of policy?
> 
> Both. If devices are in the same tier with the same priority, round robin.
> If they are in the same tier with different priorities, or in different
> tiers, ordering applies. The current tier structure should be able to
> satisfy either preference.

I assume this is the same swap priorities as of today, right? You want similar
priority behavior within a tier.

> 
> > 2. What's the reason to use 'tiers' in the name? Is it similar to memory tiers
> >    and you want promotion/demotion among the tiers?
> 
> This was originally Chris's idea. I think he explained the rationale
> well in his reply.
> 
> > 3. If a workload has multiple swap devices assigned, can you describe the
> >    scenario where such workloads need to partition/divide given devices to their
> >    sub-workloads?
> 
> One possible scenario is reducing lock contention by partitioning swap
> devices between parent and child cgroups.

The lock contention is orthogonal (and distraction here).

> 
> > Let's start with these questions. Please note that I want us to not just look at
> > the current use-case but brainstorm more future use-cases and then come up with
> > the solution which is more future proof.
> 
> We have clear production use cases from both us and Chris, and I also
> presented a deployment example in the cover letter.
> 
> I think it is hard to design concretely for future use cases at this
> point. When those needs become clearer, BPF with its flexibility
> would be a better fit then. I see BPF as a natural extension path
> rather than a starting point.
> 
> For now, guarding the memcg & tier behind a CONFIG option would
> let us move forward without committing to a stable interface, and
> we can always pivot to BPF later if needed

I think your use-case is very clear. Before committing to any options, I want us
to brainstorm all options and gather pros/cons and then make an informed
decision. Anyways I will respond to your other email (in a day or two).

Shakeel


      reply	other threads:[~2026-02-23  5:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26  6:52 Youngjun Park
2026-01-26  6:52 ` [RFC PATCH v2 v2 1/5] mm: swap: introduce swap tier infrastructure Youngjun Park
2026-02-12  9:07   ` Chris Li
2026-02-13  2:18     ` YoungJun Park
2026-02-13 14:33     ` YoungJun Park
2026-01-26  6:52 ` [RFC PATCH v2 v2 2/5] mm: swap: associate swap devices with tiers Youngjun Park
2026-01-26  6:52 ` [RFC PATCH v2 v2 3/5] mm: memcontrol: add interface for swap tier selection Youngjun Park
2026-01-26  6:52 ` [RFC PATCH v2 v2 4/5] mm, swap: change back to use each swap device's percpu cluster Youngjun Park
2026-02-12  7:37   ` Chris Li
2026-01-26  6:52 ` [RFC PATCH v2 v2 5/5] mm, swap: introduce percpu swap device cache to avoid fragmentation Youngjun Park
2026-02-12  6:12 ` [RFC PATCH v2 0/5] mm/swap, memcg: Introduce swap tiers for cgroup based swap control Chris Li
2026-02-12  9:22   ` Chris Li
2026-02-13  2:26     ` YoungJun Park
2026-02-13  1:59   ` YoungJun Park
2026-02-12 17:57 ` Nhat Pham
2026-02-12 17:58   ` Nhat Pham
2026-02-13  2:43   ` YoungJun Park
2026-02-12 18:33 ` Shakeel Butt
2026-02-13  3:58   ` YoungJun Park
2026-02-21  3:47     ` Shakeel Butt
2026-02-21  6:07       ` Chris Li
2026-02-21 17:44         ` Shakeel Butt
2026-02-22  1:16           ` YoungJun Park
2026-02-21 14:30       ` YoungJun Park
2026-02-23  5:56         ` Shakeel Butt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZvX0HZy1PDylL8A@linux.dev \
    --to=shakeel.butt@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=austin.kim@lge.com \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=gunho.lee@lge.com \
    --cc=hannes@cmpxchg.org \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shikemeng@huaweicloud.com \
    --cc=taejoon.song@lge.com \
    --cc=youngjun.park@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox