From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82483C61DB2 for ; Fri, 13 Jun 2025 06:49:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E62EE6B007B; Fri, 13 Jun 2025 02:49:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E13EF6B0089; Fri, 13 Jun 2025 02:49:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2A056B008A; Fri, 13 Jun 2025 02:49:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B3F536B007B for ; Fri, 13 Jun 2025 02:49:09 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 32967121375 for ; Fri, 13 Jun 2025 06:49:09 +0000 (UTC) X-FDA: 83549450418.18.6118427 Received: from lgeamrelo03.lge.com (lgeamrelo03.lge.com [156.147.51.102]) by imf19.hostedemail.com (Postfix) with ESMTP id 625FB1A0006 for ; Fri, 13 Jun 2025 06:49:06 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of youngjun.park@lge.com designates 156.147.51.102 as permitted sender) smtp.mailfrom=youngjun.park@lge.com; dmarc=pass (policy=none) header.from=lge.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749797347; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CBMJbTAFsmVHvw/5pWeLMOX18gfrj/y0/AeRAMUET5A=; b=Vo1zW0RXk2RqK0m1MERKMDyGyXqiw5DrBYy49KVuoTasDPZfy0UOp0vY/3s4rlTVcvcGU4 ntR4Cz1reN898g6nPrRuLNIi5qHeJaYaGNcdZynK/E3ego+KielOV6RvcTdMkLFAcEEaZp 3TvaO6I9+atpx2PVPDtcKrQuDtAaJ5M= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of youngjun.park@lge.com designates 156.147.51.102 as permitted sender) smtp.mailfrom=youngjun.park@lge.com; dmarc=pass (policy=none) header.from=lge.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749797347; a=rsa-sha256; cv=none; b=PZWJZuqIs8EORO3+wTNVHVtyfK02+Xns/bY8cp2YqYmy6GBvkhNKA1GmE/olppq5erbsj+ /YAVM/89LntUWcog1nE6fLZDsEOTpb46nrFTorF8jqjytz1vo2ASSSyvn1ifIt6KI1WUf/ jlVdrs+3mHB5dBWcXJ0yk8KnJNz9gJA= Received: from unknown (HELO yjaykim-PowerEdge-T330) (10.177.112.156) by 156.147.51.102 with ESMTP; 13 Jun 2025 15:49:02 +0900 X-Original-SENDERIP: 10.177.112.156 X-Original-MAILFROM: youngjun.park@lge.com Date: Fri, 13 Jun 2025 15:49:02 +0900 From: YoungJun Park To: Kairui Song Cc: linux-mm@kvack.org, akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, shikemeng@huaweicloud.com, nphamcs@gmail.com, bhe@redhat.com, baohua@kernel.org, chrisl@kernel.org, muchun.song@linux.dev, iamjoonsoo.kim@lge.com, taejoon.song@lge.com, gunho.lee@lge.com Subject: Re: [RFC PATCH 2/2] mm: swap: apply per cgroup swap priority mechansim on swap layer Message-ID: References: <20250612103743.3385842-1-youngjun.park@lge.com> <20250612103743.3385842-3-youngjun.park@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 625FB1A0006 X-Rspamd-Server: rspam07 X-Stat-Signature: z87dk3yukum4pbcc6irp49ibh78eoxbb X-Rspam-User: X-HE-Tag: 1749797346-631583 X-HE-Meta: U2FsdGVkX1+YtfviNDE783ZUJLDTsAgeVRlbx6PIrJkUCplT+D2qF8n3mJlRVcuefaGN951M9KJAXJvh5GnPMn9g+RRdLmdj5zROxGZE4v5GiNGkx1DXJl+MJXWpxQupDpUvxYvf8PXWboJzG87YRrXIqMjkBkxE4Rsg4yzZoZXEJera1vgouC5TDYAvSHY8I5cXwRZJqt+B2W1TerJZcB3Y1pOvMt0CRsWrhefSiW34FfponoMPbVvhD8CMke5uWrRTX80kkxAXFroAMgolPxRnXdujkKG+piuEyPzwsqt/NbPkFm20sJEWlsYDB6re2+F77qxLpSiVAoqpz52Av2a2TfHjg0f54goINo16UX3rRgkvhm1U8WGCsR5yqZTS6ixzV84htG0rlpyeBbHhzCLk92USmnBydp/O1plt7Yo3AcmTf/m5/e4DLMd7WPkSY5katvBtGNOkKu5PWSFVIS97avVizQ6zSg+zZQa5h73HDy0mntaXGesdV8zB7wgHQZVFmARiVJH3QM1yN3+282FPrXySLwetNs8Ule7DFaXm61pbFKOMfxnW/aNZj7DgpLCaa5+W9xR97RzTnjqRSJGep8ITG/6CrmexHDZuF7U2FINxFvt5p4ewzYhMYp48al5c9CxTFHjIT+vLCvLpSPwdo2AzUPt4G3MXCJ4kdRflPcJd7wJw55xU0yNlg+XaOepu3Jy5KarMDSgrCsCNcm2PUXhtdSFQykSFEED1ZmFCFxcMFc6eXVO/pJ1pkjlMN/xJRFA9B1ubpuYa+fLNVaOLG9ojDrEvjGnEEoXwFUKgLnHRwkjj2NQHCFhmzbDX75v7orav8Nw//gAFMgQgfYEzj8zYlbSeJXKhUYTNxlJ0WOvC4j+zPkE+DaG6dhT+O0SKYIFLk7BS+0JMLjia8ma5TIiSMsolsTNGvfn0maZO/IeeSido5v53JPLs/NvtPvZwffZpwBx2iTcOQHM OOZ75eUD nVg/iJS4tjzcMtUbwAally5nvJD6xYqYPQds3zpYrQmXIgNX8XR132j1kH+vPimg7BcQOPoJexjdRTb5rNXFDPGvhqu1a/+9iquE4z9yp2t8cTBwPYBl4u2uyRVHuFZY1b8xhbAGli05yZIVsWfm9Z6balAJ186vnLpfw5p1/CxSO3+iv+Yg8xenPEJ8B5cv+xRmwOcpzl9NoAS6HgMNXTejg4yy33jmy1fMzSouIdEQJ6kFhI2KtDHK5LT1dpYP8g7idnn466K/eKPuScOv/8xjZX59U0ccPv6usggqzB9Ed/Iggl+TIlGL5wQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 12, 2025 at 07:14:20PM +0800, Kairui Song wrote: > On Thu, Jun 12, 2025 at 6:43 PM wrote: > > > > From: "youngjun.park" > > > > Hi, Youngjun, > > Thanks for sharing this series. > > > This patch implements swap device selection and swap on/off propagation > > when a cgroup-specific swap priority is set. > > > > There is one workaround to this implementation as follows. > > Current per-cpu swap cluster enforces swap device selection based solely > > on CPU locality, overriding the swap cgroup's configured priorities. > > I've been thinking about this, we can switch to a per-cgroup-per-cpu > next cluster selector, the problem with current code is that swap > allocator is not designed with folio / cgroup in mind at all, so it's > really ugly to implement, which is why I have following two patches in > the swap table series: This seems to be the suitable alternative for upstream at the moment. I think there are still a few things that need to be considered, though. (Nhat pointed it out well. I've share my thoughts on that context. ) > https://lore.kernel.org/linux-mm/20250514201729.48420-18-ryncsn@gmail.com/ > https://lore.kernel.org/linux-mm/20250514201729.48420-22-ryncsn@gmail.com/ > > The first one makes all swap allocation starts with a folio, the > second one makes the allocator always folio aware. So you can know > which cgroup is doing the allocation at anytime inside the allocator > (and it reduced the number of argument, also improving performance :) > ) > So the allocator can just use cgroup's swap info if available, plist, > percpu cluster, and fallback to global locality in a very natural way. > Wow! This is exactly the situation I needed. I thought it was uncomfortable to pass memcg parameter. If memcg can be naturally identified within the allocation, as you mentioned, It would be good both performance-wise and design-wise. > > Therefore, when a swap cgroup priority is assigned, we fall back to > > using per-CPU clusters per swap device, similar to the previous behavior. > > > > A proper fix for this workaround will be evaluated in the next patch. > > Hmm, but this is already the last patch in the series? Ah! The next patch series refers to the one. I'm still evaluating this part and wasn't confident enough to include it in the current version. At first, I wanted to get feedback on the core part, I'm currently pursuing.