linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chengming Zhou <chengming.zhou@linux.dev>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>,
	Nhat Pham <nphamcs@gmail.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"yosryahmed@google.com" <yosryahmed@google.com>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	"21cnbao@gmail.com" <21cnbao@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"Zou, Nanhai" <nanhai.zou@intel.com>,
	"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
	"Gopal, Vinodh" <vinodh.gopal@intel.com>,
	Usama Arif <usamaarif642@gmail.com>
Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
Date: Fri, 30 Aug 2024 12:52:06 +0800	[thread overview]
Message-ID: <8545b4d8-ba21-4607-8217-2b7b02ccb4d8@linux.dev> (raw)
In-Reply-To: <SJ0PR11MB5678AEDB9E47BB6267D5885CC9962@SJ0PR11MB5678.namprd11.prod.outlook.com>

On 2024/8/30 03:38, Sridhar, Kanchana P wrote:
> Hi Nhat,
> 
>> -----Original Message-----
>> From: Nhat Pham <nphamcs@gmail.com>
>> Sent: Thursday, August 29, 2024 10:11 AM
>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
>> hannes@cmpxchg.org; yosryahmed@google.com; ryan.roberts@arm.com;
>> Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
>> foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>;
>> Usama Arif <usamaarif642@gmail.com>; Chengming Zhou
>> <chengming.zhou@linux.dev>
>> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
>>
>> On Wed, Aug 28, 2024 at 5:06 PM Sridhar, Kanchana P
>> <kanchana.p.sridhar@intel.com> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Nhat Pham <nphamcs@gmail.com>
>>>> Sent: Wednesday, August 28, 2024 2:35 PM
>>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>>>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
>>>> hannes@cmpxchg.org; yosryahmed@google.com;
>> ryan.roberts@arm.com;
>>>> Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
>>>> foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
>>>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
>>>> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
>>>>
>>>> On Wed, Aug 28, 2024 at 2:35 AM Kanchana P Sridhar
>>>> <kanchana.p.sridhar@intel.com> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> This patch-series enables zswap_store() to accept and store mTHP
>>>>> folios. The most significant contribution in this series is from the
>>>>> earlier RFC submitted by Ryan Roberts [1]. Ryan's original RFC has been
>>>>> migrated to v6.11-rc3 in patch 2/4 of this series.
>>>>>
>>>>> [1]: [RFC PATCH v1] mm: zswap: Store large folios without splitting
>>>>>       https://lore.kernel.org/linux-mm/20231019110543.3284654-1-
>>>> ryan.roberts@arm.com/T/#u
>>>>>
>>>>> Additionally, there is an attempt to modularize some of the functionality
>>>>> in zswap_store(), to make it more amenable to supporting any-order
>>>>> mTHPs. For instance, the function zswap_store_entry() stores a
>>>> zswap_entry
>>>>> in the xarray. Likewise, zswap_delete_stored_offsets() can be used to
>>>>> delete all offsets corresponding to a higher order folio stored in zswap.
>>>>>
>>>>
>>>> Will this have any conflict with mTHP swap work? Especially with mTHP
>>>> swap-in and zswap writeback.
>>>>
>>>> My understanding is from zswap's perspective, the large folio is
>>>> broken apart into independent subpages, correct? What happens when
>> we
>>>> have partially written back mTHP (i.e some subpages are in zswap
>>>> still, whereas others are written back to swap). Would this
>>>> automatically prevent mTHP swapin?
>>>
>>> That is a good point. To begin with, this patch-series would make the default
>>> behavior for mTHP swapout/storage and swapin for ZSWAP to be on par
>> with
>>> ZRAM. From zswap's perspective, imo this is a significant step forward
>> towards
>>> realizing cold memory storage with mTHP folios. However, it is only a
>> starting
>>> point that makes the behavior uniform across zswap/zram. Initially,
>> workloads
>>> would see a one-time benefit with reclaim being able to swapout mTHP
>>> folios without splitting, to zswap. If the mTHPs were cold memory, then we
>>> would have derived latency gains towards memory savings (with zswap).
>>>
>>> However, if the mTHP were part of "not so cold" memory, this would result
>>> in a one-way mTHP conversion to 4K folios. Depending on workloads and
>> their
>>> access patterns, we could either see individual 4K folios being swapped in,
>>> or entire chunks if not the entire (original) mTHP needing to be swapped in.
>>>
>>> It should be noted that this is more of a performance vs. cold memory
>>> preservation trade-off that needs to drive mTHP reclaim, storage, swapin
>> and
>>> writeback policy. Different workloads could require different policies.
>> However,
>>> even though this patch is only a starting point, it is still functionally correct
>>> by being equivalent to zram-mTHP, and compatible with the rest of mm and
>>> swap as far as mTHP. Another important functionality/data consistency
>> decision
>>> I made in this patch series is error handling during zswap_store() of mTHP:
>>> in case of any errors, all swap offsets for the mTHP are deleted from the
>>> zswap xarray/zpool, since we know that the mTHP will now have to be
>> stored
>>> in the backing swap device. IOW, an mTHP is either entirely stored in zswap,
>>> or entirely not stored in zswap.
>>>
>>> To answer your question, we would need to come up with what the
>> semantics
>>> would need to be for zswap zpool storage granularity, swapin granularity,
>>> readahead granularity and writeback wrt mTHP and how the overall swap
>>> sub-system needs to "preserve" mTHP vs. splitting mTHP into 4K/lower-
>> order
>>> folios during swapout. Once we have a good understanding of these policies,
>>> we could implement them in zswap. Alternately, develop an abstraction that
>> is
>>> one level above zswap/zram and makes things easier and shareable
>> between
>>> zswap and zram. By this, I mean fundamental assumptions such as
>> consecutive
>>> swap offsets (for instance). To some extent, this implies that an mTHP as a
>>> swap entity is defined by consecutiveness of swap offsets. Maybe the policy
>>> to keep mTHPs in the system over extended duration might be to assemble
>>> them dynamically based on swapin_readahead() decisions (which is based
>> on
>>> workload access patterns). In other words, mTHPs could be a useful
>> abstraction
>>> that can be static or even dynamic based on working set characteristics, and
>>> cold memory preservation. This is quite a complex topic imho.
>>>
>>> As we know, Barry Song and Chuanhua Han have started the discussion on
>>> this in their zram mTHP swapin series [1].
>>
>> Yeah I'm a bit more concerned with the correctness aspect. As long as
>> it's not buggy, then we can implement mTHP zswapout first, and force
>> individual subpage (z)swapin for now (since we cannot control
>> writeback from writing individual subpages).
> 
> Absolutely, this sounds like the way to go!
> 
>>
>> We can discuss strategy to harmonize mTHP, zswap (with writeback) as
>> we go along.
> 
> Sounds great :)
> 
>>
>> BTW, I think we're not cc-ing Chengming? Is the get_maintainers script
>> not working properly... Let me manually add him in - please include
>> him in future submission and responses, as he is also a zswap reviewer
>> :)
> 
> I think when I ran get_maintainers.pl, I was in v6.10. For sure, will include
> Chengming in future submissions and responses :)

Maybe a little late for the party, will take a look ASAP.
It's an interesting and great work.

Thanks!

> 
>>
>> Also cc-ing Usama who is interested in this work.
> 
> Sounds great.
> 
> Thanks,
> Kanchana
> 
>>
>>>
>>> [1] https://lore.kernel.org/all/20240821074541.516249-3-
>> hanchuanhua@oppo.com/T/#u
>>>
>>> Thanks,
>>> Kanchana


  reply	other threads:[~2024-08-30  4:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-28  9:35 Kanchana P Sridhar
2024-08-28  9:35 ` [PATCH v5 1/3] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-08-28  9:35 ` [PATCH v5 2/3] mm: zswap: zswap_store() extended to handle mTHP folios Kanchana P Sridhar
2024-08-28  9:35 ` [PATCH v5 3/3] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-08-28 15:55 ` [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios Nhat Pham
2024-08-28 17:23   ` Nhat Pham
2024-08-28 19:30     ` Sridhar, Kanchana P
2024-08-28 19:24   ` Sridhar, Kanchana P
2024-08-28 21:35 ` Nhat Pham
2024-08-29  0:06   ` Sridhar, Kanchana P
2024-08-29 17:10     ` Nhat Pham
2024-08-29 19:38       ` Sridhar, Kanchana P
2024-08-30  4:52         ` Chengming Zhou [this message]
2024-09-20  2:34           ` Sridhar, Kanchana P
2024-08-29  3:59   ` Sridhar, Kanchana P
2024-08-28 22:37 ` Yosry Ahmed
2024-08-29  0:20   ` Sridhar, Kanchana P
2024-08-29  1:01     ` Yosry Ahmed
2024-08-29  3:10       ` Sridhar, Kanchana P
2024-08-29 23:33   ` Nhat Pham
2024-08-29 23:38     ` Yosry Ahmed
2024-08-29 23:47       ` Nhat Pham
2024-08-29 23:55         ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8545b4d8-ba21-4607-8217-2b7b02ccb4d8@linux.dev \
    --to=chengming.zhou@linux.dev \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kanchana.p.sridhar@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=usamaarif642@gmail.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox