From: Chengming Zhou <chengming.zhou@linux.dev>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>,
Nhat Pham <nphamcs@gmail.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"yosryahmed@google.com" <yosryahmed@google.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"Huang, Ying" <ying.huang@intel.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>,
Usama Arif <usamaarif642@gmail.com>
Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
Date: Fri, 30 Aug 2024 12:52:06 +0800 [thread overview]
Message-ID: <8545b4d8-ba21-4607-8217-2b7b02ccb4d8@linux.dev> (raw)
In-Reply-To: <SJ0PR11MB5678AEDB9E47BB6267D5885CC9962@SJ0PR11MB5678.namprd11.prod.outlook.com>
On 2024/8/30 03:38, Sridhar, Kanchana P wrote:
> Hi Nhat,
>
>> -----Original Message-----
>> From: Nhat Pham <nphamcs@gmail.com>
>> Sent: Thursday, August 29, 2024 10:11 AM
>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
>> hannes@cmpxchg.org; yosryahmed@google.com; ryan.roberts@arm.com;
>> Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
>> foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>;
>> Usama Arif <usamaarif642@gmail.com>; Chengming Zhou
>> <chengming.zhou@linux.dev>
>> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
>>
>> On Wed, Aug 28, 2024 at 5:06 PM Sridhar, Kanchana P
>> <kanchana.p.sridhar@intel.com> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Nhat Pham <nphamcs@gmail.com>
>>>> Sent: Wednesday, August 28, 2024 2:35 PM
>>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>>>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
>>>> hannes@cmpxchg.org; yosryahmed@google.com;
>> ryan.roberts@arm.com;
>>>> Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
>>>> foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
>>>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
>>>> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
>>>>
>>>> On Wed, Aug 28, 2024 at 2:35 AM Kanchana P Sridhar
>>>> <kanchana.p.sridhar@intel.com> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> This patch-series enables zswap_store() to accept and store mTHP
>>>>> folios. The most significant contribution in this series is from the
>>>>> earlier RFC submitted by Ryan Roberts [1]. Ryan's original RFC has been
>>>>> migrated to v6.11-rc3 in patch 2/4 of this series.
>>>>>
>>>>> [1]: [RFC PATCH v1] mm: zswap: Store large folios without splitting
>>>>> https://lore.kernel.org/linux-mm/20231019110543.3284654-1-
>>>> ryan.roberts@arm.com/T/#u
>>>>>
>>>>> Additionally, there is an attempt to modularize some of the functionality
>>>>> in zswap_store(), to make it more amenable to supporting any-order
>>>>> mTHPs. For instance, the function zswap_store_entry() stores a
>>>> zswap_entry
>>>>> in the xarray. Likewise, zswap_delete_stored_offsets() can be used to
>>>>> delete all offsets corresponding to a higher order folio stored in zswap.
>>>>>
>>>>
>>>> Will this have any conflict with mTHP swap work? Especially with mTHP
>>>> swap-in and zswap writeback.
>>>>
>>>> My understanding is from zswap's perspective, the large folio is
>>>> broken apart into independent subpages, correct? What happens when
>> we
>>>> have partially written back mTHP (i.e some subpages are in zswap
>>>> still, whereas others are written back to swap). Would this
>>>> automatically prevent mTHP swapin?
>>>
>>> That is a good point. To begin with, this patch-series would make the default
>>> behavior for mTHP swapout/storage and swapin for ZSWAP to be on par
>> with
>>> ZRAM. From zswap's perspective, imo this is a significant step forward
>> towards
>>> realizing cold memory storage with mTHP folios. However, it is only a
>> starting
>>> point that makes the behavior uniform across zswap/zram. Initially,
>> workloads
>>> would see a one-time benefit with reclaim being able to swapout mTHP
>>> folios without splitting, to zswap. If the mTHPs were cold memory, then we
>>> would have derived latency gains towards memory savings (with zswap).
>>>
>>> However, if the mTHP were part of "not so cold" memory, this would result
>>> in a one-way mTHP conversion to 4K folios. Depending on workloads and
>> their
>>> access patterns, we could either see individual 4K folios being swapped in,
>>> or entire chunks if not the entire (original) mTHP needing to be swapped in.
>>>
>>> It should be noted that this is more of a performance vs. cold memory
>>> preservation trade-off that needs to drive mTHP reclaim, storage, swapin
>> and
>>> writeback policy. Different workloads could require different policies.
>> However,
>>> even though this patch is only a starting point, it is still functionally correct
>>> by being equivalent to zram-mTHP, and compatible with the rest of mm and
>>> swap as far as mTHP. Another important functionality/data consistency
>> decision
>>> I made in this patch series is error handling during zswap_store() of mTHP:
>>> in case of any errors, all swap offsets for the mTHP are deleted from the
>>> zswap xarray/zpool, since we know that the mTHP will now have to be
>> stored
>>> in the backing swap device. IOW, an mTHP is either entirely stored in zswap,
>>> or entirely not stored in zswap.
>>>
>>> To answer your question, we would need to come up with what the
>> semantics
>>> would need to be for zswap zpool storage granularity, swapin granularity,
>>> readahead granularity and writeback wrt mTHP and how the overall swap
>>> sub-system needs to "preserve" mTHP vs. splitting mTHP into 4K/lower-
>> order
>>> folios during swapout. Once we have a good understanding of these policies,
>>> we could implement them in zswap. Alternately, develop an abstraction that
>> is
>>> one level above zswap/zram and makes things easier and shareable
>> between
>>> zswap and zram. By this, I mean fundamental assumptions such as
>> consecutive
>>> swap offsets (for instance). To some extent, this implies that an mTHP as a
>>> swap entity is defined by consecutiveness of swap offsets. Maybe the policy
>>> to keep mTHPs in the system over extended duration might be to assemble
>>> them dynamically based on swapin_readahead() decisions (which is based
>> on
>>> workload access patterns). In other words, mTHPs could be a useful
>> abstraction
>>> that can be static or even dynamic based on working set characteristics, and
>>> cold memory preservation. This is quite a complex topic imho.
>>>
>>> As we know, Barry Song and Chuanhua Han have started the discussion on
>>> this in their zram mTHP swapin series [1].
>>
>> Yeah I'm a bit more concerned with the correctness aspect. As long as
>> it's not buggy, then we can implement mTHP zswapout first, and force
>> individual subpage (z)swapin for now (since we cannot control
>> writeback from writing individual subpages).
>
> Absolutely, this sounds like the way to go!
>
>>
>> We can discuss strategy to harmonize mTHP, zswap (with writeback) as
>> we go along.
>
> Sounds great :)
>
>>
>> BTW, I think we're not cc-ing Chengming? Is the get_maintainers script
>> not working properly... Let me manually add him in - please include
>> him in future submission and responses, as he is also a zswap reviewer
>> :)
>
> I think when I ran get_maintainers.pl, I was in v6.10. For sure, will include
> Chengming in future submissions and responses :)
Maybe a little late for the party, will take a look ASAP.
It's an interesting and great work.
Thanks!
>
>>
>> Also cc-ing Usama who is interested in this work.
>
> Sounds great.
>
> Thanks,
> Kanchana
>
>>
>>>
>>> [1] https://lore.kernel.org/all/20240821074541.516249-3-
>> hanchuanhua@oppo.com/T/#u
>>>
>>> Thanks,
>>> Kanchana
next prev parent reply other threads:[~2024-08-30 4:52 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-28 9:35 Kanchana P Sridhar
2024-08-28 9:35 ` [PATCH v5 1/3] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-08-28 9:35 ` [PATCH v5 2/3] mm: zswap: zswap_store() extended to handle mTHP folios Kanchana P Sridhar
2024-08-28 9:35 ` [PATCH v5 3/3] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-08-28 15:55 ` [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios Nhat Pham
2024-08-28 17:23 ` Nhat Pham
2024-08-28 19:30 ` Sridhar, Kanchana P
2024-08-28 19:24 ` Sridhar, Kanchana P
2024-08-28 21:35 ` Nhat Pham
2024-08-29 0:06 ` Sridhar, Kanchana P
2024-08-29 17:10 ` Nhat Pham
2024-08-29 19:38 ` Sridhar, Kanchana P
2024-08-30 4:52 ` Chengming Zhou [this message]
2024-09-20 2:34 ` Sridhar, Kanchana P
2024-08-29 3:59 ` Sridhar, Kanchana P
2024-08-28 22:37 ` Yosry Ahmed
2024-08-29 0:20 ` Sridhar, Kanchana P
2024-08-29 1:01 ` Yosry Ahmed
2024-08-29 3:10 ` Sridhar, Kanchana P
2024-08-29 23:33 ` Nhat Pham
2024-08-29 23:38 ` Yosry Ahmed
2024-08-29 23:47 ` Nhat Pham
2024-08-29 23:55 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8545b4d8-ba21-4607-8217-2b7b02ccb4d8@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox