* [RFC PATCH v2 0/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
@ 2024-11-16 9:16 Chen Ridong
2024-11-16 9:16 ` [RFC PATCH v2 1/1] " Chen Ridong
0 siblings, 1 reply; 15+ messages in thread
From: Chen Ridong @ 2024-11-16 9:16 UTC (permalink / raw)
To: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
ryan.roberts, baohua
Cc: linux-mm, linux-kernel, chenridong, wangweiyang2, xieym_ict
From: Chen Ridong <chenridong@huawei.com>
The issue has been dissused [1]. This patch is following Barry's
suggestion to fix this issue.
---
v2:
- detect folios whose writeback has done and move them to the tail
of lru. suggested by Barry Song
v1:
[1] https://lore.kernel.org/linux-kernel/20241010081802.290893-1-chenridong@huaweicloud.com/
Chen Ridong (1):
mm/vmscan: move the written-back folios to the tail of LRU after
shrinking
mm/vmscan.c | 37 +++++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 8 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-16 9:16 [RFC PATCH v2 0/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking Chen Ridong
@ 2024-11-16 9:16 ` Chen Ridong
2024-11-17 3:26 ` Barry Song
2024-11-18 4:03 ` Matthew Wilcox
0 siblings, 2 replies; 15+ messages in thread
From: Chen Ridong @ 2024-11-16 9:16 UTC (permalink / raw)
To: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
ryan.roberts, baohua
Cc: linux-mm, linux-kernel, chenridong, wangweiyang2, xieym_ict
From: Chen Ridong <chenridong@huawei.com>
An issue was found with the following testing step:
1. Compile with CONFIG_TRANSPARENT_HUGEPAGE=y
2. Mount memcg v1, and create memcg named test_memcg and set
usage_in_bytes=2.1G, memsw.usage_in_bytes=3G.
3. Create a 1G swap file, and allocate 2.2G anon memory in test_memcg.
It was found that:
cat memory.usage_in_bytes
2144940032
cat memory.memsw.usage_in_bytes
2255056896
free -h
total used free
Mem: 31Gi 2.1Gi 27Gi
Swap: 1.0Gi 618Mi 405Mi
As shown above, the test_memcg used about 100M swap, but 600M+ swap memory
was used, which means that 500M may be wasted because other memcgs can not
use these swap memory.
It can be explained as follows:
1. When entering shrink_inactive_list, it isolates folios from lru from
tail to head. If it just takes folioN from lru(make it simple).
inactive lru: folio1<->folio2<->folio3...<->folioN-1
isolated list: folioN
2. In shrink_page_list function, if folioN is THP(2M), it may be splited
and added to swap cache folio by folio. After adding to swap cache,
it will submit io to writeback folio to swap, which is asynchronous.
When shrink_page_list is finished, the isolated folios list will be
moved back to the head of inactive lru. The inactive lru may just look
like this, with 512 filioes have been move to the head of inactive lru.
folioN512<->folioN511<->...filioN1<->folio1<->folio2...<->folioN-1
It committed io from folioN1 to folioN512, the later folios committed
was added to head of the 'ret_folios' in the shrink_page_list function.
As a result, the order was shown as folioN512->folioN511->...->folioN1.
3. When folio writeback io is completed, the folio may be rotated to tail
of the lru one by one. It's assumed that filioN1,filioN2, ...,filioN512
are completed in order(commit io in this order), and they are rotated to
the tail of the LRU in order (filioN1<->...folioN511<->folioN512).
Therefore, those folios that are tail of the lru will be reclaimed as
soon as possible.
folio1<->folio2<->...<->folioN-1<->filioN1<->...folioN511<->folioN512
4. However, shrink_page_list and folio writeback are asynchronous. If THP
is splited, shrink_page_list loops at least 512 times, which means that
shrink_page_list is not completed but some folios writeback have been
completed, and this may lead to failure to rotate these folios to the
tail of lru. The lru may look likes as below:
folioN50<->folioN49<->...filioN1<->folio1<->folio2...<->folioN-1<->
folioN51<->folioN52<->...folioN511<->folioN512
Although those folios (N1-N50) have been finished writing back, they
are still at the head of the lru. This is because their writeback_end
occurred while it were still looping in shrink_folio_list(), causing
folio_end_writeback()'s folio_rotate_reclaimable() to fail in moving
these folios, which are not in the LRU but still in the 'folio_list',
to the tail of the LRU.
When isolating folios from lru, it scans from tail to head, so it is
difficult to scan those folios again.
What mentioned above may lead to a large number of folios have been added
to swap cache but can not be reclaimed in time, which may reduce reclaim
efficiency and prevent other memcgs from using this swap memory even if
they trigger OOM.
To fix this issue, the folios whose writeback has been completed should be
move to the tail of the LRU instead of always placing them at the head of
the LRU when the shrink_page_list is finished. It can be realized by
following steps.
1. In the shrink_page_list function, the folios whose are committed to
are added to the head of 'folio_list', which will be return to the
caller.
2. When shrink_page_list finishes, it is known that how many folios have
been pageout, and they are all at the head of 'folio_list', which is
ready be moved back to LRU. So, in the 'move_folios_to_lru function'
function, if the first 'nr_io' folios (which have been pageout) have
been written back completely, move them to the tail of LRU. Otherwise,
move them to the head of the LRU.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
mm/vmscan.c | 37 +++++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 8 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 76378bc257e3..04f7eab9d818 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1046,6 +1046,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
struct folio_batch free_folios;
LIST_HEAD(ret_folios);
LIST_HEAD(demote_folios);
+ LIST_HEAD(pageout_folios);
unsigned int nr_reclaimed = 0;
unsigned int pgactivate = 0;
bool do_demote_pass;
@@ -1061,7 +1062,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
struct address_space *mapping;
struct folio *folio;
enum folio_references references = FOLIOREF_RECLAIM;
- bool dirty, writeback;
+ bool dirty, writeback, is_pageout = false;
unsigned int nr_pages;
cond_resched();
@@ -1384,6 +1385,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
nr_pages = 1;
}
stat->nr_pageout += nr_pages;
+ is_pageout = true;
if (folio_test_writeback(folio))
goto keep;
@@ -1508,7 +1510,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
keep_locked:
folio_unlock(folio);
keep:
- list_add(&folio->lru, &ret_folios);
+ if (is_pageout)
+ list_add(&folio->lru, &pageout_folios);
+ else
+ list_add(&folio->lru, &ret_folios);
VM_BUG_ON_FOLIO(folio_test_lru(folio) ||
folio_test_unevictable(folio), folio);
}
@@ -1551,6 +1556,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
free_unref_folios(&free_folios);
list_splice(&ret_folios, folio_list);
+ list_splice(&pageout_folios, folio_list);
count_vm_events(PGACTIVATE, pgactivate);
if (plug)
@@ -1826,11 +1832,14 @@ static bool too_many_isolated(struct pglist_data *pgdat, int file,
/*
* move_folios_to_lru() moves folios from private @list to appropriate LRU list.
+ * @lruvec: The LRU vector the list is moved to.
+ * @list: The folio list are moved to lruvec
+ * @nr_io: The first nr folios of the list that have been committed io.
*
* Returns the number of pages moved to the given lruvec.
*/
static unsigned int move_folios_to_lru(struct lruvec *lruvec,
- struct list_head *list)
+ struct list_head *list, unsigned int nr_io)
{
int nr_pages, nr_moved = 0;
struct folio_batch free_folios;
@@ -1880,9 +1889,21 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec,
* inhibits memcg migration).
*/
VM_BUG_ON_FOLIO(!folio_matches_lruvec(folio, lruvec), folio);
- lruvec_add_folio(lruvec, folio);
+ /*
+ * If the folio have been committed io and writed back completely,
+ * it should be added to the tailed to the lru, so it can
+ * be relaimed as soon as possible.
+ */
+ if (nr_io > 0 &&
+ !folio_test_reclaim(folio) &&
+ !folio_test_writeback(folio))
+ lruvec_add_folio_tail(lruvec, folio);
+ else
+ lruvec_add_folio(lruvec, folio);
+
nr_pages = folio_nr_pages(folio);
nr_moved += nr_pages;
+ nr_io = nr_io > nr_pages ? (nr_io - nr_pages) : 0;
if (folio_test_active(folio))
workingset_age_nonresident(lruvec, nr_pages);
}
@@ -1960,7 +1981,7 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false);
spin_lock_irq(&lruvec->lru_lock);
- move_folios_to_lru(lruvec, &folio_list);
+ move_folios_to_lru(lruvec, &folio_list, stat.nr_pageout);
__mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
stat.nr_demoted);
@@ -2111,8 +2132,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
*/
spin_lock_irq(&lruvec->lru_lock);
- nr_activate = move_folios_to_lru(lruvec, &l_active);
- nr_deactivate = move_folios_to_lru(lruvec, &l_inactive);
+ nr_activate = move_folios_to_lru(lruvec, &l_active, 0);
+ nr_deactivate = move_folios_to_lru(lruvec, &l_inactive, 0);
__count_vm_events(PGDEACTIVATE, nr_deactivate);
__count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate);
@@ -4627,7 +4648,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
spin_lock_irq(&lruvec->lru_lock);
- move_folios_to_lru(lruvec, &list);
+ move_folios_to_lru(lruvec, &list, 0);
walk = current->reclaim_state->mm_walk;
if (walk && walk->batched) {
--
2.34.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-16 9:16 ` [RFC PATCH v2 1/1] " Chen Ridong
@ 2024-11-17 3:26 ` Barry Song
2024-11-18 2:18 ` Chen Ridong
2024-11-18 4:03 ` Matthew Wilcox
1 sibling, 1 reply; 15+ messages in thread
From: Barry Song @ 2024-11-17 3:26 UTC (permalink / raw)
To: Chen Ridong
Cc: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
ryan.roberts, linux-mm, linux-kernel, chenridong, wangweiyang2,
xieym_ict
On Sat, Nov 16, 2024 at 10:26 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>
> From: Chen Ridong <chenridong@huawei.com>
>
> An issue was found with the following testing step:
> 1. Compile with CONFIG_TRANSPARENT_HUGEPAGE=y
> 2. Mount memcg v1, and create memcg named test_memcg and set
> usage_in_bytes=2.1G, memsw.usage_in_bytes=3G.
> 3. Create a 1G swap file, and allocate 2.2G anon memory in test_memcg.
>
> It was found that:
>
> cat memory.usage_in_bytes
> 2144940032
> cat memory.memsw.usage_in_bytes
> 2255056896
>
> free -h
> total used free
> Mem: 31Gi 2.1Gi 27Gi
> Swap: 1.0Gi 618Mi 405Mi
>
> As shown above, the test_memcg used about 100M swap, but 600M+ swap memory
> was used, which means that 500M may be wasted because other memcgs can not
> use these swap memory.
>
> It can be explained as follows:
> 1. When entering shrink_inactive_list, it isolates folios from lru from
> tail to head. If it just takes folioN from lru(make it simple).
>
> inactive lru: folio1<->folio2<->folio3...<->folioN-1
> isolated list: folioN
>
> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> and added to swap cache folio by folio. After adding to swap cache,
> it will submit io to writeback folio to swap, which is asynchronous.
> When shrink_page_list is finished, the isolated folios list will be
> moved back to the head of inactive lru. The inactive lru may just look
> like this, with 512 filioes have been move to the head of inactive lru.
>
> folioN512<->folioN511<->...filioN1<->folio1<->folio2...<->folioN-1
>
> It committed io from folioN1 to folioN512, the later folios committed
> was added to head of the 'ret_folios' in the shrink_page_list function.
> As a result, the order was shown as folioN512->folioN511->...->folioN1.
>
> 3. When folio writeback io is completed, the folio may be rotated to tail
> of the lru one by one. It's assumed that filioN1,filioN2, ...,filioN512
> are completed in order(commit io in this order), and they are rotated to
> the tail of the LRU in order (filioN1<->...folioN511<->folioN512).
> Therefore, those folios that are tail of the lru will be reclaimed as
> soon as possible.
>
> folio1<->folio2<->...<->folioN-1<->filioN1<->...folioN511<->folioN512
>
> 4. However, shrink_page_list and folio writeback are asynchronous. If THP
> is splited, shrink_page_list loops at least 512 times, which means that
> shrink_page_list is not completed but some folios writeback have been
> completed, and this may lead to failure to rotate these folios to the
> tail of lru. The lru may look likes as below:
>
> folioN50<->folioN49<->...filioN1<->folio1<->folio2...<->folioN-1<->
> folioN51<->folioN52<->...folioN511<->folioN512
>
> Although those folios (N1-N50) have been finished writing back, they
> are still at the head of the lru. This is because their writeback_end
> occurred while it were still looping in shrink_folio_list(), causing
> folio_end_writeback()'s folio_rotate_reclaimable() to fail in moving
> these folios, which are not in the LRU but still in the 'folio_list',
> to the tail of the LRU.
> When isolating folios from lru, it scans from tail to head, so it is
> difficult to scan those folios again.
>
> What mentioned above may lead to a large number of folios have been added
> to swap cache but can not be reclaimed in time, which may reduce reclaim
> efficiency and prevent other memcgs from using this swap memory even if
> they trigger OOM.
>
> To fix this issue, the folios whose writeback has been completed should be
> move to the tail of the LRU instead of always placing them at the head of
> the LRU when the shrink_page_list is finished. It can be realized by
> following steps.
> 1. In the shrink_page_list function, the folios whose are committed to
It seems like there's a grammatical error here—whose something?
> are added to the head of 'folio_list', which will be return to the
> caller.
> 2. When shrink_page_list finishes, it is known that how many folios have
> been pageout, and they are all at the head of 'folio_list', which is
> ready be moved back to LRU. So, in the 'move_folios_to_lru function'
> function, if the first 'nr_io' folios (which have been pageout) have
> been written back completely, move them to the tail of LRU. Otherwise,
> move them to the head of the LRU.
>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
> mm/vmscan.c | 37 +++++++++++++++++++++++++++++--------
> 1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 76378bc257e3..04f7eab9d818 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1046,6 +1046,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> struct folio_batch free_folios;
> LIST_HEAD(ret_folios);
> LIST_HEAD(demote_folios);
> + LIST_HEAD(pageout_folios);
> unsigned int nr_reclaimed = 0;
> unsigned int pgactivate = 0;
> bool do_demote_pass;
> @@ -1061,7 +1062,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> struct address_space *mapping;
> struct folio *folio;
> enum folio_references references = FOLIOREF_RECLAIM;
> - bool dirty, writeback;
> + bool dirty, writeback, is_pageout = false;
> unsigned int nr_pages;
>
> cond_resched();
> @@ -1384,6 +1385,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> nr_pages = 1;
> }
> stat->nr_pageout += nr_pages;
> + is_pageout = true;
>
> if (folio_test_writeback(folio))
> goto keep;
> @@ -1508,7 +1510,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> keep_locked:
> folio_unlock(folio);
> keep:
> - list_add(&folio->lru, &ret_folios);
> + if (is_pageout)
> + list_add(&folio->lru, &pageout_folios);
> + else
> + list_add(&folio->lru, &ret_folios);
> VM_BUG_ON_FOLIO(folio_test_lru(folio) ||
> folio_test_unevictable(folio), folio);
> }
> @@ -1551,6 +1556,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> free_unref_folios(&free_folios);
>
> list_splice(&ret_folios, folio_list);
> + list_splice(&pageout_folios, folio_list);
Do we really need this pageout_folios list? is the below not sufficient?
+ if (nr_io > 0 &&
+ !folio_test_reclaim(folio) &&
+ !folio_test_writeback(folio))
+ lruvec_add_folio_tail(lruvec, folio);
+ else
+ lruvec_add_folio(lruvec, folio);
> count_vm_events(PGACTIVATE, pgactivate);
>
> if (plug)
> @@ -1826,11 +1832,14 @@ static bool too_many_isolated(struct pglist_data *pgdat, int file,
>
> /*
> * move_folios_to_lru() moves folios from private @list to appropriate LRU list.
> + * @lruvec: The LRU vector the list is moved to.
> + * @list: The folio list are moved to lruvec
> + * @nr_io: The first nr folios of the list that have been committed io.
> *
> * Returns the number of pages moved to the given lruvec.
> */
> static unsigned int move_folios_to_lru(struct lruvec *lruvec,
> - struct list_head *list)
> + struct list_head *list, unsigned int nr_io)
> {
> int nr_pages, nr_moved = 0;
> struct folio_batch free_folios;
> @@ -1880,9 +1889,21 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec,
> * inhibits memcg migration).
> */
> VM_BUG_ON_FOLIO(!folio_matches_lruvec(folio, lruvec), folio);
> - lruvec_add_folio(lruvec, folio);
> + /*
> + * If the folio have been committed io and writed back completely,
> + * it should be added to the tailed to the lru, so it can
> + * be relaimed as soon as possible.
> + */
> + if (nr_io > 0 &&
> + !folio_test_reclaim(folio) &&
> + !folio_test_writeback(folio))
> + lruvec_add_folio_tail(lruvec, folio);
> + else
> + lruvec_add_folio(lruvec, folio);
> +
> nr_pages = folio_nr_pages(folio);
> nr_moved += nr_pages;
> + nr_io = nr_io > nr_pages ? (nr_io - nr_pages) : 0;
> if (folio_test_active(folio))
> workingset_age_nonresident(lruvec, nr_pages);
> }
> @@ -1960,7 +1981,7 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
> nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false);
>
> spin_lock_irq(&lruvec->lru_lock);
> - move_folios_to_lru(lruvec, &folio_list);
> + move_folios_to_lru(lruvec, &folio_list, stat.nr_pageout);
>
> __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
> stat.nr_demoted);
> @@ -2111,8 +2132,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
> */
> spin_lock_irq(&lruvec->lru_lock);
>
> - nr_activate = move_folios_to_lru(lruvec, &l_active);
> - nr_deactivate = move_folios_to_lru(lruvec, &l_inactive);
> + nr_activate = move_folios_to_lru(lruvec, &l_active, 0);
> + nr_deactivate = move_folios_to_lru(lruvec, &l_inactive, 0);
>
> __count_vm_events(PGDEACTIVATE, nr_deactivate);
> __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate);
> @@ -4627,7 +4648,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
>
> spin_lock_irq(&lruvec->lru_lock);
>
> - move_folios_to_lru(lruvec, &list);
> + move_folios_to_lru(lruvec, &list, 0);
I'm not entirely convinced about using the 'nr' argument here.
Is the goal to differentiate between two cases?
1. we need to take care of written-back folios
2. we don't need to take care of written-back folios?
Would it be a bool or better to provide separate helpers?
>
> walk = current->reclaim_state->mm_walk;
> if (walk && walk->batched) {
> --
> 2.34.1
>
Thanks
Barry
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-17 3:26 ` Barry Song
@ 2024-11-18 2:18 ` Chen Ridong
0 siblings, 0 replies; 15+ messages in thread
From: Chen Ridong @ 2024-11-18 2:18 UTC (permalink / raw)
To: Barry Song
Cc: akpm, mhocko, hannes, yosryahmed, yuzhao, david, willy,
ryan.roberts, linux-mm, linux-kernel, chenridong, wangweiyang2,
xieym_ict
On 2024/11/17 11:26, Barry Song wrote:
> On Sat, Nov 16, 2024 at 10:26 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>>
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> An issue was found with the following testing step:
>> 1. Compile with CONFIG_TRANSPARENT_HUGEPAGE=y
>> 2. Mount memcg v1, and create memcg named test_memcg and set
>> usage_in_bytes=2.1G, memsw.usage_in_bytes=3G.
>> 3. Create a 1G swap file, and allocate 2.2G anon memory in test_memcg.
>>
>> It was found that:
>>
>> cat memory.usage_in_bytes
>> 2144940032
>> cat memory.memsw.usage_in_bytes
>> 2255056896
>>
>> free -h
>> total used free
>> Mem: 31Gi 2.1Gi 27Gi
>> Swap: 1.0Gi 618Mi 405Mi
>>
>> As shown above, the test_memcg used about 100M swap, but 600M+ swap memory
>> was used, which means that 500M may be wasted because other memcgs can not
>> use these swap memory.
>>
>> It can be explained as follows:
>> 1. When entering shrink_inactive_list, it isolates folios from lru from
>> tail to head. If it just takes folioN from lru(make it simple).
>>
>> inactive lru: folio1<->folio2<->folio3...<->folioN-1
>> isolated list: folioN
>>
>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>> and added to swap cache folio by folio. After adding to swap cache,
>> it will submit io to writeback folio to swap, which is asynchronous.
>> When shrink_page_list is finished, the isolated folios list will be
>> moved back to the head of inactive lru. The inactive lru may just look
>> like this, with 512 filioes have been move to the head of inactive lru.
>>
>> folioN512<->folioN511<->...filioN1<->folio1<->folio2...<->folioN-1
>>
>> It committed io from folioN1 to folioN512, the later folios committed
>> was added to head of the 'ret_folios' in the shrink_page_list function.
>> As a result, the order was shown as folioN512->folioN511->...->folioN1.
>>
>> 3. When folio writeback io is completed, the folio may be rotated to tail
>> of the lru one by one. It's assumed that filioN1,filioN2, ...,filioN512
>> are completed in order(commit io in this order), and they are rotated to
>> the tail of the LRU in order (filioN1<->...folioN511<->folioN512).
>> Therefore, those folios that are tail of the lru will be reclaimed as
>> soon as possible.
>>
>> folio1<->folio2<->...<->folioN-1<->filioN1<->...folioN511<->folioN512
>>
>> 4. However, shrink_page_list and folio writeback are asynchronous. If THP
>> is splited, shrink_page_list loops at least 512 times, which means that
>> shrink_page_list is not completed but some folios writeback have been
>> completed, and this may lead to failure to rotate these folios to the
>> tail of lru. The lru may look likes as below:
>>
>> folioN50<->folioN49<->...filioN1<->folio1<->folio2...<->folioN-1<->
>> folioN51<->folioN52<->...folioN511<->folioN512
>>
>> Although those folios (N1-N50) have been finished writing back, they
>> are still at the head of the lru. This is because their writeback_end
>> occurred while it were still looping in shrink_folio_list(), causing
>> folio_end_writeback()'s folio_rotate_reclaimable() to fail in moving
>> these folios, which are not in the LRU but still in the 'folio_list',
>> to the tail of the LRU.
>> When isolating folios from lru, it scans from tail to head, so it is
>> difficult to scan those folios again.
>>
>> What mentioned above may lead to a large number of folios have been added
>> to swap cache but can not be reclaimed in time, which may reduce reclaim
>> efficiency and prevent other memcgs from using this swap memory even if
>> they trigger OOM.
>>
>> To fix this issue, the folios whose writeback has been completed should be
>> move to the tail of the LRU instead of always placing them at the head of
>> the LRU when the shrink_page_list is finished. It can be realized by
>> following steps.
>> 1. In the shrink_page_list function, the folios whose are committed to
>
> It seems like there's a grammatical error here—whose something?
>
>> are added to the head of 'folio_list', which will be return to the
>> caller.
>> 2. When shrink_page_list finishes, it is known that how many folios have
>> been pageout, and they are all at the head of 'folio_list', which is
>> ready be moved back to LRU. So, in the 'move_folios_to_lru function'
>> function, if the first 'nr_io' folios (which have been pageout) have
>> been written back completely, move them to the tail of LRU. Otherwise,
>> move them to the head of the LRU.
>>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>> ---
>> mm/vmscan.c | 37 +++++++++++++++++++++++++++++--------
>> 1 file changed, 29 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 76378bc257e3..04f7eab9d818 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1046,6 +1046,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> struct folio_batch free_folios;
>> LIST_HEAD(ret_folios);
>> LIST_HEAD(demote_folios);
>> + LIST_HEAD(pageout_folios);
>> unsigned int nr_reclaimed = 0;
>> unsigned int pgactivate = 0;
>> bool do_demote_pass;
>> @@ -1061,7 +1062,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> struct address_space *mapping;
>> struct folio *folio;
>> enum folio_references references = FOLIOREF_RECLAIM;
>> - bool dirty, writeback;
>> + bool dirty, writeback, is_pageout = false;
>> unsigned int nr_pages;
>>
>> cond_resched();
>> @@ -1384,6 +1385,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> nr_pages = 1;
>> }
>> stat->nr_pageout += nr_pages;
>> + is_pageout = true;
>>
>> if (folio_test_writeback(folio))
>> goto keep;
>> @@ -1508,7 +1510,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> keep_locked:
>> folio_unlock(folio);
>> keep:
>> - list_add(&folio->lru, &ret_folios);
>> + if (is_pageout)
>> + list_add(&folio->lru, &pageout_folios);
>> + else
>> + list_add(&folio->lru, &ret_folios);
>> VM_BUG_ON_FOLIO(folio_test_lru(folio) ||
>> folio_test_unevictable(folio), folio);
>> }
>> @@ -1551,6 +1556,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> free_unref_folios(&free_folios);
>>
>> list_splice(&ret_folios, folio_list);
>> + list_splice(&pageout_folios, folio_list);
>
> Do we really need this pageout_folios list? is the below not sufficient?
>
> + if (nr_io > 0 &&
> + !folio_test_reclaim(folio) &&
> + !folio_test_writeback(folio))
> + lruvec_add_folio_tail(lruvec, folio);
> + else
> + lruvec_add_folio(lruvec, folio);
>
Thank you for your reply.
I think this is indeed. We want move the written-back folios to the tail
of the LRU so that them can be scanned as soon as possible to be reclaimed.
How can we know the folios have been paged out and written-back in the
move_folios_to_lru function?
It was set to 'folio_set_reclaim(folio)' when the folios were paged-out,
and it was set to 'folio_test_set_writeback(folio)' before committing
io. The 'pageout_folios' can inform the caller those folios have been
paged-out and '!folio_test_reclaim(folio) &&
!folio_test_writeback(folio)' can determine whether those folios have
been written-back.
If we don't add 'pageout_folios' here, we can only use
'!folio_test_reclaim(folio) && !folio_test_writeback(folio)' to
determine move the folios to the tail or not. This could result in
mistakenly moving those folios—which have not been paged out, but for
which the 'PG_writeback' and 'PG_reclaim' flags are not set—to the tail
of the LRU. This may lead to repeatedly scanning the same folios.
>> count_vm_events(PGACTIVATE, pgactivate);
>>
>> if (plug)
>> @@ -1826,11 +1832,14 @@ static bool too_many_isolated(struct pglist_data *pgdat, int file,
>>
>> /*
>> * move_folios_to_lru() moves folios from private @list to appropriate LRU list.
>> + * @lruvec: The LRU vector the list is moved to.
>> + * @list: The folio list are moved to lruvec
>> + * @nr_io: The first nr folios of the list that have been committed io.
>> *
>> * Returns the number of pages moved to the given lruvec.
>> */
>> static unsigned int move_folios_to_lru(struct lruvec *lruvec,
>> - struct list_head *list)
>> + struct list_head *list, unsigned int nr_io)
>> {
>> int nr_pages, nr_moved = 0;
>> struct folio_batch free_folios;
>> @@ -1880,9 +1889,21 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec,
>> * inhibits memcg migration).
>> */
>> VM_BUG_ON_FOLIO(!folio_matches_lruvec(folio, lruvec), folio);
>> - lruvec_add_folio(lruvec, folio);
>> + /*
>> + * If the folio have been committed io and writed back completely,
>> + * it should be added to the tailed to the lru, so it can
>> + * be relaimed as soon as possible.
>> + */
>> + if (nr_io > 0 &&
>> + !folio_test_reclaim(folio) &&
>> + !folio_test_writeback(folio))
>> + lruvec_add_folio_tail(lruvec, folio);
>> + else
>> + lruvec_add_folio(lruvec, folio);
>> +
>> nr_pages = folio_nr_pages(folio);
>> nr_moved += nr_pages;
>> + nr_io = nr_io > nr_pages ? (nr_io - nr_pages) : 0;
>> if (folio_test_active(folio))
>> workingset_age_nonresident(lruvec, nr_pages);
>> }
>> @@ -1960,7 +1981,7 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
>> nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false);
>>
>> spin_lock_irq(&lruvec->lru_lock);
>> - move_folios_to_lru(lruvec, &folio_list);
>> + move_folios_to_lru(lruvec, &folio_list, stat.nr_pageout);
>>
>> __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
>> stat.nr_demoted);
>> @@ -2111,8 +2132,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>> */
>> spin_lock_irq(&lruvec->lru_lock);
>>
>> - nr_activate = move_folios_to_lru(lruvec, &l_active);
>> - nr_deactivate = move_folios_to_lru(lruvec, &l_inactive);
>> + nr_activate = move_folios_to_lru(lruvec, &l_active, 0);
>> + nr_deactivate = move_folios_to_lru(lruvec, &l_inactive, 0);
>>
>> __count_vm_events(PGDEACTIVATE, nr_deactivate);
>> __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate);
>> @@ -4627,7 +4648,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
>>
>> spin_lock_irq(&lruvec->lru_lock);
>>
>> - move_folios_to_lru(lruvec, &list);
>> + move_folios_to_lru(lruvec, &list, 0);
>
> I'm not entirely convinced about using the 'nr' argument here.
> Is the goal to differentiate between two cases?
> 1. we need to take care of written-back folios
> 2. we don't need to take care of written-back folios?
>
The 'isolate_lru_folios' scans the LRU from the tail to the head. If the
folios have been scanned for shrinking, they are moved to the head of
the LRU to prevent repeated scanning. Only folios that have been paged
out and written back should be moved to the tail of the LRU, so they can
be reclaimed the next time they are scanned.
However, if these folios have been paged out but not yet written back,
they should be moved to the head of the LRU. They will then be moved to
the tail of the LRU 'when folio_end_writeback()'s
folio_rotate_reclaimable()' is invoked.
For those folios that have not been paged out, if they cannot be
reclaimed, they should be moved to the head of the LRU to avoid repeated
scanning. That is why we need 'nr_io', which indicates that these folios
have been paged out."
> Would it be a bool or better to provide separate helpers?
>
+ if (nr_io > 0 &&
+ !folio_test_writeback(folio))
+ lruvec_add_folio_tail(lruvec, folio);
+ else
+ lruvec_add_folio(lruvec, folio);
+
I think this may be sufficient. Only using 'PG_writeback' to determine
whether the folios have been written-back.
Thank you again.
Best regards,
Ridong
>>
>> walk = current->reclaim_state->mm_walk;
>> if (walk && walk->batched) {
>> --
>> 2.34.1
>>
>
> Thanks
> Barry
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-16 9:16 ` [RFC PATCH v2 1/1] " Chen Ridong
2024-11-17 3:26 ` Barry Song
@ 2024-11-18 4:03 ` Matthew Wilcox
2024-11-18 4:14 ` Barry Song
1 sibling, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2024-11-18 4:03 UTC (permalink / raw)
To: Chen Ridong
Cc: akpm, mhocko, hannes, yosryahmed, yuzhao, david, ryan.roberts,
baohua, linux-mm, linux-kernel, chenridong, wangweiyang2,
xieym_ict
On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> and added to swap cache folio by folio. After adding to swap cache,
> it will submit io to writeback folio to swap, which is asynchronous.
> When shrink_page_list is finished, the isolated folios list will be
> moved back to the head of inactive lru. The inactive lru may just look
> like this, with 512 filioes have been move to the head of inactive lru.
I was hoping that we'd be able to stop splitting the folio when adding
to the swap cache. Ideally. we'd add the whole 2MB and write it back
as a single unit.
This is going to become much more important with memdescs. We'd have to
allocate 512 struct folios to do this, which would be about 10 4kB pages,
and if we're trying to swap out memory, we're probably low on memory.
So I don't like this solution you have at all because it doesn't help us
get to the solution we're going to need in about a year's time.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 4:03 ` Matthew Wilcox
@ 2024-11-18 4:14 ` Barry Song
2024-11-18 4:21 ` Matthew Wilcox
2024-11-18 9:41 ` chenridong
0 siblings, 2 replies; 15+ messages in thread
From: Barry Song @ 2024-11-18 4:14 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Chen Ridong, akpm, mhocko, hannes, yosryahmed, yuzhao, david,
ryan.roberts, linux-mm, linux-kernel, chenridong, wangweiyang2,
xieym_ict
On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> > 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> > and added to swap cache folio by folio. After adding to swap cache,
> > it will submit io to writeback folio to swap, which is asynchronous.
> > When shrink_page_list is finished, the isolated folios list will be
> > moved back to the head of inactive lru. The inactive lru may just look
> > like this, with 512 filioes have been move to the head of inactive lru.
>
> I was hoping that we'd be able to stop splitting the folio when adding
> to the swap cache. Ideally. we'd add the whole 2MB and write it back
> as a single unit.
This is already the case: adding to the swapcache doesn’t require splitting
THPs, but failing to allocate 2MB of contiguous swap slots will.
>
> This is going to become much more important with memdescs. We'd have to
> allocate 512 struct folios to do this, which would be about 10 4kB pages,
> and if we're trying to swap out memory, we're probably low on memory.
>
> So I don't like this solution you have at all because it doesn't help us
> get to the solution we're going to need in about a year's time.
>
Ridong might need to clarify why this splitting is occurring. If it’s due to the
failure to allocate swap slots, we still need a solution to address it.
Thanks
Barry
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 4:14 ` Barry Song
@ 2024-11-18 4:21 ` Matthew Wilcox
2024-11-25 1:19 ` chenridong
2024-11-27 0:08 ` Chris Li
2024-11-18 9:41 ` chenridong
1 sibling, 2 replies; 15+ messages in thread
From: Matthew Wilcox @ 2024-11-18 4:21 UTC (permalink / raw)
To: Barry Song
Cc: Chen Ridong, akpm, mhocko, hannes, yosryahmed, yuzhao, david,
ryan.roberts, linux-mm, linux-kernel, chenridong, wangweiyang2,
xieym_ict, Chris Li
On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> > > 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> > > and added to swap cache folio by folio. After adding to swap cache,
> > > it will submit io to writeback folio to swap, which is asynchronous.
> > > When shrink_page_list is finished, the isolated folios list will be
> > > moved back to the head of inactive lru. The inactive lru may just look
> > > like this, with 512 filioes have been move to the head of inactive lru.
> >
> > I was hoping that we'd be able to stop splitting the folio when adding
> > to the swap cache. Ideally. we'd add the whole 2MB and write it back
> > as a single unit.
>
> This is already the case: adding to the swapcache doesn’t require splitting
> THPs, but failing to allocate 2MB of contiguous swap slots will.
Agreed we need to understand why this is happening. As I've said a few
times now, we need to stop requiring contiguity. Real filesystems don't
need the contiguity (they become less efficient, but they can scatter a
single 2MB folio to multiple places).
Maybe Chris has a solution to this in the works?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 4:14 ` Barry Song
2024-11-18 4:21 ` Matthew Wilcox
@ 2024-11-18 9:41 ` chenridong
2024-11-18 9:55 ` Barry Song
1 sibling, 1 reply; 15+ messages in thread
From: chenridong @ 2024-11-18 9:41 UTC (permalink / raw)
To: Barry Song, Matthew Wilcox
Cc: Chen Ridong, akpm, mhocko, hannes, yosryahmed, yuzhao, david,
ryan.roberts, linux-mm, linux-kernel, wangweiyang2, xieym_ict
On 2024/11/18 12:14, Barry Song wrote:
> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
>>
>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>>> and added to swap cache folio by folio. After adding to swap cache,
>>> it will submit io to writeback folio to swap, which is asynchronous.
>>> When shrink_page_list is finished, the isolated folios list will be
>>> moved back to the head of inactive lru. The inactive lru may just look
>>> like this, with 512 filioes have been move to the head of inactive lru.
>>
>> I was hoping that we'd be able to stop splitting the folio when adding
>> to the swap cache. Ideally. we'd add the whole 2MB and write it back
>> as a single unit.
>
> This is already the case: adding to the swapcache doesn’t require splitting
> THPs, but failing to allocate 2MB of contiguous swap slots will.
>
>>
>> This is going to become much more important with memdescs. We'd have to
>> allocate 512 struct folios to do this, which would be about 10 4kB pages,
>> and if we're trying to swap out memory, we're probably low on memory.
>>
>> So I don't like this solution you have at all because it doesn't help us
>> get to the solution we're going to need in about a year's time.
>>
>
> Ridong might need to clarify why this splitting is occurring. If it’s due to the
> failure to allocate swap slots, we still need a solution to address it.
>
> Thanks
> Barry
shrink_folio_list
add_to_swap
folio_alloc_swap
get_swap_pages
scan_swap_map_slots
/*
* Swapfile is not block device or not using clusters so unable
* to allocate large entries.
*/
if (!(si->flags & SWP_BLKDEV) || !si->cluster_info)
return 0;
In my test, I use a file as swap, which is not 'SWP_BLKDEV'. So it
failed to get get_swap_pages.
I think this is a race issue between 'shrink_folio_list' executing and
writing back asynchronously. In my test, 512 folios(THP split) were
added to swap, only about 60 folios had not been written back when
'move_folios_to_lru' was invoked after 'shrink_folio_list'. What if
writing back faster? Maybe this will happen even 32 folios(without THP)
are in the 'folio_list' of shrink_folio_list's inputs.
Best regards,
Ridong
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 9:41 ` chenridong
@ 2024-11-18 9:55 ` Barry Song
2024-11-27 0:17 ` Chris Li
0 siblings, 1 reply; 15+ messages in thread
From: Barry Song @ 2024-11-18 9:55 UTC (permalink / raw)
To: chenridong, Chris Li
Cc: Matthew Wilcox, Chen Ridong, akpm, mhocko, hannes, yosryahmed,
yuzhao, david, ryan.roberts, linux-mm, linux-kernel,
wangweiyang2, xieym_ict
On Mon, Nov 18, 2024 at 10:41 PM chenridong <chenridong@huawei.com> wrote:
>
>
>
> On 2024/11/18 12:14, Barry Song wrote:
> > On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> >>
> >> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> >>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> >>> and added to swap cache folio by folio. After adding to swap cache,
> >>> it will submit io to writeback folio to swap, which is asynchronous.
> >>> When shrink_page_list is finished, the isolated folios list will be
> >>> moved back to the head of inactive lru. The inactive lru may just look
> >>> like this, with 512 filioes have been move to the head of inactive lru.
> >>
> >> I was hoping that we'd be able to stop splitting the folio when adding
> >> to the swap cache. Ideally. we'd add the whole 2MB and write it back
> >> as a single unit.
> >
> > This is already the case: adding to the swapcache doesn’t require splitting
> > THPs, but failing to allocate 2MB of contiguous swap slots will.
> >
> >>
> >> This is going to become much more important with memdescs. We'd have to
> >> allocate 512 struct folios to do this, which would be about 10 4kB pages,
> >> and if we're trying to swap out memory, we're probably low on memory.
> >>
> >> So I don't like this solution you have at all because it doesn't help us
> >> get to the solution we're going to need in about a year's time.
> >>
> >
> > Ridong might need to clarify why this splitting is occurring. If it’s due to the
> > failure to allocate swap slots, we still need a solution to address it.
> >
> > Thanks
> > Barry
>
> shrink_folio_list
> add_to_swap
> folio_alloc_swap
> get_swap_pages
> scan_swap_map_slots
> /*
> * Swapfile is not block device or not using clusters so unable
> * to allocate large entries.
> */
> if (!(si->flags & SWP_BLKDEV) || !si->cluster_info)
> return 0;
>
> In my test, I use a file as swap, which is not 'SWP_BLKDEV'. So it
> failed to get get_swap_pages.
Alright, a proper non-rotating swap block device would be much
better. In your case, though, cluster allocation isn’t supported.
>
> I think this is a race issue between 'shrink_folio_list' executing and
> writing back asynchronously. In my test, 512 folios(THP split) were
> added to swap, only about 60 folios had not been written back when
> 'move_folios_to_lru' was invoked after 'shrink_folio_list'. What if
> writing back faster? Maybe this will happen even 32 folios(without THP)
> are in the 'folio_list' of shrink_folio_list's inputs.
On a real non-rotate swap device, the race condition would occur only when
contiguous 2MB swap slots are unavailable.
Hi Chris,
I recall you mentioned unifying the code for swap devices and swap files, or
for non-rotating and rotating devices. I assume a swap file (not a block device)
would also be a practical user case?
>
> Best regards,
> Ridong
Thanks
Barry
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 4:21 ` Matthew Wilcox
@ 2024-11-25 1:19 ` chenridong
2024-11-28 23:08 ` Barry Song
2024-11-27 0:08 ` Chris Li
1 sibling, 1 reply; 15+ messages in thread
From: chenridong @ 2024-11-25 1:19 UTC (permalink / raw)
To: Matthew Wilcox, Barry Song, Chris Li
Cc: Chen Ridong, akpm, mhocko, hannes, yosryahmed, yuzhao, david,
ryan.roberts, linux-mm, linux-kernel, wangweiyang2, xieym_ict,
Chris Li
On 2024/11/18 12:21, Matthew Wilcox wrote:
> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
>> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
>>>
>>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
>>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>>>> and added to swap cache folio by folio. After adding to swap cache,
>>>> it will submit io to writeback folio to swap, which is asynchronous.
>>>> When shrink_page_list is finished, the isolated folios list will be
>>>> moved back to the head of inactive lru. The inactive lru may just look
>>>> like this, with 512 filioes have been move to the head of inactive lru.
>>>
>>> I was hoping that we'd be able to stop splitting the folio when adding
>>> to the swap cache. Ideally. we'd add the whole 2MB and write it back
>>> as a single unit.
>>
>> This is already the case: adding to the swapcache doesn’t require splitting
>> THPs, but failing to allocate 2MB of contiguous swap slots will.
>
> Agreed we need to understand why this is happening. As I've said a few
> times now, we need to stop requiring contiguity. Real filesystems don't
> need the contiguity (they become less efficient, but they can scatter a
> single 2MB folio to multiple places).
>
> Maybe Chris has a solution to this in the works?
>
Hi, Chris, do you have a better idea to solve this issue?
Best regards,
Ridong
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 4:21 ` Matthew Wilcox
2024-11-25 1:19 ` chenridong
@ 2024-11-27 0:08 ` Chris Li
1 sibling, 0 replies; 15+ messages in thread
From: Chris Li @ 2024-11-27 0:08 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Barry Song, Chen Ridong, akpm, mhocko, hannes, yosryahmed,
yuzhao, david, ryan.roberts, linux-mm, linux-kernel, chenridong,
wangweiyang2, xieym_ict
On Sun, Nov 17, 2024 at 8:22 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
> > On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> > > > 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> > > > and added to swap cache folio by folio. After adding to swap cache,
> > > > it will submit io to writeback folio to swap, which is asynchronous.
> > > > When shrink_page_list is finished, the isolated folios list will be
> > > > moved back to the head of inactive lru. The inactive lru may just look
> > > > like this, with 512 filioes have been move to the head of inactive lru.
> > >
> > > I was hoping that we'd be able to stop splitting the folio when adding
> > > to the swap cache. Ideally. we'd add the whole 2MB and write it back
> > > as a single unit.
> >
> > This is already the case: adding to the swapcache doesn’t require splitting
> > THPs, but failing to allocate 2MB of contiguous swap slots will.
>
> Agreed we need to understand why this is happening. As I've said a few
> times now, we need to stop requiring contiguity. Real filesystems don't
> need the contiguity (they become less efficient, but they can scatter a
> single 2MB folio to multiple places).
>
> Maybe Chris has a solution to this in the works?
Hi Matthew and Chenridong,
Sorry for the late reply.
I don't have a working solution yet. I just have some ideas.
One of the big challenges is what to do with swap cache. Currently
when a folio was added to the swap cache, it assumed continued swap
entry. There will be a lot of complexity to break that assumption. To
make things worse, the discontiguous swap entry might belong to a
different xarray due to the 64M swap address sharding.
One idea is that we can have a special kind of swap device to do swap
entry redirecting.
For the swap out path,
Let's say the real swapfile A is almost full. We want to allocate an
order of 4 swap entries to folio F.
If there are contiguous swap entries in A, the swap allocator just
returns entry [A9 ..A12], with A9 as the head swap entry. That is the
same as the normal path we have now.
On the other hand, if there is no contiguous swap entry in A. Only
non-contiguous swap entry A1, A3, A5, A7.
Instead, we allocate from a special redirecting swap device R as R1,
R2, R3, R4 with an IO redirecting array as [R1, A1, A3, A5, A7]. Swap
device R is virtual, there is no real file backing on it, so the swap
file size on R can grow or shrink as needed.
In add_to_swap_cache(), we set folio F->swap = R1. Add F into swap
cache S with entry [R1..R4] pointing to folio F. In other words,
S[R1..R4] = F. Add additional lookup xarray L[R1..R4] = [R1, A1, A3,
A5, A7]. For the rest of the code, we pass the R1 as the continuous
swap entry to folio F.
The swap_writepage_bdev_async() will recognize R as a special device.
It will do the lookup xarray L[R1] to get the [R1, A1, A3, A5, A7],
use that entry list to build the bio with 4 iovec instead of 1. Fill
up the [A1,A3,A5,A7] into the bio vec. That is the swap write path.
For the swap in, the page fault handler gets a fault at address X and
looks up the pte containing swap entry R3. Look up the swap cache of
S[R3] and get nothing, folio F is not in the swap cache.
Recognize the R is a remapping device. The swap core will lookup L[R3]
= [R1, A1,A3,A5,A7]. If we want to swap in order 2 folio. Then
construct swap_read_folio_bdev_async() with iovec [A1, A3, A5, A7].
If we just want to swap in a 4k page. We can construct iovec as [A3]
alone, given the swap entry starts from R1.
That is the read path.
For the simplicity, there is a lot of detail omitted in the
description. Also on the implementation side, a lot of optimizations
we might be able to do, e.g. using pointer lookup of R1 instead of
xarray, we can use struct to hold R1 and [A1, A3, A5, A7] etc.
This approach avoids a lot of complexity in breaking the continuity
assumption of swap cache entries, at the cost of additional swap cache
address space R. The lookup mapping L[R1..R4] = [R1, A1, A3, A5, A7]
are minimally necessary data structures to track the IO remapping. I
think that is unavoidable.
Please let me know if you see any problem with the above approach. As
always, feedback is welcome as well.
Thanks
Chris
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-18 9:55 ` Barry Song
@ 2024-11-27 0:17 ` Chris Li
0 siblings, 0 replies; 15+ messages in thread
From: Chris Li @ 2024-11-27 0:17 UTC (permalink / raw)
To: Barry Song
Cc: chenridong, Matthew Wilcox, Chen Ridong, akpm, mhocko, hannes,
yosryahmed, yuzhao, david, ryan.roberts, linux-mm, linux-kernel,
wangweiyang2, xieym_ict, Kairui Song
On Mon, Nov 18, 2024 at 1:56 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Mon, Nov 18, 2024 at 10:41 PM chenridong <chenridong@huawei.com> wrote:
> >
> >
> >
> > On 2024/11/18 12:14, Barry Song wrote:
> > > On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >>
> > >> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> > >>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> > >>> and added to swap cache folio by folio. After adding to swap cache,
> > >>> it will submit io to writeback folio to swap, which is asynchronous.
> > >>> When shrink_page_list is finished, the isolated folios list will be
> > >>> moved back to the head of inactive lru. The inactive lru may just look
> > >>> like this, with 512 filioes have been move to the head of inactive lru.
> > >>
> > >> I was hoping that we'd be able to stop splitting the folio when adding
> > >> to the swap cache. Ideally. we'd add the whole 2MB and write it back
> > >> as a single unit.
> > >
> > > This is already the case: adding to the swapcache doesn’t require splitting
> > > THPs, but failing to allocate 2MB of contiguous swap slots will.
> > >
> > >>
> > >> This is going to become much more important with memdescs. We'd have to
> > >> allocate 512 struct folios to do this, which would be about 10 4kB pages,
> > >> and if we're trying to swap out memory, we're probably low on memory.
> > >>
> > >> So I don't like this solution you have at all because it doesn't help us
> > >> get to the solution we're going to need in about a year's time.
> > >>
> > >
> > > Ridong might need to clarify why this splitting is occurring. If it’s due to the
> > > failure to allocate swap slots, we still need a solution to address it.
> > >
> > > Thanks
> > > Barry
> >
> > shrink_folio_list
> > add_to_swap
> > folio_alloc_swap
> > get_swap_pages
> > scan_swap_map_slots
> > /*
> > * Swapfile is not block device or not using clusters so unable
> > * to allocate large entries.
> > */
> > if (!(si->flags & SWP_BLKDEV) || !si->cluster_info)
> > return 0;
> >
> > In my test, I use a file as swap, which is not 'SWP_BLKDEV'. So it
> > failed to get get_swap_pages.
>
> Alright, a proper non-rotating swap block device would be much
> better. In your case, though, cluster allocation isn’t supported.
Ah yes. The later part of the swap allocation series removes the non
cluster allocation code path.
It is not merged to mm-unstable yet. So even a swapfile not block
device will get the cluster allocator.
>
> >
> > I think this is a race issue between 'shrink_folio_list' executing and
> > writing back asynchronously. In my test, 512 folios(THP split) were
> > added to swap, only about 60 folios had not been written back when
> > 'move_folios_to_lru' was invoked after 'shrink_folio_list'. What if
> > writing back faster? Maybe this will happen even 32 folios(without THP)
> > are in the 'folio_list' of shrink_folio_list's inputs.
>
> On a real non-rotate swap device, the race condition would occur only when
> contiguous 2MB swap slots are unavailable.
>
> Hi Chris,
> I recall you mentioned unifying the code for swap devices and swap files, or
> for non-rotating and rotating devices. I assume a swap file (not a block device)
> would also be a practical user case?
I assume you mean non-SSD vs SSD device. In this follow up series of
the swap allocator from Kairui, the old non cluster allocator gets
removed, the cluster allocator will be used all the time.
https://lore.kernel.org/linux-mm/20241022192451.38138-4-ryncsn@gmail.com/
Chris
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-25 1:19 ` chenridong
@ 2024-11-28 23:08 ` Barry Song
2024-11-29 2:25 ` chenridong
0 siblings, 1 reply; 15+ messages in thread
From: Barry Song @ 2024-11-28 23:08 UTC (permalink / raw)
To: chenridong
Cc: Matthew Wilcox, Chris Li, Chen Ridong, akpm, mhocko, hannes,
yosryahmed, yuzhao, david, ryan.roberts, linux-mm, linux-kernel,
wangweiyang2, xieym_ict
On Mon, Nov 25, 2024 at 2:19 PM chenridong <chenridong@huawei.com> wrote:
>
>
>
> On 2024/11/18 12:21, Matthew Wilcox wrote:
> > On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
> >> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> >>>
> >>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> >>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> >>>> and added to swap cache folio by folio. After adding to swap cache,
> >>>> it will submit io to writeback folio to swap, which is asynchronous.
> >>>> When shrink_page_list is finished, the isolated folios list will be
> >>>> moved back to the head of inactive lru. The inactive lru may just look
> >>>> like this, with 512 filioes have been move to the head of inactive lru.
> >>>
> >>> I was hoping that we'd be able to stop splitting the folio when adding
> >>> to the swap cache. Ideally. we'd add the whole 2MB and write it back
> >>> as a single unit.
> >>
> >> This is already the case: adding to the swapcache doesn’t require splitting
> >> THPs, but failing to allocate 2MB of contiguous swap slots will.
> >
> > Agreed we need to understand why this is happening. As I've said a few
> > times now, we need to stop requiring contiguity. Real filesystems don't
> > need the contiguity (they become less efficient, but they can scatter a
> > single 2MB folio to multiple places).
> >
> > Maybe Chris has a solution to this in the works?
> >
>
> Hi, Chris, do you have a better idea to solve this issue?
Not Chris. As I read the code again, we have already the below code to fixup
the issue "missed folio_rotate_reclaimable()" in evict_folios():
/* retry folios that may have missed
folio_rotate_reclaimable() */
list_move(&folio->lru, &clean);
It doesn't work for you?
commit 359a5e1416caaf9ce28396a65ed3e386cc5de663
Author: Yu Zhao <yuzhao@google.com>
Date: Tue Nov 15 18:38:07 2022 -0700
mm: multi-gen LRU: retry folios written back while isolated
The page reclaim isolates a batch of folios from the tail of one of the
LRU lists and works on those folios one by one. For a suitable
swap-backed folio, if the swap device is async, it queues that folio for
writeback. After the page reclaim finishes an entire batch, it puts back
the folios it queued for writeback to the head of the original LRU list.
In the meantime, the page writeback flushes the queued folios also by
batches. Its batching logic is independent from that of the page reclaim.
For each of the folios it writes back, the page writeback calls
folio_rotate_reclaimable() which tries to rotate a folio to the tail.
folio_rotate_reclaimable() only works for a folio after the page reclaim
has put it back. If an async swap device is fast enough, the page
writeback can finish with that folio while the page reclaim is still
working on the rest of the batch containing it. In this case, that folio
will remain at the head and the page reclaim will not retry it before
reaching there.
This patch adds a retry to evict_folios(). After evict_folios() has
finished an entire batch and before it puts back folios it cannot free
immediately, it retries those that may have missed the rotation.
Before this patch, ~60% of folios swapped to an Intel Optane missed
folio_rotate_reclaimable(). After this patch, ~99% of missed folios were
reclaimed upon retry.
This problem affects relatively slow async swap devices like Samsung 980
Pro much less and does not affect sync swap devices like zram or zswap at
all.
>
> Best regards,
> Ridong
Thanks
Barry
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-28 23:08 ` Barry Song
@ 2024-11-29 2:25 ` chenridong
2024-11-29 3:07 ` Barry Song
0 siblings, 1 reply; 15+ messages in thread
From: chenridong @ 2024-11-29 2:25 UTC (permalink / raw)
To: Barry Song, Yu Zhao
Cc: Matthew Wilcox, Chris Li, Chen Ridong, akpm, mhocko, hannes,
yosryahmed, yuzhao, david, ryan.roberts, linux-mm, linux-kernel,
wangweiyang2, xieym_ict
On 2024/11/29 7:08, Barry Song wrote:
> On Mon, Nov 25, 2024 at 2:19 PM chenridong <chenridong@huawei.com> wrote:
>>
>>
>>
>> On 2024/11/18 12:21, Matthew Wilcox wrote:
>>> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
>>>> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
>>>>>
>>>>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
>>>>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
>>>>>> and added to swap cache folio by folio. After adding to swap cache,
>>>>>> it will submit io to writeback folio to swap, which is asynchronous.
>>>>>> When shrink_page_list is finished, the isolated folios list will be
>>>>>> moved back to the head of inactive lru. The inactive lru may just look
>>>>>> like this, with 512 filioes have been move to the head of inactive lru.
>>>>>
>>>>> I was hoping that we'd be able to stop splitting the folio when adding
>>>>> to the swap cache. Ideally. we'd add the whole 2MB and write it back
>>>>> as a single unit.
>>>>
>>>> This is already the case: adding to the swapcache doesn’t require splitting
>>>> THPs, but failing to allocate 2MB of contiguous swap slots will.
>>>
>>> Agreed we need to understand why this is happening. As I've said a few
>>> times now, we need to stop requiring contiguity. Real filesystems don't
>>> need the contiguity (they become less efficient, but they can scatter a
>>> single 2MB folio to multiple places).
>>>
>>> Maybe Chris has a solution to this in the works?
>>>
>>
>> Hi, Chris, do you have a better idea to solve this issue?
>
> Not Chris. As I read the code again, we have already the below code to fixup
> the issue "missed folio_rotate_reclaimable()" in evict_folios():
>
> /* retry folios that may have missed
> folio_rotate_reclaimable() */
> list_move(&folio->lru, &clean);
>
> It doesn't work for you?
>
> commit 359a5e1416caaf9ce28396a65ed3e386cc5de663
> Author: Yu Zhao <yuzhao@google.com>
> Date: Tue Nov 15 18:38:07 2022 -0700
> mm: multi-gen LRU: retry folios written back while isolated
>
> The page reclaim isolates a batch of folios from the tail of one of the
> LRU lists and works on those folios one by one. For a suitable
> swap-backed folio, if the swap device is async, it queues that folio for
> writeback. After the page reclaim finishes an entire batch, it puts back
> the folios it queued for writeback to the head of the original LRU list.
>
> In the meantime, the page writeback flushes the queued folios also by
> batches. Its batching logic is independent from that of the page reclaim.
> For each of the folios it writes back, the page writeback calls
> folio_rotate_reclaimable() which tries to rotate a folio to the tail.
>
>
> folio_rotate_reclaimable() only works for a folio after the page reclaim
> has put it back. If an async swap device is fast enough, the page
> writeback can finish with that folio while the page reclaim is still
> working on the rest of the batch containing it. In this case, that folio
> will remain at the head and the page reclaim will not retry it before
> reaching there.
>
> This patch adds a retry to evict_folios(). After evict_folios() has
> finished an entire batch and before it puts back folios it cannot free
> immediately, it retries those that may have missed the rotation.
> Before this patch, ~60% of folios swapped to an Intel Optane missed
> folio_rotate_reclaimable(). After this patch, ~99% of missed folios were
> reclaimed upon retry.
>
> This problem affects relatively slow async swap devices like Samsung 980
> Pro much less and does not affect sync swap devices like zram or zswap at
> all.
>
>>
>> Best regards,
>> Ridong
>
> Thanks
> Barry
Thank you for your reply, Barry.
I found this issue with 5.10 version. I reproduced this issue with the
next version, but the CONFIG_LRU_GEN_ENABLED kconfig is disabled. I
tested again with CONFIG_LRU_GEN_ENABLED enabled, and this issue can be
fixed.
IIUC, the 359a5e1416caaf9ce28396a65ed3e386cc5de663 commit can only work
when CONFIG_LRU_GEN_ENABLED is enabled, but this issue exists when
CONFIG_LRU_GEN_ENABLED is disabled and it should be fixed.
I read the code of commit 359a5e1416caaf9ce28396a65ed3e386cc5de663, it
found folios that are missed to rotate in a more complicated way, but it
makes it much clearer what is being done. Should I implement in Yu
Zhao's way?
Best regards,
Ridong
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
2024-11-29 2:25 ` chenridong
@ 2024-11-29 3:07 ` Barry Song
0 siblings, 0 replies; 15+ messages in thread
From: Barry Song @ 2024-11-29 3:07 UTC (permalink / raw)
To: chenridong
Cc: Yu Zhao, Matthew Wilcox, Chris Li, Chen Ridong, akpm, mhocko,
hannes, yosryahmed, david, ryan.roberts, linux-mm, linux-kernel,
wangweiyang2, xieym_ict
On Fri, Nov 29, 2024 at 3:25 PM chenridong <chenridong@huawei.com> wrote:
>
>
>
> On 2024/11/29 7:08, Barry Song wrote:
> > On Mon, Nov 25, 2024 at 2:19 PM chenridong <chenridong@huawei.com> wrote:
> >>
> >>
> >>
> >> On 2024/11/18 12:21, Matthew Wilcox wrote:
> >>> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
> >>>> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> >>>>>
> >>>>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> >>>>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> >>>>>> and added to swap cache folio by folio. After adding to swap cache,
> >>>>>> it will submit io to writeback folio to swap, which is asynchronous.
> >>>>>> When shrink_page_list is finished, the isolated folios list will be
> >>>>>> moved back to the head of inactive lru. The inactive lru may just look
> >>>>>> like this, with 512 filioes have been move to the head of inactive lru.
> >>>>>
> >>>>> I was hoping that we'd be able to stop splitting the folio when adding
> >>>>> to the swap cache. Ideally. we'd add the whole 2MB and write it back
> >>>>> as a single unit.
> >>>>
> >>>> This is already the case: adding to the swapcache doesn’t require splitting
> >>>> THPs, but failing to allocate 2MB of contiguous swap slots will.
> >>>
> >>> Agreed we need to understand why this is happening. As I've said a few
> >>> times now, we need to stop requiring contiguity. Real filesystems don't
> >>> need the contiguity (they become less efficient, but they can scatter a
> >>> single 2MB folio to multiple places).
> >>>
> >>> Maybe Chris has a solution to this in the works?
> >>>
> >>
> >> Hi, Chris, do you have a better idea to solve this issue?
> >
> > Not Chris. As I read the code again, we have already the below code to fixup
> > the issue "missed folio_rotate_reclaimable()" in evict_folios():
> >
> > /* retry folios that may have missed
> > folio_rotate_reclaimable() */
> > list_move(&folio->lru, &clean);
> >
> > It doesn't work for you?
> >
> > commit 359a5e1416caaf9ce28396a65ed3e386cc5de663
> > Author: Yu Zhao <yuzhao@google.com>
> > Date: Tue Nov 15 18:38:07 2022 -0700
> > mm: multi-gen LRU: retry folios written back while isolated
> >
> > The page reclaim isolates a batch of folios from the tail of one of the
> > LRU lists and works on those folios one by one. For a suitable
> > swap-backed folio, if the swap device is async, it queues that folio for
> > writeback. After the page reclaim finishes an entire batch, it puts back
> > the folios it queued for writeback to the head of the original LRU list.
> >
> > In the meantime, the page writeback flushes the queued folios also by
> > batches. Its batching logic is independent from that of the page reclaim.
> > For each of the folios it writes back, the page writeback calls
> > folio_rotate_reclaimable() which tries to rotate a folio to the tail.
> >
> >
> > folio_rotate_reclaimable() only works for a folio after the page reclaim
> > has put it back. If an async swap device is fast enough, the page
> > writeback can finish with that folio while the page reclaim is still
> > working on the rest of the batch containing it. In this case, that folio
> > will remain at the head and the page reclaim will not retry it before
> > reaching there.
> >
> > This patch adds a retry to evict_folios(). After evict_folios() has
> > finished an entire batch and before it puts back folios it cannot free
> > immediately, it retries those that may have missed the rotation.
> > Before this patch, ~60% of folios swapped to an Intel Optane missed
> > folio_rotate_reclaimable(). After this patch, ~99% of missed folios were
> > reclaimed upon retry.
> >
> > This problem affects relatively slow async swap devices like Samsung 980
> > Pro much less and does not affect sync swap devices like zram or zswap at
> > all.
> >
> >>
> >> Best regards,
> >> Ridong
> >
> > Thanks
> > Barry
>
> Thank you for your reply, Barry.
> I found this issue with 5.10 version. I reproduced this issue with the
> next version, but the CONFIG_LRU_GEN_ENABLED kconfig is disabled. I
> tested again with CONFIG_LRU_GEN_ENABLED enabled, and this issue can be
> fixed.
>
> IIUC, the 359a5e1416caaf9ce28396a65ed3e386cc5de663 commit can only work
> when CONFIG_LRU_GEN_ENABLED is enabled, but this issue exists when
> CONFIG_LRU_GEN_ENABLED is disabled and it should be fixed.
>
> I read the code of commit 359a5e1416caaf9ce28396a65ed3e386cc5de663, it
> found folios that are missed to rotate in a more complicated way, but it
> makes it much clearer what is being done. Should I implement in Yu
> Zhao's way?
yes. this is completely the same thing.
since Yu only fixed in mglru and you are still using active/inactive,
the same fix should apply to active/inactive lru.
>
> Best regards,
> Ridong
thanks
barry
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-11-29 3:07 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-16 9:16 [RFC PATCH v2 0/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking Chen Ridong
2024-11-16 9:16 ` [RFC PATCH v2 1/1] " Chen Ridong
2024-11-17 3:26 ` Barry Song
2024-11-18 2:18 ` Chen Ridong
2024-11-18 4:03 ` Matthew Wilcox
2024-11-18 4:14 ` Barry Song
2024-11-18 4:21 ` Matthew Wilcox
2024-11-25 1:19 ` chenridong
2024-11-28 23:08 ` Barry Song
2024-11-29 2:25 ` chenridong
2024-11-29 3:07 ` Barry Song
2024-11-27 0:08 ` Chris Li
2024-11-18 9:41 ` chenridong
2024-11-18 9:55 ` Barry Song
2024-11-27 0:17 ` Chris Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox