* [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
@ 2023-08-23 9:54 liwenyu01
2023-08-31 7:26 ` liwenyu01
0 siblings, 1 reply; 7+ messages in thread
From: liwenyu01 @ 2023-08-23 9:54 UTC (permalink / raw)
To: akpm, linux-mm; +Cc: linux-kernel, wangyun, liwenyu01
[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]
The current memory reclaim delay statistics only count the direct memory
reclaim of the task in do_try_to_free_pages(). In systems with NUMA
open, some tasks occasionally experience slower response times, but the
total count of reclaim does not increase, using ftrace can show that
node_reclaim has occurred.
The memory reclaim occurring in get_page_from_freelist() is also due to
heavy memory load. To get the impact of tasks in memory reclaim, this
patch adds the statistics of the memory reclaim delay statistics for
__node_reclaim().
Signed-off-by: Wen Yu Li <liwenyu01@bilibili.com>
---
mm/vmscan.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1080209a568b..d2471abce1ae 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
cond_resched();
psi_memstall_enter(&pflags);
+ delayacct_freepages_start();
fs_reclaim_acquire(sc.gfp_mask);
/*
* We need to be able to allocate from the reserves for RECLAIM_UNMAP
@@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
memalloc_noreclaim_restore(noreclaim_flag);
fs_reclaim_release(sc.gfp_mask);
psi_memstall_leave(&pflags);
+ delayacct_freepages_end();
trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
--
2.30.2
本?件??指定收件人使用并可能包含保密信息,若??收到本?件,敬?通知?件人,并立即?除本?件及所有副本。?不得擅自?播、??、保存或?制此?件(含附件)。感??的理解与配合。
This message may contain confidential information, and is intended only for the use of the addressee(s) named above. If you have received this message in error, please contact the sender immediately and delete all copies from your system. You are hereby notified that any dissemination, distribution, preservation or copying of this message and/or attachments is strictly prohibited. Thank you for your understanding and cooperation.
[-- Attachment #2: Type: text/html, Size: 6843 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
2023-08-23 9:54 [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist liwenyu01
@ 2023-08-31 7:26 ` liwenyu01
2023-09-02 23:44 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: liwenyu01 @ 2023-08-31 7:26 UTC (permalink / raw)
To: akpm, linux-mm; +Cc: linux-kernel, wangyun, liwenyu01
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 2648 bytes --]
This patch just adds the delay statistics of get_page_from_freelist() for memory reclaim, without many modifications.
Anyone in particular I should cc to get this reviewed?
·¢¼þÈË: ÎÄÓî <liwenyu01@bilibili.com>
ÈÕÆÚ: ÐÇÆÚÈý, 2023Äê8ÔÂ23ÈÕ 17:54
ÊÕ¼þÈË: akpm@linux-foundation.org <akpm@linux-foundation.org>, linux-mm@kvack.org <linux-mm@kvack.org>
³ËÍ: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>, ïwÐÜ <wangyun@bilibili.com>, ÎÄÓî <liwenyu01@bilibili.com>
Ö÷Ìâ: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
The current memory reclaim delay statistics only count the direct memory
reclaim of the task in do_try_to_free_pages(). In systems with NUMA
open, some tasks occasionally experience slower response times, but the
total count of reclaim does not increase, using ftrace can show that
node_reclaim has occurred.
The memory reclaim occurring in get_page_from_freelist() is also due to
heavy memory load. To get the impact of tasks in memory reclaim, this
patch adds the statistics of the memory reclaim delay statistics for
__node_reclaim().
Signed-off-by: Wen Yu Li <liwenyu01@bilibili.com>
---
mm/vmscan.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1080209a568b..d2471abce1ae 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
cond_resched();
psi_memstall_enter(&pflags);
+ delayacct_freepages_start();
fs_reclaim_acquire(sc.gfp_mask);
/*
* We need to be able to allocate from the reserves for RECLAIM_UNMAP
@@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
memalloc_noreclaim_restore(noreclaim_flag);
fs_reclaim_release(sc.gfp_mask);
psi_memstall_leave(&pflags);
+ delayacct_freepages_end();
trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
--
2.30.2
±¾Óʼþ½öΪָ¶¨ÊÕ¼þÈËʹÓò¢¿ÉÄܰüº¬±£ÃÜÐÅÏ¢£¬ÈôÄúÎóÊÕµ½±¾Óʼþ£¬¾´Çë֪ͨ·¢¼þÈË£¬²¢Á¢¼´É¾³ý±¾Óʼþ¼°ËùÓи±±¾¡£Äú²»µÃÉÃ×Ô´«²¥¡¢×ª·¢¡¢±£´æ»ò¸´ÖÆ´ËÓʼþ(º¬¸½¼þ)¡£¸ÐлÄúµÄÀí½âÓëÅäºÏ¡£
This message may contain confidential information, and is intended only for the use of the addressee(s) named above. If you have received this message in error, please contact the sender immediately and delete all copies from your system. You are hereby notified that any dissemination, distribution, preservation or copying of this message and/or attachments is strictly prohibited. Thank you for your understanding and cooperation.
[-- Attachment #2: Type: text/html, Size: 11742 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
2023-08-31 7:26 ` liwenyu01
@ 2023-09-02 23:44 ` Andrew Morton
2023-09-05 2:56 ` 答复: [External]Re: " liwenyu01
2023-09-05 5:32 ` liwenyu01
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2023-09-02 23:44 UTC (permalink / raw)
To: liwenyu01; +Cc: linux-mm, linux-kernel, wangyun
On Thu, 31 Aug 2023 07:26:20 +0000 "liwenyu01@bilibili.com" <liwenyu01@bilibili.com> wrote:
> reclaim of the task in do_try_to_free_pages(). In systems with NUMA
> open, some tasks occasionally experience slower response times, but the
> total count of reclaim does not increase, using ftrace can show that
> node_reclaim has occurred.
>
> The memory reclaim occurring in get_page_from_freelist() is also due to
> heavy memory load. To get the impact of tasks in memory reclaim, this
> patch adds the statistics of the memory reclaim delay statistics for
> __node_reclaim().
>
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>
> cond_resched();
> psi_memstall_enter(&pflags);
> + delayacct_freepages_start();
> fs_reclaim_acquire(sc.gfp_mask);
> /*
> * We need to be able to allocate from the reserves for RECLAIM_UNMAP
> @@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> memalloc_noreclaim_restore(noreclaim_flag);
> fs_reclaim_release(sc.gfp_mask);
> psi_memstall_leave(&pflags);
> + delayacct_freepages_end();
>
> trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
__node_reclaim() calls shrink_node() which at some point will call
do_try_to_free_pages() (yes?), which calls delayacct_freepages_start().
So we're effectively nesting calls to delayacct_freepages_start(),
which isn't designed for that?
^ permalink raw reply [flat|nested] 7+ messages in thread
* 答复: [External]Re: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
2023-09-02 23:44 ` Andrew Morton
@ 2023-09-05 2:56 ` liwenyu01
2023-09-05 5:32 ` liwenyu01
1 sibling, 0 replies; 7+ messages in thread
From: liwenyu01 @ 2023-09-05 2:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, wangyun
[-- Attachment #1: Type: text/plain, Size: 2104 bytes --]
reclaim of the task in do_try_to_free_pages(). In systems with NUMA
> open, some tasks occasionally experience slower response times, but the
> total count of reclaim does not increase, using ftrace can show that
> node_reclaim has occurred.
>
> The memory reclaim occurring in get_page_from_freelist() is also due to
> heavy memory load. To get the impact of tasks in memory reclaim, this
> patch adds the statistics of the memory reclaim delay statistics for
> __node_reclaim().
>
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>
> cond_resched();
> psi_memstall_enter(&pflags);
> + delayacct_freepages_start();
> fs_reclaim_acquire(sc.gfp_mask);
> /*
> * We need to be able to allocate from the reserves for RECLAIM_UNMAP
> @@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> memalloc_noreclaim_restore(noreclaim_flag);
> fs_reclaim_release(sc.gfp_mask);
> psi_memstall_leave(&pflags);
> + delayacct_freepages_end();
>
> trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
__node_reclaim() calls shrink_node() which at some point will call
do_try_to_free_pages() (yes?), which calls delayacct_freepages_start().
So we're effectively nesting calls to delayacct_freepages_start(),
which isn't designed for that?
本?件??指定收件人使用并可能包含保密信息,若您?收到本?件,敬?通知?件人,并立即?除本?件及所有副本。您不得擅自?播、??、保存或复制此?件(含附件)。感?您的理解与配合。
This message may contain confidential information, and is intended only for the use of the addressee(s) named above. If you have received this message in error, please contact the sender immediately and delete all copies from your system. You are hereby notified that any dissemination, distribution, preservation or copying of this message and/or attachments is strictly prohibited. Thank you for your understanding and cooperation.
[-- Attachment #2: Type: text/html, Size: 5561 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
2023-09-02 23:44 ` Andrew Morton
2023-09-05 2:56 ` 答复: [External]Re: " liwenyu01
@ 2023-09-05 5:32 ` liwenyu01
2023-09-06 0:53 ` Education Directorate
1 sibling, 1 reply; 7+ messages in thread
From: liwenyu01 @ 2023-09-05 5:32 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel, wangyun
[-- Attachment #1: Type: text/plain, Size: 2561 bytes --]
>> reclaim of the task in do_try_to_free_pages(). In systems with NUMA
>> open, some tasks occasionally experience slower response times, but the
>> total count of reclaim does not increase, using ftrace can show that
>> node_reclaim has occurred.
>>
>> The memory reclaim occurring in get_page_from_freelist() is also due to
>> heavy memory load. To get the impact of tasks in memory reclaim, this
>> patch adds the statistics of the memory reclaim delay statistics for
>> __node_reclaim().
>>
>> ...
>>
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>>
>> cond_resched();
>> psi_memstall_enter(&pflags);
>> + delayacct_freepages_start();
>> fs_reclaim_acquire(sc.gfp_mask);
>> /*
>> * We need to be able to allocate from the reserves for RECLAIM_UNMAP
>> @@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>> memalloc_noreclaim_restore(noreclaim_flag);
>> fs_reclaim_release(sc.gfp_mask);
>> psi_memstall_leave(&pflags);
>> + delayacct_freepages_end();
>>
>> trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
>
> __node_reclaim() calls shrink_node() which at some point will call
> do_try_to_free_pages() (yes?), which calls delayacct_freepages_start().
>
> So we're effectively nesting calls to delayacct_freepages_start(),
> which isn't designed for that?
>
sorry, the last reply was a mistake.
It seems that no point in shrink_node() will call do_try_to_free_pages().
And do_try_to_free_pages() will call shrink_node() through shrink_zones(),
if shrink_node() also has some point will call do_try_to_free_pages,then
delayacct_freepages_start() is nested now?
best wishes.
本?件??指定收件人使用并可能包含保密信息,若??收到本?件,敬?通知?件人,并立即?除本?件及所有副本。?不得擅自?播、??、保存或?制此?件(含附件)。感??的理解与配合。
This message may contain confidential information, and is intended only for the use of the addressee(s) named above. If you have received this message in error, please contact the sender immediately and delete all copies from your system. You are hereby notified that any dissemination, distribution, preservation or copying of this message and/or attachments is strictly prohibited. Thank you for your understanding and cooperation.
[-- Attachment #2: Type: text/html, Size: 6425 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
2023-09-05 5:32 ` liwenyu01
@ 2023-09-06 0:53 ` Education Directorate
2023-09-07 11:52 ` liwenyu01
0 siblings, 1 reply; 7+ messages in thread
From: Education Directorate @ 2023-09-06 0:53 UTC (permalink / raw)
To: liwenyu01; +Cc: akpm, linux-mm, linux-kernel, wangyun
On Tue, Sep 05, 2023 at 05:32:15AM +0000, liwenyu01@bilibili.com wrote:
> >> reclaim of the task in do_try_to_free_pages(). In systems with NUMA
> >> open, some tasks occasionally experience slower response times, but the
> >> total count of reclaim does not increase, using ftrace can show that
> >> node_reclaim has occurred.
> >>
> >> The memory reclaim occurring in get_page_from_freelist() is also due to
> >> heavy memory load. To get the impact of tasks in memory reclaim, this
> >> patch adds the statistics of the memory reclaim delay statistics for
> >> __node_reclaim().
> >>
> >> ...
> >>
> >> --- a/mm/vmscan.c
> >> +++ b/mm/vmscan.c
> >> @@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> >>
> >> cond_resched();
> >> psi_memstall_enter(&pflags);
> >> + delayacct_freepages_start();
> >> fs_reclaim_acquire(sc.gfp_mask);
> >> /*
> >> * We need to be able to allocate from the reserves for RECLAIM_UNMAP
> >> @@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> >> memalloc_noreclaim_restore(noreclaim_flag);
> >> fs_reclaim_release(sc.gfp_mask);
> >> psi_memstall_leave(&pflags);
> >> + delayacct_freepages_end();
> >>
> >> trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
> >
> > __node_reclaim() calls shrink_node() which at some point will call
> > do_try_to_free_pages() (yes?), which calls delayacct_freepages_start().
> >
> > So we're effectively nesting calls to delayacct_freepages_start(),
> > which isn't designed for that?
> >
> sorry, the last reply was a mistake.
>
> It seems that no point in shrink_node() will call do_try_to_free_pages().
> And do_try_to_free_pages() will call shrink_node() through shrink_zones(),
> if shrink_node() also has some point will call do_try_to_free_pages,then
> delayacct_freepages_start() is nested now?
That's because shrink_node() goes through shrink_list() via
shrink_lruvec()? do_try_to_free_pages() will call shrink_node(). Ideally
we should have some counters around __node_reclaim() and balance_pgdat()
like psi_memstall_* does. Do you want to mimic what psi_memstall_* does?
This would change the definition of delayacct free pages, but I don't think
it will make it worse.
Balbir Singh
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist
2023-09-06 0:53 ` Education Directorate
@ 2023-09-07 11:52 ` liwenyu01
0 siblings, 0 replies; 7+ messages in thread
From: liwenyu01 @ 2023-09-07 11:52 UTC (permalink / raw)
To: Education Directorate, akpm; +Cc: linux-mm, linux-kernel, wangyun
[-- Attachment #1: Type: text/plain, Size: 4021 bytes --]
>> >> reclaim of the task in do_try_to_free_pages(). In systems with NUMA
>> >> open, some tasks occasionally experience slower response times, but the
>> >> total count of reclaim does not increase, using ftrace can show that
>> >> node_reclaim has occurred.
>> >>
>> >> The memory reclaim occurring in get_page_from_freelist() is also due to
>> >> heavy memory load. To get the impact of tasks in memory reclaim, this
>> >> patch adds the statistics of the memory reclaim delay statistics for
>> >> __node_reclaim().
>> >>
>> >> ...
>> >>
>> >> --- a/mm/vmscan.c
>> >> +++ b/mm/vmscan.c
>> >> @@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>> >>
>> >> cond_resched();
>> >> psi_memstall_enter(&pflags);
>> >> + delayacct_freepages_start();
>> >> fs_reclaim_acquire(sc.gfp_mask);
>> >> /*
>> >> * We need to be able to allocate from the reserves for RECLAIM_UNMAP
>> >> @@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>> >> memalloc_noreclaim_restore(noreclaim_flag);
>> >> fs_reclaim_release(sc.gfp_mask);
>> >> psi_memstall_leave(&pflags);
>> >> + delayacct_freepages_end();
>> >>
>> >> trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
>> >
>> > __node_reclaim() calls shrink_node() which at some point will call
>> > do_try_to_free_pages() (yes?), which calls delayacct_freepages_start().
>> >
>> > So we're effectively nesting calls to delayacct_freepages_start(),
>> > which isn't designed for that?
>> >
>> sorry, the last reply was a mistake.
>>
>> It seems that no point in shrink_node() will call do_try_to_free_pages().
>> And do_try_to_free_pages() will call shrink_node() through shrink_zones(),
>> if shrink_node() also has some point will call do_try_to_free_pages,then
>> delayacct_freepages_start() is nested now?
>
> That's because shrink_node() goes through shrink_list() via
> shrink_lruvec()? do_try_to_free_pages() will call shrink_node(). Ideally
> we should have some counters around __node_reclaim() and balance_pgdat()
> like psi_memstall_* does. Do you want to mimic what psi_memstall_* does?
> This would change the definition of delayacct free pages, but I don't think
> it will make it worse.
>
> Balbir Singh
The focus of delayacct should be the memory recalim delay statistics for
each task, and there should be only few direct connections with shrink_node()?
At least it seems like the using of delayacct_freepages_start() is not
wrong right now, so there is unnecessary to implement a new counting method?
Compared with the delay statistics of balance_pgdat() for kswapd, is it
more meaningful to keep the definition of delayacct free pages and only
statistics for application?
Keep the definition of delayacct free pages, going back to this simple
patch, it only does one very simple thing, counting the memory reclaim
delay due to memory pressure on the memory allocation path of
application. Currently only measure the memory recalim delay in
do_try_to_free_pages(), this patch adds statistical points in
__node_reclaim(), both do_try_to_free_pages() and __node_reclaim()
will call shrink_node().
WenYu
本?件??指定收件人使用并可能包含保密信息,若??收到本?件,敬?通知?件人,并立即?除本?件及所有副本。?不得擅自?播、??、保存或?制此?件(含附件)。感??的理解与配合。
This message may contain confidential information, and is intended only for the use of the addressee(s) named above. If you have received this message in error, please contact the sender immediately and delete all copies from your system. You are hereby notified that any dissemination, distribution, preservation or copying of this message and/or attachments is strictly prohibited. Thank you for your understanding and cooperation.
[-- Attachment #2: Type: text/html, Size: 8794 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-09-07 11:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23 9:54 [PATCH RFC] delayacct: add memory reclaim delay in get_page_from_freelist liwenyu01
2023-08-31 7:26 ` liwenyu01
2023-09-02 23:44 ` Andrew Morton
2023-09-05 2:56 ` 答复: [External]Re: " liwenyu01
2023-09-05 5:32 ` liwenyu01
2023-09-06 0:53 ` Education Directorate
2023-09-07 11:52 ` liwenyu01
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox