* [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath @ 2015-08-07 7:08 Pintu Kumar 2015-08-07 7:44 ` Michal Hocko 2015-08-07 7:50 ` Sergey Senozhatsky 0 siblings, 2 replies; 10+ messages in thread From: Pintu Kumar @ 2015-08-07 7:08 UTC (permalink / raw) To: akpm, linux-kernel, linux-mm, minchan, dave, mhocko, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, pintu.k Cc: cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr This patch add new counter slowpath_entered in /proc/vmstat to track how many times the system entered into slowpath after first allocation attempt is failed. This is useful to know the rate of allocation success within the slowpath. This patch is tested on ARM with 512MB RAM. A sample output is shown below after successful boot-up: shell> cat /proc/vmstat nr_free_pages 4712 pgalloc_normal 1319432 pgalloc_movable 0 pageoutrun 379 allocstall 0 slowpath_entered 585 compact_stall 0 compact_fail 0 compact_success 0 >From the above output we can see that the system entered slowpath 585 times. But the existing counter kswapd(pageoutrun), direct_reclaim(allocstall), direct_compact(compact_stall) does not tell this value. >From the above value, it clearly indicates that the system have entered slowpath 585 times. Out of which 379 times allocation passed through kswapd, without performing direct reclaim/compaction. That means the remaining 206 times the allocation would have succeeded using the alloc_pages_high_priority. Signed-off-by: Pintu Kumar <pintu.k@samsung.com> --- include/linux/vm_event_item.h | 2 +- mm/page_alloc.c | 2 ++ mm/vmstat.c | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 2b1cef8..9825f294 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -37,7 +37,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #endif PGINODESTEAL, SLABS_SCANNED, KSWAPD_INODESTEAL, KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY, - PAGEOUTRUN, ALLOCSTALL, PGROTATED, + PAGEOUTRUN, ALLOCSTALL, SLOWPATH_ENTERED, PGROTATED, DROP_PAGECACHE, DROP_SLAB, #ifdef CONFIG_NUMA_BALANCING NUMA_PTE_UPDATES, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2024d2e..4a5d487 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3029,6 +3029,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (IS_ENABLED(CONFIG_NUMA) && (gfp_mask & __GFP_THISNODE) && !wait) goto nopage; + count_vm_event(SLOWPATH_ENTERED); + retry: if (!(gfp_mask & __GFP_NO_KSWAPD)) wake_all_kswapds(order, ac); diff --git a/mm/vmstat.c b/mm/vmstat.c index 1fd0886..1c54fdf 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -778,7 +778,7 @@ const char * const vmstat_text[] = { "kswapd_high_wmark_hit_quickly", "pageoutrun", "allocstall", - + "slowpath_entered", "pgrotated", "drop_pagecache", -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-07 7:08 [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath Pintu Kumar @ 2015-08-07 7:44 ` Michal Hocko 2015-08-07 12:46 ` PINTU KUMAR 2015-08-07 7:50 ` Sergey Senozhatsky 1 sibling, 1 reply; 10+ messages in thread From: Michal Hocko @ 2015-08-07 7:44 UTC (permalink / raw) To: Pintu Kumar Cc: akpm, linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr On Fri 07-08-15 12:38:54, Pintu Kumar wrote: > This patch add new counter slowpath_entered in /proc/vmstat to > track how many times the system entered into slowpath after > first allocation attempt is failed. This is too lowlevel to be exported in the regular user visible interface IMO. > This is useful to know the rate of allocation success within > the slowpath. What would be that information good for? Is a regular administrator expected to consume this value or this is aimed more to kernel developers? If the later then I think a trace point sounds like a better interface. > This patch is tested on ARM with 512MB RAM. > A sample output is shown below after successful boot-up: > shell> cat /proc/vmstat > nr_free_pages 4712 > pgalloc_normal 1319432 > pgalloc_movable 0 > pageoutrun 379 > allocstall 0 > slowpath_entered 585 > compact_stall 0 > compact_fail 0 > compact_success 0 > > >From the above output we can see that the system entered > slowpath 585 times. > But the existing counter kswapd(pageoutrun), direct_reclaim(allocstall), > direct_compact(compact_stall) does not tell this value. > >From the above value, it clearly indicates that the system have > entered slowpath 585 times. Out of which 379 times allocation passed > through kswapd, without performing direct reclaim/compaction. > That means the remaining 206 times the allocation would have succeeded > using the alloc_pages_high_priority. > > Signed-off-by: Pintu Kumar <pintu.k@samsung.com> -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-07 7:44 ` Michal Hocko @ 2015-08-07 12:46 ` PINTU KUMAR 2015-08-07 14:30 ` Michal Hocko 2015-08-07 22:35 ` Andrew Morton 0 siblings, 2 replies; 10+ messages in thread From: PINTU KUMAR @ 2015-08-07 12:46 UTC (permalink / raw) To: 'Michal Hocko' Cc: akpm, linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr Hi, > -----Original Message----- > From: Michal Hocko [mailto:mhocko@kernel.org] > Sent: Friday, August 07, 2015 1:14 PM > To: Pintu Kumar > Cc: akpm@linux-foundation.org; linux-kernel@vger.kernel.org; linux- > mm@kvack.org; minchan@kernel.org; dave@stgolabs.net; koct9i@gmail.com; > mgorman@suse.de; vbabka@suse.cz; js1304@gmail.com; > hannes@cmpxchg.org; alexander.h.duyck@redhat.com; > sasha.levin@oracle.com; cl@linux.com; fengguang.wu@intel.com; > cpgs@samsung.com; pintu_agarwal@yahoo.com; pintu.k@outlook.com; > vishnu.ps@samsung.com; rohit.kr@samsung.com > Subject: Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath > > On Fri 07-08-15 12:38:54, Pintu Kumar wrote: > > This patch add new counter slowpath_entered in /proc/vmstat to track > > how many times the system entered into slowpath after first allocation > > attempt is failed. > > This is too lowlevel to be exported in the regular user visible interface IMO. > I think its ok because I think this interface is for lowlevel debugging itself. > > This is useful to know the rate of allocation success within the > > slowpath. > > What would be that information good for? Is a regular administrator expected to > consume this value or this is aimed more to kernel developers? If the later then I > think a trace point sounds like a better interface. > This information is good for kernel developers. I found this information useful while debugging low memory situation and sluggishness behavior. I wanted to know how many times the first allocation is failing and how many times system entering slowpath. As I said, the existing counter does not give this information clearly. The pageoutrun, allocstall is too confusing. Also, if kswapd and compaction is disabled, we have no other counter for slowpath (except allocstall). Another problem is that allocstall can also be incremented from hibernation during shrink_all_memory calling. Which may create more confusion. Thus I found this interface useful to understand low memory behavior. If device sluggishness is happening because of too many slowpath or due to some other problem. Then we can decide what will be the best memory configuration for my device to reduce the slowpath. Regarding trace points, I am not sure if we can attach counter to it. Also trace may have more over-head and requires additional configs to be enabled to debug. Mostly these configs will not be enabled by default (at least in embedded, low memory device). I found the vmstat interface more easy and useful. Comments and suggestions are welcome. > > This patch is tested on ARM with 512MB RAM. > > A sample output is shown below after successful boot-up: > > shell> cat /proc/vmstat > > nr_free_pages 4712 > > pgalloc_normal 1319432 > > pgalloc_movable 0 > > pageoutrun 379 > > allocstall 0 > > slowpath_entered 585 > > compact_stall 0 > > compact_fail 0 > > compact_success 0 > > > > >From the above output we can see that the system entered > > slowpath 585 times. > > But the existing counter kswapd(pageoutrun), > > direct_reclaim(allocstall), > > direct_compact(compact_stall) does not tell this value. > > >From the above value, it clearly indicates that the system have > > entered slowpath 585 times. Out of which 379 times allocation passed > > through kswapd, without performing direct reclaim/compaction. > > That means the remaining 206 times the allocation would have succeeded > > using the alloc_pages_high_priority. > > > > Signed-off-by: Pintu Kumar <pintu.k@samsung.com> > -- > Michal Hocko > SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-07 12:46 ` PINTU KUMAR @ 2015-08-07 14:30 ` Michal Hocko 2015-08-07 22:35 ` Andrew Morton 1 sibling, 0 replies; 10+ messages in thread From: Michal Hocko @ 2015-08-07 14:30 UTC (permalink / raw) To: PINTU KUMAR Cc: akpm, linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr On Fri 07-08-15 18:16:47, PINTU KUMAR wrote: [...] > > On Fri 07-08-15 12:38:54, Pintu Kumar wrote: > > > This patch add new counter slowpath_entered in /proc/vmstat to track > > > how many times the system entered into slowpath after first allocation > > > attempt is failed. > > > > This is too lowlevel to be exported in the regular user visible interface IMO. > > > I think its ok because I think this interface is for lowlevel debugging itself. Yes but this might change in future implementations where the counter might be misleading or even lacking any meaning. This is a user visible interface which has to be maintained practically for ever. We have made those mistakes in the past... [...] > This information is good for kernel developers. Then make it a trace point and you can dump even more information. E.g. timestamps, gfp_mask, order... [...] > Regarding trace points, I am not sure if we can attach counter to it. You do not need to have a counter. You just watch for the tracepoint while debugging your particular problem. > Also trace may have more over-head Tracepoints should be close to 0 overhead when disabled and certainly not a performance killer during the debugging session. > and requires additional configs to be enabled to debug. This is to be expected for the debugging sessions. And I am pretty sure that the static event tracepoints do not require anything really excessive. > Mostly these configs will not be enabled by default (at least in embedded, low > memory device). Are you sure? I thought that CONFIG_TRACING should be sufficient for EVENT_TRACING but I am not familiar with this too deeply... -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-07 12:46 ` PINTU KUMAR 2015-08-07 14:30 ` Michal Hocko @ 2015-08-07 22:35 ` Andrew Morton 2015-08-10 9:45 ` PINTU KUMAR 1 sibling, 1 reply; 10+ messages in thread From: Andrew Morton @ 2015-08-07 22:35 UTC (permalink / raw) To: PINTU KUMAR Cc: 'Michal Hocko', linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr On Fri, 07 Aug 2015 18:16:47 +0530 PINTU KUMAR <pintu.k@samsung.com> wrote: > > > This is useful to know the rate of allocation success within the > > > slowpath. > > > > What would be that information good for? Is a regular administrator expected > to > > consume this value or this is aimed more to kernel developers? If the later > then I > > think a trace point sounds like a better interface. > > > This information is good for kernel developers. > I found this information useful while debugging low memory situation and > sluggishness behavior. > I wanted to know how many times the first allocation is failing and how many > times system entering slowpath. > As I said, the existing counter does not give this information clearly. > The pageoutrun, allocstall is too confusing. > Also, if kswapd and compaction is disabled, we have no other counter for > slowpath (except allocstall). > Another problem is that allocstall can also be incremented from hibernation > during shrink_all_memory calling. > Which may create more confusion. > Thus I found this interface useful to understand low memory behavior. > If device sluggishness is happening because of too many slowpath or due to some > other problem. > Then we can decide what will be the best memory configuration for my device to > reduce the slowpath. > > Regarding trace points, I am not sure if we can attach counter to it. > Also trace may have more over-head and requires additional configs to be enabled > to debug. > Mostly these configs will not be enabled by default (at least in embedded, low > memory device). > I found the vmstat interface more easy and useful. This does seem like a pretty basic and sensible thing to expose in vmstat. It probably makes more sense than some of the other things we have in there. Yes, it could be a tracepoint but practically speaking, a tracepoint makes it developer-only. You can ask a bug reporter or a customer "what is /proc/vmstat:slowpath_entered" doing, but it's harder to ask them to set up tracing. And I don't think this will lock us into anything - vmstat is a big dumping ground and I don't see a big problem with removing or changing things later on. IMO, debugfs rules apply here and vmstat would be in debugfs, had debugfs existed at the time. Two things: - we appear to have forgotten to document /proc/vmstat - How does one actually use slowpath_entered? Obviously we'd like to know "what proportion of allocations entered the slowpath", so we calculate slowpath_entered/X how do we obtain "X"? Is it by adding up all the pgalloc_*? If so, perhaps we should really have slowpath_entered_dma, slowpath_entered_dma32, ...? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-07 22:35 ` Andrew Morton @ 2015-08-10 9:45 ` PINTU KUMAR 2015-08-11 10:55 ` Michal Hocko 0 siblings, 1 reply; 10+ messages in thread From: PINTU KUMAR @ 2015-08-10 9:45 UTC (permalink / raw) To: 'Andrew Morton' Cc: 'Michal Hocko', linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr Hi, > -----Original Message----- > From: Andrew Morton [mailto:akpm@linux-foundation.org] > Sent: Saturday, August 08, 2015 4:06 AM > To: PINTU KUMAR > Cc: 'Michal Hocko'; linux-kernel@vger.kernel.org; linux-mm@kvack.org; > minchan@kernel.org; dave@stgolabs.net; koct9i@gmail.com; > mgorman@suse.de; vbabka@suse.cz; js1304@gmail.com; > hannes@cmpxchg.org; alexander.h.duyck@redhat.com; > sasha.levin@oracle.com; cl@linux.com; fengguang.wu@intel.com; > cpgs@samsung.com; pintu_agarwal@yahoo.com; pintu.k@outlook.com; > vishnu.ps@samsung.com; rohit.kr@samsung.com > Subject: Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath > > On Fri, 07 Aug 2015 18:16:47 +0530 PINTU KUMAR <pintu.k@samsung.com> > wrote: > > > > > This is useful to know the rate of allocation success within the > > > > slowpath. > > > > > > What would be that information good for? Is a regular administrator > > > expected > > to > > > consume this value or this is aimed more to kernel developers? If > > > the later > > then I > > > think a trace point sounds like a better interface. > > > > > This information is good for kernel developers. > > I found this information useful while debugging low memory situation > > and sluggishness behavior. > > I wanted to know how many times the first allocation is failing and > > how many times system entering slowpath. > > As I said, the existing counter does not give this information clearly. > > The pageoutrun, allocstall is too confusing. > > Also, if kswapd and compaction is disabled, we have no other counter > > for slowpath (except allocstall). > > Another problem is that allocstall can also be incremented from > > hibernation during shrink_all_memory calling. > > Which may create more confusion. > > Thus I found this interface useful to understand low memory behavior. > > If device sluggishness is happening because of too many slowpath or > > due to some other problem. > > Then we can decide what will be the best memory configuration for my > > device to reduce the slowpath. > > > > Regarding trace points, I am not sure if we can attach counter to it. > > Also trace may have more over-head and requires additional configs to > > be enabled to debug. > > Mostly these configs will not be enabled by default (at least in > > embedded, low memory device). > > I found the vmstat interface more easy and useful. > > This does seem like a pretty basic and sensible thing to expose in vmstat. It > probably makes more sense than some of the other things we have in there. > Thanks Andrew. Yes, as par my analysis, I feel that this is one of the useful and important interface. I added it in one of our internal product and found it to be very useful. Specially during shrink_memory and compact_nodes analysis I found it really useful. It helps me to prove that if higher-order pages are present, it can reduce the slowpath drastically. Also during my ELC presentation people asked me how to monitor the slowpath counts. > Yes, it could be a tracepoint but practically speaking, a tracepoint makes it > developer-only. You can ask a bug reporter or a customer "what is > /proc/vmstat:slowpath_entered" doing, but it's harder to ask them to set up > tracing. > Yes, at times tracing are painful to analyze. Also, in commercial user binaries, most of tracing support are disabled (with no root privileges). However, /proc/vmstat works with normal user binaries. When memory issues are reported, we just get log dumps and few interfaces like this. Most of the time these memory issues are hard to reproduce because it may happen after long usage. > And I don't think this will lock us into anything - vmstat is a big dumping ground > and I don't see a big problem with removing or changing things later on. IMO, > debugfs rules apply here and vmstat would be in debugfs, had debugfs existed at > the time. > > > Two things: > > - we appear to have forgotten to document /proc/vmstat > Yes, I could not find any document on vmstat under kernel/Documentation. I think it's a nice think to have. May be, I can start this initiative to create one :) If respective owner can update, it will be great. > - How does one actually use slowpath_entered? Obviously we'd like to > know "what proportion of allocations entered the slowpath", so we > calculate > > slowpath_entered/X > > how do we obtain "X"? Is it by adding up all the pgalloc_*? If > so, perhaps we should really have slowpath_entered_dma, > slowpath_entered_dma32, ...? I think the slowpath for other zones may not be required. We just need to know how many times we entered slowpath and possibly do something to reduce it. But, I think, pgalloc_* count may also include success for fastpath. How I use slowpath for analysis is: VMSTAT BEFORE AFTER %DIFF ---------- ---------- ---------- ------------ nr_free_pages 6726 12494 46.17% pgalloc_normal 985836 1549333 36.37% pageoutrun 2699 529 80.40% allocstall 298 98 67.11% slowpath_entered 16659 739 95.56% compact_stall 244 21 91.39% compact_fail 178 11 93.82% compact_success 52 7 86.54% The above values are from 512MB system with only NORMAL zone. Before, the slowpath count was 16659. After (memory shrinker + compaction), the slowpath reduced by 95%, for the same scenario. This is just an example. If we are interested to know even allocation success/fail ratio in slowpath, then I think we need more counters. Such as; direct_reclaim_success/fail, kswapd_success/fail (just like compaction success/fail). OR, we can have pgalloc_success_fastpath counter. Then we can do: pgalloc_success_in_slowpath = (pgalloc_normal - pgalloc_success_fastpath) Therefore, success_ratio for slowpath could be; (pgalloc_success_in_slowpath/slowpath_entered) * 100 More comments, welcome. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-10 9:45 ` PINTU KUMAR @ 2015-08-11 10:55 ` Michal Hocko 2015-08-12 14:52 ` PINTU KUMAR 0 siblings, 1 reply; 10+ messages in thread From: Michal Hocko @ 2015-08-11 10:55 UTC (permalink / raw) To: PINTU KUMAR Cc: 'Andrew Morton', linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr On Mon 10-08-15 15:15:06, PINTU KUMAR wrote: [...] > > > Regarding trace points, I am not sure if we can attach counter to it. > > > Also trace may have more over-head and requires additional configs to > > > be enabled to debug. > > > Mostly these configs will not be enabled by default (at least in > > > embedded, low memory device). > > > I found the vmstat interface more easy and useful. > > > > This does seem like a pretty basic and sensible thing to expose in vmstat. It > > probably makes more sense than some of the other things we have in there. I still fail to see what exactly this number says. The allocator slowpath (aka __alloc_pages_slowpath) is more an organizational split up of the code than anything that would tell us about how costly the allocation is - e.g. zone_reclaim might happen before we enter the slowpath. > Thanks Andrew. > Yes, as par my analysis, I feel that this is one of the useful and important > interface. > I added it in one of our internal product and found it to be very useful. > Specially during shrink_memory and compact_nodes analysis I found it really > useful. > It helps me to prove that if higher-order pages are present, it can reduce the > slowpath drastically. I am not sure I understand but this is kind of obvious, no? > Also during my ELC presentation people asked me how to monitor the slowpath > counts. Isn't the allocation latency a much well defined metric? What does the slowpath without compaction/reclaim tell to user? > > Yes, it could be a tracepoint but practically speaking, a tracepoint makes it > > developer-only. You can ask a bug reporter or a customer "what is > > /proc/vmstat:slowpath_entered" doing, but it's harder to ask them to set up > > tracing. > > > Yes, at times tracing are painful to analyze. > Also, in commercial user binaries, most of tracing support are disabled (with no > root privileges). > However, /proc/vmstat works with normal user binaries. > When memory issues are reported, we just get log dumps and few interfaces like > this. > Most of the time these memory issues are hard to reproduce because it may happen > after long usage. Yes, I do understand that vmstat is much more convenient. No question about that. But the counter should be generally usable. When I see COMPACTSTALL increasing I know that the direct compaction had to be invoked and that tells me that the system is getting fragmented and COMPACTFAIL/COMPACTSUCCESS will tell me how successful the compaction is. Similarly when I see ALLOCSTALL I know that kswapd doesn't catch up and scan/reclaim will tell me how effective it is. Snapshoting ALLOCSTALL/time helped me to narrow down memory pressure peaks to further investigate other counters in a more detail. What will entered-slowpath without triggering neither compaction nor direct reclaim tell me? [...] > > Two things: > > > > - we appear to have forgotten to document /proc/vmstat > > > Yes, I could not find any document on vmstat under kernel/Documentation. > I think it's a nice think to have. > May be, I can start this initiative to create one :) That would be more than appreciated. > If respective owner can update, it will be great. > > > - How does one actually use slowpath_entered? Obviously we'd like to > > know "what proportion of allocations entered the slowpath", so we > > calculate > > > > slowpath_entered/X > > > > how do we obtain "X"? Is it by adding up all the pgalloc_*? It's not because pgalloc_ count number of pages while slowpath_entered counts allocations requests. > > If > > so, perhaps we should really have slowpath_entered_dma, > > slowpath_entered_dma32, ...? > > I think the slowpath for other zones may not be required. > We just need to know how many times we entered slowpath and possibly do > something to reduce it. > But, I think, pgalloc_* count may also include success for fastpath. > > How I use slowpath for analysis is: > VMSTAT BEFORE AFTER %DIFF > ---------- ---------- ---------- ------------ > nr_free_pages 6726 12494 46.17% > pgalloc_normal 985836 1549333 36.37% > pageoutrun 2699 529 80.40% > allocstall 298 98 67.11% > slowpath_entered 16659 739 95.56% > compact_stall 244 21 91.39% > compact_fail 178 11 93.82% > compact_success 52 7 86.54% > > The above values are from 512MB system with only NORMAL zone. > Before, the slowpath count was 16659. > After (memory shrinker + compaction), the slowpath reduced by 95%, for > the same scenario. > This is just an example. But what additional information does it give to us? We can see that the direct reclaim has been reduced as well as the compaction which was even more effective so the overall memory pressure was lighter and memory less fragmented. I assume that your test has requested the same amount of high order allocations and pgalloc_normal much higher in the second case suggests they were more effective but we can see that clearly even without slowpath_entered. So I would argue that we do not need slowpath_entered. We already have it, even specialized depending on which _slow_ path has been executed. What we are missing is a number of all requests to have a reasonable base. Whether adding such a counter in the hot path is justified is a question. I haven't really needed it so far and I am looking into vmstat and meminfo to debug memory reclaim related issues quite often. > If we are interested to know even allocation success/fail ratio in slowpath, > then I think we need more counters. > Such as; direct_reclaim_success/fail, kswapd_success/fail (just like compaction > success/fail). > OR, we can have pgalloc_success_fastpath counter. This all sounds like exposing more and more details about internal implementation. This all fits into tracepoints world IMO. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-11 10:55 ` Michal Hocko @ 2015-08-12 14:52 ` PINTU KUMAR 2015-08-13 9:07 ` Michal Hocko 0 siblings, 1 reply; 10+ messages in thread From: PINTU KUMAR @ 2015-08-12 14:52 UTC (permalink / raw) To: 'Michal Hocko' Cc: 'Andrew Morton', linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr, iqbal.ams Hi, > -----Original Message----- > From: Michal Hocko [mailto:mhocko@kernel.org] > Sent: Tuesday, August 11, 2015 4:25 PM > To: PINTU KUMAR > Cc: 'Andrew Morton'; linux-kernel@vger.kernel.org; linux-mm@kvack.org; > minchan@kernel.org; dave@stgolabs.net; koct9i@gmail.com; > mgorman@suse.de; vbabka@suse.cz; js1304@gmail.com; > hannes@cmpxchg.org; alexander.h.duyck@redhat.com; > sasha.levin@oracle.com; cl@linux.com; fengguang.wu@intel.com; > cpgs@samsung.com; pintu_agarwal@yahoo.com; pintu.k@outlook.com; > vishnu.ps@samsung.com; rohit.kr@samsung.com > Subject: Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath > > On Mon 10-08-15 15:15:06, PINTU KUMAR wrote: > [...] > > > > Regarding trace points, I am not sure if we can attach counter to it. > > > > Also trace may have more over-head and requires additional configs > > > > to be enabled to debug. > > > > Mostly these configs will not be enabled by default (at least in > > > > embedded, low memory device). > > > > I found the vmstat interface more easy and useful. > > > > > > This does seem like a pretty basic and sensible thing to expose in > > > vmstat. It probably makes more sense than some of the other things we have > in there. > > I still fail to see what exactly this number says. The allocator slowpath (aka > __alloc_pages_slowpath) is more an organizational split up of the code than > anything that would tell us about how costly the allocation is - e.g. zone_reclaim > might happen before we enter the slowpath. > > > Thanks Andrew. > > Yes, as par my analysis, I feel that this is one of the useful and > > important interface. > > I added it in one of our internal product and found it to be very useful. > > Specially during shrink_memory and compact_nodes analysis I found it > > really useful. > > It helps me to prove that if higher-order pages are present, it can > > reduce the slowpath drastically. > > I am not sure I understand but this is kind of obvious, no? > Yes, but it's hard to prove to management that the slowpath count is reduced. As we have seen, most of the time this kind of performance issues are hard to reproduce. > > Also during my ELC presentation people asked me how to monitor the > > slowpath counts. > > Isn't the allocation latency a much well defined metric? What does the slowpath > without compaction/reclaim tell to user? > The current metrics in slowpath is the story half told. > > > Yes, it could be a tracepoint but practically speaking, a tracepoint > > > makes it developer-only. You can ask a bug reporter or a customer > > > "what is /proc/vmstat:slowpath_entered" doing, but it's harder to > > > ask them to set up tracing. > > > > > Yes, at times tracing are painful to analyze. > > Also, in commercial user binaries, most of tracing support are > > disabled (with no root privileges). > > However, /proc/vmstat works with normal user binaries. > > When memory issues are reported, we just get log dumps and few > > interfaces like this. > > Most of the time these memory issues are hard to reproduce because it > > may happen after long usage. > > Yes, I do understand that vmstat is much more convenient. No question about > that. But the counter should be generally usable. > > When I see COMPACTSTALL increasing I know that the direct compaction had to > be invoked and that tells me that the system is getting fragmented and > COMPACTFAIL/COMPACTSUCCESS will tell me how successful the compaction is. > > Similarly when I see ALLOCSTALL I know that kswapd doesn't catch up and > scan/reclaim will tell me how effective it is. Snapshoting ALLOCSTALL/time > helped me to narrow down memory pressure peaks to further investigate other > counters in a more detail. > > What will entered-slowpath without triggering neither compaction nor direct > reclaim tell me? > The slowpath count will actually give the actual number, irrespective of compact/reclaim/kswapd. There are other things that happens in slowpath, for which we don't have counters. Thus having one counter _slowpath_ is enough for all situations. Even, when KSWAP/COMPACTION is disabled, or not used. > [...] > > > > Two things: > > > > > > - we appear to have forgotten to document /proc/vmstat > > > > > Yes, I could not find any document on vmstat under kernel/Documentation. > > I think it's a nice think to have. > > May be, I can start this initiative to create one :) > > That would be more than appreciated. > Ok, I will start the basic vmstat.txt in Documentation and release first version. Thanks. > > If respective owner can update, it will be great. > > > > > - How does one actually use slowpath_entered? Obviously we'd like to > > > know "what proportion of allocations entered the slowpath", so we > > > calculate > > > > > > slowpath_entered/X > > > > > > how do we obtain "X"? Is it by adding up all the pgalloc_*? > > It's not because pgalloc_ count number of pages while slowpath_entered counts > allocations requests. > > > > If > > > so, perhaps we should really have slowpath_entered_dma, > > > slowpath_entered_dma32, ...? > > > > I think the slowpath for other zones may not be required. > > We just need to know how many times we entered slowpath and possibly > > do something to reduce it. > > But, I think, pgalloc_* count may also include success for fastpath. > > > > How I use slowpath for analysis is: > > VMSTAT BEFORE AFTER %DIFF > > ---------- ---------- ---------- ------------ > > nr_free_pages 6726 12494 46.17% > > pgalloc_normal 985836 1549333 36.37% > > pageoutrun 2699 529 80.40% > > allocstall 298 98 67.11% > > slowpath_entered 16659 739 95.56% > > compact_stall 244 21 91.39% > > compact_fail 178 11 93.82% > > compact_success 52 7 86.54% > > > > The above values are from 512MB system with only NORMAL zone. > > Before, the slowpath count was 16659. > > After (memory shrinker + compaction), the slowpath reduced by 95%, for > > the same scenario. > > This is just an example. > > But what additional information does it give to us? We can see that the direct > reclaim has been reduced as well as the compaction which was even more > effective so the overall memory pressure was lighter and memory less > fragmented. I assume that your test has requested the same amount of high > order allocations and pgalloc_normal much higher in the second case suggests > they were more effective but we can see that clearly even without > slowpath_entered. > The think to note here is that, slowpath count is 16659 (which is 100% actual, and no confusion). However, if you see the other counter for slowpath (pageoutrun:2699, allocstall:298, compact_stall:244), And add all of them (2699+298+244)=3241, it is much lesser than the actual slowpath count. So, these counter doesn't really tells what actually happened in the slowpath. There are other factors that effects slowpath (like, alloc without watermarks). Moreover, with _retry_ and _rebalance_ mechanism, the allocstall/compact_stall counter will keep increasing. But, slowpath count will remain same. Also, in some system, the KSWAP can be disabled, so pageoutrun will be always 0. Similarly, COMPACTION can be disabled, so compact_stall will not be present. In this scenario, we are left with only allocstall. Also, as I said earlier, this allocstall can also be incremented from other place, such as shrink_all_memory. Consider, another situation like below: VMSTAT ------------------------------------- nr_free_pages 59982 pgalloc_normal 364163 pgalloc_high 2046 pageoutrun 1 allocstall 0 compact_stall 0 compact_fail 0 compact_success 0 ------------------------------------ >From the above, is it possible to tell how many times it entered into slowpath? Now, I will add slowpath here, and check again. I don't have that data right now. Thus, the point is, just one counter is enough to quickly analyze the behavior in slowpath. More suggestions are welcome! > So I would argue that we do not need slowpath_entered. We already have it, > even specialized depending on which _slow_ path has been executed. > What we are missing is a number of all requests to have a reasonable base. > Whether adding such a counter in the hot path is justified is a question. I haven't > really needed it so far and I am looking into vmstat and meminfo to debug > memory reclaim related issues quite often. > > > If we are interested to know even allocation success/fail ratio in > > slowpath, then I think we need more counters. > > Such as; direct_reclaim_success/fail, kswapd_success/fail (just like > > compaction success/fail). > > OR, we can have pgalloc_success_fastpath counter. > > This all sounds like exposing more and more details about internal > implementation. This all fits into tracepoints world IMO. > > -- > Michal Hocko > SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-12 14:52 ` PINTU KUMAR @ 2015-08-13 9:07 ` Michal Hocko 0 siblings, 0 replies; 10+ messages in thread From: Michal Hocko @ 2015-08-13 9:07 UTC (permalink / raw) To: PINTU KUMAR Cc: 'Andrew Morton', linux-kernel, linux-mm, minchan, dave, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr, iqbal.ams On Wed 12-08-15 20:22:10, PINTU KUMAR wrote: > > On Mon 10-08-15 15:15:06, PINTU KUMAR wrote: [...] > > > Yes, as par my analysis, I feel that this is one of the useful and > > > important interface. > > > I added it in one of our internal product and found it to be very useful. > > > Specially during shrink_memory and compact_nodes analysis I found it > > > really useful. > > > It helps me to prove that if higher-order pages are present, it can > > > reduce the slowpath drastically. > > > > I am not sure I understand but this is kind of obvious, no? > > > Yes, but it's hard to prove to management that the slowpath count is reduced. > As we have seen, most of the time this kind of performance issues are hard to > reproduce. But the counter doesn't tell you much as I've tried to explain in my previous email. You simply do not have the base to compare it to. The fact is that slow path in this context is quite ambiguous. As I've mentioned the fast path (as per the code organization) can already do expensive operations (e.g. zone_reclaim). So what you are exporting is more a slow path from the code organization POV. Management might be happy about comparing two arbitrary numbers but that doesn't mean it is relevant... [...] > > When I see COMPACTSTALL increasing I know that the direct compaction had to > > be invoked and that tells me that the system is getting fragmented and > > COMPACTFAIL/COMPACTSUCCESS will tell me how successful the compaction is. > > > > Similarly when I see ALLOCSTALL I know that kswapd doesn't catch up and > > scan/reclaim will tell me how effective it is. Snapshoting ALLOCSTALL/time > > helped me to narrow down memory pressure peaks to further investigate other > > counters in a more detail. > > > > What will entered-slowpath without triggering neither compaction nor direct > > reclaim tell me? > > > The slowpath count will actually give the actual number, irrespective of > compact/reclaim/kswapd. If we are missing them and they are significant to make a picture of what is causing allocation delays then let's focus on those. > There are other things that happens in slowpath, for which we don't have > counters. Which would be interesting enough to account for? [...] > > > How I use slowpath for analysis is: > > > VMSTAT BEFORE AFTER %DIFF > > > ---------- ---------- ---------- ------------ > > > nr_free_pages 6726 12494 46.17% > > > pgalloc_normal 985836 1549333 36.37% > > > pageoutrun 2699 529 80.40% > > > allocstall 298 98 67.11% > > > slowpath_entered 16659 739 95.56% > > > compact_stall 244 21 91.39% > > > compact_fail 178 11 93.82% > > > compact_success 52 7 86.54% > > > > > > The above values are from 512MB system with only NORMAL zone. > > > Before, the slowpath count was 16659. > > > After (memory shrinker + compaction), the slowpath reduced by 95%, for > > > the same scenario. > > > This is just an example. > > > > But what additional information does it give to us? We can see that the direct > > reclaim has been reduced as well as the compaction which was even more > > effective so the overall memory pressure was lighter and memory less > > fragmented. I assume that your test has requested the same amount of high > > order allocations and pgalloc_normal much higher in the second case suggests > > they were more effective but we can see that clearly even without > > slowpath_entered. > > > The think to note here is that, slowpath count is 16659 (which is 100% actual, > and no confusion). 100% against what? It certainly is not 100% of all costly allocations because of what has been said already. Moreover this number is really meaningless without knowing how many allocations requests were done in total. > However, if you see the other counter for slowpath (pageoutrun:2699, > allocstall:298, compact_stall:244), > And add all of them (2699+298+244)=3241, it is much lesser than the actual > slowpath count. Yes, because the allocation might have succeeded before the compaction and/or direct reclaim. Such an allocation could be marginally slower than what is not accounted as a fastpath. > So, these counter doesn't really tells what actually happened in the slowpath. No they are not and that is not their purpose. They aim at telling you about costly allocation paths and they give you quite a good view into how they operate. At least they've been serving good for me so far. If there are gaps then let's fill them. > There are other factors that effects slowpath (like, alloc without watermarks). > Moreover, with _retry_ and _rebalance_ mechanism, the allocstall/compact_stall > counter will keep increasing. > But, slowpath count will remain same. I am not sure direct reclaims per one slow path is a super important information. It's been quite sufficient for me to see that there have been many direct reclaims per time unit to debug what is causing the memory peak. > Also, in some system, the KSWAP can be disabled, so pageoutrun will be always 0. Such a system would be really unhealthy but that is really irrelevant to the discussion. > Similarly, COMPACTION can be disabled, so compact_stall will not be present. > In this scenario, we are left with only allocstall. Yes and so what? > Also, as I said earlier, this allocstall can also be incremented from other > place, such as shrink_all_memory. But shrink_all_memory is really uninteresting because this is a hibernation path. You can save the file before and after the hibernation to exclude it. > Consider, another situation like below: > VMSTAT > ------------------------------------- > nr_free_pages 59982 > pgalloc_normal 364163 > pgalloc_high 2046 > pageoutrun 1 > allocstall 0 > compact_stall 0 > compact_fail 0 > compact_success 0 > ------------------------------------ > From the above, is it possible to tell how many times it entered into slowpath? No and I would argue this is not really that interesting. Because we know that neither the direct reclaim nor compaction had to be triggered. So from my point of view those allocations were still in a good shape. entered_slowpath would tell me marginally more. Merely the fact that I had to go via get_page_from_freelist one more time and as this doesn't have a constant cost I would have to go for tracing to have a better picture. That being said, this counter alone is IMHO useless for any reasonable analysis. I would even argue it is actively misleading because it doesn't mark all the slow paths during the allocation. So NAK to this patch. Nevertheless, I can imagine some additional counters could help for debugging. ALLOC_REQUESTS - to count all requests ALLOC_FAILS - to count number of failed requests ALLOC_OOM - to count OOM events COMPACTBACKOFF - compaction backed off because it wouldn't be worth it I could find a way without them until now so I am so sure they are really necessary but if somebody has a usecase and the additional overhead (especially for ALLOC_REQUESTS which is the hot path) is worth it I wouldn't mind. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath 2015-08-07 7:08 [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath Pintu Kumar 2015-08-07 7:44 ` Michal Hocko @ 2015-08-07 7:50 ` Sergey Senozhatsky 1 sibling, 0 replies; 10+ messages in thread From: Sergey Senozhatsky @ 2015-08-07 7:50 UTC (permalink / raw) To: Pintu Kumar Cc: akpm, linux-kernel, linux-mm, minchan, dave, mhocko, koct9i, mgorman, vbabka, js1304, hannes, alexander.h.duyck, sasha.levin, cl, fengguang.wu, cpgs, pintu_agarwal, pintu.k, vishnu.ps, rohit.kr On (08/07/15 12:38), Pintu Kumar wrote: > This patch add new counter slowpath_entered in /proc/vmstat to > track how many times the system entered into slowpath after > first allocation attempt is failed. > This is useful to know the rate of allocation success within > the slowpath. > This patch is tested on ARM with 512MB RAM. > A sample output is shown below after successful boot-up: > shell> cat /proc/vmstat > nr_free_pages 4712 > pgalloc_normal 1319432 > pgalloc_movable 0 > pageoutrun 379 > allocstall 0 > slowpath_entered 585 > compact_stall 0 > compact_fail 0 > compact_success 0 > > From the above output we can see that the system entered > slowpath 585 times. so what can you do with this number? -ss > But the existing counter kswapd(pageoutrun), direct_reclaim(allocstall), > direct_compact(compact_stall) does not tell this value. > From the above value, it clearly indicates that the system have > entered slowpath 585 times. Out of which 379 times allocation passed > through kswapd, without performing direct reclaim/compaction. > That means the remaining 206 times the allocation would have succeeded > using the alloc_pages_high_priority. > > Signed-off-by: Pintu Kumar <pintu.k@samsung.com> > --- > include/linux/vm_event_item.h | 2 +- > mm/page_alloc.c | 2 ++ > mm/vmstat.c | 2 +- > 3 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h > index 2b1cef8..9825f294 100644 > --- a/include/linux/vm_event_item.h > +++ b/include/linux/vm_event_item.h > @@ -37,7 +37,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, > #endif > PGINODESTEAL, SLABS_SCANNED, KSWAPD_INODESTEAL, > KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY, > - PAGEOUTRUN, ALLOCSTALL, PGROTATED, > + PAGEOUTRUN, ALLOCSTALL, SLOWPATH_ENTERED, PGROTATED, > DROP_PAGECACHE, DROP_SLAB, > #ifdef CONFIG_NUMA_BALANCING > NUMA_PTE_UPDATES, > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 2024d2e..4a5d487 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3029,6 +3029,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > if (IS_ENABLED(CONFIG_NUMA) && (gfp_mask & __GFP_THISNODE) && !wait) > goto nopage; > > + count_vm_event(SLOWPATH_ENTERED); > + > retry: > if (!(gfp_mask & __GFP_NO_KSWAPD)) > wake_all_kswapds(order, ac); > diff --git a/mm/vmstat.c b/mm/vmstat.c > index 1fd0886..1c54fdf 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -778,7 +778,7 @@ const char * const vmstat_text[] = { > "kswapd_high_wmark_hit_quickly", > "pageoutrun", > "allocstall", > - > + "slowpath_entered", > "pgrotated", > > "drop_pagecache", > -- > 1.7.9.5 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-08-13 9:07 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-08-07 7:08 [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath Pintu Kumar 2015-08-07 7:44 ` Michal Hocko 2015-08-07 12:46 ` PINTU KUMAR 2015-08-07 14:30 ` Michal Hocko 2015-08-07 22:35 ` Andrew Morton 2015-08-10 9:45 ` PINTU KUMAR 2015-08-11 10:55 ` Michal Hocko 2015-08-12 14:52 ` PINTU KUMAR 2015-08-13 9:07 ` Michal Hocko 2015-08-07 7:50 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox