linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: + mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure.patch added to -mm tree
       [not found] <58a38a94.nb3wSoo24sv+3Kju%akpm@linux-foundation.org>
@ 2017-02-22 10:43 ` Michal Hocko
  2017-02-23  9:01   ` Vinayak Menon
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2017-02-22 10:43 UTC (permalink / raw)
  To: akpm
  Cc: vinmenon, anton.vorontsov, hannes, mgorman, minchan, riel,
	shashim, vbabka, vdavydov.dev, mm-commits, linux-mm

On Tue 14-02-17 14:54:12, akpm@linux-foundation.org wrote:
> From: Vinayak Menon <vinmenon@codeaurora.org>
> Subject: mm: vmscan: do not pass reclaimed slab to vmpressure
> 
> During global reclaim, the nr_reclaimed passed to vmpressure includes the
> pages reclaimed from slab.  But the corresponding scanned slab pages is
> not passed.  There is an impact to the vmpressure values because of this. 
> While moving from kernel version 3.18 to 4.4, a difference is seen in the
> vmpressure values for the same workload resulting in a different behaviour
> of the vmpressure consumer.  One such case is of a vmpressure based
> lowmemorykiller.  It is observed that the vmpressure events are received
> late and less in number resulting in tasks not being killed at the right
> time.  The following numbers show the impact on reclaim activity due to
> the change in behaviour of lowmemorykiller on a 4GB device.  The test
> launches a number of apps in sequence and repeats it multiple times.
> 
>                       v4.4           v3.18
> pgpgin                163016456      145617236
> pgpgout               4366220        4188004
> workingset_refault    29857868       26781854
> workingset_activate   6293946        5634625
> pswpin                1327601        1133912
> pswpout               3593842        3229602
> pgalloc_dma           99520618       94402970
> pgalloc_normal        104046854      98124798
> pgfree                203772640      192600737
> pgmajfault            2126962        1851836
> pgsteal_kswapd_dma    19732899       18039462
> pgsteal_kswapd_normal 19945336       17977706
> pgsteal_direct_dma    206757         131376
> pgsteal_direct_normal 236783         138247
> pageoutrun            116622         108370
> allocstall            7220           4684
> compact_stall         931            856
> 
> This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
> invoke slab shrinkers from shrink_zone()").
> 
> So do not consider reclaimed slab pages for vmpressure calculation.  The
> reclaimed pages from slab can be excluded because the freeing of a page by
> slab shrinking depends on each slab's object population, making the cost
> model (i.e.  scan:free) different from that of LRU.  Also, not every
> shrinker accounts the pages it reclaims.  But ideally the pages reclaimed
> from slab should be passed to vmpressure, otherwise higher vmpressure
> levels can be triggered even when there is a reclaim progress.  But
> accounting only the reclaimed slab pages without the scanned, and adding
> something which does not fit into the cost model just adds noise to the
> vmpressure values.

I believe there are still some of my questions which are not answered by
the changelog update. Namely
- vmstat numbers without mentioning vmpressure events for those 2
  kernels have basically no meaning.
- the changelog doesn't mention that the test case basically benefits
  from as many lmk interventions as possible. Does this represent a real
  life workload? If not is there any real life workload which would
  benefit from the new behavior.
- I would be also very careful calling this a regression without having
  any real workload as an example
- Arguments about the cost model is are true but the resulting code is
  not a 100% win either and the changelog should be explicit about the
  consequences - aka more critical events can fire early while there is
  still slab making a reclaim progress.
 
> Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
> Link: http://lkml.kernel.org/r/1486641577-11685-2-git-send-email-vinmenon@codeaurora.org
> Acked-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
> Cc: Shiraz Hashim <shashim@codeaurora.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/vmscan.c |   17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff -puN mm/vmscan.c~mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure mm/vmscan.c
> --- a/mm/vmscan.c~mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure
> +++ a/mm/vmscan.c
> @@ -2603,16 +2603,23 @@ static bool shrink_node(pg_data_t *pgdat
>  				    sc->nr_scanned - nr_scanned,
>  				    node_lru_pages);
>  
> +		/*
> +		 * Record the subtree's reclaim efficiency. The reclaimed
> +		 * pages from slab is excluded here because the corresponding
> +		 * scanned pages is not accounted. Moreover, freeing a page
> +		 * by slab shrinking depends on each slab's object population,
> +		 * making the cost model (i.e. scan:free) different from that
> +		 * of LRU.
> +		 */
> +		vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
> +			   sc->nr_scanned - nr_scanned,
> +			   sc->nr_reclaimed - nr_reclaimed);
> +
>  		if (reclaim_state) {
>  			sc->nr_reclaimed += reclaim_state->reclaimed_slab;
>  			reclaim_state->reclaimed_slab = 0;
>  		}
>  
> -		/* Record the subtree's reclaim efficiency */
> -		vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
> -			   sc->nr_scanned - nr_scanned,
> -			   sc->nr_reclaimed - nr_reclaimed);
> -
>  		if (sc->nr_reclaimed - nr_reclaimed)
>  			reclaimable = true;
>  
> _
> 
> Patches currently in -mm which might be from vinmenon@codeaurora.org are
> 
> mm-vmpressure-fix-sending-wrong-events-on-underflow.patch
> mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure.patch
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: + mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure.patch added to -mm tree
  2017-02-22 10:43 ` + mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure.patch added to -mm tree Michal Hocko
@ 2017-02-23  9:01   ` Vinayak Menon
  2017-02-23 13:31     ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Vinayak Menon @ 2017-02-23  9:01 UTC (permalink / raw)
  To: Michal Hocko, akpm
  Cc: anton.vorontsov, hannes, mgorman, minchan, riel, shashim, vbabka,
	vdavydov.dev, mm-commits, linux-mm


On 2/22/2017 4:13 PM, Michal Hocko wrote:
> On Tue 14-02-17 14:54:12, akpm@linux-foundation.org wrote:
>> From: Vinayak Menon <vinmenon@codeaurora.org>
>> Subject: mm: vmscan: do not pass reclaimed slab to vmpressure
>>
>> During global reclaim, the nr_reclaimed passed to vmpressure includes the
>> pages reclaimed from slab.  But the corresponding scanned slab pages is
>> not passed.  There is an impact to the vmpressure values because of this. 
>> While moving from kernel version 3.18 to 4.4, a difference is seen in the
>> vmpressure values for the same workload resulting in a different behaviour
>> of the vmpressure consumer.  One such case is of a vmpressure based
>> lowmemorykiller.  It is observed that the vmpressure events are received
>> late and less in number resulting in tasks not being killed at the right
>> time.  The following numbers show the impact on reclaim activity due to
>> the change in behaviour of lowmemorykiller on a 4GB device.  The test
>> launches a number of apps in sequence and repeats it multiple times.
>>
>>                       v4.4           v3.18
>> pgpgin                163016456      145617236
>> pgpgout               4366220        4188004
>> workingset_refault    29857868       26781854
>> workingset_activate   6293946        5634625
>> pswpin                1327601        1133912
>> pswpout               3593842        3229602
>> pgalloc_dma           99520618       94402970
>> pgalloc_normal        104046854      98124798
>> pgfree                203772640      192600737
>> pgmajfault            2126962        1851836
>> pgsteal_kswapd_dma    19732899       18039462
>> pgsteal_kswapd_normal 19945336       17977706
>> pgsteal_direct_dma    206757         131376
>> pgsteal_direct_normal 236783         138247
>> pageoutrun            116622         108370
>> allocstall            7220           4684
>> compact_stall         931            856
>>
>> This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
>> invoke slab shrinkers from shrink_zone()").
>>
>> So do not consider reclaimed slab pages for vmpressure calculation.  The
>> reclaimed pages from slab can be excluded because the freeing of a page by
>> slab shrinking depends on each slab's object population, making the cost
>> model (i.e.  scan:free) different from that of LRU.  Also, not every
>> shrinker accounts the pages it reclaims.  But ideally the pages reclaimed
>> from slab should be passed to vmpressure, otherwise higher vmpressure
>> levels can be triggered even when there is a reclaim progress.  But
>> accounting only the reclaimed slab pages without the scanned, and adding
>> something which does not fit into the cost model just adds noise to the
>> vmpressure values.
> I believe there are still some of my questions which are not answered by
> the changelog update. Namely
> - vmstat numbers without mentioning vmpressure events for those 2
>   kernels have basically no meaning.
Sending a new version. The vmpressure events difference is added.
> - the changelog doesn't mention that the test case basically benefits
>   from as many lmk interventions as possible. Does this represent a real
>   life workload? If not is there any real life workload which would
>   benefit from the new behavior.
The use case does not actually benefit from as many lmk interventions as possible. Because it has to also take care
of maximizing the number of applications sustained. IMHO Android using a vmpressure based user space lowmemorykiller
is a real life workload. But the lowmemorykiller killer example was just to show the difference in vmpressure events between
2 kernel versions. Any workload which uses vmpressure would be something similar ? It would take an action by killing tasks,
or releasing some buffers etc as I understand. The patch was actually meant to fix the addition of noise to vmpressure by
adding reclaimed without accounting the cost and the lmk example was just to indicate the difference in vmpressure events.
> - I would be also very careful calling this a regression without having
>   any real workload as an example
Okay. I have removed that from changelog.
> - Arguments about the cost model is are true but the resulting code is
>   not a 100% win either and the changelog should be explicit about the
>   consequences - aka more critical events can fire early while there is
>   still slab making a reclaim progress.
>  
This line was added to changelog indicating the consequence.
"Ideally the pages reclaimed from slab should be passed to vmpressure, otherwise higher vmpressure levels can
 be triggered even when there is a reclaim progress."

Thanks,
Vinayak

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: + mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure.patch added to -mm tree
  2017-02-23  9:01   ` Vinayak Menon
@ 2017-02-23 13:31     ` Michal Hocko
  0 siblings, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2017-02-23 13:31 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, anton.vorontsov, hannes, mgorman, minchan, riel, shashim,
	vbabka, vdavydov.dev, mm-commits, linux-mm

On Thu 23-02-17 14:31:51, Vinayak Menon wrote:
> 
> On 2/22/2017 4:13 PM, Michal Hocko wrote:
[...]
> > - the changelog doesn't mention that the test case basically benefits
> >   from as many lmk interventions as possible. Does this represent a real
> >   life workload? If not is there any real life workload which would
> >   benefit from the new behavior.
>
> The use case does not actually benefit from as many lmk interventions
> as possible. Because it has to also take care of maximizing the number
> of applications sustained. 

exactly and that is why I am questioning a more pessimistic events. LMK
is a disruptive action so reporting critical actions too early can have
negative impact.

> IMHO Android using a vmpressure based user
> space lowmemorykiller is a real life workload. But the lowmemorykiller
> killer example was just to show the difference in vmpressure events
> between 2 kernel versions. Any workload which uses vmpressure would
> be something similar ? It would take an action by killing tasks, or
> releasing some buffers etc as I understand. The patch was actually
> meant to fix the addition of noise to vmpressure by adding reclaimed
> without accounting the cost and the lmk example was just to indicate
> the difference in vmpressure events.

OK, it seems I have to repeat myself again. So what is the advantage of
getting more pessimistic events and potentially fire disruptive actions
sooner while we could still reclaim slab? Who is going to benefit from
this except from the initial test case which, we agreed, is artificial?
Why does the "noise" even matter?

I am sorry but this whole change smells like "let's fix the test case"
rather than "let's think what the real life use cases will benefit from"
to me. As I've said I will not block this change because the cost model
is so fuzzy that one way or another there will always be somebody
complaining about it... So please, at least, make sure that somebody
hunting a vmpressure misbehavior know why this has been changed! If this
really is a test case motivated change then I would encourage you to
withdraw this patch and instead try to think how to make the vmpressure
more robust.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-02-23 13:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <58a38a94.nb3wSoo24sv+3Kju%akpm@linux-foundation.org>
2017-02-22 10:43 ` + mm-vmscan-do-not-pass-reclaimed-slab-to-vmpressure.patch added to -mm tree Michal Hocko
2017-02-23  9:01   ` Vinayak Menon
2017-02-23 13:31     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox