* [PATCH] mm: vmscan: count slab shrinking results after each shrink_slab()
@ 2015-10-19 18:13 Johannes Weiner
2015-10-20 12:19 ` Vladimir Davydov
0 siblings, 1 reply; 4+ messages in thread
From: Johannes Weiner @ 2015-10-19 18:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vladimir Davydov, linux-mm, cgroups, linux-kernel
cb731d6 ("vmscan: per memory cgroup slab shrinkers") sought to
optimize accumulating slab reclaim results in sc->nr_reclaimed only
once per zone, but the memcg hierarchy walk itself uses
sc->nr_reclaimed as an exit condition. This can lead to overreclaim.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/vmscan.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 27d580b..a02654e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2441,11 +2441,18 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc,
shrink_lruvec(lruvec, swappiness, sc, &lru_pages);
zone_lru_pages += lru_pages;
- if (memcg && is_classzone)
+ if (memcg && is_classzone) {
shrink_slab(sc->gfp_mask, zone_to_nid(zone),
memcg, sc->nr_scanned - scanned,
lru_pages);
+ if (reclaim_state) {
+ sc->nr_reclaimed +=
+ reclaim_state->reclaimed_slab;
+ reclaim_state->reclaimed_slab = 0;
+ }
+ }
+
/*
* Direct reclaim and kswapd have to scan all memory
* cgroups to fulfill the overall scan target for the
@@ -2467,14 +2474,16 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc,
* Shrink the slab caches in the same proportion that
* the eligible LRU pages were scanned.
*/
- if (global_reclaim(sc) && is_classzone)
+ if (global_reclaim(sc) && is_classzone) {
shrink_slab(sc->gfp_mask, zone_to_nid(zone), NULL,
sc->nr_scanned - nr_scanned,
zone_lru_pages);
- if (reclaim_state) {
- sc->nr_reclaimed += reclaim_state->reclaimed_slab;
- reclaim_state->reclaimed_slab = 0;
+ if (reclaim_state) {
+ sc->nr_reclaimed +=
+ reclaim_state->reclaimed_slab;
+ reclaim_state->reclaimed_slab = 0;
+ }
}
vmpressure(sc->gfp_mask, sc->target_mem_cgroup,
--
2.6.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: vmscan: count slab shrinking results after each shrink_slab()
2015-10-19 18:13 [PATCH] mm: vmscan: count slab shrinking results after each shrink_slab() Johannes Weiner
@ 2015-10-20 12:19 ` Vladimir Davydov
2015-10-20 13:56 ` Johannes Weiner
0 siblings, 1 reply; 4+ messages in thread
From: Vladimir Davydov @ 2015-10-20 12:19 UTC (permalink / raw)
To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel
On Mon, Oct 19, 2015 at 02:13:35PM -0400, Johannes Weiner wrote:
> cb731d6 ("vmscan: per memory cgroup slab shrinkers") sought to
> optimize accumulating slab reclaim results in sc->nr_reclaimed only
> once per zone, but the memcg hierarchy walk itself uses
> sc->nr_reclaimed as an exit condition. This can lead to overreclaim.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> mm/vmscan.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 27d580b..a02654e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2441,11 +2441,18 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc,
> shrink_lruvec(lruvec, swappiness, sc, &lru_pages);
> zone_lru_pages += lru_pages;
>
> - if (memcg && is_classzone)
> + if (memcg && is_classzone) {
> shrink_slab(sc->gfp_mask, zone_to_nid(zone),
> memcg, sc->nr_scanned - scanned,
> lru_pages);
>
> + if (reclaim_state) {
current->reclaim_state is only set on global reclaim, so when performing
memcg reclaim we'll never get here. Hence, since we check nr_reclaimed
in the loop only on memcg reclaim, this patch doesn't change anything.
Setting current->reclaim_state on memcg reclaim doesn't seem to be an
option, because it accounts objects freed by any cgroup (e.g. via RCU
callback) - see https://lkml.org/lkml/2015/1/20/91
About overreclaim that might happen due to the current behavior. Inodes
and dentries are small and usually freed by RCU so not accounting them
to nr_reclaimed shouldn't make much difference. The only reason I see
why overreclaim can happen is ignoring eviction of an inode full of page
cache, speaking of which makes me wonder if it'd be better to refrain
from dropping inodes which have page cache left, at least unless the
scan priority is low?
Thanks,
Vladimir
> + sc->nr_reclaimed +=
> + reclaim_state->reclaimed_slab;
> + reclaim_state->reclaimed_slab = 0;
> + }
> + }
> +
> /*
> * Direct reclaim and kswapd have to scan all memory
> * cgroups to fulfill the overall scan target for the
> @@ -2467,14 +2474,16 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc,
> * Shrink the slab caches in the same proportion that
> * the eligible LRU pages were scanned.
> */
> - if (global_reclaim(sc) && is_classzone)
> + if (global_reclaim(sc) && is_classzone) {
> shrink_slab(sc->gfp_mask, zone_to_nid(zone), NULL,
> sc->nr_scanned - nr_scanned,
> zone_lru_pages);
>
> - if (reclaim_state) {
> - sc->nr_reclaimed += reclaim_state->reclaimed_slab;
> - reclaim_state->reclaimed_slab = 0;
> + if (reclaim_state) {
> + sc->nr_reclaimed +=
> + reclaim_state->reclaimed_slab;
> + reclaim_state->reclaimed_slab = 0;
> + }
> }
>
> vmpressure(sc->gfp_mask, sc->target_mem_cgroup,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: vmscan: count slab shrinking results after each shrink_slab()
2015-10-20 12:19 ` Vladimir Davydov
@ 2015-10-20 13:56 ` Johannes Weiner
2015-10-20 15:43 ` Vladimir Davydov
0 siblings, 1 reply; 4+ messages in thread
From: Johannes Weiner @ 2015-10-20 13:56 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel
On Tue, Oct 20, 2015 at 03:19:20PM +0300, Vladimir Davydov wrote:
> On Mon, Oct 19, 2015 at 02:13:35PM -0400, Johannes Weiner wrote:
> > cb731d6 ("vmscan: per memory cgroup slab shrinkers") sought to
> > optimize accumulating slab reclaim results in sc->nr_reclaimed only
> > once per zone, but the memcg hierarchy walk itself uses
> > sc->nr_reclaimed as an exit condition. This can lead to overreclaim.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> > mm/vmscan.c | 19 ++++++++++++++-----
> > 1 file changed, 14 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 27d580b..a02654e 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2441,11 +2441,18 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc,
> > shrink_lruvec(lruvec, swappiness, sc, &lru_pages);
> > zone_lru_pages += lru_pages;
> >
> > - if (memcg && is_classzone)
> > + if (memcg && is_classzone) {
> > shrink_slab(sc->gfp_mask, zone_to_nid(zone),
> > memcg, sc->nr_scanned - scanned,
> > lru_pages);
> >
> > + if (reclaim_state) {
>
> current->reclaim_state is only set on global reclaim, so when performing
> memcg reclaim we'll never get here. Hence, since we check nr_reclaimed
> in the loop only on memcg reclaim, this patch doesn't change anything.
>
> Setting current->reclaim_state on memcg reclaim doesn't seem to be an
> option, because it accounts objects freed by any cgroup (e.g. via RCU
> callback) - see https://lkml.org/lkml/2015/1/20/91
Ah, I was not aware of that. Thanks for clarifying. Scratch this patch
then.
Do you think it would make sense to take the shrink_slab() return
value into account? Or are most objects expected to be RCU-freed
anyway so it wouldn't make a difference?
> About overreclaim that might happen due to the current behavior. Inodes
> and dentries are small and usually freed by RCU so not accounting them
> to nr_reclaimed shouldn't make much difference. The only reason I see
> why overreclaim can happen is ignoring eviction of an inode full of page
> cache, speaking of which makes me wonder if it'd be better to refrain
> from dropping inodes which have page cache left, at least unless the
> scan priority is low?
Unless we have evidence that it drops cache pages prematurely, I think
it should be okay to leave it as is.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: vmscan: count slab shrinking results after each shrink_slab()
2015-10-20 13:56 ` Johannes Weiner
@ 2015-10-20 15:43 ` Vladimir Davydov
0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Davydov @ 2015-10-20 15:43 UTC (permalink / raw)
To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel
On Tue, Oct 20, 2015 at 09:56:06AM -0400, Johannes Weiner wrote:
> On Tue, Oct 20, 2015 at 03:19:20PM +0300, Vladimir Davydov wrote:
> > On Mon, Oct 19, 2015 at 02:13:35PM -0400, Johannes Weiner wrote:
> > > cb731d6 ("vmscan: per memory cgroup slab shrinkers") sought to
> > > optimize accumulating slab reclaim results in sc->nr_reclaimed only
> > > once per zone, but the memcg hierarchy walk itself uses
> > > sc->nr_reclaimed as an exit condition. This can lead to overreclaim.
> > >
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > > ---
> > > mm/vmscan.c | 19 ++++++++++++++-----
> > > 1 file changed, 14 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index 27d580b..a02654e 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -2441,11 +2441,18 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc,
> > > shrink_lruvec(lruvec, swappiness, sc, &lru_pages);
> > > zone_lru_pages += lru_pages;
> > >
> > > - if (memcg && is_classzone)
> > > + if (memcg && is_classzone) {
> > > shrink_slab(sc->gfp_mask, zone_to_nid(zone),
> > > memcg, sc->nr_scanned - scanned,
> > > lru_pages);
> > >
> > > + if (reclaim_state) {
> >
> > current->reclaim_state is only set on global reclaim, so when performing
> > memcg reclaim we'll never get here. Hence, since we check nr_reclaimed
> > in the loop only on memcg reclaim, this patch doesn't change anything.
> >
> > Setting current->reclaim_state on memcg reclaim doesn't seem to be an
> > option, because it accounts objects freed by any cgroup (e.g. via RCU
> > callback) - see https://lkml.org/lkml/2015/1/20/91
>
> Ah, I was not aware of that. Thanks for clarifying. Scratch this patch
> then.
>
> Do you think it would make sense to take the shrink_slab() return
> value into account? Or are most objects expected to be RCU-freed
> anyway so it wouldn't make a difference?
On memcg pressure we don't shrink anything except inodes/dentries, which
are usually RCU-freed - e.g. see dentry_free, destroy_inode,
ext4_destroy_inode, xfs_fs_destroy_inode. So I don't think the number of
objects shrunk would tell us much.
Thanks,
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-10-20 15:43 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-19 18:13 [PATCH] mm: vmscan: count slab shrinking results after each shrink_slab() Johannes Weiner
2015-10-20 12:19 ` Vladimir Davydov
2015-10-20 13:56 ` Johannes Weiner
2015-10-20 15:43 ` Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox