linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] vfs: dcache: cond_resched in shrink_dentry_list
       [not found]       ` <xr934nfyg4ld.fsf@gthelen.mtv.corp.google.com>
@ 2013-04-10  0:37         ` Greg Thelen
  2013-04-10 23:44           ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Greg Thelen @ 2013-04-10  0:37 UTC (permalink / raw)
  To: akpm, Alexander Viro; +Cc: Dave Chinner, linux-fsdevel, linux-kernel, linux-mm

On Mon, Mar 25 2013, Greg Thelen wrote:

> On Mon, Mar 25 2013, Dave Chinner wrote:
>
>> On Mon, Mar 25, 2013 at 05:39:13PM -0700, Greg Thelen wrote:
>>> On Mon, Mar 25 2013, Dave Chinner wrote:
>>> > On Mon, Mar 25, 2013 at 10:22:31AM -0700, Greg Thelen wrote:
>>> >> Call cond_resched() from shrink_dentry_list() to preserve
>>> >> shrink_dcache_parent() interactivity.
>>> >> 
>>> >> void shrink_dcache_parent(struct dentry * parent)
>>> >> {
>>> >> 	while ((found = select_parent(parent, &dispose)) != 0)
>>> >> 		shrink_dentry_list(&dispose);
>>> >> }
>>> >> 
>>> >> select_parent() populates the dispose list with dentries which
>>> >> shrink_dentry_list() then deletes.  select_parent() carefully uses
>>> >> need_resched() to avoid doing too much work at once.  But neither
>>> >> shrink_dcache_parent() nor its called functions call cond_resched().
>>> >> So once need_resched() is set select_parent() will return single
>>> >> dentry dispose list which is then deleted by shrink_dentry_list().
>>> >> This is inefficient when there are a lot of dentry to process.  This
>>> >> can cause softlockup and hurts interactivity on non preemptable
>>> >> kernels.
>>> >
>>> > Hi Greg,
>>> >
>>> > I can see how this coul dcause problems, but isn't the problem then
>>> > that shrink_dcache_parent()/select_parent() itself is mishandling
>>> > the need for rescheduling rather than being a problem with
>>> > the shrink_dentry_list() implementation?  i.e. select_parent() is
>>> > aborting batching based on a need for rescheduling, but then not
>>> > doing that itself and assuming that someone else will do the
>>> > reschedule for it?
>>> >
>>> > Perhaps this is a better approach:
>>> >
>>> > -	while ((found = select_parent(parent, &dispose)) != 0)
>>> > +	while ((found = select_parent(parent, &dispose)) != 0) {
>>> >                 shrink_dentry_list(&dispose);
>>> > +		cond_resched();
>>> > +	}
>>> >
>>> > With this, select_parent() stops batching when a resched is needed,
>>> > we dispose of the list as a single batch and only then resched if it
>>> > was needed before we go and grab the next batch. That should fix the
>>> > "small batch" problem without the potential for changing the
>>> > shrink_dentry_list() behaviour adversely for other users....
>>> 
>>> I considered only modifying shrink_dcache_parent() as you show above.
>>> Either approach fixes the problem I've seen.  My initial approach adds
>>> cond_resched() deeper into shrink_dentry_list() because I thought that
>>> there might a secondary benefit: shrink_dentry_list() would be willing
>>> to give up the processor when working on a huge number of dentry.  This
>>> could improve interactivity during shrinker and umount.  I don't feel
>>> strongly on this and would be willing to test and post the
>>> add-cond_resched-to-shrink_dcache_parent approach.
>>
>> The shrinker has interactivity problems because of the global
>> dcache_lru_lock, not because of ithe size of the list passed to
>> shrink_dentry_list(). The amount of work that shrink_dentry_list()
>> does here is already bound by the shrinker batch size. Hence in the
>> absence of the RT folk complaining about significant holdoffs I
>> don't think there is an interactivity problem through the shrinker
>> path.
>
> No arguments from me.
>
>> As for the unmount path - shrink_dcache_for_umount_subtree() - that
>> doesn't use shrink_dentry_list() and so would need it's own internal
>> calls to cond_resched().  Perhaps it's shrink_dcache_sb() that you
>> are concerned about?  Either way, And there are lots more similar
>> issues in the unmount path such as evict_inodes(), so unless you are
>> going to give every possible path through unmount/remount/bdev
>> invalidation the same treatment then changing shrink_dentry_list()
>> won't significantly improve the interactivity of the system
>> situation in these paths...
>
> Ok.  As stated, I wasn't sure if the cond_resched() in
> shrink_dentry_list() had any appeal.  Apparently it doesn't.  I'll drop
> this approach in favor of the following:
>
> --->8---
>
> From: Greg Thelen <gthelen@google.com>
> Date: Sat, 23 Mar 2013 18:25:02 -0700
> Subject: [PATCH] vfs: dcache: cond_resched in shrink_dcache_parent
>
> Call cond_resched() in shrink_dcache_parent() to maintain
> interactivity.
>
> Before this patch:
>
> void shrink_dcache_parent(struct dentry * parent)
> {
> 	while ((found = select_parent(parent, &dispose)) != 0)
> 		shrink_dentry_list(&dispose);
> }
>
> select_parent() populates the dispose list with dentries which
> shrink_dentry_list() then deletes.  select_parent() carefully uses
> need_resched() to avoid doing too much work at once.  But neither
> shrink_dcache_parent() nor its called functions call cond_resched().
> So once need_resched() is set select_parent() will return single
> dentry dispose list which is then deleted by shrink_dentry_list().
> This is inefficient when there are a lot of dentry to process.  This
> can cause softlockup and hurts interactivity on non preemptable
> kernels.
>
> This change adds cond_resched() in shrink_dcache_parent().  The
> benefit of this is that need_resched() is quickly cleared so that
> future calls to select_parent() are able to efficiently return a big
> batch of dentry.
>
> These additional cond_resched() do not seem to impact performance, at
> least for the workload below.
>
> Here is a program which can cause soft lockup on a if other system
> activity sets need_resched().
>
> 	int main()
> 	{
> 	        struct rlimit rlim;
> 	        int i;
> 	        int f[100000];
> 	        char buf[20];
> 	        struct timeval t1, t2;
> 	        double diff;
>
> 	        /* cleanup past run */
> 	        system("rm -rf x");
>
> 	        /* boost nfile rlimit */
> 	        rlim.rlim_cur = 200000;
> 	        rlim.rlim_max = 200000;
> 	        if (setrlimit(RLIMIT_NOFILE, &rlim))
> 	                err(1, "setrlimit");
>
> 	        /* make directory for files */
> 	        if (mkdir("x", 0700))
> 	                err(1, "mkdir");
>
> 	        if (gettimeofday(&t1, NULL))
> 	                err(1, "gettimeofday");
>
> 	        /* populate directory with open files */
> 	        for (i = 0; i < 100000; i++) {
> 	                snprintf(buf, sizeof(buf), "x/%d", i);
> 	                f[i] = open(buf, O_CREAT);
> 	                if (f[i] == -1)
> 	                        err(1, "open");
> 	        }
>
> 	        /* close some of the files */
> 	        for (i = 0; i < 85000; i++)
> 	                close(f[i]);
>
> 	        /* unlink all files, even open ones */
> 	        system("rm -rf x");
>
> 	        if (gettimeofday(&t2, NULL))
> 	                err(1, "gettimeofday");
>
> 	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
> 	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));
>
> 	        printf("done: %g elapsed\n", diff/1e6);
> 	        return 0;
> 	}
>
> Signed-off-by: Greg Thelen <gthelen@google.com>
> Signed-off-by: Dave Chinner <david@fromorbit.com>
> ---
>  fs/dcache.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dcache.c b/fs/dcache.c
> index fbfae008..e52c07e 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1230,8 +1230,10 @@ void shrink_dcache_parent(struct dentry * parent)
>  	LIST_HEAD(dispose);
>  	int found;
>  
> -	while ((found = select_parent(parent, &dispose)) != 0)
> +	while ((found = select_parent(parent, &dispose)) != 0) {
>  		shrink_dentry_list(&dispose);
> +		cond_resched();
> +	}
>  }
>  EXPORT_SYMBOL(shrink_dcache_parent);

Should this change go through Al's or Andrew's branch?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] vfs: dcache: cond_resched in shrink_dentry_list
  2013-04-10  0:37         ` [PATCH] vfs: dcache: cond_resched in shrink_dentry_list Greg Thelen
@ 2013-04-10 23:44           ` Andrew Morton
  2013-04-11  0:15             ` Greg Thelen
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2013-04-10 23:44 UTC (permalink / raw)
  To: Greg Thelen
  Cc: Alexander Viro, Dave Chinner, linux-fsdevel, linux-kernel, linux-mm

On Tue, 09 Apr 2013 17:37:20 -0700 Greg Thelen <gthelen@google.com> wrote:

> > Call cond_resched() in shrink_dcache_parent() to maintain
> > interactivity.
> >
> > Before this patch:
> >
> > void shrink_dcache_parent(struct dentry * parent)
> > {
> > 	while ((found = select_parent(parent, &dispose)) != 0)
> > 		shrink_dentry_list(&dispose);
> > }
> >
> > select_parent() populates the dispose list with dentries which
> > shrink_dentry_list() then deletes.  select_parent() carefully uses
> > need_resched() to avoid doing too much work at once.  But neither
> > shrink_dcache_parent() nor its called functions call cond_resched().
> > So once need_resched() is set select_parent() will return single
> > dentry dispose list which is then deleted by shrink_dentry_list().
> > This is inefficient when there are a lot of dentry to process.  This
> > can cause softlockup and hurts interactivity on non preemptable
> > kernels.
> >
> > This change adds cond_resched() in shrink_dcache_parent().  The
> > benefit of this is that need_resched() is quickly cleared so that
> > future calls to select_parent() are able to efficiently return a big
> > batch of dentry.
> >
> > These additional cond_resched() do not seem to impact performance, at
> > least for the workload below.
> >
> > Here is a program which can cause soft lockup on a if other system
> > activity sets need_resched().

I was unable to guess what word was missing from "on a if other" ;)

> Should this change go through Al's or Andrew's branch?

I'll fight him for it.


Softlockups are fairly serious, so I'll put a cc:stable in there.  Or
were the changes which triggered this problem added after 3.9?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] vfs: dcache: cond_resched in shrink_dentry_list
  2013-04-10 23:44           ` Andrew Morton
@ 2013-04-11  0:15             ` Greg Thelen
  0 siblings, 0 replies; 3+ messages in thread
From: Greg Thelen @ 2013-04-11  0:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Dave Chinner, linux-fsdevel, linux-kernel, linux-mm

On Wed, Apr 10 2013, Andrew Morton wrote:

> On Tue, 09 Apr 2013 17:37:20 -0700 Greg Thelen <gthelen@google.com> wrote:
>
>> > Call cond_resched() in shrink_dcache_parent() to maintain
>> > interactivity.
>> >
>> > Before this patch:
>> >
>> > void shrink_dcache_parent(struct dentry * parent)
>> > {
>> > 	while ((found = select_parent(parent, &dispose)) != 0)
>> > 		shrink_dentry_list(&dispose);
>> > }
>> >
>> > select_parent() populates the dispose list with dentries which
>> > shrink_dentry_list() then deletes.  select_parent() carefully uses
>> > need_resched() to avoid doing too much work at once.  But neither
>> > shrink_dcache_parent() nor its called functions call cond_resched().
>> > So once need_resched() is set select_parent() will return single
>> > dentry dispose list which is then deleted by shrink_dentry_list().
>> > This is inefficient when there are a lot of dentry to process.  This
>> > can cause softlockup and hurts interactivity on non preemptable
>> > kernels.
>> >
>> > This change adds cond_resched() in shrink_dcache_parent().  The
>> > benefit of this is that need_resched() is quickly cleared so that
>> > future calls to select_parent() are able to efficiently return a big
>> > batch of dentry.
>> >
>> > These additional cond_resched() do not seem to impact performance, at
>> > least for the workload below.
>> >
>> > Here is a program which can cause soft lockup on a if other system
>> > activity sets need_resched().
>
> I was unable to guess what word was missing from "on a if other" ;)

Less is more ;)  Reword to:

  Here is a program which can cause soft lockup if other system activity
  sets need_resched().

>> Should this change go through Al's or Andrew's branch?
>
> I'll fight him for it.

Thanks.

> Softlockups are fairly serious, so I'll put a cc:stable in there.  Or
> were the changes which triggered this problem added after 3.9?

This also applies to stable.  I see the problem at least back to v3.3.
I did not test earlier kernels, but could if you want.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-04-11  0:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1364232151-23242-1-git-send-email-gthelen@google.com>
     [not found] ` <20130325235614.GI6369@dastard>
     [not found]   ` <xr93fvzjgfke.fsf@gthelen.mtv.corp.google.com>
     [not found]     ` <20130326024032.GJ6369@dastard>
     [not found]       ` <xr934nfyg4ld.fsf@gthelen.mtv.corp.google.com>
2013-04-10  0:37         ` [PATCH] vfs: dcache: cond_resched in shrink_dentry_list Greg Thelen
2013-04-10 23:44           ` Andrew Morton
2013-04-11  0:15             ` Greg Thelen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox