* Problem in "prune_icache"
@ 2009-03-30 9:45 HongChao Zhang
2009-04-02 15:28 ` Jan Kara
0 siblings, 1 reply; 2+ messages in thread
From: HongChao Zhang @ 2009-03-30 9:45 UTC (permalink / raw)
To: linux-fsdevel, viro, linux-mm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1728 bytes --]
Hi
I'am from Lustre, which is a product of SUN Mirocsystem to implement
Scaled Distributed FileSystem, and we encounter a deadlock problem
in prune_icache, the detailed is,
during truncating a file, a new update in current journal transaction
will be created, but it found memory in low level during processing,
then it call try_to_free_pages to free some pages, which finially call
shrink_icache_memory/prune_icache to free cache memory occupied by inodes.
Note: prune_icache will get and hold "iprune_mutex" during its whole pruning work.
but at the same time, kswapd have called shrink_icache_memory/prune_icache with
"iprune_mutex" locked, which found some inodes to dispose and call
clear_inode/DQUOT_DROP/fs-specific-quota-drop-op(say "ldiskfs_dquot_drop" in our case)
to drop dquot, and this fs-specific-quota-drop-op can call journal_start to
start a new update, but it found the buffers in current transaction is up to
j_max_transaction_buffers, so it wake up kjournald to commit the transaction.
so kjournald will call journal_commit_transaction to commit the transcation,
which set the state of the transaction as T_LOCKED then check whether there are
still pending updates for the committing transaction, and it found there is a
pending update(started in truncating operation, see above), so it will wait
the update to complete, BUT the update won't be completed for it can't get the
"iprune_mutex" hold by kswapd, so the deadlock is triggered.
please see attachment for the possible patch to fixup this problem.
Regards
Hongchao
___________________________________________________________
好玩贺卡等你发,邮箱贺卡全新上线!
http://card.mail.cn.yahoo.com/
[-- Attachment #2: patch.18399 --]
[-- Type: text/plain, Size: 407 bytes --]
--- fs/inode.c.orig 2009-01-24 03:28:57.000000000 +0800
+++ fs/inode.c 2009-01-24 03:30:18.000000000 +0800
@@ -418,7 +418,9 @@ static void prune_icache(int nr_to_scan)
int nr_scanned;
unsigned long reap = 0;
- mutex_lock(&iprune_mutex);
+ if (!mutex_trylock(&iprune_mutex))
+ return;
+
spin_lock(&inode_lock);
for (nr_scanned = 0; nr_scanned < nr_to_scan; nr_scanned++) {
struct inode *inode;
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Problem in "prune_icache"
2009-03-30 9:45 Problem in "prune_icache" HongChao Zhang
@ 2009-04-02 15:28 ` Jan Kara
0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara @ 2009-04-02 15:28 UTC (permalink / raw)
To: HongChao Zhang; +Cc: linux-fsdevel, viro, linux-mm, linux-kernel
Hi,
> I'am from Lustre, which is a product of SUN Mirocsystem to implement
> Scaled Distributed FileSystem, and we encounter a deadlock problem
> in prune_icache, the detailed is,
>
> during truncating a file, a new update in current journal transaction
> will be created, but it found memory in low level during processing,
> then it call try_to_free_pages to free some pages, which finially call
> shrink_icache_memory/prune_icache to free cache memory occupied by inodes.
> Note: prune_icache will get and hold "iprune_mutex" during its whole pruning work.
>
> but at the same time, kswapd have called shrink_icache_memory/prune_icache with
> "iprune_mutex" locked, which found some inodes to dispose and call
> clear_inode/DQUOT_DROP/fs-specific-quota-drop-op(say "ldiskfs_dquot_drop" in our case)
> to drop dquot, and this fs-specific-quota-drop-op can call journal_start to
> start a new update, but it found the buffers in current transaction is up to
> j_max_transaction_buffers, so it wake up kjournald to commit the transaction.
> so kjournald will call journal_commit_transaction to commit the transcation,
> which set the state of the transaction as T_LOCKED then check whether there are
> still pending updates for the committing transaction, and it found there is a
> pending update(started in truncating operation, see above), so it will wait
> the update to complete, BUT the update won't be completed for it can't get the
> "iprune_mutex" hold by kswapd, so the deadlock is triggered.
Yes, this has happened with other filesystems as well (ext3,
ext4,...). The usual solution for this problem is to specify GFP_NOFS to
all allocations that happen while the transaction is open. That way we
never get to recursing back to the filesystem in the allocation. Is
there some reason why that is no-go for you?
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-04-02 15:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-30 9:45 Problem in "prune_icache" HongChao Zhang
2009-04-02 15:28 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox