Hi
 
I'am from Lustre, which is a product of SUN Mirocsystem to implement
Scaled Distributed FileSystem, and we encounter a deadlock problem 
in prune_icache, the detailed is,
 
during truncating a file, a new update in current journal transaction
will be created, but it found memory in low level during processing, 
then it call try_to_free_pages to free some pages, which finially call
shrink_icache_memory/prune_icache to free cache memory occupied by inodes.
Note: prune_icache will get and hold "iprune_mutex" during its whole pruning work.
 
but at the same time, kswapd have called shrink_icache_memory/prune_icache with 
"iprune_mutex" locked, which found some inodes to dispose and call 
clear_inode/DQUOT_DROP/fs-specific-quota-drop-op(say "ldiskfs_dquot_drop" in our case)
to drop dquot, and this fs-specific-quota-drop-op can call journal_start to
start a new update, but it found the buffers in current transaction is up to
j_max_transaction_buffers, so it wake up kjournald to commit the transaction.
so kjournald will call journal_commit_transaction to commit the transcation,
which set the state of the transaction as T_LOCKED then check whether there are
still pending updates for the committing transaction, and it found there is a
pending update(started in truncating operation, see above), so it will wait
the update to complete, BUT the update won't be completed for it can't get the
"iprune_mutex" hold by kswapd, so the deadlock is triggered.
 
please see attachment for the possible patch to fixup this problem.
 

Regards
Hongchao


      ___________________________________________________________ 
  好玩贺卡等你发，邮箱贺卡全新上线！ 
http://card.mail.cn.yahoo.com/