linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [BUG] memcg: panic when rmdir()
@ 2009-01-16  6:15 Li Zefan
  2009-01-16  6:19 ` KAMEZAWA Hiroyuki
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Li Zefan @ 2009-01-16  6:15 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki, Balbir Singh; +Cc: Daisuke Nishimura, linux-mm

Found this when testing memory resource controller, can be triggered
with:
- CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
- or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
- or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount

# mount -t cgroup -o memory xxx /mnt
# mkdir /mnt/0
# for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
# echo "low limit" > /mnt/0/tasks
# do whatever to allocate some memory
# swapoff -a
killed (by OOM)
# for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
# rmdir /mnt/0

------------[ cut here ]------------
WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
Hardware name: Aspire SA85
Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 2548, comm: rmdir Tainted: G        W  2.6.29-rc1-mm1 #4
Call Trace:
 [<c042ecb6>] warn_slowpath+0x79/0x8f
 [<c04496ce>] ? clockevents_program_event+0xe0/0xef
 [<c0463a1b>] ? res_counter_charge+0x35/0xb0
 [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
 [<c0463941>] res_counter_uncharge_locked+0x25/0x36
 [<c04639ba>] res_counter_uncharge+0x33/0x5f
 [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
 [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
 [<c0460776>] cgroup_rmdir+0x5e/0x27e
 [<c0621d08>] ? _spin_unlock+0x2c/0x41
 [<c04a67fe>] vfs_rmdir+0x5b/0x9c
 [<c04a7b8c>] do_rmdir+0x89/0xc8
 [<c04438ed>] ? up_read+0x1b/0x2e
 [<c0623de4>] ? do_page_fault+0x356/0x5ed
 [<c04a7c14>] sys_rmdir+0x15/0x17
 [<c0403485>] sysenter_do_call+0x12/0x35
---[ end trace 4eaa2a86a8e2da24 ]---
------------[ cut here ]------------
kernel BUG at kernel/cgroup.c:2517!
invalid opcode: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]

Pid: 2548, comm: rmdir Tainted: G        W  (2.6.29-rc1-mm1 #4) Aspire SA85
EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
EIP is at cgroup_rmdir+0xda/0x27e
EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
Stack:
 00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
 c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
 00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
Call Trace:
 [<c0621d08>] ? _spin_unlock+0x2c/0x41
 [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
 [<c04a7b8c>] ? do_rmdir+0x89/0xc8
 [<c04438ed>] ? up_read+0x1b/0x2e
 [<c0623de4>] ? do_page_fault+0x356/0x5ed
 [<c04a7c14>] ? sys_rmdir+0x15/0x17
 [<c0403485>] ? sysenter_do_call+0x12/0x35
Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
---[ end trace 4eaa2a86a8e2da25 ]---
	



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:15 [BUG] memcg: panic when rmdir() Li Zefan
@ 2009-01-16  6:19 ` KAMEZAWA Hiroyuki
  2009-01-16  6:24   ` Li Zefan
  2009-01-16  6:29 ` KAMEZAWA Hiroyuki
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  6:19 UTC (permalink / raw)
  To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm

On Fri, 16 Jan 2009 14:15:04 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> 
> # mount -t cgroup -o memory xxx /mnt
> # mkdir /mnt/0
> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> # echo "low limit" > /mnt/0/tasks
> # do whatever to allocate some memory
> # swapoff -a
> killed (by OOM)
> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> # rmdir /mnt/0
> 
Isn't this a problem Nishimura fixed today ?

could you try

memcg-get-put-parents-at-create-free.patch

in mm-commits ?

Sorry for inconvinience, I'll send you the patch in private mail if necessary.

Thanks,
-Kame



> ------------[ cut here ]------------
> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
> Hardware name: Aspire SA85
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> Pid: 2548, comm: rmdir Tainted: G        W  2.6.29-rc1-mm1 #4
> Call Trace:
>  [<c042ecb6>] warn_slowpath+0x79/0x8f
>  [<c04496ce>] ? clockevents_program_event+0xe0/0xef
>  [<c0463a1b>] ? res_counter_charge+0x35/0xb0
>  [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
>  [<c0463941>] res_counter_uncharge_locked+0x25/0x36
>  [<c04639ba>] res_counter_uncharge+0x33/0x5f
>  [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
>  [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
>  [<c0460776>] cgroup_rmdir+0x5e/0x27e
>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
>  [<c04a67fe>] vfs_rmdir+0x5b/0x9c
>  [<c04a7b8c>] do_rmdir+0x89/0xc8
>  [<c04438ed>] ? up_read+0x1b/0x2e
>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
>  [<c04a7c14>] sys_rmdir+0x15/0x17
>  [<c0403485>] sysenter_do_call+0x12/0x35
> ---[ end trace 4eaa2a86a8e2da24 ]---
> ------------[ cut here ]------------
> kernel BUG at kernel/cgroup.c:2517!
> invalid opcode: 0000 [#1] PREEMPT SMP
> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> 
> Pid: 2548, comm: rmdir Tainted: G        W  (2.6.29-rc1-mm1 #4) Aspire SA85
> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
> EIP is at cgroup_rmdir+0xda/0x27e
> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
> Stack:
>  00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
>  c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
>  00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
> Call Trace:
>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
>  [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
>  [<c04a7b8c>] ? do_rmdir+0x89/0xc8
>  [<c04438ed>] ? up_read+0x1b/0x2e
>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
>  [<c04a7c14>] ? sys_rmdir+0x15/0x17
>  [<c0403485>] ? sysenter_do_call+0x12/0x35
> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
> ---[ end trace 4eaa2a86a8e2da25 ]---
> 	
> 
> 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:19 ` KAMEZAWA Hiroyuki
@ 2009-01-16  6:24   ` Li Zefan
  2009-01-16  6:30     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-01-16  6:24 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm

KAMEZAWA Hiroyuki wrote:
> On Fri, 16 Jan 2009 14:15:04 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
> 
>> Found this when testing memory resource controller, can be triggered
>> with:
>> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>>
>> # mount -t cgroup -o memory xxx /mnt
>> # mkdir /mnt/0
>> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
>> # echo "low limit" > /mnt/0/tasks
>> # do whatever to allocate some memory
>> # swapoff -a
>> killed (by OOM)
>> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
>> # rmdir /mnt/0
>>
> Isn't this a problem Nishimura fixed today ?
> 

Are you sure?

The changelog:
==========
The lifetime of struct cgroup and struct mem_cgroup is different and
mem_cgroup has its own reference count for handling references from swap_cgroup.

This causes strange problem that the parent mem_cgroup dies while
child mem_cgroup alive, and this problem causes a bug in case of use_hierarchy==1
because res_counter_uncharge climbs up the tree.
==========

I was not using hierarchy, and no "mem_cgroup dies while child mem_cgroup alive"
in my test.

Anyway, I'll try.

> could you try
> 
> memcg-get-put-parents-at-create-free.patch
> 
> in mm-commits ?
> 
> Sorry for inconvinience, I'll send you the patch in private mail if necessary.
> 
> Thanks,
> -Kame
> 
> 
> 
>> ------------[ cut here ]------------
>> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
>> Hardware name: Aspire SA85
>> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
>> Pid: 2548, comm: rmdir Tainted: G        W  2.6.29-rc1-mm1 #4
>> Call Trace:
>>  [<c042ecb6>] warn_slowpath+0x79/0x8f
>>  [<c04496ce>] ? clockevents_program_event+0xe0/0xef
>>  [<c0463a1b>] ? res_counter_charge+0x35/0xb0
>>  [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
>>  [<c0463941>] res_counter_uncharge_locked+0x25/0x36
>>  [<c04639ba>] res_counter_uncharge+0x33/0x5f
>>  [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
>>  [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
>>  [<c0460776>] cgroup_rmdir+0x5e/0x27e
>>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
>>  [<c04a67fe>] vfs_rmdir+0x5b/0x9c
>>  [<c04a7b8c>] do_rmdir+0x89/0xc8
>>  [<c04438ed>] ? up_read+0x1b/0x2e
>>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
>>  [<c04a7c14>] sys_rmdir+0x15/0x17
>>  [<c0403485>] sysenter_do_call+0x12/0x35
>> ---[ end trace 4eaa2a86a8e2da24 ]---
>> ------------[ cut here ]------------
>> kernel BUG at kernel/cgroup.c:2517!
>> invalid opcode: 0000 [#1] PREEMPT SMP
>> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
>> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
>>
>> Pid: 2548, comm: rmdir Tainted: G        W  (2.6.29-rc1-mm1 #4) Aspire SA85
>> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
>> EIP is at cgroup_rmdir+0xda/0x27e
>> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
>> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
>>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
>> Stack:
>>  00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
>>  c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
>>  00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
>> Call Trace:
>>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
>>  [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
>>  [<c04a7b8c>] ? do_rmdir+0x89/0xc8
>>  [<c04438ed>] ? up_read+0x1b/0x2e
>>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
>>  [<c04a7c14>] ? sys_rmdir+0x15/0x17
>>  [<c0403485>] ? sysenter_do_call+0x12/0x35
>> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
>> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
>> ---[ end trace 4eaa2a86a8e2da25 ]---
>> 	
>>
>>
>>
>>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:15 [BUG] memcg: panic when rmdir() Li Zefan
  2009-01-16  6:19 ` KAMEZAWA Hiroyuki
@ 2009-01-16  6:29 ` KAMEZAWA Hiroyuki
  2009-01-16  6:58 ` Daisuke Nishimura
  2009-01-16  8:07 ` KAMEZAWA Hiroyuki
  3 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  6:29 UTC (permalink / raw)
  To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm

On Fri, 16 Jan 2009 14:15:04 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> 
> # mount -t cgroup -o memory xxx /mnt
> # mkdir /mnt/0
> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> # echo "low limit" > /mnt/0/tasks
> # do whatever to allocate some memory
> # swapoff -a
> killed (by OOM)
> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> # rmdir /mnt/0
> 

Hmm, it seems css->refcnt is bad (css->refcnt < 0). maybe css_put is not
called without css_get().

will chase. thank you for testing.

-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:24   ` Li Zefan
@ 2009-01-16  6:30     ` KAMEZAWA Hiroyuki
  2009-01-16  7:00       ` Li Zefan
  0 siblings, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  6:30 UTC (permalink / raw)
  To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm

On Fri, 16 Jan 2009 14:24:39 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> KAMEZAWA Hiroyuki wrote:
> > On Fri, 16 Jan 2009 14:15:04 +0800
> > Li Zefan <lizf@cn.fujitsu.com> wrote:
> > 
> >> Found this when testing memory resource controller, can be triggered
> >> with:
> >> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> >>
> >> # mount -t cgroup -o memory xxx /mnt
> >> # mkdir /mnt/0
> >> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> >> # echo "low limit" > /mnt/0/tasks
> >> # do whatever to allocate some memory
> >> # swapoff -a
> >> killed (by OOM)
> >> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> >> # rmdir /mnt/0
> >>
> > Isn't this a problem Nishimura fixed today ?
> > 
> 
> Are you sure?
> 
Sorry, I didn't see BUG! line in you log.

-Kame

> The changelog:
> ==========
> The lifetime of struct cgroup and struct mem_cgroup is different and
> mem_cgroup has its own reference count for handling references from swap_cgroup.
> 
> This causes strange problem that the parent mem_cgroup dies while
> child mem_cgroup alive, and this problem causes a bug in case of use_hierarchy==1
> because res_counter_uncharge climbs up the tree.
> ==========
> 
> I was not using hierarchy, and no "mem_cgroup dies while child mem_cgroup alive"
> in my test.
> 
> Anyway, I'll try.
> 



> > could you try
> > 
> > memcg-get-put-parents-at-create-free.patch
> > 
> > in mm-commits ?
> > 
> > Sorry for inconvinience, I'll send you the patch in private mail if necessary.
> > 
> > Thanks,
> > -Kame
> > 
> > 
> > 
> >> ------------[ cut here ]------------
> >> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
> >> Hardware name: Aspire SA85
> >> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> >> Pid: 2548, comm: rmdir Tainted: G        W  2.6.29-rc1-mm1 #4
> >> Call Trace:
> >>  [<c042ecb6>] warn_slowpath+0x79/0x8f
> >>  [<c04496ce>] ? clockevents_program_event+0xe0/0xef
> >>  [<c0463a1b>] ? res_counter_charge+0x35/0xb0
> >>  [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
> >>  [<c0463941>] res_counter_uncharge_locked+0x25/0x36
> >>  [<c04639ba>] res_counter_uncharge+0x33/0x5f
> >>  [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
> >>  [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
> >>  [<c0460776>] cgroup_rmdir+0x5e/0x27e
> >>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
> >>  [<c04a67fe>] vfs_rmdir+0x5b/0x9c
> >>  [<c04a7b8c>] do_rmdir+0x89/0xc8
> >>  [<c04438ed>] ? up_read+0x1b/0x2e
> >>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
> >>  [<c04a7c14>] sys_rmdir+0x15/0x17
> >>  [<c0403485>] sysenter_do_call+0x12/0x35
> >> ---[ end trace 4eaa2a86a8e2da24 ]---
> >> ------------[ cut here ]------------
> >> kernel BUG at kernel/cgroup.c:2517!
> >> invalid opcode: 0000 [#1] PREEMPT SMP
> >> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
> >> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> >>
> >> Pid: 2548, comm: rmdir Tainted: G        W  (2.6.29-rc1-mm1 #4) Aspire SA85
> >> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
> >> EIP is at cgroup_rmdir+0xda/0x27e
> >> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
> >> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
> >>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> >> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
> >> Stack:
> >>  00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
> >>  c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
> >>  00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
> >> Call Trace:
> >>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
> >>  [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
> >>  [<c04a7b8c>] ? do_rmdir+0x89/0xc8
> >>  [<c04438ed>] ? up_read+0x1b/0x2e
> >>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
> >>  [<c04a7c14>] ? sys_rmdir+0x15/0x17
> >>  [<c0403485>] ? sysenter_do_call+0x12/0x35
> >> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
> >> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
> >> ---[ end trace 4eaa2a86a8e2da25 ]---
> >> 	
> >>
> >>
> >>
> >>
> > 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:15 [BUG] memcg: panic when rmdir() Li Zefan
  2009-01-16  6:19 ` KAMEZAWA Hiroyuki
  2009-01-16  6:29 ` KAMEZAWA Hiroyuki
@ 2009-01-16  6:58 ` Daisuke Nishimura
  2009-01-16  8:07 ` KAMEZAWA Hiroyuki
  3 siblings, 0 replies; 15+ messages in thread
From: Daisuke Nishimura @ 2009-01-16  6:58 UTC (permalink / raw)
  To: Li Zefan; +Cc: nishimura, KAMEZAWA Hiroyuki, Balbir Singh, linux-mm

On Fri, 16 Jan 2009 14:15:04 +0800, Li Zefan <lizf@cn.fujitsu.com> wrote:
> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> 
> # mount -t cgroup -o memory xxx /mnt
> # mkdir /mnt/0
> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> # echo "low limit" > /mnt/0/tasks
> # do whatever to allocate some memory
> # swapoff -a
> killed (by OOM)
> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> # rmdir /mnt/0
> 
> ------------[ cut here ]------------
> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
> Hardware name: Aspire SA85
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> Pid: 2548, comm: rmdir Tainted: G        W  2.6.29-rc1-mm1 #4
> Call Trace:
>  [<c042ecb6>] warn_slowpath+0x79/0x8f
>  [<c04496ce>] ? clockevents_program_event+0xe0/0xef
>  [<c0463a1b>] ? res_counter_charge+0x35/0xb0
>  [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
>  [<c0463941>] res_counter_uncharge_locked+0x25/0x36
>  [<c04639ba>] res_counter_uncharge+0x33/0x5f
>  [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
>  [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
>  [<c0460776>] cgroup_rmdir+0x5e/0x27e
>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
>  [<c04a67fe>] vfs_rmdir+0x5b/0x9c
>  [<c04a7b8c>] do_rmdir+0x89/0xc8
>  [<c04438ed>] ? up_read+0x1b/0x2e
>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
>  [<c04a7c14>] sys_rmdir+0x15/0x17
>  [<c0403485>] sysenter_do_call+0x12/0x35
> ---[ end trace 4eaa2a86a8e2da24 ]---
> ------------[ cut here ]------------
> kernel BUG at kernel/cgroup.c:2517!
> invalid opcode: 0000 [#1] PREEMPT SMP
> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> 
> Pid: 2548, comm: rmdir Tainted: G        W  (2.6.29-rc1-mm1 #4) Aspire SA85
> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
> EIP is at cgroup_rmdir+0xda/0x27e
> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
> Stack:
>  00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
>  c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
>  00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
> Call Trace:
>  [<c0621d08>] ? _spin_unlock+0x2c/0x41
>  [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
>  [<c04a7b8c>] ? do_rmdir+0x89/0xc8
>  [<c04438ed>] ? up_read+0x1b/0x2e
>  [<c0623de4>] ? do_page_fault+0x356/0x5ed
>  [<c04a7c14>] ? sys_rmdir+0x15/0x17
>  [<c0403485>] ? sysenter_do_call+0x12/0x35
> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
> ---[ end trace 4eaa2a86a8e2da25 ]---
> 	
I've reproduced this bug in my environment too.
I'll dig this too.

Thank you for reporting a bug.

Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:30     ` KAMEZAWA Hiroyuki
@ 2009-01-16  7:00       ` Li Zefan
  0 siblings, 0 replies; 15+ messages in thread
From: Li Zefan @ 2009-01-16  7:00 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm

KAMEZAWA Hiroyuki wrote:
> On Fri, 16 Jan 2009 14:24:39 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
> 
>> KAMEZAWA Hiroyuki wrote:
>>> On Fri, 16 Jan 2009 14:15:04 +0800
>>> Li Zefan <lizf@cn.fujitsu.com> wrote:
>>>
>>>> Found this when testing memory resource controller, can be triggered
>>>> with:
>>>> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
>>>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
>>>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>>>>
>>>> # mount -t cgroup -o memory xxx /mnt
>>>> # mkdir /mnt/0
>>>> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
>>>> # echo "low limit" > /mnt/0/tasks
>>>> # do whatever to allocate some memory
>>>> # swapoff -a
>>>> killed (by OOM)
>>>> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
>>>> # rmdir /mnt/0
>>>>
>>> Isn't this a problem Nishimura fixed today ?
>>>
>> Are you sure?
>>
> Sorry, I didn't see BUG! line in you log.
> 

I've tested with Nishimura's patch applied, and as is expected, this bug
is totally different from the one Nishimura has fixed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  6:15 [BUG] memcg: panic when rmdir() Li Zefan
                   ` (2 preceding siblings ...)
  2009-01-16  6:58 ` Daisuke Nishimura
@ 2009-01-16  8:07 ` KAMEZAWA Hiroyuki
  2009-01-16  8:26   ` Daisuke Nishimura
  2009-01-16  8:33   ` Li Zefan
  3 siblings, 2 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  8:07 UTC (permalink / raw)
  To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm, hugh

On Fri, 16 Jan 2009 14:15:04 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> 

Li-san, could you try this ? I myself can't reproduce the bug yet...
==

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Now, at swapoff, even while try_charge() fails, commit is executed.
This is bug and make refcnt of cgroup_subsys_state minus, finally.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Jan14/mm/swapfile.c
===================================================================
--- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
+++ mmotm-2.6.29-Jan14/mm/swapfile.c
@@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
 	pte_t *pte;
 	int ret = 1;
 
-	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
+	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
 		ret = -ENOMEM;
+		goto out_nolock;
+	}
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
@@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
 	activate_page(page);
 out:
 	pte_unmap_unlock(pte, ptl);
+out_nolock:
 	return ret;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  8:07 ` KAMEZAWA Hiroyuki
@ 2009-01-16  8:26   ` Daisuke Nishimura
  2009-01-16  9:12     ` KAMEZAWA Hiroyuki
  2009-01-16  9:13     ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
  2009-01-16  8:33   ` Li Zefan
  1 sibling, 2 replies; 15+ messages in thread
From: Daisuke Nishimura @ 2009-01-16  8:26 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: nishimura, Li Zefan, Balbir Singh, linux-mm, hugh

On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Now, at swapoff, even while try_charge() fails, commit is executed.
> This is bug and make refcnt of cgroup_subsys_state minus, finally.
> 
Nice catch!

I think this bug can explain this problem I've seen.
Commiting on trycharge failure will add the pc to the lru
without a corresponding charge and refcnt.
And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).

Even if the problem cannot be fixed by this patch, this patch is valid and needed.

> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>

I'll test it.


Thanks,
Daisuke Nishimura.

> ---
> Index: mmotm-2.6.29-Jan14/mm/swapfile.c
> ===================================================================
> --- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
> +++ mmotm-2.6.29-Jan14/mm/swapfile.c
> @@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
>  	pte_t *pte;
>  	int ret = 1;
>  
> -	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
> +	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
>  		ret = -ENOMEM;
> +		goto out_nolock;
> +	}
>  
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>  	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
> @@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
>  	activate_page(page);
>  out:
>  	pte_unmap_unlock(pte, ptl);
> +out_nolock:
>  	return ret;
>  }
>  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  8:07 ` KAMEZAWA Hiroyuki
  2009-01-16  8:26   ` Daisuke Nishimura
@ 2009-01-16  8:33   ` Li Zefan
  2009-01-16  8:40     ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-01-16  8:33 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm, hugh

KAMEZAWA Hiroyuki wrote:
> On Fri, 16 Jan 2009 14:15:04 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
> 
>> Found this when testing memory resource controller, can be triggered
>> with:
>> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>>
> 
> Li-san, could you try this ? I myself can't reproduce the bug yet...

I've tested this patch, and the bug seems to disappear. :)

Tested-by: Li Zefan <lizf@cn.fujitsu.com>

I'm going to be off office, and I'll do more testing to confirm this
next week.

> ==
> 
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Now, at swapoff, even while try_charge() fails, commit is executed.
> This is bug and make refcnt of cgroup_subsys_state minus, finally.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> Index: mmotm-2.6.29-Jan14/mm/swapfile.c
> ===================================================================
> --- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
> +++ mmotm-2.6.29-Jan14/mm/swapfile.c
> @@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
>  	pte_t *pte;
>  	int ret = 1;
>  
> -	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
> +	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
>  		ret = -ENOMEM;
> +		goto out_nolock;
> +	}
>  
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>  	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
> @@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
>  	activate_page(page);
>  out:
>  	pte_unmap_unlock(pte, ptl);
> +out_nolock:
>  	return ret;
>  }
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  8:33   ` Li Zefan
@ 2009-01-16  8:40     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  8:40 UTC (permalink / raw)
  To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm, hugh

On Fri, 16 Jan 2009 16:33:08 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:

> KAMEZAWA Hiroyuki wrote:
> > On Fri, 16 Jan 2009 14:15:04 +0800
> > Li Zefan <lizf@cn.fujitsu.com> wrote:
> > 
> >> Found this when testing memory resource controller, can be triggered
> >> with:
> >> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> >>
> > 
> > Li-san, could you try this ? I myself can't reproduce the bug yet...
> 
> I've tested this patch, and the bug seems to disappear. :)
> 
> Tested-by: Li Zefan <lizf@cn.fujitsu.com>
> 
> I'm going to be off office, and I'll do more testing to confirm this
> next week.
> 

Thank you !

-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  8:26   ` Daisuke Nishimura
@ 2009-01-16  9:12     ` KAMEZAWA Hiroyuki
  2009-01-16  9:23       ` [BUGFIX] [PATCH] memcg: fix refcnt handling at swapoff KAMEZAWA Hiroyuki
  2009-01-16  9:13     ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
  1 sibling, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  9:12 UTC (permalink / raw)
  To: Daisuke Nishimura; +Cc: Li Zefan, Balbir Singh, linux-mm, hugh

On Fri, 16 Jan 2009 17:26:51 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > 
> > Now, at swapoff, even while try_charge() fails, commit is executed.
> > This is bug and make refcnt of cgroup_subsys_state minus, finally.
> > 
> Nice catch!
> 
> I think this bug can explain this problem I've seen.
> Commiting on trycharge failure will add the pc to the lru
> without a corresponding charge and refcnt.
> And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
> and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
> 
> Even if the problem cannot be fixed by this patch, this patch is valid and needed.
> 
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> 
> I'll test it.
> 
> 
I finally get how-to-reprocuce and confirmed this fixes the problem.

How-to-reproduce.

In shell-A
  #mount -t cgroup none /opt/cgroup
  #mkdir /opt/cgroup/xxx/
  #echo 0 > /opt/cgroup/xxx/tasks
  #Run malloc 100M on this and sleep. ---(*)

In shell-B.
  #echo 40M > /opt/cgroup/xxx/memory.limit_in_bytes.
  Then, you'll see 60M of swap.
  #/sbin/swapoff -a 
  Then, you'll see OOM-Kill against (*)
  #echo shell-A > /opt/cgroup/tasks
  make /opt/cgroup/xxx/ empty
  #rmdir /opt/cgroup/xxx

=> panics.

I'll add this swap-off test to memcg-debug.txt later.

BTW, OOM against (*) itself seems also probelmatic.
But simply disable oom-at-swapoff cannot be a workaround...


-Kame

















> Thanks,
> Daisuke Nishimura.
> 
> > ---
> > Index: mmotm-2.6.29-Jan14/mm/swapfile.c
> > ===================================================================
> > --- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
> > +++ mmotm-2.6.29-Jan14/mm/swapfile.c
> > @@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
> >  	pte_t *pte;
> >  	int ret = 1;
> >  
> > -	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
> > +	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
> >  		ret = -ENOMEM;
> > +		goto out_nolock;
> > +	}
> >  
> >  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> >  	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
> > @@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
> >  	activate_page(page);
> >  out:
> >  	pte_unmap_unlock(pte, ptl);
> > +out_nolock:
> >  	return ret;
> >  }
> >  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  8:26   ` Daisuke Nishimura
  2009-01-16  9:12     ` KAMEZAWA Hiroyuki
@ 2009-01-16  9:13     ` Daisuke Nishimura
  2009-01-16  9:16       ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 15+ messages in thread
From: Daisuke Nishimura @ 2009-01-16  9:13 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: nishimura, Li Zefan, Balbir Singh, linux-mm, hugh

On Fri, 16 Jan 2009 17:26:51 +0900, Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > 
> > Now, at swapoff, even while try_charge() fails, commit is executed.
> > This is bug and make refcnt of cgroup_subsys_state minus, finally.
> > 
> Nice catch!
> 
> I think this bug can explain this problem I've seen.
> Commiting on trycharge failure will add the pc to the lru
> without a corresponding charge and refcnt.
> And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
> and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
> 
> Even if the problem cannot be fixed by this patch, this patch is valid and needed.
> 
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> 
> I'll test it.
> 
I've tested several times, but this problem didn't happen.

Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] memcg: panic when rmdir()
  2009-01-16  9:13     ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
@ 2009-01-16  9:16       ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  9:16 UTC (permalink / raw)
  To: Daisuke Nishimura; +Cc: Li Zefan, Balbir Singh, linux-mm, hugh

On Fri, 16 Jan 2009 18:13:00 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> On Fri, 16 Jan 2009 17:26:51 +0900, Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > 
> > > Now, at swapoff, even while try_charge() fails, commit is executed.
> > > This is bug and make refcnt of cgroup_subsys_state minus, finally.
> > > 
> > Nice catch!
> > 
> > I think this bug can explain this problem I've seen.
> > Commiting on trycharge failure will add the pc to the lru
> > without a corresponding charge and refcnt.
> > And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
> > and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
> > 
> > Even if the problem cannot be fixed by this patch, this patch is valid and needed.
> > 
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > 
> > I'll test it.
> > 
> I've tested several times, but this problem didn't happen.
> 
> Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> 

Thank you!, I'll send the patch to Andrew.

-Kame

> 
> Thanks,
> Daisuke Nishimura.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [BUGFIX] [PATCH] memcg: fix refcnt handling at swapoff
  2009-01-16  9:12     ` KAMEZAWA Hiroyuki
@ 2009-01-16  9:23       ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16  9:23 UTC (permalink / raw)
  To: akpm; +Cc: Daisuke Nishimura, Li Zefan, Balbir Singh, linux-mm, hugh

On Fri, 16 Jan 2009 18:12:35 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> How-to-reproduce.
> 
> In shell-A
>   #mount -t cgroup none /opt/cgroup
>   #mkdir /opt/cgroup/xxx/
>   #echo 0 > /opt/cgroup/xxx/tasks
>   #Run malloc 100M on this and sleep. ---(*)
> 
> In shell-B.
>   #echo 40M > /opt/cgroup/xxx/memory.limit_in_bytes.
>   Then, you'll see 60M of swap.
>   #/sbin/swapoff -a 
>   Then, you'll see OOM-Kill against (*)
>   #echo shell-A > /opt/cgroup/tasks
>   make /opt/cgroup/xxx/ empty
>   #rmdir /opt/cgroup/xxx
> 
> => panics.
> 
I'll update how-to-test text under Documentation/ later.
-Kame
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Now, at swapoff, even while try_charge() fails, commit is executed.
This is bug and make refcnt of cgroup_subsys_state minus, finally.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
---
 mm/swapfile.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Index: mmotm-2.6.29-Jan14/mm/swapfile.c
===================================================================
--- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
+++ mmotm-2.6.29-Jan14/mm/swapfile.c
@@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
 	pte_t *pte;
 	int ret = 1;
 
-	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
+	if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
 		ret = -ENOMEM;
+		goto out_nolock;
+	}
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
@@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
 	activate_page(page);
 out:
 	pte_unmap_unlock(pte, ptl);
+out_nolock:
 	return ret;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-01-16  9:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-16  6:15 [BUG] memcg: panic when rmdir() Li Zefan
2009-01-16  6:19 ` KAMEZAWA Hiroyuki
2009-01-16  6:24   ` Li Zefan
2009-01-16  6:30     ` KAMEZAWA Hiroyuki
2009-01-16  7:00       ` Li Zefan
2009-01-16  6:29 ` KAMEZAWA Hiroyuki
2009-01-16  6:58 ` Daisuke Nishimura
2009-01-16  8:07 ` KAMEZAWA Hiroyuki
2009-01-16  8:26   ` Daisuke Nishimura
2009-01-16  9:12     ` KAMEZAWA Hiroyuki
2009-01-16  9:23       ` [BUGFIX] [PATCH] memcg: fix refcnt handling at swapoff KAMEZAWA Hiroyuki
2009-01-16  9:13     ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
2009-01-16  9:16       ` KAMEZAWA Hiroyuki
2009-01-16  8:33   ` Li Zefan
2009-01-16  8:40     ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox