* [BUG] memcg: panic when rmdir()
@ 2009-01-16 6:15 Li Zefan
2009-01-16 6:19 ` KAMEZAWA Hiroyuki
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Li Zefan @ 2009-01-16 6:15 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki, Balbir Singh; +Cc: Daisuke Nishimura, linux-mm
Found this when testing memory resource controller, can be triggered
with:
- CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
- or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
- or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
# mount -t cgroup -o memory xxx /mnt
# mkdir /mnt/0
# for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
# echo "low limit" > /mnt/0/tasks
# do whatever to allocate some memory
# swapoff -a
killed (by OOM)
# for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
# rmdir /mnt/0
------------[ cut here ]------------
WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
Hardware name: Aspire SA85
Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 2548, comm: rmdir Tainted: G W 2.6.29-rc1-mm1 #4
Call Trace:
[<c042ecb6>] warn_slowpath+0x79/0x8f
[<c04496ce>] ? clockevents_program_event+0xe0/0xef
[<c0463a1b>] ? res_counter_charge+0x35/0xb0
[<c04639b0>] ? res_counter_uncharge+0x29/0x5f
[<c0463941>] res_counter_uncharge_locked+0x25/0x36
[<c04639ba>] res_counter_uncharge+0x33/0x5f
[<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
[<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
[<c0460776>] cgroup_rmdir+0x5e/0x27e
[<c0621d08>] ? _spin_unlock+0x2c/0x41
[<c04a67fe>] vfs_rmdir+0x5b/0x9c
[<c04a7b8c>] do_rmdir+0x89/0xc8
[<c04438ed>] ? up_read+0x1b/0x2e
[<c0623de4>] ? do_page_fault+0x356/0x5ed
[<c04a7c14>] sys_rmdir+0x15/0x17
[<c0403485>] sysenter_do_call+0x12/0x35
---[ end trace 4eaa2a86a8e2da24 ]---
------------[ cut here ]------------
kernel BUG at kernel/cgroup.c:2517!
invalid opcode: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 2548, comm: rmdir Tainted: G W (2.6.29-rc1-mm1 #4) Aspire SA85
EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
EIP is at cgroup_rmdir+0xda/0x27e
EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
Stack:
00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
Call Trace:
[<c0621d08>] ? _spin_unlock+0x2c/0x41
[<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
[<c04a7b8c>] ? do_rmdir+0x89/0xc8
[<c04438ed>] ? up_read+0x1b/0x2e
[<c0623de4>] ? do_page_fault+0x356/0x5ed
[<c04a7c14>] ? sys_rmdir+0x15/0x17
[<c0403485>] ? sysenter_do_call+0x12/0x35
Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
---[ end trace 4eaa2a86a8e2da25 ]---
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:15 [BUG] memcg: panic when rmdir() Li Zefan
@ 2009-01-16 6:19 ` KAMEZAWA Hiroyuki
2009-01-16 6:24 ` Li Zefan
2009-01-16 6:29 ` KAMEZAWA Hiroyuki
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 6:19 UTC (permalink / raw)
To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm
On Fri, 16 Jan 2009 14:15:04 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:
> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>
> # mount -t cgroup -o memory xxx /mnt
> # mkdir /mnt/0
> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> # echo "low limit" > /mnt/0/tasks
> # do whatever to allocate some memory
> # swapoff -a
> killed (by OOM)
> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> # rmdir /mnt/0
>
Isn't this a problem Nishimura fixed today ?
could you try
memcg-get-put-parents-at-create-free.patch
in mm-commits ?
Sorry for inconvinience, I'll send you the patch in private mail if necessary.
Thanks,
-Kame
> ------------[ cut here ]------------
> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
> Hardware name: Aspire SA85
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> Pid: 2548, comm: rmdir Tainted: G W 2.6.29-rc1-mm1 #4
> Call Trace:
> [<c042ecb6>] warn_slowpath+0x79/0x8f
> [<c04496ce>] ? clockevents_program_event+0xe0/0xef
> [<c0463a1b>] ? res_counter_charge+0x35/0xb0
> [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
> [<c0463941>] res_counter_uncharge_locked+0x25/0x36
> [<c04639ba>] res_counter_uncharge+0x33/0x5f
> [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
> [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
> [<c0460776>] cgroup_rmdir+0x5e/0x27e
> [<c0621d08>] ? _spin_unlock+0x2c/0x41
> [<c04a67fe>] vfs_rmdir+0x5b/0x9c
> [<c04a7b8c>] do_rmdir+0x89/0xc8
> [<c04438ed>] ? up_read+0x1b/0x2e
> [<c0623de4>] ? do_page_fault+0x356/0x5ed
> [<c04a7c14>] sys_rmdir+0x15/0x17
> [<c0403485>] sysenter_do_call+0x12/0x35
> ---[ end trace 4eaa2a86a8e2da24 ]---
> ------------[ cut here ]------------
> kernel BUG at kernel/cgroup.c:2517!
> invalid opcode: 0000 [#1] PREEMPT SMP
> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
>
> Pid: 2548, comm: rmdir Tainted: G W (2.6.29-rc1-mm1 #4) Aspire SA85
> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
> EIP is at cgroup_rmdir+0xda/0x27e
> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
> Stack:
> 00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
> c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
> 00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
> Call Trace:
> [<c0621d08>] ? _spin_unlock+0x2c/0x41
> [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
> [<c04a7b8c>] ? do_rmdir+0x89/0xc8
> [<c04438ed>] ? up_read+0x1b/0x2e
> [<c0623de4>] ? do_page_fault+0x356/0x5ed
> [<c04a7c14>] ? sys_rmdir+0x15/0x17
> [<c0403485>] ? sysenter_do_call+0x12/0x35
> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
> ---[ end trace 4eaa2a86a8e2da25 ]---
>
>
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:19 ` KAMEZAWA Hiroyuki
@ 2009-01-16 6:24 ` Li Zefan
2009-01-16 6:30 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-01-16 6:24 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm
KAMEZAWA Hiroyuki wrote:
> On Fri, 16 Jan 2009 14:15:04 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
>
>> Found this when testing memory resource controller, can be triggered
>> with:
>> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>>
>> # mount -t cgroup -o memory xxx /mnt
>> # mkdir /mnt/0
>> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
>> # echo "low limit" > /mnt/0/tasks
>> # do whatever to allocate some memory
>> # swapoff -a
>> killed (by OOM)
>> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
>> # rmdir /mnt/0
>>
> Isn't this a problem Nishimura fixed today ?
>
Are you sure?
The changelog:
==========
The lifetime of struct cgroup and struct mem_cgroup is different and
mem_cgroup has its own reference count for handling references from swap_cgroup.
This causes strange problem that the parent mem_cgroup dies while
child mem_cgroup alive, and this problem causes a bug in case of use_hierarchy==1
because res_counter_uncharge climbs up the tree.
==========
I was not using hierarchy, and no "mem_cgroup dies while child mem_cgroup alive"
in my test.
Anyway, I'll try.
> could you try
>
> memcg-get-put-parents-at-create-free.patch
>
> in mm-commits ?
>
> Sorry for inconvinience, I'll send you the patch in private mail if necessary.
>
> Thanks,
> -Kame
>
>
>
>> ------------[ cut here ]------------
>> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
>> Hardware name: Aspire SA85
>> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
>> Pid: 2548, comm: rmdir Tainted: G W 2.6.29-rc1-mm1 #4
>> Call Trace:
>> [<c042ecb6>] warn_slowpath+0x79/0x8f
>> [<c04496ce>] ? clockevents_program_event+0xe0/0xef
>> [<c0463a1b>] ? res_counter_charge+0x35/0xb0
>> [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
>> [<c0463941>] res_counter_uncharge_locked+0x25/0x36
>> [<c04639ba>] res_counter_uncharge+0x33/0x5f
>> [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
>> [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
>> [<c0460776>] cgroup_rmdir+0x5e/0x27e
>> [<c0621d08>] ? _spin_unlock+0x2c/0x41
>> [<c04a67fe>] vfs_rmdir+0x5b/0x9c
>> [<c04a7b8c>] do_rmdir+0x89/0xc8
>> [<c04438ed>] ? up_read+0x1b/0x2e
>> [<c0623de4>] ? do_page_fault+0x356/0x5ed
>> [<c04a7c14>] sys_rmdir+0x15/0x17
>> [<c0403485>] sysenter_do_call+0x12/0x35
>> ---[ end trace 4eaa2a86a8e2da24 ]---
>> ------------[ cut here ]------------
>> kernel BUG at kernel/cgroup.c:2517!
>> invalid opcode: 0000 [#1] PREEMPT SMP
>> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
>> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
>>
>> Pid: 2548, comm: rmdir Tainted: G W (2.6.29-rc1-mm1 #4) Aspire SA85
>> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
>> EIP is at cgroup_rmdir+0xda/0x27e
>> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
>> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
>> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
>> Stack:
>> 00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
>> c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
>> 00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
>> Call Trace:
>> [<c0621d08>] ? _spin_unlock+0x2c/0x41
>> [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
>> [<c04a7b8c>] ? do_rmdir+0x89/0xc8
>> [<c04438ed>] ? up_read+0x1b/0x2e
>> [<c0623de4>] ? do_page_fault+0x356/0x5ed
>> [<c04a7c14>] ? sys_rmdir+0x15/0x17
>> [<c0403485>] ? sysenter_do_call+0x12/0x35
>> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
>> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
>> ---[ end trace 4eaa2a86a8e2da25 ]---
>>
>>
>>
>>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:15 [BUG] memcg: panic when rmdir() Li Zefan
2009-01-16 6:19 ` KAMEZAWA Hiroyuki
@ 2009-01-16 6:29 ` KAMEZAWA Hiroyuki
2009-01-16 6:58 ` Daisuke Nishimura
2009-01-16 8:07 ` KAMEZAWA Hiroyuki
3 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 6:29 UTC (permalink / raw)
To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm
On Fri, 16 Jan 2009 14:15:04 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:
> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>
> # mount -t cgroup -o memory xxx /mnt
> # mkdir /mnt/0
> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> # echo "low limit" > /mnt/0/tasks
> # do whatever to allocate some memory
> # swapoff -a
> killed (by OOM)
> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> # rmdir /mnt/0
>
Hmm, it seems css->refcnt is bad (css->refcnt < 0). maybe css_put is not
called without css_get().
will chase. thank you for testing.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:24 ` Li Zefan
@ 2009-01-16 6:30 ` KAMEZAWA Hiroyuki
2009-01-16 7:00 ` Li Zefan
0 siblings, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 6:30 UTC (permalink / raw)
To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm
On Fri, 16 Jan 2009 14:24:39 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:
> KAMEZAWA Hiroyuki wrote:
> > On Fri, 16 Jan 2009 14:15:04 +0800
> > Li Zefan <lizf@cn.fujitsu.com> wrote:
> >
> >> Found this when testing memory resource controller, can be triggered
> >> with:
> >> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> >>
> >> # mount -t cgroup -o memory xxx /mnt
> >> # mkdir /mnt/0
> >> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> >> # echo "low limit" > /mnt/0/tasks
> >> # do whatever to allocate some memory
> >> # swapoff -a
> >> killed (by OOM)
> >> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> >> # rmdir /mnt/0
> >>
> > Isn't this a problem Nishimura fixed today ?
> >
>
> Are you sure?
>
Sorry, I didn't see BUG! line in you log.
-Kame
> The changelog:
> ==========
> The lifetime of struct cgroup and struct mem_cgroup is different and
> mem_cgroup has its own reference count for handling references from swap_cgroup.
>
> This causes strange problem that the parent mem_cgroup dies while
> child mem_cgroup alive, and this problem causes a bug in case of use_hierarchy==1
> because res_counter_uncharge climbs up the tree.
> ==========
>
> I was not using hierarchy, and no "mem_cgroup dies while child mem_cgroup alive"
> in my test.
>
> Anyway, I'll try.
>
> > could you try
> >
> > memcg-get-put-parents-at-create-free.patch
> >
> > in mm-commits ?
> >
> > Sorry for inconvinience, I'll send you the patch in private mail if necessary.
> >
> > Thanks,
> > -Kame
> >
> >
> >
> >> ------------[ cut here ]------------
> >> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
> >> Hardware name: Aspire SA85
> >> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> >> Pid: 2548, comm: rmdir Tainted: G W 2.6.29-rc1-mm1 #4
> >> Call Trace:
> >> [<c042ecb6>] warn_slowpath+0x79/0x8f
> >> [<c04496ce>] ? clockevents_program_event+0xe0/0xef
> >> [<c0463a1b>] ? res_counter_charge+0x35/0xb0
> >> [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
> >> [<c0463941>] res_counter_uncharge_locked+0x25/0x36
> >> [<c04639ba>] res_counter_uncharge+0x33/0x5f
> >> [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
> >> [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
> >> [<c0460776>] cgroup_rmdir+0x5e/0x27e
> >> [<c0621d08>] ? _spin_unlock+0x2c/0x41
> >> [<c04a67fe>] vfs_rmdir+0x5b/0x9c
> >> [<c04a7b8c>] do_rmdir+0x89/0xc8
> >> [<c04438ed>] ? up_read+0x1b/0x2e
> >> [<c0623de4>] ? do_page_fault+0x356/0x5ed
> >> [<c04a7c14>] sys_rmdir+0x15/0x17
> >> [<c0403485>] sysenter_do_call+0x12/0x35
> >> ---[ end trace 4eaa2a86a8e2da24 ]---
> >> ------------[ cut here ]------------
> >> kernel BUG at kernel/cgroup.c:2517!
> >> invalid opcode: 0000 [#1] PREEMPT SMP
> >> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
> >> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> >>
> >> Pid: 2548, comm: rmdir Tainted: G W (2.6.29-rc1-mm1 #4) Aspire SA85
> >> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
> >> EIP is at cgroup_rmdir+0xda/0x27e
> >> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
> >> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
> >> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> >> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
> >> Stack:
> >> 00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
> >> c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
> >> 00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
> >> Call Trace:
> >> [<c0621d08>] ? _spin_unlock+0x2c/0x41
> >> [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
> >> [<c04a7b8c>] ? do_rmdir+0x89/0xc8
> >> [<c04438ed>] ? up_read+0x1b/0x2e
> >> [<c0623de4>] ? do_page_fault+0x356/0x5ed
> >> [<c04a7c14>] ? sys_rmdir+0x15/0x17
> >> [<c0403485>] ? sysenter_do_call+0x12/0x35
> >> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
> >> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
> >> ---[ end trace 4eaa2a86a8e2da25 ]---
> >>
> >>
> >>
> >>
> >>
> >
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:15 [BUG] memcg: panic when rmdir() Li Zefan
2009-01-16 6:19 ` KAMEZAWA Hiroyuki
2009-01-16 6:29 ` KAMEZAWA Hiroyuki
@ 2009-01-16 6:58 ` Daisuke Nishimura
2009-01-16 8:07 ` KAMEZAWA Hiroyuki
3 siblings, 0 replies; 15+ messages in thread
From: Daisuke Nishimura @ 2009-01-16 6:58 UTC (permalink / raw)
To: Li Zefan; +Cc: nishimura, KAMEZAWA Hiroyuki, Balbir Singh, linux-mm
On Fri, 16 Jan 2009 14:15:04 +0800, Li Zefan <lizf@cn.fujitsu.com> wrote:
> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>
> # mount -t cgroup -o memory xxx /mnt
> # mkdir /mnt/0
> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
> # echo "low limit" > /mnt/0/tasks
> # do whatever to allocate some memory
> # swapoff -a
> killed (by OOM)
> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
> # rmdir /mnt/0
>
> ------------[ cut here ]------------
> WARNING: at kernel/res_counter.c:71 res_counter_uncharge_locked+0x25/0x36()
> Hardware name: Aspire SA85
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
> Pid: 2548, comm: rmdir Tainted: G W 2.6.29-rc1-mm1 #4
> Call Trace:
> [<c042ecb6>] warn_slowpath+0x79/0x8f
> [<c04496ce>] ? clockevents_program_event+0xe0/0xef
> [<c0463a1b>] ? res_counter_charge+0x35/0xb0
> [<c04639b0>] ? res_counter_uncharge+0x29/0x5f
> [<c0463941>] res_counter_uncharge_locked+0x25/0x36
> [<c04639ba>] res_counter_uncharge+0x33/0x5f
> [<c049b9ef>] mem_cgroup_force_empty+0x21b/0x498
> [<c049c82f>] mem_cgroup_pre_destroy+0x12/0x14
> [<c0460776>] cgroup_rmdir+0x5e/0x27e
> [<c0621d08>] ? _spin_unlock+0x2c/0x41
> [<c04a67fe>] vfs_rmdir+0x5b/0x9c
> [<c04a7b8c>] do_rmdir+0x89/0xc8
> [<c04438ed>] ? up_read+0x1b/0x2e
> [<c0623de4>] ? do_page_fault+0x356/0x5ed
> [<c04a7c14>] sys_rmdir+0x15/0x17
> [<c0403485>] sysenter_do_call+0x12/0x35
> ---[ end trace 4eaa2a86a8e2da24 ]---
> ------------[ cut here ]------------
> kernel BUG at kernel/cgroup.c:2517!
> invalid opcode: 0000 [#1] PREEMPT SMP
> last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/irq
> Modules linked in: bridge stp llc autofs4 dm_mirror dm_region_hash dm_log dm_mod parport_pc button sg r8169 mii parport sata_sis pata_sis ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
>
> Pid: 2548, comm: rmdir Tainted: G W (2.6.29-rc1-mm1 #4) Aspire SA85
> EIP: 0060:[<c04607f2>] EFLAGS: 00210046 CPU: 1
> EIP is at cgroup_rmdir+0xda/0x27e
> EAX: f442b800 EBX: ed00b3c0 ECX: c04607ca EDX: 00000000
> ESI: c0778dc0 EDI: 00200246 EBP: ed252f30 ESP: ed252f14
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process rmdir (pid: 2548, ti=ed252000 task=e19aa8c0 task.ti=ed252000)
> Stack:
> 00000000 f442b800 c0621d08 e18a8014 e2d96a00 fffffff0 e2ccadf0 ed252f44
> c04a67fe e2d96a00 00000000 0804ca00 ed252fa8 c04a7b8c ed269080 e2d745a0
> 00002121 00000001 f46a3000 00000000 00000000 00000000 ed0a6d34 00000004
> Call Trace:
> [<c0621d08>] ? _spin_unlock+0x2c/0x41
> [<c04a67fe>] ? vfs_rmdir+0x5b/0x9c
> [<c04a7b8c>] ? do_rmdir+0x89/0xc8
> [<c04438ed>] ? up_read+0x1b/0x2e
> [<c0623de4>] ? do_page_fault+0x356/0x5ed
> [<c04a7c14>] ? sys_rmdir+0x15/0x17
> [<c0403485>] ? sysenter_do_call+0x12/0x35
> Code: c8 fe ff 8b 43 40 8b 70 0c eb 3e 8b 46 28 8b 44 83 20 89 45 e8 8b 55 e8 8b 52 04 83 fa 01 89 55 e4 0f 8f 5f ff ff ff 85 d2 75 04 <0f> 0b eb fe 8b 45 e4 31 c9 8b 55 e8 f0 0f b1 4a 04 8b 55 e4 39
> EIP: [<c04607f2>] cgroup_rmdir+0xda/0x27e SS:ESP 0068:ed252f14
> ---[ end trace 4eaa2a86a8e2da25 ]---
>
I've reproduced this bug in my environment too.
I'll dig this too.
Thank you for reporting a bug.
Daisuke Nishimura.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:30 ` KAMEZAWA Hiroyuki
@ 2009-01-16 7:00 ` Li Zefan
0 siblings, 0 replies; 15+ messages in thread
From: Li Zefan @ 2009-01-16 7:00 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm
KAMEZAWA Hiroyuki wrote:
> On Fri, 16 Jan 2009 14:24:39 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
>
>> KAMEZAWA Hiroyuki wrote:
>>> On Fri, 16 Jan 2009 14:15:04 +0800
>>> Li Zefan <lizf@cn.fujitsu.com> wrote:
>>>
>>>> Found this when testing memory resource controller, can be triggered
>>>> with:
>>>> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
>>>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
>>>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>>>>
>>>> # mount -t cgroup -o memory xxx /mnt
>>>> # mkdir /mnt/0
>>>> # for pid in `cat /mnt/tasks`; do echo $pid > /mnt/0/tasks; done
>>>> # echo "low limit" > /mnt/0/tasks
>>>> # do whatever to allocate some memory
>>>> # swapoff -a
>>>> killed (by OOM)
>>>> # for pid in `cat /mnt/0/tasks`; do echo $pid > /mnt/tasks; done
>>>> # rmdir /mnt/0
>>>>
>>> Isn't this a problem Nishimura fixed today ?
>>>
>> Are you sure?
>>
> Sorry, I didn't see BUG! line in you log.
>
I've tested with Nishimura's patch applied, and as is expected, this bug
is totally different from the one Nishimura has fixed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 6:15 [BUG] memcg: panic when rmdir() Li Zefan
` (2 preceding siblings ...)
2009-01-16 6:58 ` Daisuke Nishimura
@ 2009-01-16 8:07 ` KAMEZAWA Hiroyuki
2009-01-16 8:26 ` Daisuke Nishimura
2009-01-16 8:33 ` Li Zefan
3 siblings, 2 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 8:07 UTC (permalink / raw)
To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm, hugh
On Fri, 16 Jan 2009 14:15:04 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:
> Found this when testing memory resource controller, can be triggered
> with:
> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>
Li-san, could you try this ? I myself can't reproduce the bug yet...
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Now, at swapoff, even while try_charge() fails, commit is executed.
This is bug and make refcnt of cgroup_subsys_state minus, finally.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Jan14/mm/swapfile.c
===================================================================
--- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
+++ mmotm-2.6.29-Jan14/mm/swapfile.c
@@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
pte_t *pte;
int ret = 1;
- if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
+ if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
ret = -ENOMEM;
+ goto out_nolock;
+ }
pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
@@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
activate_page(page);
out:
pte_unmap_unlock(pte, ptl);
+out_nolock:
return ret;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 8:07 ` KAMEZAWA Hiroyuki
@ 2009-01-16 8:26 ` Daisuke Nishimura
2009-01-16 9:12 ` KAMEZAWA Hiroyuki
2009-01-16 9:13 ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
2009-01-16 8:33 ` Li Zefan
1 sibling, 2 replies; 15+ messages in thread
From: Daisuke Nishimura @ 2009-01-16 8:26 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: nishimura, Li Zefan, Balbir Singh, linux-mm, hugh
On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Now, at swapoff, even while try_charge() fails, commit is executed.
> This is bug and make refcnt of cgroup_subsys_state minus, finally.
>
Nice catch!
I think this bug can explain this problem I've seen.
Commiting on trycharge failure will add the pc to the lru
without a corresponding charge and refcnt.
And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
Even if the problem cannot be fixed by this patch, this patch is valid and needed.
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
I'll test it.
Thanks,
Daisuke Nishimura.
> ---
> Index: mmotm-2.6.29-Jan14/mm/swapfile.c
> ===================================================================
> --- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
> +++ mmotm-2.6.29-Jan14/mm/swapfile.c
> @@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
> pte_t *pte;
> int ret = 1;
>
> - if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
> + if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
> ret = -ENOMEM;
> + goto out_nolock;
> + }
>
> pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
> @@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
> activate_page(page);
> out:
> pte_unmap_unlock(pte, ptl);
> +out_nolock:
> return ret;
> }
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 8:07 ` KAMEZAWA Hiroyuki
2009-01-16 8:26 ` Daisuke Nishimura
@ 2009-01-16 8:33 ` Li Zefan
2009-01-16 8:40 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-01-16 8:33 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm, hugh
KAMEZAWA Hiroyuki wrote:
> On Fri, 16 Jan 2009 14:15:04 +0800
> Li Zefan <lizf@cn.fujitsu.com> wrote:
>
>> Found this when testing memory resource controller, can be triggered
>> with:
>> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
>> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
>>
>
> Li-san, could you try this ? I myself can't reproduce the bug yet...
I've tested this patch, and the bug seems to disappear. :)
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
I'm going to be off office, and I'll do more testing to confirm this
next week.
> ==
>
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Now, at swapoff, even while try_charge() fails, commit is executed.
> This is bug and make refcnt of cgroup_subsys_state minus, finally.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> Index: mmotm-2.6.29-Jan14/mm/swapfile.c
> ===================================================================
> --- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
> +++ mmotm-2.6.29-Jan14/mm/swapfile.c
> @@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
> pte_t *pte;
> int ret = 1;
>
> - if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
> + if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
> ret = -ENOMEM;
> + goto out_nolock;
> + }
>
> pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
> @@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
> activate_page(page);
> out:
> pte_unmap_unlock(pte, ptl);
> +out_nolock:
> return ret;
> }
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 8:33 ` Li Zefan
@ 2009-01-16 8:40 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 8:40 UTC (permalink / raw)
To: Li Zefan; +Cc: Balbir Singh, Daisuke Nishimura, linux-mm, hugh
On Fri, 16 Jan 2009 16:33:08 +0800
Li Zefan <lizf@cn.fujitsu.com> wrote:
> KAMEZAWA Hiroyuki wrote:
> > On Fri, 16 Jan 2009 14:15:04 +0800
> > Li Zefan <lizf@cn.fujitsu.com> wrote:
> >
> >> Found this when testing memory resource controller, can be triggered
> >> with:
> >> - CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
> >> - or CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y && boot with noswapaccount
> >>
> >
> > Li-san, could you try this ? I myself can't reproduce the bug yet...
>
> I've tested this patch, and the bug seems to disappear. :)
>
> Tested-by: Li Zefan <lizf@cn.fujitsu.com>
>
> I'm going to be off office, and I'll do more testing to confirm this
> next week.
>
Thank you !
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 8:26 ` Daisuke Nishimura
@ 2009-01-16 9:12 ` KAMEZAWA Hiroyuki
2009-01-16 9:23 ` [BUGFIX] [PATCH] memcg: fix refcnt handling at swapoff KAMEZAWA Hiroyuki
2009-01-16 9:13 ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
1 sibling, 1 reply; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 9:12 UTC (permalink / raw)
To: Daisuke Nishimura; +Cc: Li Zefan, Balbir Singh, linux-mm, hugh
On Fri, 16 Jan 2009 17:26:51 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Now, at swapoff, even while try_charge() fails, commit is executed.
> > This is bug and make refcnt of cgroup_subsys_state minus, finally.
> >
> Nice catch!
>
> I think this bug can explain this problem I've seen.
> Commiting on trycharge failure will add the pc to the lru
> without a corresponding charge and refcnt.
> And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
> and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
>
> Even if the problem cannot be fixed by this patch, this patch is valid and needed.
>
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
>
> I'll test it.
>
>
I finally get how-to-reprocuce and confirmed this fixes the problem.
How-to-reproduce.
In shell-A
#mount -t cgroup none /opt/cgroup
#mkdir /opt/cgroup/xxx/
#echo 0 > /opt/cgroup/xxx/tasks
#Run malloc 100M on this and sleep. ---(*)
In shell-B.
#echo 40M > /opt/cgroup/xxx/memory.limit_in_bytes.
Then, you'll see 60M of swap.
#/sbin/swapoff -a
Then, you'll see OOM-Kill against (*)
#echo shell-A > /opt/cgroup/tasks
make /opt/cgroup/xxx/ empty
#rmdir /opt/cgroup/xxx
=> panics.
I'll add this swap-off test to memcg-debug.txt later.
BTW, OOM against (*) itself seems also probelmatic.
But simply disable oom-at-swapoff cannot be a workaround...
-Kame
> Thanks,
> Daisuke Nishimura.
>
> > ---
> > Index: mmotm-2.6.29-Jan14/mm/swapfile.c
> > ===================================================================
> > --- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
> > +++ mmotm-2.6.29-Jan14/mm/swapfile.c
> > @@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
> > pte_t *pte;
> > int ret = 1;
> >
> > - if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
> > + if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
> > ret = -ENOMEM;
> > + goto out_nolock;
> > + }
> >
> > pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> > if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
> > @@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
> > activate_page(page);
> > out:
> > pte_unmap_unlock(pte, ptl);
> > +out_nolock:
> > return ret;
> > }
> >
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 8:26 ` Daisuke Nishimura
2009-01-16 9:12 ` KAMEZAWA Hiroyuki
@ 2009-01-16 9:13 ` Daisuke Nishimura
2009-01-16 9:16 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 15+ messages in thread
From: Daisuke Nishimura @ 2009-01-16 9:13 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: nishimura, Li Zefan, Balbir Singh, linux-mm, hugh
On Fri, 16 Jan 2009 17:26:51 +0900, Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Now, at swapoff, even while try_charge() fails, commit is executed.
> > This is bug and make refcnt of cgroup_subsys_state minus, finally.
> >
> Nice catch!
>
> I think this bug can explain this problem I've seen.
> Commiting on trycharge failure will add the pc to the lru
> without a corresponding charge and refcnt.
> And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
> and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
>
> Even if the problem cannot be fixed by this patch, this patch is valid and needed.
>
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
>
> I'll test it.
>
I've tested several times, but this problem didn't happen.
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Thanks,
Daisuke Nishimura.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] memcg: panic when rmdir()
2009-01-16 9:13 ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
@ 2009-01-16 9:16 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 9:16 UTC (permalink / raw)
To: Daisuke Nishimura; +Cc: Li Zefan, Balbir Singh, linux-mm, hugh
On Fri, 16 Jan 2009 18:13:00 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> On Fri, 16 Jan 2009 17:26:51 +0900, Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > On Fri, 16 Jan 2009 17:07:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > >
> > > Now, at swapoff, even while try_charge() fails, commit is executed.
> > > This is bug and make refcnt of cgroup_subsys_state minus, finally.
> > >
> > Nice catch!
> >
> > I think this bug can explain this problem I've seen.
> > Commiting on trycharge failure will add the pc to the lru
> > without a corresponding charge and refcnt.
> > And rmdir uncharges the pc(so we get WARNING: at kernel/res_counter.c:71)
> > and decrements the refcnt(so we get BUG at kernel/cgroup.c:2517).
> >
> > Even if the problem cannot be fixed by this patch, this patch is valid and needed.
> >
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> >
> > I'll test it.
> >
> I've tested several times, but this problem didn't happen.
>
> Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
>
Thank you!, I'll send the patch to Andrew.
-Kame
>
> Thanks,
> Daisuke Nishimura.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [BUGFIX] [PATCH] memcg: fix refcnt handling at swapoff
2009-01-16 9:12 ` KAMEZAWA Hiroyuki
@ 2009-01-16 9:23 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 15+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-01-16 9:23 UTC (permalink / raw)
To: akpm; +Cc: Daisuke Nishimura, Li Zefan, Balbir Singh, linux-mm, hugh
On Fri, 16 Jan 2009 18:12:35 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> How-to-reproduce.
>
> In shell-A
> #mount -t cgroup none /opt/cgroup
> #mkdir /opt/cgroup/xxx/
> #echo 0 > /opt/cgroup/xxx/tasks
> #Run malloc 100M on this and sleep. ---(*)
>
> In shell-B.
> #echo 40M > /opt/cgroup/xxx/memory.limit_in_bytes.
> Then, you'll see 60M of swap.
> #/sbin/swapoff -a
> Then, you'll see OOM-Kill against (*)
> #echo shell-A > /opt/cgroup/tasks
> make /opt/cgroup/xxx/ empty
> #rmdir /opt/cgroup/xxx
>
> => panics.
>
I'll update how-to-test text under Documentation/ later.
-Kame
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Now, at swapoff, even while try_charge() fails, commit is executed.
This is bug and make refcnt of cgroup_subsys_state minus, finally.
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
---
mm/swapfile.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Index: mmotm-2.6.29-Jan14/mm/swapfile.c
===================================================================
--- mmotm-2.6.29-Jan14.orig/mm/swapfile.c
+++ mmotm-2.6.29-Jan14/mm/swapfile.c
@@ -698,8 +698,10 @@ static int unuse_pte(struct vm_area_stru
pte_t *pte;
int ret = 1;
- if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr))
+ if (mem_cgroup_try_charge_swapin(vma->vm_mm, page, GFP_KERNEL, &ptr)) {
ret = -ENOMEM;
+ goto out_nolock;
+ }
pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
@@ -723,6 +725,7 @@ static int unuse_pte(struct vm_area_stru
activate_page(page);
out:
pte_unmap_unlock(pte, ptl);
+out_nolock:
return ret;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-01-16 9:24 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-16 6:15 [BUG] memcg: panic when rmdir() Li Zefan
2009-01-16 6:19 ` KAMEZAWA Hiroyuki
2009-01-16 6:24 ` Li Zefan
2009-01-16 6:30 ` KAMEZAWA Hiroyuki
2009-01-16 7:00 ` Li Zefan
2009-01-16 6:29 ` KAMEZAWA Hiroyuki
2009-01-16 6:58 ` Daisuke Nishimura
2009-01-16 8:07 ` KAMEZAWA Hiroyuki
2009-01-16 8:26 ` Daisuke Nishimura
2009-01-16 9:12 ` KAMEZAWA Hiroyuki
2009-01-16 9:23 ` [BUGFIX] [PATCH] memcg: fix refcnt handling at swapoff KAMEZAWA Hiroyuki
2009-01-16 9:13 ` [BUG] memcg: panic when rmdir() Daisuke Nishimura
2009-01-16 9:16 ` KAMEZAWA Hiroyuki
2009-01-16 8:33 ` Li Zefan
2009-01-16 8:40 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox