linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* How to interpret this OOM situation?
@ 2014-11-16 14:11 Marki
  2014-11-16 16:39 ` Konstantin Khlebnikov
  0 siblings, 1 reply; 4+ messages in thread
From: Marki @ 2014-11-16 14:11 UTC (permalink / raw)
  To: linux-mm


Hey there,

I wouldn't know where to turn anymore, maybe you guys can help me debug this
OOM.

Questions aside from "why in the end is this happening":
- GFP mask lower byte 0xa indicates a request for a free page in highmem.
This is a 64-bit system and therefore has no highmem zone. So what's going on?
- Swap is almost not used: why not use it before OOMing?
- Pagecache is high: why not empty it before OOMing? (almost no dirty pages)

Oh and it's a machine with 4G of RAM on kernel 3.0.101 (SLES11 SP3).


<4>[598175.284914] cifsd invoked oom-killer: gfp_mask=0x200da, order=0,
oom_adj=0, oom_score_adj=0
<6>[598175.284919] cifsd cpuset=/ mems_allowed=0
<4>[598175.284921] Pid: 5529, comm: cifsd Tainted: G           E X
3.0.101-0.35-default #1
<4>[598175.284923] Call Trace:
<4>[598175.284934]  [<ffffffff81004935>] dump_trace+0x75/0x310
<4>[598175.284941]  [<ffffffff8145f2f3>] dump_stack+0x69/0x6f
<4>[598175.284947]  [<ffffffff810fc53e>] dump_header+0x8e/0x110
<4>[598175.284950]  [<ffffffff810fc8e6>] oom_kill_process+0xa6/0x350
<4>[598175.284954]  [<ffffffff810fce25>] out_of_memory+0x295/0x2f0
<4>[598175.284957]  [<ffffffff8110287e>] __alloc_pages_slowpath+0x78e/0x7d0
<4>[598175.284960]  [<ffffffff81102aa9>] __alloc_pages_nodemask+0x1e9/0x200
<4>[598175.284965]  [<ffffffff8113de60>] alloc_pages_vma+0xd0/0x1c0
<4>[598175.284969]  [<ffffffff81130bcd>] read_swap_cache_async+0x10d/0x160
<4>[598175.284972]  [<ffffffff81130c94>] swapin_readahead+0x74/0xd0
<4>[598175.284975]  [<ffffffff81120bfa>] do_swap_page+0xea/0x5f0
<4>[598175.284978]  [<ffffffff81121c21>] handle_pte_fault+0x1e1/0x230
<4>[598175.284982]  [<ffffffff81465bcd>] do_page_fault+0x1fd/0x4c0
<4>[598175.284985]  [<ffffffff814627e5>] page_fault+0x25/0x30
<4>[598175.285002]  [<00007f65a0891078>] 0x7f65a0891077

Ok, it wants to swap in sth but fails because apparently there is no more
physical memory.

<4>[598175.285003] Mem-Info:
<4>[598175.285004] Node 0 DMA per-cpu:
<4>[598175.285006] CPU    0: hi:    0, btch:   1 usd:   0
<4>[598175.285007] CPU    1: hi:    0, btch:   1 usd:   0
<4>[598175.285008] Node 0 DMA32 per-cpu:
<4>[598175.285010] CPU    0: hi:  186, btch:  31 usd:   9
<4>[598175.285011] CPU    1: hi:  186, btch:  31 usd:   7
<4>[598175.285012] Node 0 Normal per-cpu:
<4>[598175.285013] CPU    0: hi:  186, btch:  31 usd:  35
<4>[598175.285014] CPU    1: hi:  186, btch:  31 usd:  31
<4>[598175.285017] active_anon:218 inactive_anon:91 isolated_anon:0
<4>[598175.285018]  active_file:187788 inactive_file:451982 isolated_file:896
<4>[598175.285018]  unevictable:0 dirty:0 writeback:69 unstable:0
<4>[598175.285019]  free:21841 slab_reclaimable:8417 slab_unreclaimable:132175
<4>[598175.285020]  mapped:8168 shmem:4 pagetables:2639 bounce:0

Here we see a little over 3G used although I wouldn't be able to say what
the different entries are exactly.

<4>[598175.285021] Node 0 DMA free:15880kB min:256kB low:320kB high:384kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB isolat
ed(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0k
B pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? yes
<4>[598175.285027] lowmem_reserve[]: 0 3000 4010 4010
<4>[598175.285029] Node 0 DMA32 free:54600kB min:50368kB low:62960kB
high:75552kB active_anon:860kB inactive_anon:308kB active_file:600716kB
inactive_file:1576184kB
 unevictable:0kB isolated(anon):0kB isolated(file):3328kB present:3072160kB
mlocked:0kB dirty:0kB writeback:248kB mapped:26800kB shmem:16kB
slab_reclaimable:23552kB
 slab_unreclaimable:412540kB kernel_stack:752kB pagetables:2412kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4169324
all_unreclaimable? yes
<4>[598175.285036] lowmem_reserve[]: 0 0 1010 1010
<4>[598175.285038] Node 0 Normal free:16884kB min:16956kB low:21192kB
high:25432kB active_anon:12kB inactive_anon:56kB active_file:150436kB
inactive_file:231744kB u
nevictable:0kB isolated(anon):0kB isolated(file):384kB present:1034240kB
mlocked:0kB dirty:0kB writeback:28kB mapped:5872kB shmem:0kB
slab_reclaimable:10116kB slab_
unreclaimable:116160kB kernel_stack:2848kB pagetables:8144kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:688103 all_unreclaimable? yes
<4>[598175.285044] lowmem_reserve[]: 0 0 0 0
<4>[598175.285046] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB
1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
<4>[598175.285051] Node 0 DMA32: 12620*4kB 3*8kB 0*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 54600kB
<4>[598175.285056] Node 0 Normal: 3195*4kB 1*8kB 0*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 16884kB

There seems to be a lot of fragmentation. But since an order 0 page (4k) was
requested (in highmem!?), and tons of those are available, that wouldn't
matter, would it?

<4>[598175.285061] 375504 total pagecache pages

That's more than 1G of pagecache. Shouldn't it first lower that cache before
throwing OOM?

<4>[598175.285062] 268 pages in swap cache
<4>[598175.285064] Swap cache stats: add 1266107, delete 1265839, find
3666696/3838636
<4>[598175.285065] Free swap  = 4641856kB
<4>[598175.285066] Total swap = 5244924kB

Almost no swap used. Shouldn't it swap before throwing OOM?

<4>[598175.285066] 1030522 pages RAM

Oh and FWIW here comes the process list

<6>[598175.285067] [ pid ]   uid  tgid total_vm      rss cpu oom_adj
oom_score_adj name
<6>[598175.285071] [  485]     0   485     4223       62   0     -17       
 -1000 udevd
<6>[598175.285073] [ 1434]     0  1434     1003       65   1       0       
     0 acpid
<6>[598175.285075] [ 1449]   100  1449     8585      112   0       0       
     0 dbus-daemon
<6>[598175.285077] [ 1475]     0  1475    36450      428   1       0       
     0 mono
<6>[598175.285079] [ 1772]     0  1772    21365      298   1       0       
     0 vmtoolsd
<6>[598175.285081] [ 1838]   101  1838    12322      180   0       0       
     0 hald
<6>[598175.285083] [ 1842]     0  1842    41067      187   1       0       
     0 console-kit-dae
<6>[598175.285085] [ 1843]     0  1843     4510       56   1       0       
     0 hald-runner
<6>[598175.285087] [ 1961]     0  1961     8691       17   0       0       
     0 hald-addon-inpu
<6>[598175.285107] [ 1984]     0  1984     8691       75   1       0       
     0 hald-addon-stor
<6>[598175.285109] [ 1992]   101  1992     9130        7   1       0       
     0 hald-addon-acpi
<6>[598175.285111] [ 1993]     0  1993     8691       77   0       0       
     0 hald-addon-stor
<6>[598175.285113] [ 2562]     0  2562    47184       78   1       0       
     0 httpstkd
<6>[598175.285115] [ 2581]     0  2581     5881      221   1       0       
     0 syslog-ng
<6>[598175.285117] [ 2584]     0  2584     1070       63   1       0       
     0 klogd
<6>[598175.285119] [ 2598]     0  2598    23796      104   1     -17       
 -1000 auditd
<6>[598175.285121] [ 2600]     0  2600    19995       87   1       0       
     0 audispd
<6>[598175.285123] [ 2621]     0  2621     2093       58   0       0       
     0 haveged
<6>[598175.285125] [ 2641]     0  2641     4728       81   1       0       
     0 rpcbind
<6>[598175.285127] [ 2680]     0  2680    77513      657   0       0       
     0 nsrexecd
<6>[598175.285129] [ 2753]     0  2753     4222       52   0     -17       
 -1000 udevd
<6>[598175.285131] [ 2832]     0  2832     2160       75   0       0       
     0 irqbalance
<6>[598175.285133] [ 2863]     0  2863     6778       53   1       0       
     0 mcelog
<6>[598175.285135] [ 3163]     0  3163    35027      170   1       0       
     0 gmond
<6>[598175.285137] [ 3177] 65534  3177    56670      185   0       0       
     0 gmetad
<6>[598175.285139] [ 3213]     0  3213    24991      107   1       0       
     0 sfcbd
<6>[598175.285141] [ 3214]     0  3214    16795        0   1       0       
     0 sfcbd
<6>[598175.285143] [ 3221]     0  3221    20445       78   1       0       
     0 sfcbd
<6>[598175.285145] [ 3222]     0  3222    41992      117   1       0       
     0 sfcbd
<6>[598175.285147] [ 3239]     0  3239    16092       58   1       0       
     0 pure-ftpd
<6>[598175.285149] [ 3240]     2  3240     6284       82   0       0       
     0 slpd
<6>[598175.285151] [ 3290]     0  3290    12855      120   0     -17       
 -1000 sshd
<6>[598175.285153] [ 3316]    74  3316     8070      152   0       0       
     0 ntpd
<6>[598175.285154] [ 3333]     0  3333    17945       90   1       0       
     0 cupsd
<6>[598175.285156] [ 3393]     0  3393    19365       31   1       0       
     0 sfcbd
<6>[598175.285158] [ 3395]     0  3395    21475      109   0       0       
     0 sfcbd
<6>[598175.285160] [ 3400]     0  3400    38331      129   1       0       
     0 sfcbd
<6>[598175.285162] [ 3479]     0  3479    38357      125   0       0       
     0 sfcbd
<6>[598175.285164] [ 3719]     0  3655   220311     2005   0       0       
     0 ndsd
<6>[598175.285166] [ 3893]    30  3893   177915      910   0       0       
     0 java
<6>[598175.285168] [ 3910]     0  3910    14968       97   1       0       
     0 nscd
<6>[598175.285170] [ 3961]     0  3961    47276      332   0       0       
     0 namcd
<6>[598175.285172] [ 4073]     0  4073    10998      104   0       0       
     0 master
<6>[598175.285174] [ 4099]    51  4099    14190      229   0       0       
     0 qmgr
<6>[598175.285176] [ 4135]     0  4135    33370       99   1       0       
     0 httpd2-prefork
<6>[598175.285178] [ 4136]    30  4136    35518       85   1       0       
     0 httpd2-prefork
<6>[598175.285180] [ 4137]    30  4137    35523      266   0       0       
     0 httpd2-prefork
<6>[598175.285182] [ 4138]    30  4138    35523      111   0       0       
     0 httpd2-prefork
<6>[598175.285184] [ 4139]    30  4139    35523      137   0       0       
     0 httpd2-prefork
<6>[598175.285186] [ 4140]    30  4140    35523      299   0       0       
     0 httpd2-prefork
<6>[598175.285188] [ 4168]     0  4168     5751       86   0       0       
     0 cron
<6>[598175.285190] [ 4349]     0  4349    43028      120   0       0       
     0 ndpapp
<6>[598175.285194] [ 4548]     0  4548    17722       33   0       0       
     0 adminusd
<6>[598175.285196] [ 4577]     0  4577    17136       26   1       0       
     0 jstcpd
<6>[598175.285198] [ 4580]     0  4580    12511        0   1       0       
     0 jstcpd
<6>[598175.285200] [ 4601]     0  4601    10976       42   1       0       
     0 vlrpc
<6>[598175.285202] [ 4621]     0  4621     4222       54   1     -17       
 -1000 udevd
<6>[598175.285204] [ 4672]     0  4672    21525       70   0       0       
     0 volmnd
<6>[598175.285206] [ 4693]     0  4693    48377      195   0       0       
     0 ncp2nss
<6>[598175.285208] [ 4942]    81  4942    40049       32   0       0       
     0 novell-xregd
<6>[598175.285210] [ 5195]     0  5195    90312      479   0       0       
     0 cifsd
<6>[598175.285212] [ 5240]     0  5240     9586        9   1       0       
     0 smdrd
<6>[598175.285214] [ 5279]     0  5279    55127      172   0       0       
     0 novfsd
<6>[598175.285216] [ 5327]   104  5327     9431       72   0       0       
     0 nrpe
<6>[598175.285218] [ 5337]     0  5337     3177       78   0       0       
     0 mingetty
<6>[598175.285219] [ 5338]     0  5338     3177       78   1       0       
     0 mingetty
<6>[598175.285221] [ 5339]     0  5339     3177       78   0       0       
     0 mingetty
<6>[598175.285223] [ 5340]     0  5340     3177       78   1       0       
     0 mingetty
<6>[598175.285225] [ 5341]     0  5341     3177       78   0       0       
     0 mingetty
<6>[598175.285227] [ 5342]     0  5342     3177       78   1       0       
     0 mingetty
<6>[598175.285229] [ 5520]     0  5520    67658       99   0       0       
     0 cifsd
<6>[598175.285231] [25139]     0 25139    17698      836   0       0       
     0 snmpd
<6>[598175.285233] [ 4842]    51  4842    14147      511   0       0       
     0 pickup
<6>[598175.285235] [ 7917]     0  7917    21027     2460   1       0       
     0 savepnpc
<3>[598175.285237] Out of memory: Kill process 3719 (ndsd) score 19 or
sacrifice child
<3>[598175.285239] Killed process 3719 (ndsd) total-vm:881244kB,
anon-rss:0kB, file-rss:8020kB



Thanks

Marki

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to interpret this OOM situation?
  2014-11-16 14:11 How to interpret this OOM situation? Marki
@ 2014-11-16 16:39 ` Konstantin Khlebnikov
  2014-11-16 17:26   ` mro2
  0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Khlebnikov @ 2014-11-16 16:39 UTC (permalink / raw)
  To: Marki; +Cc: linux-mm

Swap can be used only for anon pages or for tmpfs. You have a lot of
file page cache.
I guess this is leak of pages' reference counter in some filesystem,
more likely in cifs.

Try to isolate which part of workload causes this leak, for example
switch to another filesystem.

On Sun, Nov 16, 2014 at 5:11 PM, Marki <mro2@gmx.net> wrote:
>
> Hey there,
>
> I wouldn't know where to turn anymore, maybe you guys can help me debug this
> OOM.
>
> Questions aside from "why in the end is this happening":
> - GFP mask lower byte 0xa indicates a request for a free page in highmem.
> This is a 64-bit system and therefore has no highmem zone. So what's going on?
> - Swap is almost not used: why not use it before OOMing?
> - Pagecache is high: why not empty it before OOMing? (almost no dirty pages)
>
> Oh and it's a machine with 4G of RAM on kernel 3.0.101 (SLES11 SP3).
>
>
> <4>[598175.284914] cifsd invoked oom-killer: gfp_mask=0x200da, order=0,
> oom_adj=0, oom_score_adj=0
> <6>[598175.284919] cifsd cpuset=/ mems_allowed=0
> <4>[598175.284921] Pid: 5529, comm: cifsd Tainted: G           E X
> 3.0.101-0.35-default #1
> <4>[598175.284923] Call Trace:
> <4>[598175.284934]  [<ffffffff81004935>] dump_trace+0x75/0x310
> <4>[598175.284941]  [<ffffffff8145f2f3>] dump_stack+0x69/0x6f
> <4>[598175.284947]  [<ffffffff810fc53e>] dump_header+0x8e/0x110
> <4>[598175.284950]  [<ffffffff810fc8e6>] oom_kill_process+0xa6/0x350
> <4>[598175.284954]  [<ffffffff810fce25>] out_of_memory+0x295/0x2f0
> <4>[598175.284957]  [<ffffffff8110287e>] __alloc_pages_slowpath+0x78e/0x7d0
> <4>[598175.284960]  [<ffffffff81102aa9>] __alloc_pages_nodemask+0x1e9/0x200
> <4>[598175.284965]  [<ffffffff8113de60>] alloc_pages_vma+0xd0/0x1c0
> <4>[598175.284969]  [<ffffffff81130bcd>] read_swap_cache_async+0x10d/0x160
> <4>[598175.284972]  [<ffffffff81130c94>] swapin_readahead+0x74/0xd0
> <4>[598175.284975]  [<ffffffff81120bfa>] do_swap_page+0xea/0x5f0
> <4>[598175.284978]  [<ffffffff81121c21>] handle_pte_fault+0x1e1/0x230
> <4>[598175.284982]  [<ffffffff81465bcd>] do_page_fault+0x1fd/0x4c0
> <4>[598175.284985]  [<ffffffff814627e5>] page_fault+0x25/0x30
> <4>[598175.285002]  [<00007f65a0891078>] 0x7f65a0891077
>
> Ok, it wants to swap in sth but fails because apparently there is no more
> physical memory.
>
> <4>[598175.285003] Mem-Info:
> <4>[598175.285004] Node 0 DMA per-cpu:
> <4>[598175.285006] CPU    0: hi:    0, btch:   1 usd:   0
> <4>[598175.285007] CPU    1: hi:    0, btch:   1 usd:   0
> <4>[598175.285008] Node 0 DMA32 per-cpu:
> <4>[598175.285010] CPU    0: hi:  186, btch:  31 usd:   9
> <4>[598175.285011] CPU    1: hi:  186, btch:  31 usd:   7
> <4>[598175.285012] Node 0 Normal per-cpu:
> <4>[598175.285013] CPU    0: hi:  186, btch:  31 usd:  35
> <4>[598175.285014] CPU    1: hi:  186, btch:  31 usd:  31
> <4>[598175.285017] active_anon:218 inactive_anon:91 isolated_anon:0
> <4>[598175.285018]  active_file:187788 inactive_file:451982 isolated_file:896
> <4>[598175.285018]  unevictable:0 dirty:0 writeback:69 unstable:0
> <4>[598175.285019]  free:21841 slab_reclaimable:8417 slab_unreclaimable:132175
> <4>[598175.285020]  mapped:8168 shmem:4 pagetables:2639 bounce:0
>
> Here we see a little over 3G used although I wouldn't be able to say what
> the different entries are exactly.
>
> <4>[598175.285021] Node 0 DMA free:15880kB min:256kB low:320kB high:384kB
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
> unevictable:0kB isolat
> ed(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB
> writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
> slab_unreclaimable:0kB kernel_stack:0k
> B pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? yes
> <4>[598175.285027] lowmem_reserve[]: 0 3000 4010 4010
> <4>[598175.285029] Node 0 DMA32 free:54600kB min:50368kB low:62960kB
> high:75552kB active_anon:860kB inactive_anon:308kB active_file:600716kB
> inactive_file:1576184kB
>  unevictable:0kB isolated(anon):0kB isolated(file):3328kB present:3072160kB
> mlocked:0kB dirty:0kB writeback:248kB mapped:26800kB shmem:16kB
> slab_reclaimable:23552kB
>  slab_unreclaimable:412540kB kernel_stack:752kB pagetables:2412kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4169324
> all_unreclaimable? yes
> <4>[598175.285036] lowmem_reserve[]: 0 0 1010 1010
> <4>[598175.285038] Node 0 Normal free:16884kB min:16956kB low:21192kB
> high:25432kB active_anon:12kB inactive_anon:56kB active_file:150436kB
> inactive_file:231744kB u
> nevictable:0kB isolated(anon):0kB isolated(file):384kB present:1034240kB
> mlocked:0kB dirty:0kB writeback:28kB mapped:5872kB shmem:0kB
> slab_reclaimable:10116kB slab_
> unreclaimable:116160kB kernel_stack:2848kB pagetables:8144kB unstable:0kB
> bounce:0kB writeback_tmp:0kB pages_scanned:688103 all_unreclaimable? yes
> <4>[598175.285044] lowmem_reserve[]: 0 0 0 0
> <4>[598175.285046] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB
> 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
> <4>[598175.285051] Node 0 DMA32: 12620*4kB 3*8kB 0*16kB 0*32kB 0*64kB
> 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 54600kB
> <4>[598175.285056] Node 0 Normal: 3195*4kB 1*8kB 0*16kB 0*32kB 0*64kB
> 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 16884kB
>
> There seems to be a lot of fragmentation. But since an order 0 page (4k) was
> requested (in highmem!?), and tons of those are available, that wouldn't
> matter, would it?
>
> <4>[598175.285061] 375504 total pagecache pages
>
> That's more than 1G of pagecache. Shouldn't it first lower that cache before
> throwing OOM?
>
> <4>[598175.285062] 268 pages in swap cache
> <4>[598175.285064] Swap cache stats: add 1266107, delete 1265839, find
> 3666696/3838636
> <4>[598175.285065] Free swap  = 4641856kB
> <4>[598175.285066] Total swap = 5244924kB
>
> Almost no swap used. Shouldn't it swap before throwing OOM?
>
> <4>[598175.285066] 1030522 pages RAM
>
> Oh and FWIW here comes the process list
>
> <6>[598175.285067] [ pid ]   uid  tgid total_vm      rss cpu oom_adj
> oom_score_adj name
> <6>[598175.285071] [  485]     0   485     4223       62   0     -17
>  -1000 udevd
> <6>[598175.285073] [ 1434]     0  1434     1003       65   1       0
>      0 acpid
> <6>[598175.285075] [ 1449]   100  1449     8585      112   0       0
>      0 dbus-daemon
> <6>[598175.285077] [ 1475]     0  1475    36450      428   1       0
>      0 mono
> <6>[598175.285079] [ 1772]     0  1772    21365      298   1       0
>      0 vmtoolsd
> <6>[598175.285081] [ 1838]   101  1838    12322      180   0       0
>      0 hald
> <6>[598175.285083] [ 1842]     0  1842    41067      187   1       0
>      0 console-kit-dae
> <6>[598175.285085] [ 1843]     0  1843     4510       56   1       0
>      0 hald-runner
> <6>[598175.285087] [ 1961]     0  1961     8691       17   0       0
>      0 hald-addon-inpu
> <6>[598175.285107] [ 1984]     0  1984     8691       75   1       0
>      0 hald-addon-stor
> <6>[598175.285109] [ 1992]   101  1992     9130        7   1       0
>      0 hald-addon-acpi
> <6>[598175.285111] [ 1993]     0  1993     8691       77   0       0
>      0 hald-addon-stor
> <6>[598175.285113] [ 2562]     0  2562    47184       78   1       0
>      0 httpstkd
> <6>[598175.285115] [ 2581]     0  2581     5881      221   1       0
>      0 syslog-ng
> <6>[598175.285117] [ 2584]     0  2584     1070       63   1       0
>      0 klogd
> <6>[598175.285119] [ 2598]     0  2598    23796      104   1     -17
>  -1000 auditd
> <6>[598175.285121] [ 2600]     0  2600    19995       87   1       0
>      0 audispd
> <6>[598175.285123] [ 2621]     0  2621     2093       58   0       0
>      0 haveged
> <6>[598175.285125] [ 2641]     0  2641     4728       81   1       0
>      0 rpcbind
> <6>[598175.285127] [ 2680]     0  2680    77513      657   0       0
>      0 nsrexecd
> <6>[598175.285129] [ 2753]     0  2753     4222       52   0     -17
>  -1000 udevd
> <6>[598175.285131] [ 2832]     0  2832     2160       75   0       0
>      0 irqbalance
> <6>[598175.285133] [ 2863]     0  2863     6778       53   1       0
>      0 mcelog
> <6>[598175.285135] [ 3163]     0  3163    35027      170   1       0
>      0 gmond
> <6>[598175.285137] [ 3177] 65534  3177    56670      185   0       0
>      0 gmetad
> <6>[598175.285139] [ 3213]     0  3213    24991      107   1       0
>      0 sfcbd
> <6>[598175.285141] [ 3214]     0  3214    16795        0   1       0
>      0 sfcbd
> <6>[598175.285143] [ 3221]     0  3221    20445       78   1       0
>      0 sfcbd
> <6>[598175.285145] [ 3222]     0  3222    41992      117   1       0
>      0 sfcbd
> <6>[598175.285147] [ 3239]     0  3239    16092       58   1       0
>      0 pure-ftpd
> <6>[598175.285149] [ 3240]     2  3240     6284       82   0       0
>      0 slpd
> <6>[598175.285151] [ 3290]     0  3290    12855      120   0     -17
>  -1000 sshd
> <6>[598175.285153] [ 3316]    74  3316     8070      152   0       0
>      0 ntpd
> <6>[598175.285154] [ 3333]     0  3333    17945       90   1       0
>      0 cupsd
> <6>[598175.285156] [ 3393]     0  3393    19365       31   1       0
>      0 sfcbd
> <6>[598175.285158] [ 3395]     0  3395    21475      109   0       0
>      0 sfcbd
> <6>[598175.285160] [ 3400]     0  3400    38331      129   1       0
>      0 sfcbd
> <6>[598175.285162] [ 3479]     0  3479    38357      125   0       0
>      0 sfcbd
> <6>[598175.285164] [ 3719]     0  3655   220311     2005   0       0
>      0 ndsd
> <6>[598175.285166] [ 3893]    30  3893   177915      910   0       0
>      0 java
> <6>[598175.285168] [ 3910]     0  3910    14968       97   1       0
>      0 nscd
> <6>[598175.285170] [ 3961]     0  3961    47276      332   0       0
>      0 namcd
> <6>[598175.285172] [ 4073]     0  4073    10998      104   0       0
>      0 master
> <6>[598175.285174] [ 4099]    51  4099    14190      229   0       0
>      0 qmgr
> <6>[598175.285176] [ 4135]     0  4135    33370       99   1       0
>      0 httpd2-prefork
> <6>[598175.285178] [ 4136]    30  4136    35518       85   1       0
>      0 httpd2-prefork
> <6>[598175.285180] [ 4137]    30  4137    35523      266   0       0
>      0 httpd2-prefork
> <6>[598175.285182] [ 4138]    30  4138    35523      111   0       0
>      0 httpd2-prefork
> <6>[598175.285184] [ 4139]    30  4139    35523      137   0       0
>      0 httpd2-prefork
> <6>[598175.285186] [ 4140]    30  4140    35523      299   0       0
>      0 httpd2-prefork
> <6>[598175.285188] [ 4168]     0  4168     5751       86   0       0
>      0 cron
> <6>[598175.285190] [ 4349]     0  4349    43028      120   0       0
>      0 ndpapp
> <6>[598175.285194] [ 4548]     0  4548    17722       33   0       0
>      0 adminusd
> <6>[598175.285196] [ 4577]     0  4577    17136       26   1       0
>      0 jstcpd
> <6>[598175.285198] [ 4580]     0  4580    12511        0   1       0
>      0 jstcpd
> <6>[598175.285200] [ 4601]     0  4601    10976       42   1       0
>      0 vlrpc
> <6>[598175.285202] [ 4621]     0  4621     4222       54   1     -17
>  -1000 udevd
> <6>[598175.285204] [ 4672]     0  4672    21525       70   0       0
>      0 volmnd
> <6>[598175.285206] [ 4693]     0  4693    48377      195   0       0
>      0 ncp2nss
> <6>[598175.285208] [ 4942]    81  4942    40049       32   0       0
>      0 novell-xregd
> <6>[598175.285210] [ 5195]     0  5195    90312      479   0       0
>      0 cifsd
> <6>[598175.285212] [ 5240]     0  5240     9586        9   1       0
>      0 smdrd
> <6>[598175.285214] [ 5279]     0  5279    55127      172   0       0
>      0 novfsd
> <6>[598175.285216] [ 5327]   104  5327     9431       72   0       0
>      0 nrpe
> <6>[598175.285218] [ 5337]     0  5337     3177       78   0       0
>      0 mingetty
> <6>[598175.285219] [ 5338]     0  5338     3177       78   1       0
>      0 mingetty
> <6>[598175.285221] [ 5339]     0  5339     3177       78   0       0
>      0 mingetty
> <6>[598175.285223] [ 5340]     0  5340     3177       78   1       0
>      0 mingetty
> <6>[598175.285225] [ 5341]     0  5341     3177       78   0       0
>      0 mingetty
> <6>[598175.285227] [ 5342]     0  5342     3177       78   1       0
>      0 mingetty
> <6>[598175.285229] [ 5520]     0  5520    67658       99   0       0
>      0 cifsd
> <6>[598175.285231] [25139]     0 25139    17698      836   0       0
>      0 snmpd
> <6>[598175.285233] [ 4842]    51  4842    14147      511   0       0
>      0 pickup
> <6>[598175.285235] [ 7917]     0  7917    21027     2460   1       0
>      0 savepnpc
> <3>[598175.285237] Out of memory: Kill process 3719 (ndsd) score 19 or
> sacrifice child
> <3>[598175.285239] Killed process 3719 (ndsd) total-vm:881244kB,
> anon-rss:0kB, file-rss:8020kB
>
>
>
> Thanks
>
> Marki
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: How to interpret this OOM situation?
  2014-11-16 16:39 ` Konstantin Khlebnikov
@ 2014-11-16 17:26   ` mro2
  2014-11-16 17:42     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 4+ messages in thread
From: mro2 @ 2014-11-16 17:26 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm


When I manually drop caches, it looks like this:

# grep -i dirty /proc/meminfo ; free; sync ; sync ; sync ; echo 3 >
/proc/sys/vm/drop_caches ; free ; grep -i dirty /proc/meminfo
Dirty:              2224 kB
             total       used       free     shared    buffers     cached
Mem:       3926016    3690288     235728          0      85628    1376996
-/+ buffers/cache:    2227664    1698352
Swap:      5244924     323872    4921052
             total       used       free     shared    buffers     cached
Mem:       3926016    2604568    1321448          0        132     407696
-/+ buffers/cache:    2196740    1729276
Swap:      5244924     323580    4921344
Dirty:                 8 kB


However, during backup times (usually when the OOM happens) the page cache
is not (automatically) emptied before OOMing.


Right now I tried finding out using fincore what exactly is in the page
cache, only to get allocation failures:


Nov 16 18:04:31 fs kernel: [554283.989323] fincore: page allocation failure: order:4, mode:0xd0
Nov 16 18:04:31 fs kernel: [554283.989331] Pid: 17921, comm: fincore Tainted: G           E X 3.0.101-0.35-default #1
Nov 16 18:04:31 fs kernel: [554283.989333] Call Trace:
Nov 16 18:04:31 fs kernel: [554283.989345]  [<ffffffff81004935>] dump_trace+0x75/0x310
Nov 16 18:04:31 fs kernel: [554283.989352]  [<ffffffff8145f2f3>] dump_stack+0x69/0x6f
Nov 16 18:04:31 fs kernel: [554283.989357]  [<ffffffff81100a46>] warn_alloc_failed+0xc6/0x170
Nov 16 18:04:31 fs kernel: [554283.989361]  [<ffffffff81102631>] __alloc_pages_slowpath+0x541/0x7d0
Nov 16 18:04:31 fs kernel: [554283.989364]  [<ffffffff81102aa9>] __alloc_pages_nodemask+0x1e9/0x200
Nov 16 18:04:31 fs kernel: [554283.989368]  [<ffffffff811439c3>] kmem_getpages+0x53/0x180
Nov 16 18:04:31 fs kernel: [554283.989372]  [<ffffffff811447c6>] fallback_alloc+0x196/0x270
Nov 16 18:04:31 fs kernel: [554283.989375]  [<ffffffff81145117>] kmem_cache_alloc_trace+0x207/0x2a0
Nov 16 18:04:31 fs kernel: [554283.989380]  [<ffffffff810dc466>] __tracing_open+0x66/0x330
Nov 16 18:04:31 fs kernel: [554283.989384]  [<ffffffff810dc783>] tracing_open+0x53/0xb0
Nov 16 18:04:31 fs kernel: [554283.989388]  [<ffffffff81158f68>] __dentry_open+0x198/0x310
Nov 16 18:04:31 fs kernel: [554283.989393]  [<ffffffff81168572>] do_last+0x1f2/0x800
Nov 16 18:04:31 fs kernel: [554283.989397]  [<ffffffff811697e9>] path_openat+0xd9/0x420
Nov 16 18:04:31 fs kernel: [554283.989400]  [<ffffffff81169c6c>] do_filp_open+0x4c/0xc0
Nov 16 18:04:31 fs kernel: [554283.989403]  [<ffffffff8115a90f>] do_sys_open+0x17f/0x250
Nov 16 18:04:31 fs kernel: [554283.989409]  [<ffffffff8146a012>] system_call_fastpath+0x16/0x1b
Nov 16 18:04:31 fs kernel: [554283.989453]  [<00007ff2d2b05fd0>] 0x7ff2d2b05fcf
Nov 16 18:04:31 fs kernel: [554283.989454] Mem-Info:
Nov 16 18:04:31 fs kernel: [554283.989455] Node 0 DMA per-cpu:
Nov 16 18:04:31 fs kernel: [554283.989457] CPU    0: hi:    0, btch:   1 usd:   0
Nov 16 18:04:31 fs kernel: [554283.989459] CPU    1: hi:    0, btch:   1 usd:   0
Nov 16 18:04:31 fs kernel: [554283.989460] Node 0 DMA32 per-cpu:
Nov 16 18:04:31 fs kernel: [554283.989461] CPU    0: hi:  186, btch:  31 usd:   0
Nov 16 18:04:31 fs kernel: [554283.989463] CPU    1: hi:  186, btch:  31 usd: 185
Nov 16 18:04:31 fs kernel: [554283.989464] Node 0 Normal per-cpu:
Nov 16 18:04:31 fs kernel: [554283.989465] CPU    0: hi:  186, btch:  31 usd:   0
Nov 16 18:04:31 fs kernel: [554283.989466] CPU    1: hi:  186, btch:  31 usd:  57
Nov 16 18:04:31 fs kernel: [554283.989469] active_anon:50701 inactive_anon:27212 isolated_anon:0
Nov 16 18:04:31 fs kernel: [554283.989470]  active_file:82852 inactive_file:268841 isolated_file:0
Nov 16 18:04:31 fs kernel: [554283.989471]  unevictable:7301 dirty:63 writeback:0 unstable:0
Nov 16 18:04:31 fs kernel: [554283.989471]  free:42190 slab_reclaimable:84053 slab_unreclaimable:239021
Nov 16 18:04:31 fs kernel: [554283.989472]  mapped:7031 shmem:29 pagetables:2934 bounce:0
Nov 16 18:04:31 fs kernel: [554283.989474] Node 0 DMA free:15880kB min:256kB low:320kB high:384kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolate d(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Nov 16 18:04:31 fs kernel: [554283.989480] lowmem_reserve[]: 0 3000 4010 4010
Nov 16 18:04:31 fs kernel: [554283.989482] Node 0 DMA32 free:131040kB min:50368kB low:62960kB high:75552kB active_anon:172872kB inactive_anon:57708kB active_file:301912kB inactive_file:900636kB unevictable:22 688kB isolated(anon):0kB isolated(file):0kB present:3072160kB mlocked:22688kB dirty:12kB writeback:0kB mapped:19944kB shmem:56kB slab_reclaimable:259512kB slab_unreclaimable:763412kB kernel_stack:1280kB pagetables: 3560kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Nov 16 18:04:31 fs kernel: [554283.989489] lowmem_reserve[]: 0 0 1010 1010
Nov 16 18:04:31 fs kernel: [554283.989492] Node 0 Normal free:21840kB
min:16956kB low:21192kB high:25432kB active_anon:29932kB
inactive_anon:51140kB active_file:29496kB inactive_file:174728kB
unevictable:6516
kB isolated(anon):0kB isolated(file):0kB present:1034240kB mlocked:6516kB
dirty:240kB writeback:0kB mapped:8180kB shmem:60kB slab_reclaimable:76700kB
slab_unreclaimable:192672kB kernel_stack:3072kB pagetables:8176k
B unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:21
all_unreclaimable? no
Nov 16 18:04:31 fs kernel: [554283.989499] lowmem_reserve[]: 0 0 0 0
Nov 16 18:04:31 fs kernel: [554283.989501] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
Nov 16 18:04:31 fs kernel: [554283.989506] Node 0 DMA32: 1896*4kB 14924*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 131040kB
Nov 16 18:04:31 fs kernel: [554283.989512] Node 0 Normal: 4826*4kB 61*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 21840kB
Nov 16 18:04:31 fs kernel: [554283.989518] 197543 total pagecache pages
Nov 16 18:04:31 fs kernel: [554283.989519] 8325 pages in swap cache
Nov 16 18:04:31 fs kernel: [554283.989520] Swap cache stats: add 227787, delete 219462, find 1899386/1918583
Nov 16 18:04:31 fs kernel: [554283.989521] Free swap  = 4921412kB
Nov 16 18:04:31 fs kernel: [554283.989522] Total swap = 5244924kB
Nov 16 18:04:31 fs kernel: [554283.989523] 1030522 pages RAM


Note that tmpfs is empty.


BTW this is a Novell fileserver with their NSS filesystem. They just say "give it more RAM" (duh)
 

> Swap can be used only for anon pages or for tmpfs. You have a lot of
> file page cache.
> I guess this is leak of pages' reference counter in some filesystem,
> more likely in cifs.
> 
> Try to isolate which part of workload causes this leak, for example
> switch to another filesystem.
> 
> On Sun, Nov 16, 2014 at 5:11 PM, Marki <mro2@gmx.net> wrote:
> >
> > Hey there,
> >
> > I wouldn't know where to turn anymore, maybe you guys can help me debug this
> > OOM.
> >
> > Questions aside from "why in the end is this happening":
> > - GFP mask lower byte 0xa indicates a request for a free page in highmem.
> > This is a 64-bit system and therefore has no highmem zone. So what's going on?
> > - Swap is almost not used: why not use it before OOMing?
> > - Pagecache is high: why not empty it before OOMing? (almost no dirty pages)
> >
> > Oh and it's a machine with 4G of RAM on kernel 3.0.101 (SLES11 SP3).
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: How to interpret this OOM situation?
  2014-11-16 17:26   ` mro2
@ 2014-11-16 17:42     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Khlebnikov @ 2014-11-16 17:42 UTC (permalink / raw)
  To: Marki; +Cc: linux-mm

On Sun, Nov 16, 2014 at 8:26 PM,  <mro2@gmx.net> wrote:
>
> When I manually drop caches, it looks like this:
>
> # grep -i dirty /proc/meminfo ; free; sync ; sync ; sync ; echo 3 >
> /proc/sys/vm/drop_caches ; free ; grep -i dirty /proc/meminfo
> Dirty:              2224 kB
>              total       used       free     shared    buffers     cached
> Mem:       3926016    3690288     235728          0      85628    1376996
> -/+ buffers/cache:    2227664    1698352
> Swap:      5244924     323872    4921052
>              total       used       free     shared    buffers     cached
> Mem:       3926016    2604568    1321448          0        132     407696
> -/+ buffers/cache:    2196740    1729276
> Swap:      5244924     323580    4921344
> Dirty:                 8 kB
>
>
> However, during backup times (usually when the OOM happens) the page cache
> is not (automatically) emptied before OOMing.
>
>
> Right now I tried finding out using fincore what exactly is in the page
> cache, only to get allocation failures:

You may try tool from kernel sources:  tools/vm/page-types.c
since 3.15 (or so) it can dump file page-cache (key -f) recursively for a tree.

>
>
> Nov 16 18:04:31 fs kernel: [554283.989323] fincore: page allocation failure: order:4, mode:0xd0
> Nov 16 18:04:31 fs kernel: [554283.989331] Pid: 17921, comm: fincore Tainted: G           E X 3.0.101-0.35-default #1
> Nov 16 18:04:31 fs kernel: [554283.989333] Call Trace:
> Nov 16 18:04:31 fs kernel: [554283.989345]  [<ffffffff81004935>] dump_trace+0x75/0x310
> Nov 16 18:04:31 fs kernel: [554283.989352]  [<ffffffff8145f2f3>] dump_stack+0x69/0x6f
> Nov 16 18:04:31 fs kernel: [554283.989357]  [<ffffffff81100a46>] warn_alloc_failed+0xc6/0x170
> Nov 16 18:04:31 fs kernel: [554283.989361]  [<ffffffff81102631>] __alloc_pages_slowpath+0x541/0x7d0
> Nov 16 18:04:31 fs kernel: [554283.989364]  [<ffffffff81102aa9>] __alloc_pages_nodemask+0x1e9/0x200
> Nov 16 18:04:31 fs kernel: [554283.989368]  [<ffffffff811439c3>] kmem_getpages+0x53/0x180
> Nov 16 18:04:31 fs kernel: [554283.989372]  [<ffffffff811447c6>] fallback_alloc+0x196/0x270
> Nov 16 18:04:31 fs kernel: [554283.989375]  [<ffffffff81145117>] kmem_cache_alloc_trace+0x207/0x2a0
> Nov 16 18:04:31 fs kernel: [554283.989380]  [<ffffffff810dc466>] __tracing_open+0x66/0x330
> Nov 16 18:04:31 fs kernel: [554283.989384]  [<ffffffff810dc783>] tracing_open+0x53/0xb0
> Nov 16 18:04:31 fs kernel: [554283.989388]  [<ffffffff81158f68>] __dentry_open+0x198/0x310
> Nov 16 18:04:31 fs kernel: [554283.989393]  [<ffffffff81168572>] do_last+0x1f2/0x800
> Nov 16 18:04:31 fs kernel: [554283.989397]  [<ffffffff811697e9>] path_openat+0xd9/0x420
> Nov 16 18:04:31 fs kernel: [554283.989400]  [<ffffffff81169c6c>] do_filp_open+0x4c/0xc0
> Nov 16 18:04:31 fs kernel: [554283.989403]  [<ffffffff8115a90f>] do_sys_open+0x17f/0x250
> Nov 16 18:04:31 fs kernel: [554283.989409]  [<ffffffff8146a012>] system_call_fastpath+0x16/0x1b
> Nov 16 18:04:31 fs kernel: [554283.989453]  [<00007ff2d2b05fd0>] 0x7ff2d2b05fcf
> Nov 16 18:04:31 fs kernel: [554283.989454] Mem-Info:
> Nov 16 18:04:31 fs kernel: [554283.989455] Node 0 DMA per-cpu:
> Nov 16 18:04:31 fs kernel: [554283.989457] CPU    0: hi:    0, btch:   1 usd:   0
> Nov 16 18:04:31 fs kernel: [554283.989459] CPU    1: hi:    0, btch:   1 usd:   0
> Nov 16 18:04:31 fs kernel: [554283.989460] Node 0 DMA32 per-cpu:
> Nov 16 18:04:31 fs kernel: [554283.989461] CPU    0: hi:  186, btch:  31 usd:   0
> Nov 16 18:04:31 fs kernel: [554283.989463] CPU    1: hi:  186, btch:  31 usd: 185
> Nov 16 18:04:31 fs kernel: [554283.989464] Node 0 Normal per-cpu:
> Nov 16 18:04:31 fs kernel: [554283.989465] CPU    0: hi:  186, btch:  31 usd:   0
> Nov 16 18:04:31 fs kernel: [554283.989466] CPU    1: hi:  186, btch:  31 usd:  57
> Nov 16 18:04:31 fs kernel: [554283.989469] active_anon:50701 inactive_anon:27212 isolated_anon:0
> Nov 16 18:04:31 fs kernel: [554283.989470]  active_file:82852 inactive_file:268841 isolated_file:0
> Nov 16 18:04:31 fs kernel: [554283.989471]  unevictable:7301 dirty:63 writeback:0 unstable:0
> Nov 16 18:04:31 fs kernel: [554283.989471]  free:42190 slab_reclaimable:84053 slab_unreclaimable:239021
> Nov 16 18:04:31 fs kernel: [554283.989472]  mapped:7031 shmem:29 pagetables:2934 bounce:0
> Nov 16 18:04:31 fs kernel: [554283.989474] Node 0 DMA free:15880kB min:256kB low:320kB high:384kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolate d(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> Nov 16 18:04:31 fs kernel: [554283.989480] lowmem_reserve[]: 0 3000 4010 4010
> Nov 16 18:04:31 fs kernel: [554283.989482] Node 0 DMA32 free:131040kB min:50368kB low:62960kB high:75552kB active_anon:172872kB inactive_anon:57708kB active_file:301912kB inactive_file:900636kB unevictable:22 688kB isolated(anon):0kB isolated(file):0kB present:3072160kB mlocked:22688kB dirty:12kB writeback:0kB mapped:19944kB shmem:56kB slab_reclaimable:259512kB slab_unreclaimable:763412kB kernel_stack:1280kB pagetables: 3560kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> Nov 16 18:04:31 fs kernel: [554283.989489] lowmem_reserve[]: 0 0 1010 1010
> Nov 16 18:04:31 fs kernel: [554283.989492] Node 0 Normal free:21840kB
> min:16956kB low:21192kB high:25432kB active_anon:29932kB
> inactive_anon:51140kB active_file:29496kB inactive_file:174728kB
> unevictable:6516
> kB isolated(anon):0kB isolated(file):0kB present:1034240kB mlocked:6516kB
> dirty:240kB writeback:0kB mapped:8180kB shmem:60kB slab_reclaimable:76700kB
> slab_unreclaimable:192672kB kernel_stack:3072kB pagetables:8176k
> B unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:21
> all_unreclaimable? no
> Nov 16 18:04:31 fs kernel: [554283.989499] lowmem_reserve[]: 0 0 0 0
> Nov 16 18:04:31 fs kernel: [554283.989501] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
> Nov 16 18:04:31 fs kernel: [554283.989506] Node 0 DMA32: 1896*4kB 14924*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 131040kB
> Nov 16 18:04:31 fs kernel: [554283.989512] Node 0 Normal: 4826*4kB 61*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 21840kB
> Nov 16 18:04:31 fs kernel: [554283.989518] 197543 total pagecache pages
> Nov 16 18:04:31 fs kernel: [554283.989519] 8325 pages in swap cache
> Nov 16 18:04:31 fs kernel: [554283.989520] Swap cache stats: add 227787, delete 219462, find 1899386/1918583
> Nov 16 18:04:31 fs kernel: [554283.989521] Free swap  = 4921412kB
> Nov 16 18:04:31 fs kernel: [554283.989522] Total swap = 5244924kB
> Nov 16 18:04:31 fs kernel: [554283.989523] 1030522 pages RAM
>
>
> Note that tmpfs is empty.
>
>
> BTW this is a Novell fileserver with their NSS filesystem. They just say "give it more RAM" (duh)

Heh, you have out of tree filesystem kernel module? This's much more
suspicious than cifs.

>
>
>> Swap can be used only for anon pages or for tmpfs. You have a lot of
>> file page cache.
>> I guess this is leak of pages' reference counter in some filesystem,
>> more likely in cifs.
>>
>> Try to isolate which part of workload causes this leak, for example
>> switch to another filesystem.
>>
>> On Sun, Nov 16, 2014 at 5:11 PM, Marki <mro2@gmx.net> wrote:
>> >
>> > Hey there,
>> >
>> > I wouldn't know where to turn anymore, maybe you guys can help me debug this
>> > OOM.
>> >
>> > Questions aside from "why in the end is this happening":
>> > - GFP mask lower byte 0xa indicates a request for a free page in highmem.
>> > This is a 64-bit system and therefore has no highmem zone. So what's going on?
>> > - Swap is almost not used: why not use it before OOMing?
>> > - Pagecache is high: why not empty it before OOMing? (almost no dirty pages)
>> >
>> > Oh and it's a machine with 4G of RAM on kernel 3.0.101 (SLES11 SP3).
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-16 17:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-16 14:11 How to interpret this OOM situation? Marki
2014-11-16 16:39 ` Konstantin Khlebnikov
2014-11-16 17:26   ` mro2
2014-11-16 17:42     ` Konstantin Khlebnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox