From: Matthias Dahl <ml_linux-kernel@binary-island.eu>
To: linux-raid@vger.kernel.org
Cc: linux-mm@kvack.org, dm-devel@redhat.com, linux-kernel@vger.kernel.org
Subject: Page Allocation Failures/OOM with dm-crypt on software RAID10 (Intel Rapid Storage)
Date: Tue, 12 Jul 2016 10:27:37 +0200 [thread overview]
Message-ID: <02580b0a303da26b669b4a9892624b13@mail.ud19.udmedia.de> (raw)
[-- Attachment #1: Type: text/plain, Size: 3146 bytes --]
Hello,
I posted this issue already on linux-mm, linux-kernel and dm-devel a
few days ago and after further investigation it seems like that this
issue is somehow related to the fact that I am using an Intel Rapid
Storage RAID10, so I am summarizing everything again in this mail
and include linux-raid in my post. Sorry for the noise... :(
I am currently setting up a new machine (since my old one broke down)
and I ran into a lot of " Unable to allocate memory on node -1" warnings
while using dm-crypt. I have attached as much of the full log as I could
recover.
The encrypted device is sitting on a RAID10 (software raid, Intel Rapid
Storage). I am currently limited to testing via Linux live images since
the machine is not yet properly setup but I did my tests across several
of those.
Steps to reproduce are:
1)
cryptsetup -s 512 -d /dev/urandom -c aes-xts-plain64 open --type plain
/dev/md126p5 test-device
2)
dd if=/dev/zero of=/dev/mapper/test-device status=progress bs=512K
While running and monitoring the memory usage with free, it can be seen
that the used memory increases rapidly and after just a few seconds, the
system is out of memory and page allocation failures start to be issued
as well as the OOM killer gets involved.
I have also seen this behavior with mkfs.ext4 being used on the same
device -- at least with 1.43.1.
Using direct i/o will work fine and not cause any issue. Also if
dm-crypt
is out of the picture, the problem does also not occur.
I did further tests:
1) dd block size has no influence on the issue whatsoever
2) using dm-crypt on an image located on an ext2 on the RAID10 works
fine
3) using an external (connected through USB3) hd with two partitions
and using either a RAID1 or RAID10 on it via Linux s/w RAID with
dm-crypt on-top, does also work fine
But as soon as I use dm-crypt on the Intel Rapid Storage RAID10, the
issue is 100% reproducible.
I tested all of this on a Fedora Rawhide Live Image as I currently still
am
in the process of setting the new machine up. Those images are available
here to download:
download.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/iso/
The machine itself has 32 GiB of RAM (plenty), no swap (live image)
and is a 6700k on a Z170 chipset. The kernel is the default provided
with the live image... right now that is a very recent git after
4.7.0rc6 but before rc7. But the issue also shows on 4.4.8 and 4.5.5.
The stripe size of the RAID10 is 64k, if that matters.
I am now pretty much out of ideas what else to test and where the
problem
could stem from. Suffice to say that this has impacted my trust in this
particular setup. I hope I can help to find the cause of this.
If there is anything I can do to help, please let me know.
Also, since I am not subscribed to the lists right now (I have to make
due
with a crappy WebMail interface until everything is setup), please cc'
me
accordingly. Thanks a lot.
With Kind Regards from Germany,
Matthias
--
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu
services: custom software [desktop, mobile, web], server administration
[-- Attachment #2: mdstat.txt --]
[-- Type: text/plain, Size: 296 bytes --]
Personalities : [raid10]
md126 : active raid10 sda[3] sdb[2] sdc[1] sdd[0]
3907023872 blocks super external:/md127/0 64K chunks 2 near-copies [4/4] [UUUU]
md127 : inactive sdc[3](S) sdb[2](S) sda[1](S) sdd[0](S)
10064 blocks super external:imsm
unused devices: <none>
[-- Attachment #3: vmstat.txt --]
[-- Type: text/plain, Size: 2738 bytes --]
nr_free_pages 7943696
nr_alloc_batch 5873
nr_inactive_anon 296
nr_active_anon 105347
nr_inactive_file 29921
nr_active_file 56980
nr_unevictable 13140
nr_mlock 2716
nr_anon_pages 107204
nr_mapped 27670
nr_file_pages 98500
nr_dirty 14
nr_writeback 0
nr_slab_reclaimable 8887
nr_slab_unreclaimable 16975
nr_page_table_pages 7137
nr_kernel_stack 490
nr_unstable 0
nr_bounce 0
nr_vmscan_write 9828326
nr_vmscan_immediate_reclaim 67360474
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 593
nr_dirtied 506466654
nr_written 506466595
nr_pages_scanned 0
numa_hit 5258670960
numa_miss 0
numa_foreign 0
numa_interleave 38217
numa_local 5258670960
numa_other 0
workingset_refault 336993
workingset_activate 61553
workingset_nodereclaim 7919435
nr_anon_transparent_hugepages 0
nr_free_cma 0
nr_dirty_threshold 1592267
nr_dirty_background_threshold 795161
pgpgin 10489537
pgpgout 2025884337
pswpin 0
pswpout 0
pgalloc_dma 684558
pgalloc_dma32 328009673
pgalloc_normal 5767713958
pgalloc_movable 0
pgfree 6106436104
pgactivate 813221
pgdeactivate 1284082
pgfault 1653795
pgmajfault 46351
pglazyfreed 0
pgrefill_dma 0
pgrefill_dma32 66114
pgrefill_normal 1407169
pgrefill_movable 0
pgsteal_kswapd_dma 0
pgsteal_kswapd_dma32 22181873
pgsteal_kswapd_normal 425875886
pgsteal_kswapd_movable 0
pgsteal_direct_dma 0
pgsteal_direct_dma32 10723905
pgsteal_direct_normal 45330060
pgsteal_direct_movable 0
pgscan_kswapd_dma 0
pgscan_kswapd_dma32 32470709
pgscan_kswapd_normal 758168190
pgscan_kswapd_movable 0
pgscan_direct_dma 0
pgscan_direct_dma32 55064390
pgscan_direct_normal 449388285
pgscan_direct_movable 0
pgscan_direct_throttle 16
zone_reclaim_failed 0
pginodesteal 329
slabs_scanned 75784518
kswapd_inodesteal 3324
kswapd_low_wmark_hit_quickly 18086579
kswapd_high_wmark_hit_quickly 562
pageoutrun 18100603
allocstall 739928
pgrotated 357590082
drop_pagecache 0
drop_slab 0
numa_pte_updates 0
numa_huge_pte_updates 0
numa_hint_faults 0
numa_hint_faults_local 0
numa_pages_migrated 0
pgmigrate_success 562476
pgmigrate_fail 34076511
compact_migrate_scanned 390290706
compact_free_scanned 17609026156
compact_isolated 37387419
compact_stall 17
compact_fail 10
compact_success 7
compact_daemon_wake 3013752
htlb_buddy_alloc_success 0
htlb_buddy_alloc_fail 0
unevictable_pgs_culled 69728
unevictable_pgs_scanned 0
unevictable_pgs_rescued 57566
unevictable_pgs_mlocked 62928
unevictable_pgs_munlocked 59182
unevictable_pgs_cleared 18
unevictable_pgs_stranded 18
thp_fault_alloc 0
thp_fault_fallback 0
thp_collapse_alloc 0
thp_collapse_alloc_failed 0
thp_split_page 0
thp_split_page_failed 0
thp_deferred_split_page 0
thp_split_pmd 0
thp_zero_page_alloc 0
thp_zero_page_alloc_failed 0
balloon_inflate 0
balloon_deflate 0
balloon_migrate 0
[-- Attachment #4: crypto.txt.gz --]
[-- Type: application/x-gzip, Size: 1197 bytes --]
[-- Attachment #5: kernel.log.txt.gz --]
[-- Type: application/x-gzip, Size: 24060 bytes --]
[-- Attachment #6: sysctl.txt.gz --]
[-- Type: application/x-gzip, Size: 7591 bytes --]
next reply other threads:[~2016-07-12 8:27 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-12 8:27 Matthias Dahl [this message]
2016-07-12 9:50 ` Michal Hocko
2016-07-12 11:28 ` Matthias Dahl
2016-07-12 11:49 ` Michal Hocko
2016-07-12 11:59 ` Michal Hocko
2016-07-12 12:42 ` Matthias Dahl
2016-07-12 14:07 ` Michal Hocko
2016-07-12 14:56 ` Matthias Dahl
2016-07-13 11:21 ` Michal Hocko
2016-07-13 12:18 ` Michal Hocko
2016-07-13 13:18 ` Matthias Dahl
2016-07-13 13:47 ` Michal Hocko
2016-07-13 15:32 ` Matthias Dahl
2016-07-13 16:24 ` [dm-devel] " Ondrej Kozina
2016-07-13 18:24 ` Matthias Dahl
2016-07-14 11:18 ` Tetsuo Handa
2016-07-15 7:11 ` Page Allocation Failures/OOM with dm-crypt on software RAID10 (Intel Rapid Storage) with check/repair/sync Matthias Dahl
2016-07-18 7:24 ` Matthias Dahl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=02580b0a303da26b669b4a9892624b13@mail.ud19.udmedia.de \
--to=ml_linux-kernel@binary-island.eu \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox