Hi Pintu, I am also looking for a mailing list where we can share memory related issues at kernel level. I tried subscribing to that mailing list but it says Invalid ID : >>>> --001a11436926db8726053c49ac67 **** Command '--001a11436926db8726053c49ac67' not recognized. >>>> Content-Type: text/plain; charset=UTF-8 **** Command 'content-type:' not recognized. >>>> >>>> subscribe linux-mm@kvack.org **** subscribe: unknown list 'linux-mm@kvack.org'. Thanks On Mon, Sep 12, 2016 at 11:42 AM, PINTU KUMAR wrote: > Dear Ankur, > > I would suggest you register to linux-mm@kvack.org and explain your > issues in details. > There are other experts here, who can guide you. > > Few comments are inline below. > > > From: Ankur Tank [mailto:Ankur.Tank@LntTechservices.com] > > Sent: Saturday, September 10, 2016 5:26 PM > > To: pintu.k@samsung.com > > Cc: artfri2@gmail.com > > Subject: Memory fragmentation issue related suggestion request > > > > Hello Pintukumar, > > > > TL;DR > > We have an issue in our Linux box, what looks like memory fragmentation > issue, > > while searching on net I referred talk you gave in Embedded Linux Conf. > I have several talks in ELC, not sure which one you are referring to. > Please point out. > > > I am facing this issue for couple of weeks so thought to ask you for > suggestions. > > Please forgive me If I offended you by writing mail to you, Ignore mail > if you feel so. > > > > Details > > We are facing one issue in our Embedded Linux board, Our board is > Beaglebone > > black based custom board, with 4GB eMMC as storage. We are using Linux > kernel > > 3.12. > In addition, you may need to provide the following information: > RAM size ? > cat /proc/meminfo (before and after the operation) > cat /proc/buddyinfo (before and after the operation) > cat /proc/vmstat (before and after the operation) > > > Our firmware upgrade strategy is using backgup partition for Bootloader, > Kernel, > > dtb, rootfs. > > So, > > During firmware upgrade with big rootfs and running dd to read the > partition in raw > > mode. > > In short looks like those operations are overloading the system. > > > I am not sure, but I think this is the crude way of taking the backup. > This will certainly overload your system. > FOTA upgrade experts can give more comments here. > > > From below log looks like pages above 32KB size is not available and may > be > > because of that rootfs tar on the emmc is failing. > > I have following queries in that regards, > > > > 1. Do you think it is a memory fragmentation ? > Yes, if all above 32KB (2^3 order) pages are not available, and pages are > available in lower orders (2^0/1/2) then its certainly fragmentation > problem. > However, as I said, you need to provide the following output to confirm: > cat /proc/buddyinfo > > > May be silly to ask so but just to confirm, because I had added the > software swap > > however with that also we were seeing issue reproducible and swap was > not full at > > that time ☹ > > > Well, adding swap should help a bit but it may not solve the problem > completely. > How much swap did you actually allocated? > What kind of swap you used ? > Is it ZRAM/ZSWAP (with compression support) ? > What is the swappiness ratio ? (/proc/sys/vm/swappiness) > > > 2. If it is so how do we handle it ? is there a some way similar > to your shrinker > > utility to reclaim the memory pages ? > > > Not sure which shrinker utility are you referring to ? > Is it : /proc/sys/vm/shrink_memory ? > > > Any suggestion would help me move forward, > > > Did you tried enabling CONFIG_COMPACTION ? > Try using ZRAM or ZSWAP (~30% of MemTotal). > Try tuning : /proc/sys/vm/dirty_{background_ratio/bytes} and others. > [Refer kernel/documentation for the same] > > From the logs, I observed the following: > > [ 6676.674219] mmcqd/1: page allocation failure: order:1, mode:0x200020 > Order-1 allocation is failing, so pages might be sitting in order-0. > > [ 6676.674739] free_cma:1982 > You have around ~7MB of CMA free pages, so this cannot be used for > non-movable allocation. > > [ 6676.674885] 51661 total pagecache pages > You have huge amount of memory sitting in caches. These can be reclaimed > in back ground (with slight performance degradation). > To experiment and debug you can try: echo 3 > /proc/sys/vm/drop_caches > > [ 6676.674925] Total swap = 0kB > Swap is not enabled on your system. > > > > Regards, > > Ankur > > > > Error log > > ---------------------------- > > > > [ 6676.674219] mmcqd/1: page allocation failure: order:1, mode:0x200020 > > [ 6676.674256] CPU: 0 PID: 612 Comm: mmcqd/1 Tainted: P O > 3.12.10-005- > > ts-armv7l #2 > > [ 6676.674321] [] (unwind_backtrace+0x0/0xf4) from > [] > > (show_stack+0x10/0x14) > > [ 6676.674355] [] (show_stack+0x10/0x14) from [] > > (warn_alloc_failed+0xe0/0x118) > > [ 6676.674383] [] (warn_alloc_failed+0xe0/0x118) from > [] > > (__alloc_pages_nodemask+0x74c/0x8f8) > > [ 6676.674413] [] (__alloc_pages_nodemask+0x74c/0x8f8) > from > > [] (cache_alloc_refill+0x328/0x620) > > [ 6676.674436] [] (cache_alloc_refill+0x328/0x620) from > > [] (__kmalloc+0xa0/0xe8) > > [ 6676.674471] [] (__kmalloc+0xa0/0xe8) from [] > > (edma_prep_slave_sg+0x84/0x388) > > [ 6676.674505] [] (edma_prep_slave_sg+0x84/0x388) from > > [] (omap_hsmmc_request+0x414/0x508) > > [ 6676.674544] [] (omap_hsmmc_request+0x414/0x508) from > > [] (mmc_start_request+0xc4/0xe0) > > [ 6676.674568] [] (mmc_start_request+0xc4/0xe0) from > > [] (mmc_start_req+0x2d8/0x38c) > > [ 6676.674589] [] (mmc_start_req+0x2d8/0x38c) from > [] > > (mmc_blk_issue_rw_rq+0xb4/0x9d8) > > [ 6676.674611] [] (mmc_blk_issue_rw_rq+0xb4/0x9d8) from > > [] (mmc_blk_issue_rq+0x1a4/0x468) > > [ 6676.674631] [] (mmc_blk_issue_rq+0x1a4/0x468) from > > [] (mmc_queue_thread+0x88/0x118) > > [ 6676.674657] [] (mmc_queue_thread+0x88/0x118) from > > [] (kthread+0xb4/0xb8) > > [ 6676.674681] [] (kthread+0xb4/0xb8) from [] > > (ret_from_fork+0x14/0x3c) > > [ 6676.674691] Mem-info: > > [ 6676.674700] Normal per-cpu: > > [ 6676.674711] CPU 0: hi: 90, btch: 15 usd: 79 > > [ 6676.674739] active_anon:4889 inactive_anon:13 isolated_anon:0 > > [ 6676.674739] active_file:8082 inactive_file:43196 isolated_file:0 > > [ 6676.674739] unevictable:422 dirty:2 writeback:1152 unstable:0 > > [ 6676.674739] free:3286 slab_reclaimable:1090 > slab_unreclaimable:915 > > [ 6676.674739] mapped:1593 shmem:39 pagetables:181 bounce:0 > > [ 6676.674739] free_cma:1982 > > [ 6676.674800] Normal free:13144kB min:2004kB low:2504kB high:3004kB > > active_anon:19556kB inactive_anon:52kB active_file:32328kB > > inactive_file:172784kB unevictable:o > > [ 6676.674813] lowmem_reserve[]: 0 0 0 > > [ 6676.674831] Normal: 2584*4kB (UMC) 217*8kB (C) 57*16kB (C) 5*32kB > (C) > > 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = > > 13144kB > > [ 6676.674885] 51661 total pagecache pages > > [ 6676.674900] 0 pages in swap cache > > [ 6676.674910] Swap cache stats: add 0, delete 0, find 0/0 > > [ 6676.674918] Free swap = 0kB > > [ 6676.674925] Total swap = 0kB > > [ 6676.674938] SLAB: Unable to allocate memory on node 0 (gfp=0x20) > > [ 6676.674949] cache: kmalloc-8192, object size: 8192, order: 1 > > [ 6676.674962] node 0: slabs: 3/3, objs: 3/3, free: 0 > > [ 6676.674984] omap_hsmmc 481d8000.mmc: prep_slave_sg() failed > > [ 6676.674997] omap_hsmmc 481d8000.mmc: MMC start dma failure > > [ 6676.676181] mmcblk0: unknown error -1 sending read/write command, > card > > status 0x900 > > [ 6676.676300] end_request: I/O error, dev mmcblk0, sector 27648 > > [ 6676.676318] Buffer I/O error on device mmcblk0p9, logical block > 896 > > [ 6676.676329] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676401] end_request: I/O error, dev mmcblk0, sector 27656 > > [ 6676.676415] Buffer I/O error on device mmcblk0p9, logical block > 897 > > [ 6676.676425] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676450] end_request: I/O error, dev mmcblk0, sector 27664 > > [ 6676.676461] Buffer I/O error on device mmcblk0p9, logical block > 898 > > [ 6676.676471] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676494] end_request: I/O error, dev mmcblk0, sector 27672 > > [ 6676.676505] Buffer I/O error on device mmcblk0p9, logical block > 899 > > [ 6676.676515] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676537] end_request: I/O error, dev mmcblk0, sector 27680 > > [ 6676.676548] Buffer I/O error on device mmcblk0p9, logical block > 900 > > [ 6676.676558] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676580] end_request: I/O error, dev mmcblk0, sector 27688 > > [ 6676.676591] Buffer I/O error on device mmcblk0p9, logical block > 901 > > [ 6676.676601] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676622] end_request: I/O error, dev mmcblk0, sector 27696 > > [ 6676.676634] Buffer I/O error on device mmcblk0p9, logical block > 902 > > [ 6676.676643] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676665] end_request: I/O error, dev mmcblk0, sector 27704 > > [ 6676.676676] Buffer I/O error on device mmcblk0p9, logical block > 903 > > [ 6676.676685] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676707] end_request: I/O error, dev mmcblk0, sector 27712 > > [ 6676.676718] Buffer I/O error on device mmcblk0p9, logical block > 904 > > [ 6676.676728] lost page write due to I/O error on mmcblk0p9 > > [ 6676.676749] end_request: I/O error, dev mmcblk0, sector 27720 > > [ 6676.678266] mmcqd/1: page allocation failure: order:1, > mode:0x200020 > > [ 6676.678285] CPU: 0 PID: 612 Comm: mmcqd/1 Tainted: P O > 3.12.10-005- > > ts-armv7l #2 > > [ 6676.678330] [] (unwind_backtrace+0x0/0xf4) from > [] > > (show_stack+0x10/0x14) > > [ 6676.678358] [] (show_stack+0x10/0x14) from [] > > (warn_alloc_failed+0xe0/0x118) > > [ 6676.678385] [] (warn_alloc_failed+0xe0/0x118) from > [] > > (__alloc_pages_nodemask+0x74c/0x8f8) > > [ 6676.678412] [] (__alloc_pages_nodemask+0x74c/0x8f8) > from > > [] (cache_alloc_refill+0x328/0x620) > > [ 6676.678434] [] (cache_alloc_refill+0x328/0x620) from > > [] (__kmalloc+0xa0/0xe8) > > [ 6676.678464] [] (__kmalloc+0xa0/0xe8) from [] > > (edma_prep_slave_sg+0x84/0x388) > > [ 6676.678493] [] (edma_prep_slave_sg+0x84/0x388) from > > [] (omap_hsmmc_request+0x414/0x508) > > [ 6676.678524] [] (omap_hsmmc_request+0x414/0x508) from > > [] (mmc_start_request+0xc4/0xe0) > > [ 6676.678547] [] (mmc_start_request+0xc4/0xe0) from > > [] (mmc_start_req+0x2d8/0x38c) > > [ 6676.678568] [] (mmc_start_req+0x2d8/0x38c) from > [] > > (mmc_blk_issue_rw_rq+0x230/0x9d8) > > [ 6676.678589] [] (mmc_blk_issue_rw_rq+0x230/0x9d8) from > > [] (mmc_blk_issue_rq+0x1a4/0x468) > > [ 6676.678608] [] (mmc_blk_issue_rq+0x1a4/0x468) from > > [] (mmc_queue_thread+0x88/0x118) > > [ 6676.678632] [] (mmc_queue_thread+0x88/0x118) from > > [] (kthread+0xb4/0xb8) > > [ 6676.678655] [] (kthread+0xb4/0xb8) from [] > > (ret_from_fork+0x14/0x3c) > > [ 6676.678664] Mem-info: > > > > >