From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx110.postini.com [74.125.245.110]) by kanga.kvack.org (Postfix) with SMTP id 9B0E56B0068 for ; Fri, 28 Sep 2012 11:50:14 -0400 (EDT) Received: by bkcjm1 with SMTP id jm1so4228058bkc.14 for ; Fri, 28 Sep 2012 08:50:12 -0700 (PDT) Subject: Re: mlx4_en_alloc_frag allocation failures From: Eric Dumazet In-Reply-To: <20120928151429.GB2731@BohrerMBP.rgmadvisors.com> References: <20120928151429.GB2731@BohrerMBP.rgmadvisors.com> Content-Type: text/plain; charset="UTF-8" Date: Fri, 28 Sep 2012 17:50:08 +0200 Message-ID: <1348847408.5093.2548.camel@edumazet-glaptop> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Shawn Bohrer Cc: netdev@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Fri, 2012-09-28 at 10:14 -0500, Shawn Bohrer wrote: > We've got a new application that is receiving UDP multicast data using > AF_PACKET and writing out the packets in a custom format to disk. The > packet rates are bursty, but it seems to be roughly 100 Mbps on > average for 1 minute periods. With this application running all day > we get a lot of these messages: > > [1298269.103034] kswapd1: page allocation failure: order:2, mode:0x4020 > [1298269.103038] Pid: 80, comm: kswapd1 Not tainted 3.4.9-2.rgm.fc16.x86_64 #1 > [1298269.103040] Call Trace: > [1298269.103041] [] warn_alloc_failed+0xf6/0x160 > [1298269.103053] [] ? skb_copy_bits+0x16d/0x2c0 > [1298269.103058] [] ? wakeup_kswapd+0x69/0x160 > [1298269.103060] [] __alloc_pages_nodemask+0x6e8/0x930 > [1298269.103064] [] alloc_pages_current+0xb6/0x120 > [1298269.103070] [] mlx4_en_alloc_frag+0x16b/0x1e0 [mlx4_en] > [1298269.103073] [] mlx4_en_complete_rx_desc+0x120/0x1d0 [mlx4_en] > [1298269.103076] [] mlx4_en_process_rx_cq+0x584/0x700 [mlx4_en] > [1298269.103079] [] mlx4_en_poll_rx_cq+0x3f/0x80 [mlx4_en] > [1298269.103083] [] net_rx_action+0x119/0x210 > [1298269.103086] [] __do_softirq+0xb0/0x220 > [1298269.103090] [] ? handle_irq_event+0x4d/0x70 > [1298269.103095] [] call_softirq+0x1c/0x30 > [1298269.103100] [] do_softirq+0x55/0x90 > [1298269.103101] [] irq_exit+0x75/0x80 > [1298269.103103] [] do_IRQ+0x63/0xe0 > [1298269.103107] [] common_interrupt+0x67/0x67 > [1298269.103108] [] ? _raw_spin_unlock_irqrestore+0xf/0x20 > [1298269.103113] [] compaction_alloc+0x361/0x3f0 > [1298269.103115] [] ? pagevec_lru_move_fn+0xd7/0xf0 > [1298269.103118] [] migrate_pages+0xa9/0x470 > [1298269.103120] [] ? perf_trace_mm_compaction_migratepages+0xd0/0xd0 > [1298269.103122] [] compact_zone+0x4cb/0x910 > [1298269.103124] [] __compact_pgdat+0x14b/0x190 > [1298269.103125] [] compact_pgdat+0x2d/0x30 > [1298269.103129] [] ? fragmentation_index+0x19/0x70 > [1298269.103131] [] balance_pgdat+0x6ef/0x710 > [1298269.103133] [] kswapd+0x14a/0x390 > [1298269.103136] [] ? add_wait_queue+0x60/0x60 > [1298269.103138] [] ? balance_pgdat+0x710/0x710 > [1298269.103140] [] kthread+0x93/0xa0 > [1298269.103142] [] kernel_thread_helper+0x4/0x10 > [1298269.103144] [] ? kthread_worker_fn+0x140/0x140 > [1298269.103146] [] ? gs_change+0xb/0xb > > The kernel is based on a Fedora 16 kernel and actually has the 3.4.10 > patches applied. I can easily test patches or different kernels. > > I'm mostly wondering if there is anything that can be done about these > failures? It appears that these failures have to do with handling > fragmented IP frames, but the majority of the packets this machines > should not be fragmented (there are probably some that are). > > From a memory management point of view the system has 48GB of RAM, and > typically 44GB of that is page cache. The dirty pages seem to hover > around 5-6MB and the filesystem/disks don't seem to have any problems > keeping up with writing out the data. What is the value of /proc/sys/vm/min_free_kbytes ? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org