* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
@ 2009-06-16 17:24 starlight
0 siblings, 0 replies; 8+ messages in thread
From: starlight @ 2009-06-16 17:24 UTC (permalink / raw)
To: Mel Gorman
Cc: linux-kernel, linux-mm, hugh.dickins, Lee.Schermerhorn,
kosaki.motohiro, ebmunson, agl, apw, wli
>Tried increasing a few /proc/slabinfo tuneable parameters today
>and this appears to have fixed the issue so far today.
Spoke too soon. A burst of allocation fails appeared
a some incoming data was lost. 'e1000e' system had
no problem.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-06-16 6:12 ` Eric Dumazet
@ 2009-07-05 3:44 ` Herbert Xu
0 siblings, 0 replies; 8+ messages in thread
From: Herbert Xu @ 2009-07-05 3:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: starlight, linux-kernel, mel, linux-mm, hugh.dickins,
Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli,
netdev
Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Because of slab rounding, this reallocation should be done only if resulting data
> portion is really smaller (50 %) than original skb.
If we're going to do this in the core then we should only do it
in the spots where the packet may be held indefinitely.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-06-16 9:19 ` Mel Gorman
@ 2009-06-16 15:25 ` starlight
0 siblings, 0 replies; 8+ messages in thread
From: starlight @ 2009-06-16 15:25 UTC (permalink / raw)
To: Mel Gorman
Cc: linux-kernel, linux-mm, hugh.dickins, Lee.Schermerhorn,
kosaki.motohiro, ebmunson, agl, apw, wli
At 10:19 AM 6/16/2009 +0100, Mel Gorman wrote:
>Can you give an example of an allocation failure? Specifically, I want to
>see what sort of allocation it was and what order.
I think it's just the basic buffer allocation for
Ethernet frames arriving in the 'ixgbe' driver. Seems
like it's one allocation per frame. Per the original
message the allocations are made with the 'netdev_alloc_skb()'
kernel call. The function where this code appears is
named 'ixgbe_alloc_rx_buffers()' and the comment is
"Replace used receive buffers."
The code path in question does not generate an error. It just
increments the 'alloc_rx_buff_failed' counter for the ethX
device. In addition it appears that the frame is dropped
only if the PCIe hardware ring-queue associated with each
interface is full. So on the next interrupt the allocation
is retried and appears to be successful 99% of the time.
>For reliable protocols, an allocation failure should recover and the
>data get through but obviously there is a drop in network performance
>when this happens.
This is for a specialized high-volume UDP multicast application
where data loss of any kind is unacceptable.
>If the allocations are high-order and atomic, increasing min_free_kbytes
>can help, particularly in situations where there is a burst of network
>traffic. I won't know if they are atomic until I see an error message
>though.
Doesn't the use of 'netdev_alloc_skb()' kernel primitive
imply what the nature of the allocation is? I followed the
call graph down into "kmem" land, but it's a complex place
and so I abandoned the review.
My impression is that 'min_free_kbytes' relates mainly to systems
where significant paging pressure exists. The servers have zero
paging pressure and lots of free memory, though mostly in the
form of instantly discardable file data cache pages. In the
past disabling the program that generates the cache pressure
has had no effect on data loss, though I haven't tried it in
relation this specific issue.
Tried increasing a few /proc/slabinfo tuneable parameters today
and this appears to have fixed the issue so far today.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight
2009-06-16 2:26 ` Eric Dumazet
@ 2009-06-16 9:19 ` Mel Gorman
2009-06-16 15:25 ` starlight
1 sibling, 1 reply; 8+ messages in thread
From: Mel Gorman @ 2009-06-16 9:19 UTC (permalink / raw)
To: starlight
Cc: linux-kernel, linux-mm, hugh.dickins, Lee.Schermerhorn,
kosaki.motohiro, ebmunson, agl, apw, wli
On Mon, Jun 15, 2009 at 08:19:33PM -0400, starlight@binnacle.cx wrote:
> Hello,
>
> I submitted testcase for a hugepages bug that has been
> successfully resolved. Have an apparently obscure question
> related to MM, and so I am asking anyone who might have some idea
> on this. Nothing much turned up via Google and digging into
> the KMEM code looks daunting.
>
> Running Intel 82598/ixgbe 10 gig Ethernet under heavy stress.
> Generally is working well after tuning IRQ affinities, but a
> fair number of buffer allocation failures are occurring in the
> 'ixgbe' device driver and are reported via 'ethtool' statistics.
> This may be causing data loss.
>
Can you give an example of an allocation failure? Specifically, I want to
see what sort of allocation it was and what order.
For reliable protocols, an allocation failure should recover and the
data get through but obviously there is a drop in network performance
when this happens.
> The kernel primitive returning the error is netdev_alloc_skb().
>
> Are any tuneable parameters available that can reduce or
> eliminate these allocation failures? Have about eleven
> gigabytes of free memory, though most of that is consumed
> by non-dirty file cache data. Total system memory is 16GB with
> 4GB allocated to hugepages. Zero swap usage and activity though
> swap is enabled. Most application memory is hugepage or is
> 'mlock()'ed.
>
If the allocations are high-order and atomic, increasing min_free_kbytes
can help, particularly in situations where there is a burst of network
traffic. I won't know if they are atomic until I see an error message
though.
> Thank you.
>
>
>
>
>
> System rebooted before test run.
>
> Dual Xeon E5430, 16GB FB-DIMM RAM.
>
>
> $ cat /proc/meminfo
> MemTotal: 16443828 kB
> MemFree: 281176 kB
> Buffers: 53896 kB
> Cached: 11331924 kB
> SwapCached: 0 kB
> Active: 200740 kB
> Inactive: 11284312 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 16443828 kB
> LowFree: 281176 kB
> SwapTotal: 2031608 kB
> SwapFree: 2031400 kB
> Dirty: 4 kB
> Writeback: 0 kB
> AnonPages: 104464 kB
> Mapped: 14644 kB
> Slab: 440452 kB
> PageTables: 4032 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> CommitLimit: 8156368 kB
> Committed_AS: 122452 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 266872 kB
> VmallocChunk: 34359471043 kB
> HugePages_Total: 2048
> HugePages_Free: 735
> HugePages_Rsvd: 0
> Hugepagesize: 2048 kB
>
>
> # ethtool -S eth2 | egrep -v ': 0$'
> NIC statistics:
> rx_packets: 724246449
> tx_packets: 229847
> rx_bytes: 152691992335
> tx_bytes: 10573426
> multicast: 725997241
> broadcast: 6
> rx_csum_offload_good: 723051776
> alloc_rx_buff_failed: 7119
> tx_queue_0_packets: 229847
> tx_queue_0_bytes: 10573426
> rx_queue_0_packets: 340698332
> rx_queue_0_bytes: 70844299683
> rx_queue_1_packets: 385298923
> rx_queue_1_bytes: 82276167594
>
>
> ixgbe driver fragment
> =====================
> struct sk_buff *skb = netdev_alloc_skb(adapter->netdev, bufsz);
>
> if (!skb) {
> adapter->alloc_rx_buff_failed++;
> goto no_buffers;
> }
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-06-16 4:12 ` starlight
@ 2009-06-16 6:12 ` Eric Dumazet
2009-07-05 3:44 ` Herbert Xu
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2009-06-16 6:12 UTC (permalink / raw)
To: starlight
Cc: Eric Dumazet, linux-kernel, Mel Gorman, linux-mm, hugh.dickins,
Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli,
Linux Netdev List
Please dont top post, we prefer other way around :)
starlight@binnacle.cx a ecrit :
> Eric,
>
> Great thought--thank you. Running a similar server with
> 82571/e1000e and it does not exhibit the problem. 'e1000e' has
> default copybreak=256 while 'ixgbe' has no copybreak. Rational
> given is
>
> http://osdir.com/ml/linux.drivers.e1000.devel/2008-01/msg00103.html
>
> But the comparion is a bit apples-and-oranges since the 'e1000e'
> system is dual Opteron 2354 while the 'ixgbe' system is Xeon
> E5430 (a painful choice thus far). Also 'e1000e' system passes
> data via a PACKET socket while the 'ixgbe' system passes data
> via UDP (a configurable option).
>
> I'm not fully up on how this all works: am I to understand that
> the error could result from RX ring-queue buffers not freeing
> quickly enough because they have a use-count held non-zero as
> the packet travels the stack?
Well, error is normal in stress situation, when no more kernel
memory is available.
cat /proc/net/udp
can show you (in last column) sockets where packets where dropped
by UDP stack if their receive queue was full.
>
> I've just doubled some SLAB tuneables that seem relevant, but
> if the cause is the aforementioned, this won't help. Will
> have the answer on the tweaks by the end of Tuesday.
>
> David
copybreak in drivers themselves is nice because driver can recycle
its rx skbs much faster, but that is suboptimal in forwarding (routers)
workloads. Its also a lot of duplicated code in every driver.
So we could do the skb trimming (ie : reallocating the data portion to exactly
the size of packet) in core network stack, when we know packet must be handled
by an application, and not dropped or forwarded by kernel.
Because of slab rounding, this reallocation should be done only if resulting data
portion is really smaller (50 %) than original skb.
>
>
>
> At 04:26 AM 6/16/2009 +0200, Eric Dumazet wrote:
>> 152691992335/724246449 = 210 bytes per rx packet in average
>>
>> It could make sense to add copybreak feature in this driver to
>> reduce memory needs, but that also would consume more cpu
>> cycles, and slow down forwarding setups.
>>
>> Maybe this packet trimming could be done generically in UDP
>> stack input path, before queueing packet into a receive queue,
>> if amount of available memory is under a given threshold.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-06-16 2:26 ` Eric Dumazet
@ 2009-06-16 4:12 ` starlight
2009-06-16 6:12 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: starlight @ 2009-06-16 4:12 UTC (permalink / raw)
To: Eric Dumazet
Cc: linux-kernel, Mel Gorman, linux-mm, hugh.dickins,
Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli
Eric,
Great thought--thank you. Running a similar server with
82571/e1000e and it does not exhibit the problem. 'e1000e' has
default copybreak=256 while 'ixgbe' has no copybreak. Rational
given is
http://osdir.com/ml/linux.drivers.e1000.devel/2008-01/msg00103.html
But the comparion is a bit apples-and-oranges since the 'e1000e'
system is dual Opteron 2354 while the 'ixgbe' system is Xeon
E5430 (a painful choice thus far). Also 'e1000e' system passes
data via a PACKET socket while the 'ixgbe' system passes data
via UDP (a configurable option).
I'm not fully up on how this all works: am I to understand that
the error could result from RX ring-queue buffers not freeing
quickly enough because they have a use-count held non-zero as
the packet travels the stack?
I've just doubled some SLAB tuneables that seem relevant, but
if the cause is the aforementioned, this won't help. Will
have the answer on the tweaks by the end of Tuesday.
David
At 04:26 AM 6/16/2009 +0200, Eric Dumazet wrote:
>
>152691992335/724246449 = 210 bytes per rx packet in average
>
>It could make sense to add copybreak feature in this driver to
>reduce memory needs, but that also would consume more cpu
>cycles, and slow down forwarding setups.
>
>Maybe this packet trimming could be done generically in UDP
>stack input path, before queueing packet into a receive queue,
>if amount of available memory is under a given threshold.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight
@ 2009-06-16 2:26 ` Eric Dumazet
2009-06-16 4:12 ` starlight
2009-06-16 9:19 ` Mel Gorman
1 sibling, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2009-06-16 2:26 UTC (permalink / raw)
To: starlight
Cc: linux-kernel, Mel Gorman, linux-mm, hugh.dickins,
Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli
starlight@binnacle.cx a ecrit :
> Hello,
>
> I submitted testcase for a hugepages bug that has been
> successfully resolved. Have an apparently obscure question
> related to MM, and so I am asking anyone who might have some idea
> on this. Nothing much turned up via Google and digging into
> the KMEM code looks daunting.
>
> Running Intel 82598/ixgbe 10 gig Ethernet under heavy stress.
> Generally is working well after tuning IRQ affinities, but a
> fair number of buffer allocation failures are occurring in the
> 'ixgbe' device driver and are reported via 'ethtool' statistics.
> This may be causing data loss.
>
> The kernel primitive returning the error is netdev_alloc_skb().
>
> Are any tuneable parameters available that can reduce or
> eliminate these allocation failures? Have about eleven
> gigabytes of free memory, though most of that is consumed
> by non-dirty file cache data. Total system memory is 16GB with
> 4GB allocated to hugepages. Zero swap usage and activity though
> swap is enabled. Most application memory is hugepage or is
> 'mlock()'ed.
>
> Thank you.
>
>
>
>
>
> System rebooted before test run.
>
> Dual Xeon E5430, 16GB FB-DIMM RAM.
>
>
> $ cat /proc/meminfo
> MemTotal: 16443828 kB
> MemFree: 281176 kB
> Buffers: 53896 kB
> Cached: 11331924 kB
> SwapCached: 0 kB
> Active: 200740 kB
> Inactive: 11284312 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 16443828 kB
> LowFree: 281176 kB
> SwapTotal: 2031608 kB
> SwapFree: 2031400 kB
> Dirty: 4 kB
> Writeback: 0 kB
> AnonPages: 104464 kB
> Mapped: 14644 kB
> Slab: 440452 kB
> PageTables: 4032 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> CommitLimit: 8156368 kB
> Committed_AS: 122452 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 266872 kB
> VmallocChunk: 34359471043 kB
> HugePages_Total: 2048
> HugePages_Free: 735
> HugePages_Rsvd: 0
> Hugepagesize: 2048 kB
>
>
> # ethtool -S eth2 | egrep -v ': 0$'
> NIC statistics:
> rx_packets: 724246449
> tx_packets: 229847
> rx_bytes: 152691992335
> tx_bytes: 10573426
> multicast: 725997241
> broadcast: 6
> rx_csum_offload_good: 723051776
> alloc_rx_buff_failed: 7119
> tx_queue_0_packets: 229847
> tx_queue_0_bytes: 10573426
> rx_queue_0_packets: 340698332
> rx_queue_0_bytes: 70844299683
> rx_queue_1_packets: 385298923
> rx_queue_1_bytes: 82276167594
>
>
> ixgbe driver fragment
> =====================
> struct sk_buff *skb = netdev_alloc_skb(adapter->netdev, bufsz);
>
> if (!skb) {
> adapter->alloc_rx_buff_failed++;
> goto no_buffers;
> }
>
152691992335/724246449 = 210 bytes per rx packet in average
It could make sense to add copybreak feature in this driver to reduce memory needs,
but that also would consume more cpu cycles, and slow down forwarding setups.
Maybe this packet trimming could be done generically in UDP stack input path,
before queueing packet into a receive queue, if amount of available memory
is under a given threshold.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
2009-05-27 23:19 ` Ingo Molnar
@ 2009-06-16 0:19 ` starlight
2009-06-16 2:26 ` Eric Dumazet
2009-06-16 9:19 ` Mel Gorman
0 siblings, 2 replies; 8+ messages in thread
From: starlight @ 2009-06-16 0:19 UTC (permalink / raw)
To: linux-kernel, Mel Gorman, linux-mm, hugh.dickins,
Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli
Hello,
I submitted testcase for a hugepages bug that has been
successfully resolved. Have an apparently obscure question
related to MM, and so I am asking anyone who might have some idea
on this. Nothing much turned up via Google and digging into
the KMEM code looks daunting.
Running Intel 82598/ixgbe 10 gig Ethernet under heavy stress.
Generally is working well after tuning IRQ affinities, but a
fair number of buffer allocation failures are occurring in the
'ixgbe' device driver and are reported via 'ethtool' statistics.
This may be causing data loss.
The kernel primitive returning the error is netdev_alloc_skb().
Are any tuneable parameters available that can reduce or
eliminate these allocation failures? Have about eleven
gigabytes of free memory, though most of that is consumed
by non-dirty file cache data. Total system memory is 16GB with
4GB allocated to hugepages. Zero swap usage and activity though
swap is enabled. Most application memory is hugepage or is
'mlock()'ed.
Thank you.
System rebooted before test run.
Dual Xeon E5430, 16GB FB-DIMM RAM.
$ cat /proc/meminfo
MemTotal: 16443828 kB
MemFree: 281176 kB
Buffers: 53896 kB
Cached: 11331924 kB
SwapCached: 0 kB
Active: 200740 kB
Inactive: 11284312 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 16443828 kB
LowFree: 281176 kB
SwapTotal: 2031608 kB
SwapFree: 2031400 kB
Dirty: 4 kB
Writeback: 0 kB
AnonPages: 104464 kB
Mapped: 14644 kB
Slab: 440452 kB
PageTables: 4032 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 8156368 kB
Committed_AS: 122452 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 266872 kB
VmallocChunk: 34359471043 kB
HugePages_Total: 2048
HugePages_Free: 735
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
# ethtool -S eth2 | egrep -v ': 0$'
NIC statistics:
rx_packets: 724246449
tx_packets: 229847
rx_bytes: 152691992335
tx_bytes: 10573426
multicast: 725997241
broadcast: 6
rx_csum_offload_good: 723051776
alloc_rx_buff_failed: 7119
tx_queue_0_packets: 229847
tx_queue_0_bytes: 10573426
rx_queue_0_packets: 340698332
rx_queue_0_bytes: 70844299683
rx_queue_1_packets: 385298923
rx_queue_1_bytes: 82276167594
ixgbe driver fragment
=====================
struct sk_buff *skb = netdev_alloc_skb(adapter->netdev, bufsz);
if (!skb) {
adapter->alloc_rx_buff_failed++;
goto no_buffers;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-07-05 3:19 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-16 17:24 QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight
-- strict thread matches above, loose matches on Subject: below --
2009-05-27 11:12 [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman
2009-05-27 20:14 ` Andrew Morton
2009-05-27 23:19 ` Ingo Molnar
2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight
2009-06-16 2:26 ` Eric Dumazet
2009-06-16 4:12 ` starlight
2009-06-16 6:12 ` Eric Dumazet
2009-07-05 3:44 ` Herbert Xu
2009-06-16 9:19 ` Mel Gorman
2009-06-16 15:25 ` starlight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox