Re: [PATCH v3 net-next 08/14] mlx4: use order-0 pages for RX

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Eric Dumazet <edumazet@google.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Tariq Toukan <tariqt@mellanox.com>,
	Martin KaFai Lau <kafai@fb.com>,
	Saeed Mahameed <saeedm@mellanox.com>,
	Willem de Bruijn <willemb@google.com>,
	Brenden Blanco <bblanco@plumgrid.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH v3 net-next 08/14] mlx4: use order-0 pages for RX
Date: Tue, 14 Feb 2017 05:45:01 -0800	[thread overview]
Message-ID: <CANn89i+udp6Y42D9wqmz7U6LGn1mtDRXpQGHAOAeX25eD0dGnQ@mail.gmail.com> (raw)
In-Reply-To: <20170214131206.44b644f6@redhat.com>

On Tue, Feb 14, 2017 at 4:12 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:

> It is important to understand that there are two cases for the cost of
> an atomic op, which depend on the cache-coherency state of the
> cacheline.
>
> Measured on Skylake CPU i7-6700K CPU @ 4.00GHz
>
> (1) Local CPU atomic op :  27 cycles(tsc)  6.776 ns
> (2) Remote CPU atomic op: 260 cycles(tsc) 64.964 ns
>

Okay, it seems you guys really want a patch that I said was not giving
good results

Let me publish the numbers I get , adding or not the last (and not
official) patch.

If I _force_ the user space process to run on the other node,
then the results are not the ones Alex or you are expecting.

I have with this patch about 2.7 Mpps of this silly single TCP flow,
and 3.5 Mpps without it.

lpaa24:~# sar -n DEV 1 10 | grep eth0 | grep Ave
Average:         eth0 2699243.20  16663.70 1354783.36   1079.95
0.00      0.00      4.50

Profile of the cpu on NUMA node 1 ( netserver consuming data ) :

    54.73%  [kernel]      [k] copy_user_enhanced_fast_string
    31.07%  [kernel]      [k] skb_release_data
     4.24%  [kernel]      [k] skb_copy_datagram_iter
     1.35%  [kernel]      [k] copy_page_to_iter
     0.98%  [kernel]      [k] _raw_spin_lock
     0.90%  [kernel]      [k] skb_release_head_state
     0.60%  [kernel]      [k] tcp_transmit_skb
     0.51%  [kernel]      [k] mlx4_en_xmit
     0.33%  [kernel]      [k] ___cache_free
     0.28%  [kernel]      [k] tcp_rcv_established

Profile of cpu handling mlx4 softirqs (NUMA node 0)


    48.00%  [kernel]          [k] mlx4_en_process_rx_cq
    12.92%  [kernel]          [k] napi_gro_frags
     7.28%  [kernel]          [k] inet_gro_receive
     7.17%  [kernel]          [k] tcp_gro_receive
     5.10%  [kernel]          [k] dev_gro_receive
     4.87%  [kernel]          [k] skb_gro_receive
     2.45%  [kernel]          [k] mlx4_en_prepare_rx_desc
     2.04%  [kernel]          [k] __build_skb
     1.02%  [kernel]          [k] napi_reuse_skb.isra.95
     1.01%  [kernel]          [k] tcp4_gro_receive
     0.65%  [kernel]          [k] kmem_cache_alloc
     0.45%  [kernel]          [k] _raw_spin_lock

Without the latest  patch (the exact patch series v3 I submitted),
thus with this atomic_inc() in mlx4_en_process_rx_cq  instead of only reads.

lpaa24:~# sar -n DEV 1 10|grep eth0|grep Ave
Average:         eth0 3566768.50  25638.60 1790345.69   1663.51
0.00      0.00      4.50

Profiles of the two cpus :

    74.85%  [kernel]      [k] copy_user_enhanced_fast_string
     6.42%  [kernel]      [k] skb_release_data
     5.65%  [kernel]      [k] skb_copy_datagram_iter
     1.83%  [kernel]      [k] copy_page_to_iter
     1.59%  [kernel]      [k] _raw_spin_lock
     1.48%  [kernel]      [k] skb_release_head_state
     0.72%  [kernel]      [k] tcp_transmit_skb
     0.68%  [kernel]      [k] mlx4_en_xmit
     0.43%  [kernel]      [k] page_frag_free
     0.38%  [kernel]      [k] ___cache_free
     0.37%  [kernel]      [k] tcp_established_options
     0.37%  [kernel]      [k] __ip_local_out


   37.98%  [kernel]          [k] mlx4_en_process_rx_cq
    26.47%  [kernel]          [k] napi_gro_frags
     7.02%  [kernel]          [k] inet_gro_receive
     5.89%  [kernel]          [k] tcp_gro_receive
     5.17%  [kernel]          [k] dev_gro_receive
     4.80%  [kernel]          [k] skb_gro_receive
     2.61%  [kernel]          [k] __build_skb
     2.45%  [kernel]          [k] mlx4_en_prepare_rx_desc
     1.59%  [kernel]          [k] napi_reuse_skb.isra.95
     0.95%  [kernel]          [k] tcp4_gro_receive
     0.51%  [kernel]          [k] kmem_cache_alloc
     0.42%  [kernel]          [k] __inet_lookup_established
     0.34%  [kernel]          [k] swiotlb_sync_single_for_cpu


So probably this will need further analysis, outside of the scope of
this patch series.

Could we now please Ack this v3 and merge it ?

Thanks.



> Notice the huge difference. And in case 2, it is enough that the remote
> CPU reads the cacheline and brings it into "Shared" (MESI) state, and
> the local CPU then does the atomic op.
>
> One key ideas behind the page_pool, is that remote CPUs read/detect
> refcnt==1 (Shared-state), and store the page in a small per-CPU array.
> When array is full, it gets bulk returned to the shared-ptr-ring pool.
> When "local" CPU need new pages, from the shared-ptr-ring it prefetchw
> during it's bulk refill, to latency-hide the MESI transitions needed.
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-02-14 13:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170213195858.5215-1-edumazet@google.com>
     [not found] ` <20170213195858.5215-9-edumazet@google.com>
     [not found]   ` <CAKgT0Ufx0Y=9kjLax36Gx4e7Y-A7sKZDNYxgJ9wbCT4_vxHhGA@mail.gmail.com>
     [not found]     ` <CANn89iLkPB_Dx1L2dFfwOoeXOmPhu_C3OO2yqZi8+Rvjr=-EtA@mail.gmail.com>
     [not found]       ` <CAKgT0UeB_e_Z7LM1_r=en8JJdgLhoYFstWpCDQN6iawLYZJKDA@mail.gmail.com>
2017-02-14 12:12         ` Jesper Dangaard Brouer
2017-02-14 13:45           ` Eric Dumazet [this message]
2017-02-14 14:12             ` Eric Dumazet
2017-02-14 14:56             ` Tariq Toukan
2017-02-14 15:51               ` Eric Dumazet
2017-02-14 16:03                 ` Eric Dumazet
2017-02-14 17:29                 ` Tom Herbert
2017-02-15 16:42                   ` Tariq Toukan
2017-02-15 16:57                     ` Eric Dumazet
2017-02-16 13:08                       ` Tariq Toukan
2017-02-16 15:47                         ` Eric Dumazet
2017-02-16 17:05                         ` Tom Herbert
2017-02-16 17:11                           ` Eric Dumazet
2017-02-16 20:49                             ` Saeed Mahameed
2017-02-16 19:03                           ` David Miller
2017-02-16 21:06                             ` Saeed Mahameed
2017-02-14 17:04               ` David Miller
2017-02-14 17:17                 ` David Laight
2017-02-14 17:22                   ` David Miller
2017-02-14 19:38                 ` Jesper Dangaard Brouer
2017-02-14 19:59                   ` David Miller
2017-02-14 17:29               ` Alexander Duyck
2017-02-14 18:46                 ` Jesper Dangaard Brouer
2017-02-14 19:02                   ` Eric Dumazet
2017-02-14 20:02                     ` Jesper Dangaard Brouer
2017-02-14 21:56                       ` Eric Dumazet
2017-02-14 19:06                   ` Alexander Duyck
2017-02-14 19:50                     ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANn89i+udp6Y42D9wqmz7U6LGn1mtDRXpQGHAOAeX25eD0dGnQ@mail.gmail.com \
    --to=edumazet@google.com \
    --cc=alexander.duyck@gmail.com \
    --cc=ast@kernel.org \
    --cc=bblanco@plumgrid.com \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=kafai@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=tariqt@mellanox.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox