From: David Howells <dhowells@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: dhowells@redhat.com, "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
David Ahern <dsahern@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
Al Viro <viro@zeniv.linux.org.uk>, Jens Axboe <axboe@kernel.dk>,
Jeff Layton <jlayton@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Chuck Lever III <chuck.lever@oracle.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jeroen de Borst <jeroendb@google.com>,
Catherine Sullivan <csully@google.com>,
Shailend Chand <shailend@google.com>,
Felix Fietkau <nbd@nbd.name>, John Crispin <john@phrozen.org>,
Sean Wang <sean.wang@mediatek.com>,
Mark Lee <Mark-MC.Lee@mediatek.com>,
Lorenzo Bianconi <lorenzo@kernel.org>,
Matthias Brugger <matthias.bgg@gmail.com>,
AngeloGioacchino Del Regno
<angelogioacchino.delregno@collabora.com>,
Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
Chaitanya Kulkarni <kch@nvidia.com>,
Andrew Morton <akpm@linux-foundation.org>,
netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-mediatek@lists.infradead.org,
linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v6 04/18] mm: Make the page_frag_cache allocator use per-cpu
Date: Thu, 13 Apr 2023 00:12:40 +0100 [thread overview]
Message-ID: <399350.1681341160@warthog.procyon.org.uk> (raw)
In-Reply-To: <ZDbO3haK/1+7xdRC@infradead.org>
Christoph Hellwig <hch@infradead.org> wrote:
> On Tue, Apr 11, 2023 at 05:08:48PM +0100, David Howells wrote:
> > Make the page_frag_cache allocator have a separate allocation bucket for
> > each cpu to avoid racing. This means that no lock is required, other than
> > preempt disablement, to allocate from it, though if a softirq wants to
> > access it, then softirq disablement will need to be added.
> ...
> Let me ask a third time as I've not got an answer the last two times:
Sorry about that. I think the problem is that the copy of the message from
you directly to me arrives after the first copy that comes via a mailing list
and google then deletes the direct one - as obviously no one could possibly
want duplicates, right? :-/ - and so you usually get consigned to the
linux-kernel or linux-fsdevel mailing list folder.
> > Make the NVMe, mediatek and GVE drivers pass in NULL to page_frag_cache()
> > and use the default allocation buckets rather than defining their own.
>
> why are these callers treated different from the others?
There are only four users of struct page_frag_cache, the one these patches
modify::
(1) GVE.
(2) Mediatek.
(3) NVMe.
(4) skbuff.
Note that things are slightly confused by there being three very similarly
named frag allocators (page_frag and page_frag_1k in addition to
page_frag_cache) and the __page_frag_cache_drain() function gets used for
things other than just page_frag_cache.
I've replaced the single allocation buckets with per-cpu allocation buckets
for (1), (2) and (3) so that no locking[*] is required other than pinning it
to the cpu temporarily - but I can't test them as I don't have hardware.
[*] Note that what's upstream doesn't have locking, and I'm not sure all the
users of it are SMP-safe.
That leaves (4).
Upstream, skbuff.c creates two separate per-cpu frag caches and I've elected
to retain that, except that the per-cpu bits are now inside the frag allocator
as I'm not entirely sure of the reason that there's a separate napi frag cache
to the netdev_alloc_cache.
The general page_frag_cache allocator is used by skb_splice_from_iter() if it
encounters a page it can't take a ref on, so it has been tested through that
using sunrpc, sunrpc+siw and cifs+siw.
> Can you show any performance numbers?
As far as I can tell, it doesn't make any obvious difference to directly
pumping data through TCP or TLS over TCP or transferring data over a network
filesystem such as sunrpc or cifs using siw/TCP. I've tested this between two
machines over a 1G and a 10G link.
I can generate some actual numbers tomorrow.
Actually, I probably can drop these patches 2-4 from this patchset and just
use the netdev_alloc_cache in skb_splice_from_iter() for now. Since that
copies unspliceable data, I no longer need to allocate frags in the next layer
up.
David
next prev parent reply other threads:[~2023-04-12 23:12 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 16:08 [PATCH net-next v6 00/18] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1 David Howells
2023-04-11 16:08 ` [PATCH net-next v6 01/18] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-04-13 0:51 ` Al Viro
2023-04-13 4:29 ` Al Viro
2023-04-13 20:39 ` David Howells
2023-04-13 20:49 ` Al Viro
2023-04-13 21:01 ` Al Viro
2023-04-11 16:08 ` [PATCH net-next v6 02/18] mm: Move the page fragment allocator from page_alloc.c into its own file David Howells
2023-04-11 16:08 ` [PATCH net-next v6 03/18] mm: Make the page_frag_cache allocator use multipage folios David Howells
2023-04-11 16:08 ` [PATCH net-next v6 04/18] mm: Make the page_frag_cache allocator use per-cpu David Howells
2023-04-11 16:55 ` Christoph Hellwig
2023-04-12 15:31 ` Christoph Hellwig
2023-04-12 23:12 ` David Howells [this message]
2023-04-11 16:08 ` [PATCH net-next v6 05/18] net: Pass max frags into skb_append_pagefrags() David Howells
2023-04-11 16:08 ` [PATCH net-next v6 06/18] net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES David Howells
2023-04-13 4:41 ` Al Viro
2023-04-13 21:26 ` How to determine if a page can be spliced into an skbuff, or if it should be copied/rejected? David Howells
2023-04-11 16:08 ` [PATCH net-next v6 07/18] tcp: Support MSG_SPLICE_PAGES David Howells
2023-04-11 17:09 ` Eric Dumazet
2023-04-11 17:49 ` David Howells
2023-04-11 16:08 ` [PATCH net-next v6 08/18] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-04-11 16:08 ` [PATCH net-next v6 09/18] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-04-11 16:08 ` [PATCH net-next v6 10/18] espintcp: Inline do_tcp_sendpages() David Howells
2023-04-11 16:08 ` [PATCH net-next v6 11/18] tls: " David Howells
2023-04-11 16:08 ` [PATCH net-next v6 12/18] siw: " David Howells
2023-04-11 17:22 ` Tom Talpey
2023-04-11 16:08 ` [PATCH net-next v6 13/18] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-04-11 16:08 ` [PATCH net-next v6 14/18] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-04-11 16:08 ` [PATCH net-next v6 15/18] ip6, udp6: " David Howells
2023-04-11 16:09 ` [PATCH net-next v6 16/18] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-04-11 16:09 ` [PATCH net-next v6 17/18] ip: Remove ip_append_page() David Howells
2023-04-11 16:09 ` [PATCH net-next v6 18/18] af_unix: Support MSG_SPLICE_PAGES David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=399350.1681341160@warthog.procyon.org.uk \
--to=dhowells@redhat.com \
--cc=Mark-MC.Lee@mediatek.com \
--cc=akpm@linux-foundation.org \
--cc=angelogioacchino.delregno@collabora.com \
--cc=axboe@fb.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=csully@google.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=jeroendb@google.com \
--cc=jlayton@kernel.org \
--cc=john@phrozen.org \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=lorenzo@kernel.org \
--cc=matthias.bgg@gmail.com \
--cc=nbd@nbd.name \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sagi@grimberg.me \
--cc=sean.wang@mediatek.com \
--cc=shailend@google.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willemdebruijn.kernel@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox