From: Arnd Bergmann <arnd@arndb.de>
To: virtualization@lists.linux-foundation.org
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
netdev@vger.kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org,
akpm@linux-foundation.org, hpa@zytor.com,
gregory.haskins@gmail.com
Subject: Re: [PATCHv3 2/2] vhost_net: a kernel-level virtio server
Date: Fri, 14 Aug 2009 13:40:36 +0200 [thread overview]
Message-ID: <200908141340.36176.arnd@arndb.de> (raw)
In-Reply-To: <20090813182931.GC6585@redhat.com>
On Thursday 13 August 2009, Michael S. Tsirkin wrote:
> What it is: vhost net is a character device that can be used to reduce
> the number of system calls involved in virtio networking.
> Existing virtio net code is used in the guest without modification.
AFAICT, you have addressed all my comments, mostly by convincing me
that you got it right anyway ;-).
I hope this gets into 2.6.32, good work!
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
One idea though:
> + /* Parameter checking */
> + if (sock->sk->sk_type != SOCK_RAW) {
> + r = -ESOCKTNOSUPPORT;
> + goto done;
> + }
> +
> + r = sock->ops->getname(sock, (struct sockaddr *)&uaddr.sa,
> + &uaddr_len, 0);
> + if (r)
> + goto done;
> +
> + if (uaddr.sa.sll_family != AF_PACKET) {
> + r = -EPFNOSUPPORT;
> + goto done;
> + }
You currently limit the scope of the driver by only allowing raw packet
sockets to be passed into the network driver. In qemu, we currently support
some very similar transports:
* raw packet (not in a release yet)
* tcp connection
* UDP multicast
* tap character device
* VDE with Unix local sockets
My primary interest right now is the tap support, but I think it would
be interesting in general to allow different file descriptor types
in vhost_net_set_socket. AFAICT, there are two major differences
that we need to handle for this:
* most of the transports are sockets, tap uses a character device.
This could be dealt with by having both a struct socket * in
struct vhost_net *and* a struct file *, or by always keeping the
struct file and calling vfs_readv/vfs_writev for the data transport
in both cases.
* Each transport has a slightly different header, we have
- raw ethernet frames (raw, udp multicast, tap)
- 32-bit length + raw frames, possibly fragmented (tcp)
- 80-bit header + raw frames, possibly fragmented (tap with vnet_hdr)
To handle these three cases, we need either different ioctl numbers
so that vhost_net can choose the right one, or a flags field in
VHOST_NET_SET_SOCKET, like
#define VHOST_NET_RAW 1
#define VHOST_NET_LEN_HDR 2
#define VHOST_NET_VNET_HDR 4
struct vhost_net_socket {
unsigned int flags;
int fd;
};
#define VHOST_NET_SET_SOCKET _IOW(VHOST_VIRTIO, 0x30, struct vhost_net_socket)
If both of those are addressed, we can treat vhost_net as a generic
way to do network handling in the kernel independent of the qemu
model (raw, tap, ...) for it.
Your qemu patch would have to work differently, so instead of
qemu -net nic,vhost=eth0
you would do the same as today with the raw packet socket extension
qemu -net nic -net raw,ifname=eth0
Qemu could then automatically try to use vhost_net, if it's available
in the kernel, or just fall back on software vlan otherwise.
Does that make sense?
Arnd <>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-08-14 11:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1250187913.git.mst@redhat.com>
2009-08-13 18:29 ` [PATCHv3 1/2] mm: export use_mm/unuse_mm to modules Michael S. Tsirkin
2009-08-13 18:29 ` [PATCHv3 2/2] vhost_net: a kernel-level virtio server Michael S. Tsirkin
2009-08-14 11:40 ` Arnd Bergmann [this message]
2009-08-16 6:51 ` Michael S. Tsirkin
2009-08-19 9:04 ` Arnd Bergmann
2009-08-19 13:04 ` Michael S. Tsirkin
2009-08-19 13:46 ` Arnd Bergmann
2009-08-19 14:20 ` Michael S. Tsirkin
2009-08-19 15:27 ` Arnd Bergmann
2009-08-20 8:31 ` Michael S. Tsirkin
2009-08-20 13:10 ` Arnd Bergmann
2009-08-20 13:38 ` Michael S. Tsirkin
2009-08-20 14:31 ` Arnd Bergmann
2009-08-20 14:42 ` Michael S. Tsirkin
2009-08-20 15:10 ` Arnd Bergmann
2009-08-21 13:20 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200908141340.36176.arnd@arndb.de \
--to=arnd@arndb.de \
--cc=akpm@linux-foundation.org \
--cc=gregory.haskins@gmail.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox