From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6F3ACE7A88 for ; Sat, 23 Sep 2023 07:00:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B236E6B0271; Sat, 23 Sep 2023 03:00:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD3EA6B0274; Sat, 23 Sep 2023 03:00:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 99B966B027F; Sat, 23 Sep 2023 03:00:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8B2C36B0271 for ; Sat, 23 Sep 2023 03:00:06 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5A3B680B1E for ; Sat, 23 Sep 2023 07:00:06 +0000 (UTC) X-FDA: 81266962812.11.6282CCE Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf26.hostedemail.com (Postfix) with ESMTP id 8E3F5140006 for ; Sat, 23 Sep 2023 07:00:03 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PIbKeXYR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695452403; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FcnYbvS5MK6GuTtXypuDsEXHQd3nYjUtnq+MVVoDLKA=; b=0DpdWN2jczIidWtZAgMQE2RkNpSr9Bib/771423qG+K7zv14ujKJqEbUucQanX0saCZnxU p8ssB2hGUd0g2t8OZK4p/RAvvRc+p3TktXenkbdFVvm+5QFIyOj1fgPefJ/Q9dGQWs6Cr8 WURWT9AAIOwVtWSdG7qzOsxcR1oYRN4= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PIbKeXYR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of willemdebruijn.kernel@gmail.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=willemdebruijn.kernel@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695452403; a=rsa-sha256; cv=none; b=aoi94j998Dmxg8fP+6Na+hqZUnbnl8skB33CPVx0FjKG4/0yn1E4FiMVGpLJNg8m1403Xz aNXc7sRhvcSKgN9atl9riv/H2lMp4ZNX3l+iBsIIp09CxSa3ACj2a9o7NEQ4zTB/G1KbV2 GoeJtGNHJu8CAsm7m1GfseAdLq0A9ls= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-4180adafdc6so3189271cf.2 for ; Sat, 23 Sep 2023 00:00:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695452402; x=1696057202; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FcnYbvS5MK6GuTtXypuDsEXHQd3nYjUtnq+MVVoDLKA=; b=PIbKeXYRJWvLLXmaEuMzQ6m6mC8FFF+LbLFLSj36K5K+9CXTRn3YwZ9q/d2KMqQDJp v+v2WkfS47q31Be/rlCIvCCxlfqnXEwCN+WtmMcQB3434+NuERXO/inv3JGoI9GYLOu7 t4O/0/D75sbzcBS8VguPtrtiwaJN51neNjKoJbrADYUj33dBnP3LuK+uLfeKgLYAOwMk aCjBOR/4MEshy/Nan0oaVT3P4iDgRDHD6MAhFQDLT1Jft2QghvgipTiUc6h15t70euqU qeA94Xy8rTwHDqJYZpU2jJJefe6UzPg/kkiQB54H3SHozSSeoGagYqd4Qw8OtmwDgeDT FKng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695452402; x=1696057202; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FcnYbvS5MK6GuTtXypuDsEXHQd3nYjUtnq+MVVoDLKA=; b=RdYkvkma3C6fGjJbBoEkQ11ztrixLLvP2MV2y/8Lz6B4SUJsl2JUXZHz+qegIO1wcv P+1ZfNC+Gwe472dmqEdbkO7v91Qeom7ggtXbIbxeGZzj6e1fcOdlVgw8Ktd2PZuRqOOT ziZKDXlwWWCYAFYcfpG8iv6Uk7pTGrUgyEGMdSy/jAgS0Le/gFQqq4oeV91xpqkiljhP Oe5sqqn056VoBOymwBvsr6zt4zSdg08Sth7Lnlq4BDLfHSuiyR2odgB2vzhpZNQNuXY9 neQGFZZTrkh7Qy5JK0tzT29j95N6HG1hIUkwVK09bwN20xICoC50ZHxalim6quQ9Akfj 2+rQ== X-Gm-Message-State: AOJu0YyTmmdWEtYb+bCryKiUWB6oPQGcjDI2NkCy1bsDvdNZfQYVfYf4 8ORb3JfHW6hs38bJxI/zmy47n9ae24Ajb/eKO1Y= X-Google-Smtp-Source: AGHT+IFMxTVXVKT+nznCbj1bmQxWY2DbB+LZ7TuCR7qcNktIseQwd/89E9HbXm65hYtHy2dHcCCUdURKinwtHDaGpS8= X-Received: by 2002:a05:620a:4487:b0:774:165a:6990 with SMTP id x7-20020a05620a448700b00774165a6990mr1758934qkp.20.1695452402627; Sat, 23 Sep 2023 00:00:02 -0700 (PDT) MIME-Version: 1.0 References: <20230920222231.686275-1-dhowells@redhat.com> <591a70bf016b4317add2d936696abc0f@AcuMS.aculab.com> <1173637.1695384067@warthog.procyon.org.uk> In-Reply-To: <1173637.1695384067@warthog.procyon.org.uk> From: Willem de Bruijn Date: Sat, 23 Sep 2023 08:59:25 +0200 Message-ID: Subject: Re: [PATCH v5 00/11] iov_iter: Convert the iterator macros into inline funcs To: David Howells Cc: David Laight , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jens Axboe , Al Viro , Linus Torvalds , Christoph Hellwig , Christian Brauner , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8E3F5140006 X-Stat-Signature: gjdqtkjfpmib8szp8dfqha8xgf96nei9 X-HE-Tag: 1695452403-506935 X-HE-Meta: U2FsdGVkX1+Jlchdrqm68V/qhZoGNMMYVBwasI8N6NT26x3NKYw2/oMn6dGa32mhiuHW52cwRanIR0WY5dC05Kox0AxT0pbqklM+1zULQUK9v9dW4LyyER3kOoo+gxG6R/HOHAWl/yWe+WL8tRywF4VaNdQFLrLhHBTBTqvwPlUz2pVGepJqw28bkxJhR/0stJGeqNUMEfNVCRlz6884VL66lgG+WuoMS25Pez+phCeKi/zULYP6Fy8ufsX5WUIBog1DVcdKENNX+o929WKizrKhXSYwqV5EGVPwMPAk1Ro5PgjMZ8ucOaQRulgImWx8BQgfM3tip4O2mjKb8wRqqilGqWAiaGvLLCPNlJuB824+7h2iHlRQ9s+EQoCHvHX8gTFKZkZAKZTiBXYFw6bSS0YUebCczSi+W3Ob5utwlV+UJ1sA6WX66WiE7P01UhN2S6UuCuksUh9nrpVznVbPgS/P2YE25yD1QG6ELW7dM//Ieh8t78jiBMlQLYlTynOoI3PhB1MMKsx1t88x5JAxBbXzt0m7ReHmo7H3UCWxiRzV6/6hlizQRsUOhQMAEfonr1pxvYm5kPfZw1fBHPcwIL5XLqya0YqkFulAvaPjSPIJjA/3Dd810js5TAylWamP0XeRfBJlpG7qn/uxByxD5T61ZmkMmsXHQmqrqxTvMybiDMxs6x5UZ29g9laW8lBzp6k/C117sTMWL3exxNFGVNkFPkJ/bu+eOX0cvqyDP2v7BU6zk38E5BQhhgpCpznCJT0p7XEgq6KJXYAlcsnfc3v+M32qZm107gvkKwtK5Ia5QEOIZJ6aQzy6GXIrI8sJhl9Cblb9NL2b114sM1XEsg95950Yr969NZ8dg0P19DryjF8fqISl48pme0/TNOrJImCSdThvKx9+wxkLeCVU4rarKQwyVOOQQZNnMXl5vBpdHcARijVlyYttj8jSs8fvIj6VIigAlngikHAopro hw+8BOEd IzTKO+A8U8JxeGHTzDLW8zJFj3sZXzci1VINVwvMT0HIkyBgR5+JBwrRqTb+S5wK0JCC3399vTH6Z1JKl9rACp0QzLN1rj0lTcZ4BrOQGCDc+XR2H1pVZRRJx9MD88/OpAQ6PNckSrfbvtrpNZNV1+nzEDaCddo4v3Jl7rxNNU9BPL0HoDiS5kJtcMXyuVF5cbkgX2rOlDn79j2wi3BwvKHOrt9lU6iIMVBFKFGG2GwZ9f5K+6vs3Xh8XA/Jzc0z2mnKqxVt4Bk2vgETuPCY34i0tVEo3IZoyGkh9BnELqugIfnuvsHuNmXlDvFBbaBo6H0r0vcDqLYAdC14reCtvZ9x4ipLUSiVh0M3LF1EJpnxESIxKbSfWoJTis2D8LDoV2oTg8kEtr7JjYYYMEJgYZn1NX+NnhiY5ZLbyJum9DGGlfjHOJ8llsPl5R8Io2gkiZVizqa9STtPeCP0Mq/xdhdgNaKGSl9l8YEjC7lkeIovzqwPvDN/HpIwc/R84r9+Xj4Q/tDXMhURLqzE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 22, 2023 at 2:01=E2=80=AFPM David Howells = wrote: > > David Laight wrote: > > > > (8) Move the copy-and-csum code to net/ where it can be in proximity= with > > > the code that uses it. This eliminates the code if CONFIG_NET= =3Dn and > > > allows for the slim possibility of it being inlined. > > > > > > (9) Fold memcpy_and_csum() in to its two users. > > > > > > (10) Move csum_and_copy_from_iter_full() out of line and merge in > > > csum_and_copy_from_iter() since the former is the only caller of= the > > > latter. > > > > I thought that the real idea behind these was to do the checksum > > at the same time as the copy to avoid loading the data into the L1 > > data-cache twice - especially for long buffers. > > I wonder how often there are multiple iov[] that actually make > > it better than just check summing the linear buffer? > > It also reduces the overhead for finding the data to checksum in the case= the > packet gets split since we're doing the checksumming as we copy - but wit= h a > linear buffer, that's negligible. > > > I had a feeling that check summing of udp data was done during > > copy_to/from_user, but the code can't be the copy-and-csum here > > for that because it is missing support form odd-length buffers. > > Is there a bug there? > > > Intel x86 desktop chips can easily checksum at 8 bytes/clock > > (But probably not with the current code!). > > (I've got ~12 bytes/clock using adox and adcx but that loop > > is entirely horrid and it would need run-time patching. > > Especially since I think some AMD cpu execute them very slowly.) > > > > OTOH 'rep movs[bq]' copy will copy 16 bytes/clock (32 if the > > destination is 32 byte aligned - it pretty much won't be). > > > > So you'd need a csum-and-copy loop that did 16 bytes every > > three clocks to get the same throughput for long buffers. > > In principle splitting the 'adc memory' into two instructions > > is the same number of u-ops - but I'm sure I've tried to do > > that and failed and the extra memory write can happen in > > parallel with everything else. > > So I don't think you'll get 16 bytes in two clocks - but you > > might get it is three. > > > > OTOH for a cpu where memcpy is code loop summing the data in > > the copy loop is likely to be a gain. > > > > But I suspect doing the checksum and copy at the same time > > got 'all to complicated' to actually implement fully. > > With most modern ethernet chips checksumming receive pacakets > > does it really get used enough for the additional complexity? > > You may be right. That's more a question for the networking folks than f= or > me. It's entirely possible that the checksumming code is just not used o= n > modern systems these days. > > Maybe Willem can comment since he's the UDP maintainer? Perhaps these days it is more relevant to embedded systems than high end servers.