From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97F22EB64D9 for ; Thu, 29 Jun 2023 18:53:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 056A18D0003; Thu, 29 Jun 2023 14:53:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 006AD8D0001; Thu, 29 Jun 2023 14:53:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E10B98D0003; Thu, 29 Jun 2023 14:53:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D06288D0001 for ; Thu, 29 Jun 2023 14:53:35 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8CE4D1C8FDE for ; Thu, 29 Jun 2023 18:53:35 +0000 (UTC) X-FDA: 80956683990.26.5CDD8D1 Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) by imf12.hostedemail.com (Postfix) with ESMTP id 45CAD40010 for ; Thu, 29 Jun 2023 18:53:32 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b="DBh/dJlA"; dmarc=none; spf=pass (imf12.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.49 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688064813; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oiqBI6HJiAQKTdUamzsYg0eRal+ETXSDiBXUd1jkkc8=; b=sFeWUKanagXpYG2B4PoRBo5JlPfaX6AwjqGI4blKQ9G4uMufW7Fd5U5+HEW0BbHd5JpCD3 OXRzjypjTQXVsVbMEw/KlIjrMpHKUnTVN8ma37OauLdGGLzsxIHHJH8WUOdEZAvk821QvV adh9C4KfY1Z2LXtFrQ08N9s9bWeTUlY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b="DBh/dJlA"; dmarc=none; spf=pass (imf12.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.49 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688064813; a=rsa-sha256; cv=none; b=FFtlcnWj+3FeC9Rgk0If6DyDxfQaC3d0A+ElFEYMYCANJsvHbq6XoN70rf1uePxUXINNsG Qh0XI1twvtC09wmfh6DVX7qb8miOskrkqUQd8whQmiCmo1/JESa5N93Q2CFjWAsEUbK5vK hvVURduqnzgCMnljQDBXUxci4YxI43w= Received: by mail-lf1-f49.google.com with SMTP id 2adb3069b0e04-4fb5bcb9a28so1615510e87.3 for ; Thu, 29 Jun 2023 11:53:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1688064811; x=1690656811; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=oiqBI6HJiAQKTdUamzsYg0eRal+ETXSDiBXUd1jkkc8=; b=DBh/dJlAnYjrWVHe2S0hSw3z8FR2ILmnwnvz5iL9rHZaAh/43qoHsnskAOP07F9xqj rJhtJC1pu5gSVKdMJCMlNKxn/ZFYMN7mnaOiL+4QJv/HmM48EAFnJDdKLV0ellkV98OV Qn0jMNS02qDNVTR5Clzfw+ALGFNXxGiOylv7g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688064811; x=1690656811; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oiqBI6HJiAQKTdUamzsYg0eRal+ETXSDiBXUd1jkkc8=; b=VRLZhy5yXTdrgiJ/sarxo46YcJpbDxfKhb1gDZMa7gGE5JokimTA3jBo3FGhVWzGO0 srt2lDjCxoZ6i5R7s1OmCIk9wEP0THX++ieK/AlHonZ8+lzZnlOD9Of/MX96/DVrzEhF E4tkdkV8cVyaONx/WNDtPk2JmJEall9cVTHcV24SHDPWjBIarQppVtxoISiHEG5eXw3B u1fXjIFnlLu74W6Cl01dTOeLkr4AnB1kD0xW3SaRZfkIVGtx39S/T4X6mUr/SZK+zi7I aJR2zmYRQRtX2ldwtcAFnyPR67tFpPwXv2Kun0iY6QvMN9nI1ECHi5UxkHmFtUBJxHBD 43jA== X-Gm-Message-State: ABy/qLY1mmfMOR7/gdKwTcyz0AHgmc5B5MluPRuiglXLSCNcwvWm8z6A o8PVeB5tULpZ2Xo858JLfxYblP1VRmcevgpdCEXYA13j X-Google-Smtp-Source: APBJJlHgiUOfq9DA9sgntU8ctnInGImZoUC99SakPtjdtaYH5aTV1419d6FgnkgUKfR82qL+6BA4QQ== X-Received: by 2002:a05:6512:48c9:b0:4f9:5a61:194f with SMTP id er9-20020a05651248c900b004f95a61194fmr600239lfb.11.1688064811132; Thu, 29 Jun 2023 11:53:31 -0700 (PDT) Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com. [209.85.208.43]) by smtp.gmail.com with ESMTPSA id c4-20020aa7d604000000b0051a4a1abdbbsm6022971edr.49.2023.06.29.11.53.30 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 29 Jun 2023 11:53:30 -0700 (PDT) Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-51d89664272so1150162a12.1 for ; Thu, 29 Jun 2023 11:53:30 -0700 (PDT) X-Received: by 2002:aa7:d450:0:b0:51d:8953:1c89 with SMTP id q16-20020aa7d450000000b0051d89531c89mr100997edr.8.1688064809998; Thu, 29 Jun 2023 11:53:29 -0700 (PDT) MIME-Version: 1.0 References: <20230629155433.4170837-1-dhowells@redhat.com> <4bd92932-c9d2-4cc8-b730-24c749087e39@mattwhitlock.name> In-Reply-To: From: Linus Torvalds Date: Thu, 29 Jun 2023 11:53:13 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe To: Matthew Wilcox Cc: Matt Whitlock , David Howells , netdev@vger.kernel.org, Dave Chinner , Jens Axboe , linux-fsdevel@kvack.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 45CAD40010 X-Stat-Signature: g7bnqe6i77q43jimfc7z8nmg7c7rgsay X-HE-Tag: 1688064812-258569 X-HE-Meta: U2FsdGVkX199JhZ5AHZGZdDfHg5DfMUr2r7tsYi8p0LN5lfGgW3KTpMzIHDDP2irDQZioooJ4ngYc90f+GG6JSiNs5TYvUzOrLrMRYSWU29h+rE5qsK9NWstrd3pvCp9t/QFk+40yKtLoBQstrZPu8TlbwtH/o1zTyjHUrquK2MfUF9qoTDu/xMEtxyZzrivQFeRBDgLpxfptH2eyZ9+thsqqfMO45Urx0T2HL3OrPBO5O4ZbiYuCyoQ7ggMGhtTdWroxU1J8jUwOyUJIq/6+0JBdwUr5FFkjXkBHdhPX0q5AkgjfoUn2vAIPInoLnAloDNw5wd5GXXMXJWZ8t/KjZ3p45N5fv4DKWYNQTYLsKfL8QEdL6zvOCP1i+TqgQWKismSLj2GLrEU8gdrAfodn284W1gddxsgseqYxAnIHpBVC1BgRbc9qRDVD5WWyEBLPz1s7yXGydWvKFk1sbDVGRd364MuRGnvG0eKI1gTEFDlNHVKZv27Is/O2uFAl1QfS971koMvT3tmmfM3vkpSxnwOr5Q2N+92I8nK/tCXFReYUJ0hOhUpZXcImJQQBQLbWS+B60s9h4AsA008bc1VZCuPeh1oNkUp5rCv1ZyZjNqDIIaafdpPWT12W2LTicf2yWf5+M1yUcWr+HbFCHwG3wSLht7vqKjgqAc96v4GfLnKlZQeeUQlqgvNT+W8nrDnJnaNUEO8M4omivDdRMzbfRdE1aTs37kBRcCAq8/Y9o89jO3HrBFJ+0oNzpu7yBdeNU8UjCGNwFisYf2MPIVepEATp+nzF3MzoMCzlJudwkHP+s4db8Xtqj6+ddQZyQNWF343w1B4VJEv0tjSeGgsdToLm6+CiTYrnrDnayrgHc5hqj2wFS8HUi+IadGK+xki9wnELlo2KEqyZr3GsQjUjBpoJbu0cc6FaiiLZGzNn157pI26j6es311ycU8wHDWFMhQah4/4824DYCoP1OT tfUCbQjR 4LNJqTJhIlSN1u/eR0M9vuEiQeGr94P3vNQnlgCH+CXIGFzeiKAR627/gCDI2EE6uKfd3MxsG7+jzCQMujkYqmwYPaLYu7Nn0wDRiltUK5nd2AIO3ZJEilJ0KE9ttBTWQxCWMynIqcNQsIDJTGpXm1vsltePvA+mQ/8JRr+mlP5FMorVZ4SVT+Cfwq/gTrvWEbbxIA90vxcXoJ2xvMnFUwVrzCd7AtUdgUp1Tj2gKP9HdXJC/jvYcTdSmudbiPh0Xa+Pk0qg3tSylQ+snLZe8y1hDWA7VjxtOSNIXcxvycWWcDHParpDhh7mmEknS3DZfDaZ6iVMh/V5e+EVO2InfSqNNAdNu07XK81fiJnOYqsw9L5uykc3ssGLAqo9qzJtL6L5WkfZPlJclGrPiVbj2Q4+HRBNDvaCubqUWrwX6/s+bYYmD0Y/XhY1hsDTSTLEnPORatAUyCU/e9eN6P9K9WQXgwMaGKHaDdl9RSi0XtjVM8+z4m+IR61zNPg9UE1E9f4Yc1GS4UBhfZyPLyHAWnQhlOvAKUOHnuqRtQe3s+oS8H9jdJDPzDm1Xng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000110, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 29 Jun 2023 at 11:34, Matthew Wilcox wrote: > > I think David muddied the waters by talking about vmsplice(). The > problem encountered is with splice() from the page cache. Reading > the documentation, > > splice() moves data between two file descriptors without copyin= g be=E2=80=90 > tween kernel address space and user address space. It transfers u= p to > len bytes of data from the file descriptor fd_in to the file descr= iptor > fd_out, where one of the file descriptors must refer to a pipe. Well, the original intent really always was that it's about zero-copy. So I do think that the answer to your test-program is that yes, it really traditionally *should* output "new". A splice from a file acts like a scatter-gather mmap() in the kernel. It's the original intent, and it's the whole reason it's noticeably faster than doing a write. Now, do I then agree that splice() has turned out to be a nasty morass of problems? Yes. And I even agree that while I actually *think* that your test program should output "new" (because that is the whole point of the exercise), it also means that people who use splice() need to *understand* that, and it's much too easy to get things wrong if you don't understand that the whole point of splice is to act as a kind of ad-hoc in-kernel mmap thing. And to make matters worse, for mmap() we actually do have some coherency helpers. For splice(), the page ref stays around. It's kind of like GUP and page pinning - another area where we have had lots of problems and lots of nasty semantics and complications with other VM operations over the years. So I really *really* don't want to complicate splice() even more to give it some new semantics that it has never ever really had, because people didn't understand it and used it wrong. Quite the reverse. I'd be willing to *simplify* splice() by just saying "it was all a mistake", and just turning it into wrappers around read/write. But those patches would have to be radical simplifications, not adding yet more crud on top of the pain that is splice(). Because it will hurt performance. And I'm ok with that as long as it comes with huge simplifications. What I'm *not* ok with is "I mis-used splice, now I want splice to act differently, so let's make it even more complicated". Linus