From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 512A2C001B0 for ; Wed, 19 Jul 2023 19:35:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93267280087; Wed, 19 Jul 2023 15:35:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E2DE28004C; Wed, 19 Jul 2023 15:35:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75BD0280087; Wed, 19 Jul 2023 15:35:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 63A0428004C for ; Wed, 19 Jul 2023 15:35:50 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F076C4033A for ; Wed, 19 Jul 2023 19:35:49 +0000 (UTC) X-FDA: 81029366418.25.AD6BCF0 Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) by imf02.hostedemail.com (Postfix) with ESMTP id A793A8000A for ; Wed, 19 Jul 2023 19:35:47 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=szeredi.hu header.s=google header.b=Zhw4JJ4K; spf=pass (imf02.hostedemail.com: domain of miklos@szeredi.hu designates 209.85.208.175 as permitted sender) smtp.mailfrom=miklos@szeredi.hu; dmarc=pass (policy=quarantine) header.from=szeredi.hu ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689795348; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rtl6gXXMYsoBnIpPYfSTqQ6nm1s8Hn5GW+fEct1ALRE=; b=Xhg/kTabg4TC/SG8Uwx27egmngKGCRE9pm9m175N8evFsFwuyKgoG2zsQarMLbEzZfjaZt /IJp1h4IyJgJWYSY071rEXdnJ94CTpvsljUE97ahrwUci3ctdCHGY1OS3xb5JunOCZ6dfF 7VBlypNrZWQb+FbNnlIUeCz9DWz5M+E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689795348; a=rsa-sha256; cv=none; b=rdRZ5a5chGa+anQVuDTiADRTP9xe9gm+X+nhtn1C3iz6bj/7A+hVhskfsB+7/IAI9CJNPt 7iLknbdocRUKRcC9S+2egXdXJfAf35bCyuXSSzQWMNkHLk4qbdcDxeFEIfrlSMOcoAEzO5 0R67srghn7CG5DTZuUUggkDJMqDsiDI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=szeredi.hu header.s=google header.b=Zhw4JJ4K; spf=pass (imf02.hostedemail.com: domain of miklos@szeredi.hu designates 209.85.208.175 as permitted sender) smtp.mailfrom=miklos@szeredi.hu; dmarc=pass (policy=quarantine) header.from=szeredi.hu Received: by mail-lj1-f175.google.com with SMTP id 38308e7fff4ca-2b95d5ee18dso14442361fa.1 for ; Wed, 19 Jul 2023 12:35:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; t=1689795345; x=1692387345; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rtl6gXXMYsoBnIpPYfSTqQ6nm1s8Hn5GW+fEct1ALRE=; b=Zhw4JJ4K/38BOlZM21pHWDopKnx38TCDNfnXcKZHpT7uRqEqXE9WONINOAv9x6yiUj z+XUsIZz2aKcBKEiQPUdSZ4hejspiaLEcEvoQjmfzw/3G5HXTxgK4A9mvFZ//TIcP2A6 YyRF2CDGgYyMaztEqeHFfJfxRmV3g4oGKIn6g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689795345; x=1692387345; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rtl6gXXMYsoBnIpPYfSTqQ6nm1s8Hn5GW+fEct1ALRE=; b=PVGd3ODIzgTy+2NC4HUhJW888Stvdne1Bs5StzXA2ojE7nDXUTu1WeipYw5ZrTggad 9P4C1ycB5U+3/1E9nNcnNUAMyZNCKmFtXRIOmHq4ETTzSBrksb07eb0sslkZ9w2HDT18 RsYYmTaBVIQYaKGiplaXA7ktUjgT5K/+zXpJui1nDXYSPvJDLt5e1+mlnlGL44pI/0PK +1sT9MFs0kxi7miCepMfJQ7zUd7kl1igZUMe6KvBGfme3Hp0sX8RvEFtrSSWT5+RZ3Xl 2PlQA0kQDbj5qNisftmtNJY3nsH4VA1o+2muw8KvXpAoXJvUS+npkvH7CDcQa/UgVhWa s0xg== X-Gm-Message-State: ABy/qLa5tFX6ZU7Y5o5Nq76cgfHyJsYmXrLK2EcVwFgOIOGIVsD1YVgG shAKivi1kIqfOGlKRjd5jki7hvw5dobaS47JYGKfCg== X-Google-Smtp-Source: APBJJlF263loCbf4G/vbjQ8OkGORJrA6h2pTRLBd08SnNbJHsNdGQLjuleOM0+lhq7ys4UZ/gSu7iS5raBbyA9YXqQA= X-Received: by 2002:a2e:9584:0:b0:2b6:e2c1:980f with SMTP id w4-20020a2e9584000000b002b6e2c1980fmr643994ljh.36.1689795345357; Wed, 19 Jul 2023 12:35:45 -0700 (PDT) MIME-Version: 1.0 References: <20230629155433.4170837-1-dhowells@redhat.com> <20230629155433.4170837-2-dhowells@redhat.com> In-Reply-To: From: Miklos Szeredi Date: Wed, 19 Jul 2023 21:35:33 +0200 Message-ID: Subject: Re: [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns To: Matt Whitlock Cc: David Howells , netdev@vger.kernel.org, Matthew Wilcox , Dave Chinner , Linus Torvalds , Jens Axboe , linux-fsdevel@kvack.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christoph Hellwig , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A793A8000A X-Rspam-User: X-Stat-Signature: d6joihisf6s81dh7a3sgnm19kxyypjyp X-Rspamd-Server: rspam03 X-HE-Tag: 1689795347-813226 X-HE-Meta: U2FsdGVkX1+p4mp572tzEAudbJLamLlkbx3tXUmAAM+kP22SYpFRKCP/VmtA4Hse4VZ+0DlZpat8B2GQU2CsDzDD222cAGvE74XysAwmgZ1VIAUXcDs3NsSgcqUqwe+AgRAs6B2JhrqwNQ4LkchYzv4S/VawvfDVblTeAWsdtiDYFXCYovNw6zYoyelkfq9rphGjdFd0diWY/5yeWR6rWRgllyl3ENqSRN4F+WH0f80FblFk3OUfoJHFTms34GaCs2UxxU3uWQXqglxVB4augVf1EBD84H/yI8B968fijCF2I90YonXOQay6le82L1MjSpnQALRRY5G6GSo41ssadh3p2vsRWW0tFguE6ljNj3eYV6e8DNKrrczqBAG9B87uWfK3mMsLmNWZSzcs1OFrMlboY6JItcg3rl1w3Iw4PDxK7Q+4117AxBHbRT2PvmVoWnqtT3i+brF+PJGOepcc1vqDBPMgU9vOwXh/37nGhTQpvGlzsH2n4uw5gllGP0wXM6ygRSJqP0qgaCGKbqesnYKQ+Zv1nfBiD2h9P7EIEzw+LUxwHUFaQXjc+ddX2+tCpfHDbCSOhKiU13qf3yPaK9CT8EoHHztDWflQvA5vzn8U6E9LCoxjwCvCskIPiPNUiHEpilbxXhnOgIG2yNHlSJXwK1Zz2dtvoRIjTQesYKLklWdVLcBvLTOfeaQIydSptN9oOD1CFsK0dvmVzWsOCm867ETgkhpWgK8OxM7Pjy8qupBsdwod4oMUIDjpiy6rthCz3+93KXK3Me6tXXa8qdtMRddKzOycCJ2c9U82aP4JHcnO3Oa7T3Rzfw5X17Zqp3KkVbDBz3shS0PkE2cPIQVc5A+eTv03Td/tzJTWb5tQxXYAqKXYMaFnVGBiLjVpTcZ7RAwASiGYnlRGQNDIQXw2bppb48jZeQAusqbW7pU52z1h/8ZGpQciM9W4N7yyRKZ+3q6E0J0F0Yi+iJF IDzylXdq bS1KEvWMVE3XstESgiiC0dq9xNQgkxRZ8bkpw8Md6z5I+UwEUmFeSzwwVCQ3Bh1UhjKxFeMSgixdHwIewYErl747hTClfj60Mt2AVZVjslcujscRqkuti6h4WAuwgLAahinN1C3Em8bNHd7QdfzijMFHrsKNIVQ0vEkaUEGdCMx/dr8OrmI8Sv7kQe/LtPxlJPQ30O/LEP7TJeNdXmt33HR3hZzogoOm6dWTBnXw5YA9RLEsYPOzXIJfq4hwEPivKfx6I X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 19 Jul 2023 at 19:59, Matt Whitlock wrot= e: > > On Wednesday, 19 July 2023 06:17:51 EDT, Miklos Szeredi wrote: > > On Thu, 29 Jun 2023 at 17:56, David Howells wrote= : > >> > >> Splicing data from, say, a file into a pipe currently leaves the sourc= e > >> pages in the pipe after splice() returns - but this means that those p= ages > >> can be subsequently modified by shared-writable mmap(), write(), > >> fallocate(), etc. before they're consumed. > > > > What is this trying to fix? The above behavior is well known, so > > it's not likely to be a problem. > > Respectfully, it's not well-known, as it's not documented. If the splice(= 2) > man page had mentioned that pages can be mutated after they're already > ostensibly at rest in the output pipe buffer, then my nightly backups > wouldn't have been incurring corruption silently for many months. splice(2): Though we talk of copying, actual copies are generally avoided. The kernel does this by implementing a pipe buffer as a set of refer=E2=80=90 ence-counted pointers to pages of kernel memory. The kernel creates "copies" of pages in a buffer by creating new pointers (for the output buffer) referring to the pages, and increasing the reference counts for the pages: only pointers are copied, not the pages of the buffer. While not explicitly stating that the contents of the pages can change after being spliced, this can easily be inferred from the above semantics. Thanks, Miklos