From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3198AC001DE for ; Wed, 19 Jul 2023 19:57:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A3C1F28008A; Wed, 19 Jul 2023 15:57:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EC4828004C; Wed, 19 Jul 2023 15:57:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B39228008A; Wed, 19 Jul 2023 15:57:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7B0EE28004C for ; Wed, 19 Jul 2023 15:57:00 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4CDC0C031D for ; Wed, 19 Jul 2023 19:57:00 +0000 (UTC) X-FDA: 81029419800.21.F5B3DF7 Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf18.hostedemail.com (Postfix) with ESMTP id 5B20D1C000F for ; Wed, 19 Jul 2023 19:56:57 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=szeredi.hu header.s=google header.b=KDf25kUk; dmarc=pass (policy=quarantine) header.from=szeredi.hu; spf=pass (imf18.hostedemail.com: domain of miklos@szeredi.hu designates 209.85.208.45 as permitted sender) smtp.mailfrom=miklos@szeredi.hu ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689796618; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eCAWdeG436mH+DfARvo16QjhMFi/Eb5+1UBeyzzjyUI=; b=NaiKH0qexBovhvAiIVzYy4YVmv7RjcXWiKLbMTboNsZMaFutYCib4pp4obF9dGRFPmqaL9 r9RKg6yF15cmn7x24RJjKu6UBKkr+1H94QyF1Ev6+sb93kcBmM4nn2HxDBsVZr+Nnj8W+n dEojWryXTFGyT0vpSWQqS2sZAuwixGQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=szeredi.hu header.s=google header.b=KDf25kUk; dmarc=pass (policy=quarantine) header.from=szeredi.hu; spf=pass (imf18.hostedemail.com: domain of miklos@szeredi.hu designates 209.85.208.45 as permitted sender) smtp.mailfrom=miklos@szeredi.hu ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689796618; a=rsa-sha256; cv=none; b=W/9i/9oCfBkJGZ/NlHVgB0dGkMyw3y80fSfxtuMOJNFSAtAElk/BQuTPxJcOhRqDGAlDNs yV0r+CjGqPTGJc5GuMlCLkqbLnm3Q1jnunKeAzrlCZKTYY7AKLt6FQno2lB8mynOeh7g4G 4BsPij4ofqgZ/IavzOUc3/uAKQON190= Received: by mail-ed1-f45.google.com with SMTP id 4fb4d7f45d1cf-51e590a8ab5so10038440a12.2 for ; Wed, 19 Jul 2023 12:56:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; t=1689796616; x=1692388616; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=eCAWdeG436mH+DfARvo16QjhMFi/Eb5+1UBeyzzjyUI=; b=KDf25kUkA8y5VPkkc7HrbMTq3b2TUQ9qV1Rqf5zm3CstAGUdbu1d6TNxedY+pIsa7j 2FZ1elHv1v1po8knHje3+LiAmx8Xahh5ExqZRnI0mXGfRX7m3EFLlyAK10LkH8G1Mz8+ NL2rNXbVT3ccaArlkF+ebJlcCMIxueJ9R+uLs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689796616; x=1692388616; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eCAWdeG436mH+DfARvo16QjhMFi/Eb5+1UBeyzzjyUI=; b=a513VkRpBblIjXicRBs5Rn6BalNB3CbKlDdoiN8GTf8F08aLrSI+EJ7gktkN7dbqWm J2gVQtnOVuj+UICAyiDKUQwqDV/GeNtNO43SoPRtoZnNgniLS3RdzxbLgjMu/tKVLiVe Yr5KwESum+vUYnVVEQx1yX84qa8DtrIVq5V6mtJJUKe/RfOQdWOhRZ1meDxDH1OqO4wZ q5tlsL0OgLpUxXBqvPWKDhnPmSk7AaLmlfg3RNae3WAU1ZInYYVrfuXY6310QndAAinm i9+M1EbfT/ddx9POIE1MU6nFwbgnBDGfwjZwQeCh1htG9xDjBJW8tEiMrYL+xNqyJ65P C79g== X-Gm-Message-State: ABy/qLZnseIH/GfVgFh1zwp8TlPm9TO1knBT/7lKdAEclE28NLTAxq2t Bv50rRxof7CRz7DQHZOYAQpPyKjkykz+uTn7lS7HLQ== X-Google-Smtp-Source: APBJJlHY/0j9zo7xINflNaeK8tuqOdnpsiGywd4+hJhWLhzGqpSvrN0Bo9bC2obyVshWt0JU/Kz9CjVIL8pOynOaHLU= X-Received: by 2002:a17:906:7485:b0:993:f497:adbe with SMTP id e5-20020a170906748500b00993f497adbemr4056662ejl.19.1689796616378; Wed, 19 Jul 2023 12:56:56 -0700 (PDT) MIME-Version: 1.0 References: <20230629155433.4170837-1-dhowells@redhat.com> <20230629155433.4170837-2-dhowells@redhat.com> In-Reply-To: From: Miklos Szeredi Date: Wed, 19 Jul 2023 21:56:44 +0200 Message-ID: Subject: Re: [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns To: Matthew Wilcox Cc: Matt Whitlock , David Howells , netdev@vger.kernel.org, Dave Chinner , Linus Torvalds , Jens Axboe , linux-fsdevel@kvack.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christoph Hellwig , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 5B20D1C000F X-Stat-Signature: tehnb5eksdfmoeb6gfftzu6oiu7yks3b X-Rspam-User: X-HE-Tag: 1689796617-691743 X-HE-Meta: U2FsdGVkX1/hAiX6w6qop9VTHiqYomlrWEeW0vf2qJ+E74KeCue7hvQ4UjDUdF9StPj5BATjL9uJ1nT66Or+ibIsa/H0R789PRyYykJCNA8DkAZUvXCMxd/SxINxi7tir9PWMw6EDKgv3kdusD+V24N2UN35KLv6HIw96MvymLPNoULCKqnq2bdQij4uT2mzfw1N23chkMGZ5RUSdMLItIDj6kB4tnn44Qkfj1OzW3ymiyAWi9R6dizyvENmtKLAO2bVlNg4paePNhVnfmKvMknuXLregeTXcReGM0q3IHAVALqBj4gTwYt1QkZSTAkRzrt5TH4NcMam9/ye2Cz7tvPNQtIa1NrDIW49viafrU0a5JJFgbwLtcrrL/yG7NPna8k+uUX2fa3KmrStkrQEaitiUxGX51w9RZzUuSblaPC0S8JFI+BRaVQh/svgBKnN+8Aer+8hEwTZPDwGWZwsY0n9Ze37l799bgfIweM0uWem29URifDnF/50CnUjPDnqKLBpBvBtyF+3+bZGTz1QHnBr8czfnasec/VzU0HVEthOInyE9dQ4oZEShpQ/QAqYISH+B2HGL8d2eYCDx/4OK6az+Y3owIlhlIxbavudDjyyvjsAT/q1k+mDjAWKubIfA1RGIoLwIBvHEvBXBaRjZTiDu5szxbGKMubZqy/7jvRi7OWdZHCT1+g+s8r8PtrDMehV2wZF4fVZsGRmMW2Z2+9xNVv/J55Ryi0aBtsqBBIbYEYjrzPcGvOu81/maahpm/6g7LCeXB5tUCAcHn+6rJXRk9thWpJZ0bD2VxTYFYbm/qSAOS7ABlWJKXF/QL+1iDtxOdJRVsAZMc42o+sefGdALuDgPkPU2psoMoO/vPsvMmoNvCqIRRWFfJTl0fTG4RR5OYLaVATFJsowWVSmolO/edDhNpLYCZiFn987UKvw0nzkawWJADQBuH35g2YYHgABy4Hwx2XuVFtFHpb mMjHlc6m wrtVCppM9MFnWuxMTT2h4Fq5fcjp44JRGz1cp5zg/jG1oYAGCHio3gXP4p7u/s95MzGrXbFMuOpSYOskXaon4Y3pxVtf0fxFvifTzqOYyR/JUIOxbNSavFBpa8seoNWub6pbwZuEfSzUA+UnILvKlBfBvHj7Xx8TCxxGcy+bhDaOLXy4pWv87FwlBalbSJjjD6RspIVLnmnnXg08apk7twWafeb+WJkxmFM736//vOOt1cUaXjC4ELj6Dy8FAI2BSwjPrGbVdz443kGoeiy+qHsyF/C3g8Ho61pIXFy4DpxTPdkJmvZHPTCgCzQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 19 Jul 2023 at 21:44, Matthew Wilcox wrote: > > On Wed, Jul 19, 2023 at 09:35:33PM +0200, Miklos Szeredi wrote: > > On Wed, 19 Jul 2023 at 19:59, Matt Whitlock = wrote: > > > > > > On Wednesday, 19 July 2023 06:17:51 EDT, Miklos Szeredi wrote: > > > > On Thu, 29 Jun 2023 at 17:56, David Howells w= rote: > > > >> > > > >> Splicing data from, say, a file into a pipe currently leaves the s= ource > > > >> pages in the pipe after splice() returns - but this means that tho= se pages > > > >> can be subsequently modified by shared-writable mmap(), write(), > > > >> fallocate(), etc. before they're consumed. > > > > > > > > What is this trying to fix? The above behavior is well known, so > > > > it's not likely to be a problem. > > > > > > Respectfully, it's not well-known, as it's not documented. If the spl= ice(2) > > > man page had mentioned that pages can be mutated after they're alread= y > > > ostensibly at rest in the output pipe buffer, then my nightly backups > > > wouldn't have been incurring corruption silently for many months. > > > > splice(2): > > > > Though we talk of copying, actual copies are generally avoided. > > The kernel does this by implementing a pipe buffer as a set of > > refer=E2=80=90 > > ence-counted pointers to pages of kernel memory. The > > kernel creates "copies" of pages in a buffer by creating new pointers > > (for the > > output buffer) referring to the pages, and increasing the > > reference counts for the pages: only pointers are copied, not the > > pages of the > > buffer. > > > > While not explicitly stating that the contents of the pages can change > > after being spliced, this can easily be inferred from the above > > semantics. > > So what's the API that provides the semantics of _copying_? What's your definition of copying? Thanks, Miklos