From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B27AC36000 for ; Fri, 21 Mar 2025 18:57:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EB92280003; Fri, 21 Mar 2025 14:57:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6741F280001; Fri, 21 Mar 2025 14:57:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53C98280003; Fri, 21 Mar 2025 14:57:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2DC5F280001 for ; Fri, 21 Mar 2025 14:57:03 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1846AB7C5F for ; Fri, 21 Mar 2025 18:57:04 +0000 (UTC) X-FDA: 83246465568.11.1EF8EBF Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf09.hostedemail.com (Postfix) with ESMTP id 4C13E14000E for ; Fri, 21 Mar 2025 18:57:02 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YMdjmrjY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of ritesh.list@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=ritesh.list@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742583422; a=rsa-sha256; cv=none; b=dEbhhNvgZN459UQgw8S5VYlt3KnImCnBLcoyWXeAMJcSDNWk4nGrX0iA+z7dpCtA7PMqO/ pG9pACNINm/7ye4QRZ+z8KF+yU/k24sK0iSOsP7pxakmulHrXPEIXR32tOOk+yeBlMeCjB 8+JflKa0yWamAnTIHfzJf9eYl8JqbOU= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YMdjmrjY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of ritesh.list@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=ritesh.list@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742583422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=E6L6wutV3ZHB5tvyGWzOD0ba/O/FC8bdjgSutzT7uag=; b=YvQ5S0yjhHt0lAkOaQTbUCedCrAVdHKSWI2wJAASeIBD74D0WK1a19akPvHW0h4bRpH6Z1 9CQkiMc54e7DC3UZHjKBumB8jIONoosJA6VnGs374lh9b5ihbqN8l5rGJIzL0nqEimb2ru G0NVUGFoKft5Irdoyz35r2PI9dkNdzk= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-223fd89d036so47087655ad.1 for ; Fri, 21 Mar 2025 11:57:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742583421; x=1743188221; darn=kvack.org; h=references:message-id:date:in-reply-to:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=E6L6wutV3ZHB5tvyGWzOD0ba/O/FC8bdjgSutzT7uag=; b=YMdjmrjY+Lh4ivKAujTqdKjvvlQ9BaaQtTDAnh7xTrTqITyZrHghsdl3/+kXHrFnKT Ep8g48MJYxhOFL7l7q0t06a8IKOsZwb9oobW5pc8wjoXOs1gMbuxCk9wZ8cL8SCu3ieU TuDJxYuWAVuPbeC28RGPgY7QVEr2B3118x7c05zck4esE94DKqkN+6vhAbzILoUwyK2y 7hzbn/7VbRr+k/4KV8wQq4b4N5yDaOQtNsMCQ31VQRyWGBm0wLIcwpRMDKwMUNmcnSLB WzUqiHwZ9R6tjmdBSgKFDVCb1xrOx7214p1hZFZh6I4/CT07AGxrsQ3Ka6uSHgIu/ivV KLZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742583421; x=1743188221; h=references:message-id:date:in-reply-to:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=E6L6wutV3ZHB5tvyGWzOD0ba/O/FC8bdjgSutzT7uag=; b=c7rFjMGbiKTu2w/4m7MJOloP6FHoooeQn2bsiyRdTlPhPB54xKadGBeE4f1Z58IZ09 Q28i3VL3IzZVhQIv2USRaxfvnEaccgprHCPu8p9GiuS19fCIlgmsLZzmDGJN4QQHllMM TcDQQzpYY8qgLOAP6aTf0cFchNtQ4xpRSjhcBSBLWInPuprMiZED7YxgXYyZhlW/PVPl +rfESn1/4jKFnUufoqZo/3eMxKoATMWtF+CXhEtsTyONK00SiA6fdi/91gASfbIJHtjQ 8wgFXkAs6hfA+UbIuHAV/OuATaLtWQUdM+LPrl0RF4625iTzlprJqHs6ksbb7N0oC2dO qBtQ== X-Forwarded-Encrypted: i=1; AJvYcCVaBzrZIIoPmWqlLI4nfl9QL9h81ylcKDF3WU0Nh2mTPUfx2VYDJbg/9zOKe3BGaFbTGf+o1WiQQg==@kvack.org X-Gm-Message-State: AOJu0YyCeHeLJmishT/JbWVZsKEtKRPkCyATmLKc5PeFWxc7Ihj6C5vN tnR2BGogvVyOb+qv7JCGmG5D0gDe+Zr7J1RY7AfUtu+C15iBam7Q X-Gm-Gg: ASbGncv2Y6uq6HCqQPJ8hR7nb+y00mLNHsDprH8lPZmtfqKEFUG0FvhjHVqxFKT03PD zERR8If0jz7uIDFLsbjojfFPgy8sT4bOPdd7kLPS/X3XtCKcSqBZvgfeL15M+nqN1NdAMkLYz0t gcCZaQTv8Z1jAcb5GkP/p9bT1LjxIeSjrsJKM1qbelM/SE3NMO5QKY4nnLRKOfZNB2bbkZ8qRe6 3OpR4Uy8P6HGfjKlVDqdQnbimBkRbhYi0Rxnh+CVFoINKY2ZsHeGUXiwkh52ahVtm3O9KMrB5KO xEVB+sWHuRXf6Rq02fLoDeTJ8SXQIdTB9ezbKQ== X-Google-Smtp-Source: AGHT+IFTR/0mpypKv50+XZ6X8K3cee/U5NEm6S8aE5fxiwEbadWq9uUGE1SDbkJwu58WEomccQhM1w== X-Received: by 2002:a17:902:ef12:b0:223:88af:2c30 with SMTP id d9443c01a7336-22780d8cfeemr63100365ad.16.1742583420932; Fri, 21 Mar 2025 11:57:00 -0700 (PDT) Received: from dw-tp ([171.76.82.198]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-22780f4d606sm20918565ad.103.2025.03.21.11.56.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Mar 2025 11:57:00 -0700 (PDT) From: Ritesh Harjani (IBM) To: Christoph Hellwig , Theodore Ts'o Cc: "Darrick J. Wong" , Luis Chamberlain , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, lsf-pc@lists.linux-foundation.org, david@fromorbit.com, leon@kernel.org, hch@lst.de, kbusch@kernel.org, sagi@grimberg.me, axboe@kernel.dk, joro@8bytes.org, brauner@kernel.org, hare@suse.de, willy@infradead.org, john.g.garry@oracle.com, p.raghav@samsung.com, gost.dev@samsung.com, da.gomez@samsung.com Subject: Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64 In-Reply-To: <20250321050023.GB1831@lst.de> Date: Sat, 22 Mar 2025 00:09:09 +0530 Message-ID: <87ecyqrzxu.fsf@gmail.com> References: <87o6xvsfp7.fsf@gmail.com> <20250320213034.GG2803730@frogsfrogsfrogs> <87jz8jrv0q.fsf@gmail.com> <20250321030526.GW89034@frogsfrogsfrogs> <20250321045604.GA1161423@mit.edu> <20250321050023.GB1831@lst.de> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4C13E14000E X-Stat-Signature: enganh3qutj1h9zehy6m55i5bjwuceqe X-Rspam-User: X-HE-Tag: 1742583422-976454 X-HE-Meta: U2FsdGVkX1+bwoVmDNyLfomiYYh+s8jXk724ESO0tRKtI2MGc93J+AKy8HtYU+0o8kQMsqSNqwW59DCmzOzKr3C/R2MPAUJ6Qa1YISHE7ip1eWRDt8o/O49uKhFqi4F211YDupcbC3Cx1HZdJ/SerEGqTvFWj9hgnYsU4XyjPC5fKUesz5BTAanBLndXtO8A0U9mlblbUzOHhOomeIBKqwJmryZDVOoMBgFivQdeyEslJs1cRoql+gJl0WyfbFWza1rOhA/jo86Cpd56T9Or6oYfefksDQ36LDnLeV+ICjfhf/NT9V1uQbkz9rO54GDAhnBRAMz/ViG364vTDyRzcQ7xaeo0LlK2ch87egvmS0KplqnaB6JES47lObQMnK7nTsjlj82PeDDqKiXqAlg01Fr6+Opr5D3sPDsiohgBwTB5QgjQuB8dO7k0xSkQ8tNEVzhfkfeHXOuk6/BvNvD6ziN6J43LQA4L0RejY9JchMgUAJStK9siacR2WYkI1kROAs9MOYsTk2D258mbrbdb3KOkoJDx5DNRMWQvEwqc8b/iC9Ybu9YTtNo87jx7NOF+2pY9aLB2ctaPNrkbv8qUNioGvXBZMUgmWdrJBgmqQhVDuGLcZHBXiAYVMwEtKF4feoKaNesRaAeLU8Q3/nkARb8gfVZyvKq4LFTcnHdBOnPnc2noEuZU7UQXEpuMQECmfI06l36J1AgTuU7y22DIAV3xBVHccqJSoZGI2x9JR/u1z9raADZW2KLQpDtd5RxqiMCcDXwERyL4I7IVBIwdAVKYxThgbCBA4WcSl34h2HTVNMKf0v+msBhSXkRQkeQR+y6sTYMz1yWTwq5uSL6wlX8/EtC1UcjIm28pootvHto0pfoaOogHEwUylJzt7tLy7Jrw2WEv7W/HpM7tw5KEaRUOgGqUm0wmaDlHULVgllPgMYbGtNL+65HfmvxOK9W5dPJXlyIb9CS3MXb4yPw tnG8GG2I RM+SHwjyDXb2ixoQyFf+ilC4LZVIoYSBrtirtfRT1gkCpGJcv0nsD/XY5nlc2UD0xcdRxnZ5zdqFgQGLajlzjsQY8ZiXN+Uzwcx2MazgCSqN9CngqYR1vuA6SqFADTXZaX5GiX94FlPoRUqQXpdY0UL4jsYuESg3eg4gM+4BG8nvubNPjZnvAYcLIdyeEQj20fU4xwN7ggRHPtvuwGalfMIOoHs+DlaSoLuwlV8Tmgf9TGDIqmlV1SwfS0EzG6JIfGQ02Dh8jQIkhL1lDo44gkvp0Ql+d8Y+idc3onQJ3Uh+sAeKZCBWzsg01c+movWZl0aQBYOZf5lkc+SaARIL2gf3imGl2FBh7kR676fQSmc2tqSKMR6nuxzlm6oBZOYH9LPO07/F9DzHZeehEhjERYJWRc/eifJAcOn4iB371odqjw3E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Christoph Hellwig writes: > On Fri, Mar 21, 2025 at 12:56:04AM -0400, Theodore Ts'o wrote: >> As I recall, in the eary days Linux's safety for DIO and Bufered I/O >> was best efforts, and other Unix system the recommendation to "don't >> mix the streams" was far stronger. Even if it works reliably for >> Linux, it's still something I recommend that people avoid if at all >> possible. > > It still is a best effort, just a much better effort now. It's still > pretty easy to break the coherent. Thanks Ted & Christoph for the info. Do you think we should document this recommendation, maybe somewhere in the kernel Documentation where we can also lists the possible cases where the coherency could break? (I am not too well aware of those cases though). One case which I recently came across was where the application was not setting --setbsz properly on a block device where system's pagesize is 64k. This if I understand correctly will install 1 buffer_head for a 64k page for any buffered-io operation. Then, if someone mixes the 4k buffered-io write, right next to 4k direct-io write, then well it definitely ends up problematic. Because the 4k buffered-io write will end up making a read-modify-write over a 64k page (1 buffer_head). This means we now have the entire 64k dirty page, while there is also a direct-io write operation in that region. This means both writes got overlapped, hence causing coherency issues. Such cases, I believe, are easy to miss. And now, with large folios being used in block devices, I am not sure if there is much value in applications mixing buffered I/O and direct I/O. Since direct I/O write will just end up invalidating the entire large folio, that means it could negate any benefits of using buffered I/O alongside it, on the same block device. -ritesh