From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ED18C6FA8E for ; Fri, 3 Mar 2023 01:59:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16A7B6B0071; Thu, 2 Mar 2023 20:59:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 11B7D6B0072; Thu, 2 Mar 2023 20:59:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 009F46B0073; Thu, 2 Mar 2023 20:59:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E371A6B0071 for ; Thu, 2 Mar 2023 20:59:10 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id ACE21160205 for ; Fri, 3 Mar 2023 01:59:10 +0000 (UTC) X-FDA: 80525929260.28.9BD7CC2 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf04.hostedemail.com (Postfix) with ESMTP id 974F140004 for ; Fri, 3 Mar 2023 01:59:07 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=M4iWUGhO; spf=pass (imf04.hostedemail.com: domain of kbusch@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=kbusch@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677808748; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=joo90A9A07X9rv7B3iY70MfOjwnxci/a6DE9oQlBaZM=; b=b/ugcy4Bmt+5637qLlDw8yWcMClBzeU/HrKKEhmIHIwC5AyzjMJTUTeTdjXV1IrZZY+3pa fJyTYLthHleQMv1YnF5CW0PBu4xj+rFZR92/R5wLfiCAV4ecLPQ02Ij23XrwA8BTWM+K1K pwCbxiQU9E8q+FJiDVKvjbpvhniNza4= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=M4iWUGhO; spf=pass (imf04.hostedemail.com: domain of kbusch@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=kbusch@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677808748; a=rsa-sha256; cv=none; b=Iiw5zP3JCQ6Ce8cUclOgBIoKIc84SjTf8HDu3b99rMVFyg1zCeCE7lrR/uk7jrZ4n77NNl /OZwweKAX+lo6+xPK6He3WF+kWN9Ksxh0OkuiucbJGmnWlNjLNHoyepamwPJsKeCEyc8Qp 77fg7rNbfb1H/Eb/Zk68PDUlsw4HFmM= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id A602CCE1FF7; Fri, 3 Mar 2023 01:59:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2059C433D2; Fri, 3 Mar 2023 01:59:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677808741; bh=wWK0TKtrXgkBt+vb5U8Z0HtNKqKXOfcAjYxIxt05B5Q=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=M4iWUGhO7pAc7f2BcIx9ymL2rpZiYt4cEkOBPC2iYi8hTEXRZyWxa/dQVXKoFYFMo hP0Jp8MBFg1JyKYc7J+/ood93fuB78xi4LyDek+yiYQCZOPgZmAias3x+F0hT826d/ LTO68YZk7TcvaFNMs8t6war/kvntEvGjXC3tkP9oM7S/QgTKeRh68KNDhHu6JGT1EX Tsn3ZxLgYIVS/hWpWkNvEJkM7/JQ0aWBiH8G7gMeuhqMHCHvpM+uVjI8o5NB1C6Pbu 7bMhXY9n0BSIjoxXSuvxHZXyX6NGaIiR9hpQxMZmgnK/vfqdzBawnRuoulR8/gkpvC QYOdJVZiQWlDA== Date: Thu, 2 Mar 2023 18:58:58 -0700 From: Keith Busch To: Theodore Ts'o Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 974F140004 X-Stat-Signature: 5qqozyfex6n5mpreq8krku156rqpoqy5 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1677808747-273817 X-HE-Meta: U2FsdGVkX1/SMq6npSx4s/08f5DgsrazivBoT01pYzM6+U4Jfj4UM37FqWhVaKlOPBnJkwKJ/iAb6lm5CjOhaugXdCJBITx3fqzPqe9cTO8M1eKuN8mIiWNB/XEEYi3kDEyrLJuSFCYVlMctzysd0ZS0Qf7HAixq+6L5DgGfaNA0VXldtVGRk4O0W2dve8IVB24LlYJAetzbF/6AjLUZMVt7bbUnK9YJF/lqCP3uW8FVCQPZEpHNg/rjR3XtG2RYRhx3KQMBVECauPWfAmykTG0EslARRWiSQg82z55/H8q1UorlLEjQ+frUqGf67Z28wcxSzT3Zk2uIX22GX3QrepixAjsPq83ulIzSTb6KmeqgagdirRkXSCq/V2n8jwzS6oQJvZPbG9HR4z2SIpNkNWqPPqG+RERLmU9M3YAao+wCysesH710NgKMYREbc6Nba5GVbFe4S3Cr5G2eQVfkzItUZPjIiIFv9mJ4jZzIn920h6kZuvdiNdZiQDB7iG2fRZjAzPrLy/Mr1KFi4+rkehKxjX/MY3uXC3xXNBNgDVjEz7xe9FyOYtveP4XqD3Q+QNViuki/payySC7Bzp6+wAPVwEkKsyFLimNiXDKnymyXnAHXEKgH8nAhBFAqCw1mtugRPUId+9StGwimnn4iA5CL20BjrZuHmAO1FtQYhlTsjKP5Ka9Zk5WDH+sZFOpqfdq0Pv1zIPEdpGaur3N3JvKiV/Q45PtRyvyXGF+zyJpLCKfrbV7l6tB/qmOi0VVPjH7qq0wM895Mw7SuJ5Zz8nNHtC7MjMaqdURkRjksv0wnPKjW7eCMmpRvO2aVyMA04n2SC8RLYqmHheb03vFV0HfykuF2VRvBCZUhxvN2hXHpy+MSD2cd7awmSra5p30N6klJvfymJexWrZlnD3OqzQqWZWkcqzqgwG3LmshRT0SHEnrjlCCQQktDF1UYIQsjwW1TQ1Pz7Xjm3IdcPL2 zCeclNab zXxW3FYgzZ5ByWom2r/VJ1OGnXm+521z2S9wpGWwOXJNS1kILcE2dqFlPCjWqQaKf7HeXo7kDcbEDPB7k+XkWjDzxI0zBW169CFeoWXdxCV6HOv7aVUcWz+i7frQZNCawh+katQ6diZgFD+56Tac1xoaBm6jqk4vYjlDSeQ3sHc6Sa6VrAa8Ou6aOaACL1cqqgpi0MWF5zSwm9Q4aBqn7wtyckDE14SFn9VXVKIRgLYeRRQwJzlyOQ8ZfslTnEdRrHLWiNpnoW1P7SN8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000016, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 28, 2023 at 10:52:15PM -0500, Theodore Ts'o wrote: > Emulated block devices offered by cloud VM’s can provide functionality > to guest kernels and applications that traditionally have not been > available to users of consumer-grade HDD and SSD’s. For example, > today it’s possible to create a block device in Google’s Persistent > Disk with a 16k physical sector size, which promises that aligned 16k > writes will be atomically. With NVMe, it is possible for a storage > device to promise this without requiring read-modify-write updates for > sub-16k writes. I'm not sure it does. NVMe spec doesn't say AWUN writes are never a RMW operation. NVMe suggests aligning to NPWA is the best way to avoid RMW, but doesn't guarantee that, nor does it require this limit aligns to atomic boundaries. NVMe provides a lot of hints, but stops short of promises. Vendors can promise whatever they want, but that's outside spec. > All that is necessary are some changes in the block > layer so that the kernel does not inadvertently tear a write request > when splitting a bio because it is too large (perhaps because it got > merged with some other request, and then it gets split at an > inconvenient boundary). All the limits needed to optimally split on phyiscal boundaries exist, so I hope we're using them correctly via get_max_io_size(). That said, I was hoping you were going to suggest supporting 16k logical block sizes. Not a problem on some arch's, but still problematic when PAGE_SIZE is 4k. :)