From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58A9FC6FD1F for ; Wed, 8 Mar 2023 19:35:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 865AD6B0071; Wed, 8 Mar 2023 14:35:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 80CD56B0074; Wed, 8 Mar 2023 14:35:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FB686B0075; Wed, 8 Mar 2023 14:35:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 617986B0071 for ; Wed, 8 Mar 2023 14:35:43 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id ECA92121036 for ; Wed, 8 Mar 2023 19:35:42 +0000 (UTC) X-FDA: 80546735724.23.74D3387 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf18.hostedemail.com (Postfix) with ESMTP id A7E851C0007 for ; Wed, 8 Mar 2023 19:35:40 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=KhzIhWrn; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=none (imf18.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678304141; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2QbddD5xduC1xSLvUGdJZJ/ndrX2mu9hny7kJ+ioXjA=; b=P3T6eABPApcTV53AR9chxhiL9f4DJJEuX++b1srX9PGOU0UJng1mB5AF2zA24MSAEoARln JGL+esJu4OvTsT79kBUZI72d6Dy1bjwK7Vlj6lgL8BT7X2DE0IEG/awZ6Lt/rk2IJoguRC 9R239GNWr3mZKHv5zuqWucMqOp2vC1k= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=KhzIhWrn; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=none (imf18.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678304141; a=rsa-sha256; cv=none; b=KZ2A09qJeqi0gm56YcvZk57nwHWx6zuCIl8j+YzV4rvkz33kotuGr5FheYaU4y8G4cLv2s AxH5M15tnACoAgkV82BSD5UWPkk0WjgGKAcpgrAUU67HQwqt+cynIqGfiubrJmNw1TxJVt iNiZkC/bOt1lEgRrac+ULQDHA5BI4HE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=2QbddD5xduC1xSLvUGdJZJ/ndrX2mu9hny7kJ+ioXjA=; b=KhzIhWrnkpPrgWPp+Ye4kUuYtv V9NFlwAvM4cQVeypfNXxBMaFj4XFkRMBmI9nSfdw2abRFSSOb1XfAUTHPh8xgDP5X8MFdDQkGzOTk LrtK7F1ebzv6c4iD/HF15JnnwZN42i9Cya6rxUNHX0ZmkJcK6b8IlKmO1PJKN3mjSVHxWQx2jPeCA 6bHMaenJmeHyOXqKgL9cP3QVT4t0dcnggLULB/nMb7PqAyh6FERljFJoBeWYxdwXO9MgrWxaZxhwm 0liV8xqaEopRbxeQ5/cwMA7p50HpCfUCuLJvjXdZhPtzQn3kR55v10nMFZ6NwOSJuEZwoIcOb6oPh xlZIzZNw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pZzZU-006Yh8-D4; Wed, 08 Mar 2023 19:35:32 +0000 Date: Wed, 8 Mar 2023 11:35:32 -0800 From: Luis Chamberlain To: Hannes Reinecke Cc: Matthew Wilcox , Keith Busch , Theodore Ts'o , Pankaj Raghav , Daniel Gomez , Javier =?iso-8859-1?Q?Gonz=E1lez?= , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations Message-ID: References: <0b70deae-9fc7-ca33-5737-85d7532b3d33@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0b70deae-9fc7-ca33-5737-85d7532b3d33@suse.de> X-Rspamd-Queue-Id: A7E851C0007 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: iczt5nkhx8xd8akhsj93ryiha8wpuwdb X-HE-Tag: 1678304140-216758 X-HE-Meta: U2FsdGVkX19YPKAcnZBel78GG1vN38nOZHNBpG0LCY0WqEo5xmtDVvFZSTQPCQuzwLBI7wLMuNjgaT9uI3OVJkrsmpzafRzASmmzM5eg2TSN+ssD+fB9hS+iRJvVj/oNpplk/ys9zAdoivr5rA7jXbpb80Mpc3mHQqbn0i3/31kDghZHc4elCLPD2/MnNF05DxfaPTVVVL30slBuIIx2y3eLwV/eIVDRq4JTvUEc3b3jE/ynj1sp8ueqadHg+Yd/R9bFyWBC60reTXP52vqBc2WahmTw72Pn4r/AL6JL3w1Fr0Kqmm4/8x9EddypTllAmuCCjakVwlkaRdODeWJoXxs0zdD1O6MCHOWdXl9W0iQf5CHY+EPb5y7D/j41NUc4LcHJM5yQqJ7Ahina09pAUw8StNdS/2ChWwEO3mczutWlEmxFC7CiHE0YZtZtblZZBjZh9xx4j3xfLKuXsxJyTbY5vjGnrOynhHVmdGKbPocZ+Q7rqqXHMg3GLvQFtvljUbnID7J7qf/MymNkG7KFEU3ZKsIS783FcAzuaat1shl5bsgTC69FtuA7sHybojqGM/srD0I+sIyjcDRM9CZzW/mmVOTuVnB0ZGrkWCn3PMTO9xK7T4RyAYlNBgg5nYWEfPHmtSE4R/E0JRFf1e1lvDcBywwqRcpnlnd7UIg+mcVDf8qi6YU3zQPraUWhUX77AOTw6N155SWwxpCaHOcj3PbI/BEWBqAfoVhjmGuZCaWsHAWHxNhAp5RIzKi2GP6Z0Vfs3W/AT4pWiVy4Lp80nfL30gxaNxjJjEse7/xUgmt2AAQpioFEtG1eVuxpKjbZ6acbiu2+cNsHCCX9pQp9bB2Jn3VgVh9vpkAttGo2hmBXPqu5rhaYiInimPIWjXVFO0VPtM9WRTk4/LqdLsMpvqrd+mJ6UkMcBCOvl11gpvJ9ZfjkgLbPqF/86ybmDrsF6wcJtIbZZ4ZlNCGX5nj ir2AN7rH VdjsJNPKyUljZ71Qe9sg6TqLxFp1us/VISmBaAhE3Y2s8dynxgZmz+5sq/WcmbimYEOgCRevYhWiBSH+mlziE6F4BF3u0XC0UedNmwSQ4QiBPBpOuxyxxxbS2d46DVA1hQv+QWsai52Dh5OsDMrCu8ZoczaQTFoSp3VHgWmplfbFUng/tn0O50TDOELOUsGOehJzgwg2ntk4lEaFdjJgDbpZr5G69mHzLEUt5Quy927zKoUfdZTWOg1WHA9ZaqUC7AWJ+GuHcIk1ju5blvIZX0x+m8lKo/ONXGFabeGhU5y5fKDog+MGvKhKmSNbKm8squv2TgPcux2UZkjQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000076, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Mar 05, 2023 at 12:22:15PM +0100, Hannes Reinecke wrote: > One can view zones as really large LBAs. > > Indeed it might be suboptimal from the OS point of view. > But from the device point of view it won't. > And, in fact, with devices becoming faster and faster the question is > whether sticking with relatively small sectors won't become a limiting > factor eventually. > > My point being that zones are just there because the I/O stack can only deal > with sectors up to 4k. If the I/O stack would be capable of dealing > with larger LBAs one could identify a zone with an LBA, and the entire issue > of append-only and sequential writes would be moot. > Even the entire concept of zones becomes irrelevant as the OS would > trivially only write entire zones. > > What I was saying is that 256M is not set in stone. It's just a compromise > vendors used. Even if in the course of development we arrive > at a lower number of max LBA we can handle (say, 2MB) I am pretty > sure vendors will be quite interested in that. So I'm re-reading this again and I see what you're suggesting now Hannes. You are not not suggesting that the reason why we may want larger block sizes is due to zone storage support. But rather, you are suggesting that *if* we support larger block sizes, they effectively could be used as a replacement for smaller zone sizes. Your comments about 256 MiB zones is just a target max assumption for existing known zones. So in that sense, you seem to suggest that users of smaller zone sizes could potentially look at using instead larger block sizes, as there would be no other new "feature" other than existing efforts to ensure higher folio support are in place and / buffer heads addressed. But this misses the gains of zone storage on the FTL. The strong semantics of sequential writes and a write pointer differ for how an existing storage controller may deal with writing to *one* block. You are not forbidden to just modify a bit in non-zone storage, behind the scenes for instance the FTL would do whatever it thinks it has to, very likely a read-modify-write and it may just splash the write into one fresh block for you, so the write appears to happen in a flash but in reality it used a bit of the over provisioning blocks. But with zone storage you have a considerable reduction over over provisioning, which we don't get for with simple larger block size support for non zone drives. Luis