From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BAF1C28B30 for ; Thu, 20 Mar 2025 15:37:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DD37280004; Thu, 20 Mar 2025 11:37:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 166DE280001; Thu, 20 Mar 2025 11:37:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 00781280004; Thu, 20 Mar 2025 11:37:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D3BF8280001 for ; Thu, 20 Mar 2025 11:37:26 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2E1271CCC0D for ; Thu, 20 Mar 2025 15:37:27 +0000 (UTC) X-FDA: 83242333734.23.ED41F2E Received: from 004.mia.mailroute.net (004.mia.mailroute.net [199.89.3.7]) by imf27.hostedemail.com (Postfix) with ESMTP id 4D3FE40007 for ; Thu, 20 Mar 2025 15:37:25 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=acm.org header.s=mr01 header.b=Qv4MB0lJ; spf=pass (imf27.hostedemail.com: domain of bvanassche@acm.org designates 199.89.3.7 as permitted sender) smtp.mailfrom=bvanassche@acm.org; dmarc=pass (policy=reject) header.from=acm.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742485045; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3XYe1tuQS507NL5nm9rpDihDH8Gzkr7TFGJUWeSwmG8=; b=lXQVYO08hwdThv5ijGiylDQl3MbB2mRN2UKHArE4PstZ/yf/Kg82D79z8WNZCp3++GihRx aPJrU5meCj3+NSBNC03/aCP52EYdfH7eJr0RPYLbmAYQoYtxvVsHbioRBvzrZZ52Ix2pJA JBKWMAMCHUEH57nj8f3h9ACa96hqPTs= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=acm.org header.s=mr01 header.b=Qv4MB0lJ; spf=pass (imf27.hostedemail.com: domain of bvanassche@acm.org designates 199.89.3.7 as permitted sender) smtp.mailfrom=bvanassche@acm.org; dmarc=pass (policy=reject) header.from=acm.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742485045; a=rsa-sha256; cv=none; b=FN4s8O5F2BnfV0IcK1UsKETOclnQOPfs136bn1eNDrQyPsI2H3He2USIcab/xDVDcGCXH2 Q/AvTx5Kjg0LzKRmVys54iQ7b3qKljpm36+MAON5qmC5Uj1RVQfF5GQcMHYBrzXB1kiV8A vH/bLd7oSN/0KPc+tPC5R0afXKYIcxg= Received: from localhost (localhost [127.0.0.1]) by 004.mia.mailroute.net (Postfix) with ESMTP id 4ZJV7m2n5qzmWRtG; Thu, 20 Mar 2025 15:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acm.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :from:from:content-language:references:subject:subject :user-agent:mime-version:date:date:message-id:received:received; s=mr01; t=1742485041; x=1745077042; bh=3XYe1tuQS507NL5nm9rpDihD H8Gzkr7TFGJUWeSwmG8=; b=Qv4MB0lJ9xWZJ9jMy8RMFI9qzNsHX2Zk8MxpcKaH c4HSTrBC9zuL8atxolDY9431JpPYULMP/TAF/Cej53mMuAM6OsnjQKiXq1oeqDqA RHIXi5Hkh0JYIMEzcSYQfnVgJZf6RnJvxjisw5Rw6a6tLIBwMkLuKIhpaw0Cp1XK hksP4adk00A0QIY4Ebo0BKimLRW9zAhR1wfY/50RhrUD8n3QHGIxkNWMg6klQrAn rhKkLib7pV1kZjzwbd0F2YzxQ/208axbE0tYcls+8KzTkXm/YLimi96KDaHUfvNm lsRhKyOHR/3nAlBty+Jzd81HQ7SyD++zqsEXhc9t0hgiCg== X-Virus-Scanned: by MailRoute Received: from 004.mia.mailroute.net ([127.0.0.1]) by localhost (004.mia [127.0.0.1]) (mroute_mailscanner, port 10029) with LMTP id GJk4IoTrJnfO; Thu, 20 Mar 2025 15:37:21 +0000 (UTC) Received: from [100.66.154.22] (unknown [104.135.204.82]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bvanassche@acm.org) by 004.mia.mailroute.net (Postfix) with ESMTPSA id 4ZJV7R4kR0zmWSKQ; Thu, 20 Mar 2025 15:37:06 +0000 (UTC) Message-ID: Date: Thu, 20 Mar 2025 08:37:05 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64 To: Christoph Hellwig , Luis Chamberlain Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, lsf-pc@lists.linux-foundation.org, david@fromorbit.com, leon@kernel.org, kbusch@kernel.org, sagi@grimberg.me, axboe@kernel.dk, joro@8bytes.org, brauner@kernel.org, hare@suse.de, willy@infradead.org, djwong@kernel.org, john.g.garry@oracle.com, ritesh.list@gmail.com, p.raghav@samsung.com, gost.dev@samsung.com, da.gomez@samsung.com References: <20250320141846.GA11512@lst.de> Content-Language: en-US From: Bart Van Assche In-Reply-To: <20250320141846.GA11512@lst.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: yckz75ynwea96woa9t51hrsecmtktprc X-Rspamd-Queue-Id: 4D3FE40007 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1742485045-187501 X-HE-Meta: U2FsdGVkX1/4kVLSjplTXaKifcs9XQVeytZtAzofmhN18QILrO5sXMN/50aqrLe9BfY6j5F6yt4KC9lb5XUYbkFOpfTnUEvemulGrXU/cycPvL6wZDusTistybeWMoBE0nMgTLxY/dy2I/SauswddI9+jOGDv4enKvFrWciTvGfImBLKQYFUkRGHOOLo0ApWYl4T+WSl+8oU/HqGYUinGEE+WBOjlRMGltkrgahv6zeXJOWIU6pV1b+UPhr95igGj9B84gsBxyJoW3Y/+a5VaoXucF4axUjonjAW4OWSZzZVqvT3tsxsMFfU/ea2C9l/zlgNYaS+13JGuDD4y0PQ5Up2mTDFOn8tXYYmsVV3aLqCltS2Y2LC4t8blMJT/ycMVcvDOu6iwnV0TRi01AqorJ4kZu/w6fnkI37cDNlflFPOWOtNDycKrS89pSZduZAavIFvJJYimJ9qt/Jo+Q1l8RM4bfJXvUY2Kk2zTuYREHNY3gwvXtM0qfHm61DWW+zMWfd7Lj9jvYgkF9H26EqNZvXC7Xz9AmoBPRn5wBn6IrfXck7XKsmYMKJLahhllCj4GNTmQr4r+3jV+pXknSMj3P9ev8Gf3bNc/23HkRaFJglkymW7c/mMoNebvUB5ae9mLG1LVJC6rDHZRIlFKElb0wOpT5T4DtlCKHzU8A9+yy79PYyHxn2oC7hALb0diaye4nEADAmewZ2AwvVHY8UBtCrcR5Gmyk6zcRjHG5pWXq0Wpl09My6Fwry9HUMpYZUqa5+CX+rdNa4kjhRD+dJULqJ8vjbGFj6KXVlaiIsOLQU/suBPmWWqG1uVSWyILW0Mc3wI+aIki6r8uP5tXjpuJiTzy6BqSEYk/Z3flpyaYzJ3IhUx9vPN33l3u+GH1kzDisF2DSNLmX+egXMW+plhPnmLnY1xW2nWepksHg+DA0Ir8MeC8BlADPyhBWR+mutue9oJv2i6gOWbgwGYjoq KOoAs1DM V2eiR63RZezETL4VQmn84r5UgaCFUkhuJJLX/olAsnV579P/FeIGiTVsMeOXsVH7zu2gM5T4+2l/OIlPmcyfXGoDzCMRGMqQLhbZSfjxNCn2lm19cKovgRGfeu5LZR1iEpZ8giDU8HGmnxYB5ZkKrm9JIvelRO34KChmPD26UYeiuEXT8nokxWTN3wYLOvUhLZT6QB3042XInWMg4FDkBRcalix6H3hwAhDANudA/qIDFV3I2GTct4pKugM8sTXKQ+oiTHENVqYarWVB7wtZpg+XNIxsMLA+5PllJHrFDGSo1Gdbh1TWjhZ3tB4E/lDKFeNBd9oqJdADi5FE8do5aXpd8UB95crXbnjtg/hOhRhkUwizXfTgb3Xk+4l65+xB6r0tGy9KRTFxnizAe3i6I658+Og== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/20/25 7:18 AM, Christoph Hellwig wrote: > On Thu, Mar 20, 2025 at 04:41:11AM -0700, Luis Chamberlain wrote: >> We've been constrained to a max single 512 KiB IO for a while now on x86_64. > > No, we absolutely haven't. I'm regularly seeing multi-MB I/O on both > SCSI and NVMe setup. Is NVME_MAX_KB_SZ the current maximum I/O size for PCIe NVMe controllers? From drivers/nvme/host/pci.c: /* * These can be higher, but we need to ensure that any command doesn't * require an sg allocation that needs more than a page of data. */ #define NVME_MAX_KB_SZ 8192 #define NVME_MAX_SEGS 128 #define NVME_MAX_META_SEGS 15 #define NVME_MAX_NR_ALLOCATIONS 5 >> This is due to the number of DMA segments and the segment size. > > In nvme the max_segment_size is UINT_MAX, and for most SCSI HBAs it is > fairly large as well. I have a question for NVMe device manufacturers. It is known since a long time that submitting large I/Os with the NVMe SGL format requires less CPU time compared to the NVMe PRP format. Is this sufficient to motivate NVMe device manufacturers to implement the SGL format? All SCSI controllers I know of, including UFS controllers, support something that is much closer to the NVMe SGL format rather than the NVMe PRP format. Thanks, Bart.