From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 924C9C369CB for ; Tue, 22 Apr 2025 04:40:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CE7A6B000A; Tue, 22 Apr 2025 00:40:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8618F6B000D; Tue, 22 Apr 2025 00:40:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71E086B000E; Tue, 22 Apr 2025 00:40:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 540746B000A for ; Tue, 22 Apr 2025 00:40:03 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 64A02801E2 for ; Tue, 22 Apr 2025 04:40:04 +0000 (UTC) X-FDA: 83360427528.21.4AB676F Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf15.hostedemail.com (Postfix) with ESMTP id AC6B5A0003 for ; Tue, 22 Apr 2025 04:40:02 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=lst.de; spf=pass (imf15.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745296802; a=rsa-sha256; cv=none; b=Mttt34LDBekJyemRnVs3Hc8i9HAsAQrMPdgTBpdV/67BOWOVQXFpo3t01SYFD7LfCUHWxI r5UQID3+DZCZCionic7aw4f/lAEFYGpkHVd9ss7a7r47HGBu15ir+K4s0ys91YjeymtnXl vZcZU5Yl9ltoEZhZq3cthLW23gF7iK0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=lst.de; spf=pass (imf15.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745296802; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rRY7ymDB7G/y3N9byL6waHMk5WHTws/bwQqq3msJblk=; b=suNM13aQ8a18pv5igynkIBAxjIh1HEa9H+2L+EDAo8vU4fgH9NwDjbJrt4ZMgxBtWx8HJN pIv+fDJ8FuHVxl7/fxFccztgYc/rQaFZ2oMg/HjiXBe8siCrMNDPmV+eDv9ryKjjiM59Im dElNciBm9pb85Q5Ucy8cPeJptHYiL6s= Received: by verein.lst.de (Postfix, from userid 2407) id 89EBF68BFE; Tue, 22 Apr 2025 06:39:56 +0200 (CEST) Date: Tue, 22 Apr 2025 06:39:56 +0200 From: Christoph Hellwig To: Leon Romanovsky Cc: Marek Szyprowski , Jens Axboe , Christoph Hellwig , Keith Busch , Kanchan Joshi , Jake Edge , Jonathan Corbet , Jason Gunthorpe , Zhu Yanjun , Robin Murphy , Joerg Roedel , Will Deacon , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, Niklas Schnelle , Chuck Lever , Luis Chamberlain , Matthew Wilcox , Dan Williams , Chaitanya Kulkarni , Nitesh Shetty , Leon Romanovsky Subject: Re: [PATCH v8 24/24] nvme-pci: optimize single-segment handling Message-ID: <20250422043955.GA28077@lst.de> References: <670389227a033bd5b7c5aa55191aac9943244028.1744825142.git.leon@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <670389227a033bd5b7c5aa55191aac9943244028.1744825142.git.leon@kernel.org> User-Agent: Mutt/1.5.17 (2007-11-01) X-Rspamd-Queue-Id: AC6B5A0003 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: w8cui3duxbn8wince9e6fjjtfzzrgdsy X-HE-Tag: 1745296802-797837 X-HE-Meta: U2FsdGVkX1/1RWKeyYgCJ3VSoWfO5/clPwN+IURdaNFKSV8n17jWYwEr27/ffN8gGDJ+m465isq2dsXyddte+gZvdg8f5NR4+csIkXkq6KPu4wQ3LBHH+n7r2PA8mgikNu9vP0MGviEI+WA9didaVnA50K3q9rDohMRRGyi42q52SEpba86B0/hklOO9IUpLJARAa3Lyhb2OESRVQAdaQEVqLi9XDW4WBY8gJP25SYTKMYamBuWqZGGjVYD8xujbreUllTmG7xFLhhfIb4PAXO4TV/Yq6XNKX60mcBJCimOKxyT2+3MWsYZMSnyJgzDSpuYu8IfLh17sWZ53sM8NSRT1KvcVWzSrBaLvkolAKemETHh4aplFZzo8Bm+TTSfp+gjVCX844HFWFDqu1xt9uWkhV1DHcn1zGb/A2ih4T1Gnoq/DPCMDvoySb1o5WGFk1Hgvfh6GGbCDAz14/Epy2pwDSL/gfxCszC48+fb4G6s+fyGycQSgdSZX/uRQM5QpOatmiDj5JTeqbjdbOWIWGmcmmsoLSZBtpdh9a0QSat3DAH7ExJlizBAm497AfYRe1PPW0nOhPRDgDwsz5IfSq02EPH0TxpVdFOP59nWh7QZzBfZVt2by8gRWWek+F8PDXls+sjH6eeerYsc2FC6qkiIipEeMRZX4jucSCa9IKAsTx7it36aVhWBbUmTST+EGZdgB9MBt54xowNLnwq6XEQSA6EQG+N/KNsOmX/ebWtoYUBQ8PEmoYlMXpWfA580r1ueZAJ7O5sqY7eTXwQNDaLidn3aY36iHYWha8esvlWNF5BXGCJNcGzIXAqj2Z+mZCijdr3xq/e5ImjPD0Cg1acWzr9UZvK45vfaIEQ5IzYk+LWFkjijVrThnbHv9SzYPWTjuZN5ca70aUA2kg4ykYYCVxqqskSnEJ9dGUnz5Hqd4zJ92hSNjDPzEiAEuS2jk9/kdFJQddSGGMbjO6HE FoU+Myu2 Zx0E/HoAD/rOs39g+SDYXyNa+kX4xvUVZmuR+bVI2n7M/wWnW6lgIo/EQTEehWZpKZqJlEDWN4g6+8iUt+u7Aj4MkXAWQjds090vDdW1Tj7P1I+lhDkLuF1CrfA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 18, 2025 at 09:47:54AM +0300, Leon Romanovsky wrote: > From: Kanchan Joshi > > blk_rq_dma_map API is costly for single-segment requests. > Avoid using it and map the bio_vec directly. This needs to be folded into the earlier patches or split prep patches instead of undoing work done earlier, preferably combined with a bit of code movement so that the new nvme_try_setup_prp_simple stays in the same place as before and the diff shows it reusing code. E.g. change "nvme-pci: use a better encoding for small prp pool allocations" to already use the flags instead of my boolean, and maybe include abort in the flags instead of using a separate bool so that we don't increase hte iod size. Slot in a new patch after that that dropping the single SGL segment fastpath if we think we don't need that, although if we need the PRP one I suspect that one would still be very helpful as well. Add a patch if we want the try_ version of, although when keeping the optimization for SGLs as well that are will look a bit different. I'm happy to give away my patch authorship credits if that helps with the folding.