From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4F6BC83F09 for ; Fri, 4 Jul 2025 06:27:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 866D46B0256; Fri, 4 Jul 2025 02:27:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CBA96B0258; Fri, 4 Jul 2025 02:27:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68F956B025A; Fri, 4 Jul 2025 02:27:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 569A96B0256 for ; Fri, 4 Jul 2025 02:27:13 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 056E21A1DAC for ; Fri, 4 Jul 2025 06:27:13 +0000 (UTC) X-FDA: 83625599946.17.8F8979D Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by imf08.hostedemail.com (Postfix) with ESMTP id 21603160007 for ; Fri, 4 Jul 2025 06:27:10 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=GcvWm0mT; spf=pass (imf08.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751610431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CssPMht2RNCT6oHZnq2lcog7NRiakEoMHIsMI/+GOsw=; b=5lo/Et8H02NicI64A1FXGkG6QbRwvsfxkZ1VwYfpJ5JUYZ0IlGufujDLPBxAo/9fX+5x5I TKxrkta92rbeoIXQoR7HQ2Dndrtl6vNIkRI5+g2Kyfx2k3asaMa9xbSMVvJ21ONaxSGmum gzz69pXmW1S709LQJhvdOiYg1VmRDiw= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=GcvWm0mT; spf=pass (imf08.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751610431; a=rsa-sha256; cv=none; b=t5o+wejhI/dGS8bKFrFUNi0OwSOmJH74gi/bMxTAHX0iEl4YVsU0ZRYTa8X48m8AA43sat ln4gqBy+CuSYUZDvHezPiEluDlHLvlv+Z7jH/NzPayCqstQ3UsXC36LnQhWhsZzy/JjgIM aW9N/ESINf+4+EX47ox+LLd34VInFg4= Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-73c17c770a7so906697b3a.2 for ; Thu, 03 Jul 2025 23:27:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751610430; x=1752215230; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CssPMht2RNCT6oHZnq2lcog7NRiakEoMHIsMI/+GOsw=; b=GcvWm0mT0SpEW3S14uVseDjvTNK60oRb/Ydgxm4obClUvC3fEIns0EefVpa0kUeUEE ITuataemHKOJFUCVigU9C6bj+HAfysgYr/XmgB+/z211fNPe2X2SWwTCSq4MK+VizfB7 7AVF5CjyrLHmbLCJthqv4ByRHs1MEkum7wtqO9wNaGgk6Tdy/A8gppZIw+OcdXZU2HEV qiHtt/ApqlusSyjhB6neJNGmSVjwRa46NRwCSVb46ECymKL2q70PtcmkhjCyMZq3cVNb +ey3Qf8CKPN3kmoNtfN1MXFarHljpR3uaLlVtF0zDbXNV9S8O1TmP62fDZK7eA15VUT/ 8TqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751610430; x=1752215230; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CssPMht2RNCT6oHZnq2lcog7NRiakEoMHIsMI/+GOsw=; b=gV/dz4KmOjWGVLym6mdkcHUxBRLmquO4bsotXhq62SuRYnkY+O4ho3XU3OHT6PJvWI X3vGYWa6nxZPhDOb3+Zy3IhadlV23leoQGEyXSufEmROMFWFtn4c021sFTuI1KOfjthj WqLKFaxOC1fmoL7kdXzb5F7dY3kCJA5WBr9p2BH8H9kyUx+IbRneDgXBK6J6XJDm264L HHXsfhF1uTpFaxalADzAZn58gVOTHiE8s4UZrLAuw1DhBGAKfcxioDQpMTkQ9494RtgS ofZQvW18ys1GnymJxLWQ7ZXgF4DY/QJmlTp+WzrYz7LPhnnUPSQY34whLDhmrqG1vsXf g39Q== X-Forwarded-Encrypted: i=1; AJvYcCXK9QzI2xvjGMs1q342NS358NjNZKEBa8T7gHuBTgEepAjPNCaYxdKjAC4jofMZnUwohQiWSdaFsQ==@kvack.org X-Gm-Message-State: AOJu0Yyv2GG+uR7o9NKUrTRf/IrIUF3kG/tsDf5QMfjeRuw8ciXg5Lat Irn7W/HGDsmEOWLQDBd4djohMDaRcKbSFk81lcn3Wf+dPbLGbprKhfETi5x7me1o8gc= X-Gm-Gg: ASbGncvP/9RpRFNaSTkafxBlxy+S4u7Bnk6zAl4zKUOaqvG1cBWqvki7ERYADWhJXrX ADgtXIkQKB8BGfABXbgBzOsbybwa5XN5FH5/aL1wm/EciC4Vp/lFw9OR2UGsydLyVHyAJ6tSi1b HUcAoN2Qzg02SNOdORd5nmhxQJzSKZUXzRC9hdmGproHjCQ7vvIv/Zw5JoW3WtpiaTUk/GyunOB o2QnKeicKXe8Tqxh7oxaJ2RGa6A9rA1AENb4K9d6lXz+G8qCSfZEZvISy6yL9HZFUpp/M+LlXv/ wWYgDue06FnDMvjZCXg/vb/BYEuJx517V0KUeY17fQyGgbQrlYJL/afBLYvWoHNOnZBilRdZIpB f+g19R90u0ZUl X-Google-Smtp-Source: AGHT+IHAZZUJlbwPG8pSg0jZDLcpYtT864xhpoOT0kytf9EjZhBe/0eDYndqrOdf/jc7iu5owyQo+A== X-Received: by 2002:a05:6a20:3ca6:b0:225:c286:5907 with SMTP id adf61e73a8af0-2260a0a362bmr1393632637.3.1751610429935; Thu, 03 Jul 2025 23:27:09 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.8]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b38ee5f643dsm1183240a12.37.2025.07.03.23.27.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Jul 2025 23:27:09 -0700 (PDT) From: lizhe.67@bytedance.com To: alex.williamson@redhat.com, akpm@linux-foundation.org, david@redhat.com, peterx@redhat.com, jgg@ziepe.ca Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lizhe.67@bytedance.com Subject: [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote() Date: Fri, 4 Jul 2025 14:26:02 +0800 Message-ID: <20250704062602.33500-6-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250704062602.33500-1-lizhe.67@bytedance.com> References: <20250704062602.33500-1-lizhe.67@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: mbi6nrfo6af4nwdwmsxgy9roh9r6q8rp X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 21603160007 X-HE-Tag: 1751610430-690952 X-HE-Meta: U2FsdGVkX18vFnLDCZIdYvd2JyS26InHv0TMFc/XivnoHX7XeN4eVuJvmtk8go+OiQPkHlhJLmPrrI4eIQAqfRs9F1EVlrY8opidcwhZzNA+FkGEcQiR3k1EZyWMKI8Dh8dAXtFIIxYk0hfYk9m1psGuTuXH3j2Xj112zuzUXkx2sJqUs8OiWXxKDywkVnjdLjOmHXY6hBjIwl7VAi7HLoXQZvg9SHIGNkrXjxcnqaigfb/No4Q/7/XZXJE4/BwV/OlWoPs5kvdZJnEypkh1jolZysY37+t33bs06hRZKlSe6iZS2QBy9HPlomxNAnodzIpY4Q38H/1ZdAmHO7oGV+ciLNg5d58c23xTODW8Dv5+TMyUCdhcNzoN9ZJp+AiXdC5Pe4UKj5syPZKXY8XyxvkxzvSpuqISAr+O3K1JTpsPXmfjz+QqM8PZyYJLJsPTVt5U04mKaOaqt9Gv2f4lnIFPBzZowrbkZaUsHZlWxxKhJpNe0q0CHkMCNz7MecvuaxuLI1lMplo/T4N3Y0SAuYgX7VtS2jZdw7crMDRout9Z0dZaKlpEeskLi5Q4j2pSo6vcs+Pdl0XxRaTf5aEu/fa4OZOVnAzMryFKBcfpWF2ZmirSbMh59YZrY/fCfBOvv5UjvMilo4O6LC5xGlwTJyQ2ru1agXgWmzMIrXAhpMH9rNb9bcZOACHJtGgjyCZ2P15fww3Jcf0elqfd3H+1JDOXsmmC74yKEIvSPUzNk+XYhGewofYyQ49M+o39ITb7a9S2t88UN79+mRytophaCiLh7RvyCoOXcrJ3jecDw/oghdQ7yrcJ1Zud7fznz3jmvSDq/OheHOoCfx13EAA1Xb4UeDou2N5IEk56VFYPuWkevVRko/NTQN61P5uKXg87jBQg4cG4pSyu1ITY6Bm7V7NtLTFCH5CHBADOGnzo7d+zDNiglB9Hw87uwAhvgfmekGGLjoGCqScqtcXK98x H1tshcfG SQehx4sjQZ3y5xjG92TwrueNzd5zhvhpbtIYxph1rifXQI8F3qeJOZ0TUmiTGJ84wzE0dufN6KS76UZYDWxeL4BB6mXVQ600XaT2ndm21KIE4NlZjwaW/zk5vTLghtvwPRri/NpMU4qwcBbX9XUe7XKYr+t/k+ap1OTbRCBJEFH8BSz/sMeR62RrQVJGKmr5fOATo73hr7+qVqN+EDXD/epa9g0jK3wMkfGpObRUY5YfOiqBVDRT9ZTMUnepv+fQmh3eqeiIs7QL7aBCj39nJA6w373Qj1dQXqbNqzsOczQkXTL3V3RWRGnGj95Jrt5lXk5j1vbCf91wjVh1UOOtwRAs2oV0TtfFpleYuui+RzV2Lu35LpSJ148612m+/cxLmyan9oxQ4OzxC2GKunN6UhxrgzRQgwPU92pNvpvnIuAssJhfNeeRh7cWDwyXWjnnuxLHC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Li Zhe When vfio_unpin_pages_remote() is called with a range of addresses that includes large folios, the function currently performs individual put_pfn() operations for each page. This can lead to significant performance overheads, especially when dealing with large ranges of pages. It would be very rare for reserved PFNs and non reserved will to be mixed within the same range. So this patch utilizes the has_rsvd variable introduced in the previous patch to determine whether batch put_pfn() operations can be performed. Moreover, compared to put_pfn(), unpin_user_page_range_dirty_lock() is capable of handling large folio scenarios more efficiently. The performance test results for completing the 16G VFIO IOMMU DMA unmapping are as follows. Base(v6.16-rc4): ./vfio-pci-mem-dma-map 0000:03:00.0 16 ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO UNMAP DMA in 0.135 s (118.6 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO UNMAP DMA in 0.312 s (51.3 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO UNMAP DMA in 0.136 s (117.3 GB/s) With this patchset: ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO UNMAP DMA in 0.045 s (357.0 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO UNMAP DMA in 0.288 s (55.6 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO UNMAP DMA in 0.045 s (353.9 GB/s) For large folio, we achieve an over 66% performance improvement in the VFIO UNMAP DMA item. For small folios, the performance test results appear to show a slight improvement. Suggested-by: Jason Gunthorpe Signed-off-by: Li Zhe --- drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 13c5667d431c..3971539b0d67 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -792,17 +792,29 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, return pinned; } +static inline void put_valid_unreserved_pfns(unsigned long start_pfn, + unsigned long npage, int prot) +{ + unpin_user_page_range_dirty_lock(pfn_to_page(start_pfn), npage, + prot & IOMMU_WRITE); +} + static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova, unsigned long pfn, unsigned long npage, bool do_accounting) { long unlocked = 0, locked = vpfn_pages(dma, iova, npage); - long i; - for (i = 0; i < npage; i++) - if (put_pfn(pfn++, dma->prot)) - unlocked++; + if (dma->has_rsvd) { + long i; + for (i = 0; i < npage; i++) + if (put_pfn(pfn++, dma->prot)) + unlocked++; + } else { + put_valid_unreserved_pfns(pfn, npage, dma->prot); + unlocked = npage; + } if (do_accounting) vfio_lock_acct(dma, locked - unlocked, true); -- 2.20.1