From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3CEBC83F07 for ; Mon, 7 Jul 2025 06:50:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6310F6B0278; Mon, 7 Jul 2025 02:50:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BBF46B0404; Mon, 7 Jul 2025 02:50:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 482C76B0405; Mon, 7 Jul 2025 02:50:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 88AA76B0278 for ; Mon, 7 Jul 2025 02:50:37 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 588AA1A035A for ; Mon, 7 Jul 2025 06:50:37 +0000 (UTC) X-FDA: 83636545314.30.6BAB171 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf12.hostedemail.com (Postfix) with ESMTP id 6DF6540009 for ; Mon, 7 Jul 2025 06:50:35 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Q9JqCsDH; spf=pass (imf12.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751871035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DJCrtQpscf7JigPaIY2b7xTm+tWVRMcaJVrpHVo23Lg=; b=ulf77dBpCxqpaQQO9bwT6Wwzss/FHPEh95giEqjK+A+TpG/XwKAtK/uS7hpyN29T2Nu4r3 5ujBE9GU+Ewp5bf+2FsbDv3tu2mCBE/QydPQNwZc/flQmHggGU/QOCb9S/DogqPYs6/qcP I8ty5WnOhuqoLmZFlWSFb1HZUlmBXww= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Q9JqCsDH; spf=pass (imf12.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751871035; a=rsa-sha256; cv=none; b=8opvZhGdD9mopBzRPiKpQTcc1v6ZmA3roYaM6fQKJaYeore+zaGFVMX5VY/6Eh0WB5wYuJ h1Mk47qnok33utys4dOIYjDrXMrEJNlkX932n7ThP8cNweADE5tcGkzU3wfRuwPGywTq+B W+c9D/4+I+qrFB59shXXii3uGKHDKTk= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-234b9dfb842so23305285ad.1 for ; Sun, 06 Jul 2025 23:50:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751871034; x=1752475834; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DJCrtQpscf7JigPaIY2b7xTm+tWVRMcaJVrpHVo23Lg=; b=Q9JqCsDHhHdMZwwJDA38hNSDqnhDFDKg/YcwZykP8wAOGgTGxTCvv7tykXQQZooD08 u75b9r9/+9gWSXT1hRMjiYMOIbyEgvpyIipGImB1yoG0U4d5fPBrhp8bTxjvweKRlGcw WWGY++uCsFr9gYvyAi7QqdBF0H1FVBi3UegFoiTThc3Yw3Cmr/wFjpM8R/y9vxBtgkzO DWhygsZtY+KmWsYmZtgo80eodpE2Df/y/AuVjV433RPvp0puIL5PWKJVxE1ktT+lqBmL Yi4QLW1fEYgTN5hDFsxsV2OJpl+Xj9Lv97lx0m80XrKoQbItx8AC8rttC9pcxFe1KsLy VySw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751871034; x=1752475834; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DJCrtQpscf7JigPaIY2b7xTm+tWVRMcaJVrpHVo23Lg=; b=VDiYrbzP+jlV0YSlrWKxcu9NkkuXPw0YTcuXCEwf2jYQlktdbOOoFrVlTvCLZ/oGaw vA025nJUPG8XgGRkw5SgcAbD3L/+VtC3zv4XYzj9H1FGN1zIvRjHLYMBjiCrYrN+RPW5 p+GxpObIOxdx01hlIRFvfuDI5D4wi9Ean3ncxdtL+HnQ65l+/yLid1BZOm/tNLdZuJh1 uCSIKTyHOjPufL2uxSgOIcIC1VK1f+yQFGaW7gE5fZHG576OccSHmxojx/gknU8JrTBH nJJcTDg1BU4GmPBFvms2RJEr82yD75bd3aBRJFpFhFLBE0oFwxyRxJzgnkvfFmOneqK1 yjig== X-Forwarded-Encrypted: i=1; AJvYcCW7BK9+yCM2f2Lm6oXTmpQXdXvmP3Q3InZCpN1gBdi+cswJNEqrWiFI/tLCZqc3UmWRmJC8Ys2mOA==@kvack.org X-Gm-Message-State: AOJu0YzVkNpGDRs7mirZ/U8d2A3tTiGEc8WGesVxc2p6gmfGdT5QrqFw Jkg7PMJZ0xu3cO6/EohXs1VI2BAQx+3yF39sII5aYbV4R+RMhdfsKd7P92jZuapeNe4= X-Gm-Gg: ASbGnctZrRB3+Z2q6MVRieetAqW539KvrPik3KUE/w3J5RM/LLRu4uCU9+SKfwFCqV3 9MFKi5BCB1pxgnBjxtC1OEwEuAc2A6tnca5tFAubxTf6P3MgGL+tzQfy1sjxblo/e07V/ImxJ8l Tup8fJ0x20W+ivymSf/K9Ss+uONQrAIaMB8wHX1KzV4hrUP2dYOufy2JtazhYQd0XsmJ8Kadd10 yESol1u8FsRuGKLw/iIQeVhj7DtS54mpn+QRflGkMgHra4OaPgUdGAk00xvql52mQzyXv8xTBig 8aUuIvbEiJnyUYzo99gWFH6xWALfYCOPigt++pEzd5aAXfxtus2EL2lDNlnhu+c9wIua5AMTI/C BmG/Kq0fojQ6C8pnI7ml2Uyo= X-Google-Smtp-Source: AGHT+IGh2/KBCNysHt7plUCFAgV1uGLkxyUZAwvcA0Vu+vwTRDqonN+b71HZNp4xw4YPISfBYqMzdA== X-Received: by 2002:a17:902:e5d0:b0:234:c8ec:51b5 with SMTP id d9443c01a7336-23c875e4d89mr159552995ad.53.1751871034310; Sun, 06 Jul 2025 23:50:34 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.9]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23c8431a1aasm77377635ad.15.2025.07.06.23.50.30 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 06 Jul 2025 23:50:34 -0700 (PDT) From: lizhe.67@bytedance.com To: alex.williamson@redhat.com, akpm@linux-foundation.org, david@redhat.com, jgg@ziepe.ca, peterx@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lizhe.67@bytedance.com Subject: [PATCH v3 5/5] vfio/type1: optimize vfio_unpin_pages_remote() Date: Mon, 7 Jul 2025 14:49:50 +0800 Message-ID: <20250707064950.72048-6-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250707064950.72048-1-lizhe.67@bytedance.com> References: <20250707064950.72048-1-lizhe.67@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: e58up6he73sq4urxh67xyctbghuynzxb X-Rspamd-Queue-Id: 6DF6540009 X-Rspamd-Server: rspam11 X-Rspam-User: X-HE-Tag: 1751871035-622267 X-HE-Meta: U2FsdGVkX18+c5FhK/0TCWPffV6VGepDaF2hrL8keA2Uw3YcOstDeXu5Lw1iquCO+uHty7B/2WPer1B2MpLiGc508Nf9S39KuySj2ZD42dZkSkYyvJyb0Gn/4QlQuj7KOX7JYKkBniyHUQRcYAApraT1MMZAvLSuE9DZrbWGJoHr90rExzvGNrNqh2k3dCwfzpmVTn46u8YuufzeRCrtENJ9gnqkAJ+jCC2/VjX1/jI4LU463ZtraZoBCIsxxnI+b2YA9Z7tRrtN2WMFL2zvjAHJ6Fku3r9N1b6Xeo2jo5dPGfJiqbFiGE8usWF3qfaCM0WuP7GhOJDlFKFBqjukdpsZxd7KKdLZdeBQEcJapKoxKXTC8EJPxSDMcBXi5BAhsDqOsfLRj6upCXS1CKEVXY6taL8s9o6PX7/gzP+5HtnF2qLKkwKnR9wJYPZiW3w7ALe05QKg6QKuzU7CFKqpHdeEUVn6K05oKT5NzB4+VhKKYzBZh2kYdnC0Kug0pd1itRkErXFnUKMxVltkIYGFqXE9YWl/4mJ3xh7wiiUF6fGY1uo+ZFPtsz4/KAkWxdH1R8+fV1O2Tgi5dPfebvS/EU9iQrKnRlJZILN2yntcp3LfAPB6kp+ZfMNc4NBXthLiWlOyfO3WRqTZQj/ZruSWm2e3CVoqd6MPYRijCWRDL3NkzLkDqUdyG2vGBx9E0o9Zgp5SwudkkR4MbMI7EMARsj0Ierc8ygB5kkalKr+5SVUuQUWpAiiTEkMosIIPWam6rjnx1NZaRd6WBDZawkcyeQwPxAajrMWrO7A8r1UjWgEfDSwr8u5j/mtsSdZrlLjekR4GtRUkBEcUOcxjARDrzXMYKtB3YyWcqO9y5NUm8nHtatsdMN9fQH0fz6x+5akjqxaOcazfXhEf/vd/ZomYIcx0A+a3xm29+/lG4BlkIk5kJhKTgEv+L1R2CykaMZVyQkoyaFT6FkC4fpg4CwL sGpB4Zzo dvDAZYqsajo5/Cosm3qiFa2LeEGmWvXAAjz5gXxEXbeOljQN5uQcnVJHK2dgcdDcgLeZ7XLj0QImIIzcEcuEeeMH+p3aE9BvzpQ10SKJX4I/nPNr1ak/L/frhkZRpeSB44gOrM9T/D6sgyqT0xASgxNlH0PPY2WO3/rZqA2Bb5b8ntat4JMekF7dPolmG1ytOmqjruo1iYNP0PDHgMKQNOgPGiShgd7jM4t7ca9gQ6aRhHNv9iPFjiInrxmajt/j5A8k1r9spiUub2PDUeUTeng/o9Geezu5twi8SVr4b1BaHtgi31hyVqSOUm42t8CWGm+OBTNI9uC1O8D3SDtFJ3wpRDJ/P7fXQlOuJLVerd1la6D5/aWCSrLL842BK9EhV/+kdloC12MoiUlP0VM9mNT0P3Ecb1i1uVaWxdW3A1X+sga6gBnEcfhxdTxw2e8XR11T9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Li Zhe When vfio_unpin_pages_remote() is called with a range of addresses that includes large folios, the function currently performs individual put_pfn() operations for each page. This can lead to significant performance overheads, especially when dealing with large ranges of pages. It would be very rare for reserved PFNs and non reserved will to be mixed within the same range. So this patch utilizes the has_rsvd variable introduced in the previous patch to determine whether batch put_pfn() operations can be performed. Moreover, compared to put_pfn(), unpin_user_page_range_dirty_lock() is capable of handling large folio scenarios more efficiently. The performance test results for completing the 16G VFIO IOMMU DMA unmapping are as follows. Base(v6.16-rc4): ./vfio-pci-mem-dma-map 0000:03:00.0 16 ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO UNMAP DMA in 0.135 s (118.6 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO UNMAP DMA in 0.312 s (51.3 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO UNMAP DMA in 0.136 s (117.3 GB/s) With this patchset: ------- AVERAGE (MADV_HUGEPAGE) -------- VFIO UNMAP DMA in 0.045 s (357.0 GB/s) ------- AVERAGE (MAP_POPULATE) -------- VFIO UNMAP DMA in 0.288 s (55.6 GB/s) ------- AVERAGE (HUGETLBFS) -------- VFIO UNMAP DMA in 0.045 s (353.9 GB/s) For large folio, we achieve an over 66% performance improvement in the VFIO UNMAP DMA item. For small folios, the performance test results appear to show a slight improvement. Suggested-by: Jason Gunthorpe Signed-off-by: Li Zhe --- drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 13c5667d431c..208576bd5ac3 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -792,17 +792,29 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, return pinned; } +static inline void put_valid_unreserved_pfns(unsigned long start_pfn, + unsigned long npage, int prot) +{ + unpin_user_page_range_dirty_lock(pfn_to_page(start_pfn), npage, + prot & IOMMU_WRITE); +} + static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova, unsigned long pfn, unsigned long npage, bool do_accounting) { long unlocked = 0, locked = vpfn_pages(dma, iova, npage); - long i; - for (i = 0; i < npage; i++) - if (put_pfn(pfn++, dma->prot)) - unlocked++; + if (dma->has_rsvd) { + unsigned long i; + for (i = 0; i < npage; i++) + if (put_pfn(pfn++, dma->prot)) + unlocked++; + } else { + put_valid_unreserved_pfns(pfn, npage, dma->prot); + unlocked = npage; + } if (do_accounting) vfio_lock_acct(dma, locked - unlocked, true); -- 2.20.1