From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DAA0C71157 for ; Wed, 18 Jun 2025 06:11:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C23F6B0088; Wed, 18 Jun 2025 02:11:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6719C6B0089; Wed, 18 Jun 2025 02:11:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 587D66B008A; Wed, 18 Jun 2025 02:11:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 46F436B0088 for ; Wed, 18 Jun 2025 02:11:55 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C306659F5A for ; Wed, 18 Jun 2025 06:11:54 +0000 (UTC) X-FDA: 83567500548.16.DDCBC34 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf25.hostedemail.com (Postfix) with ESMTP id 46247A0002 for ; Wed, 18 Jun 2025 06:11:52 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=fRQUPjQd; spf=pass (imf25.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750227113; a=rsa-sha256; cv=none; b=WPcjuzbkvfWR4drCsrAipbp7sLM2KQXJYJ39Jd2O+mjMfer7N6CffRVkRCdELUyR0KFakI kqgMO5Ocm1dtsShP4pxEQTVUGiSj5nZ4FYvFcAc1/DN+9EmeXAT+z3WHPQHeFsAy6AzseQ Ny0UqmFh136eKu3MfLGYiXgZUW+cLsM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=fRQUPjQd; spf=pass (imf25.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750227113; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xn7q+LZ4zA2xL1SK3J/9rIfQd9cYihtR+w0HeyZNGvc=; b=xADgRd8GNQhwhtoPkKS0Ax/W/ZRfcjwK7rQTaanEgTMkl/RR00Zp6aetXZaZZcq9J5POSR 94Ct+gk77Ph6MFd8Mq/svcJ/1nkCukHt2QJdzJqpvqIahtwh8QoPo9k9eUACtTY8EkPT9k HG882LDrY8Q67DTF1SoqJ5XP2Fk7zxE= Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-236377f00a1so58329835ad.3 for ; Tue, 17 Jun 2025 23:11:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1750227111; x=1750831911; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xn7q+LZ4zA2xL1SK3J/9rIfQd9cYihtR+w0HeyZNGvc=; b=fRQUPjQd28B0zjINFXy1uKAMzZAcuAXlGEpUBZbh68xAnW3IfFtWhNlZgPQkK0j4cD Q/UzeiSQgXbBRMLGFiGTRQH+nzBt5U+OLG9iUSFXMB5QK7nP+/cDR+JOv4gXi1QAKDv4 Gcgvnwz9YKY1uCBQ/MtePLSLcvJnkGEKqRsZUSnm1RC9QYkmo9Ds3cD0EK4Dmz75Ov7E e2qDd1Rz+QsEb16CHbD2oTA4pRui0P2mOX0x4956tuwn8dVkU5hwm14d7XiOwG1D3Pqj +6/A++W/3ypKq4J1FyhzpVAZj0k5V7kNkUCZ5g3x80kJ0Fnzq2DUMhHPPxs6fsVn8nAU EOKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750227111; x=1750831911; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xn7q+LZ4zA2xL1SK3J/9rIfQd9cYihtR+w0HeyZNGvc=; b=c2DzqBxEfmBp4g6AGvfUaB76nxmXVuQ9+1LDhpgtKpFFngZK/wWtX76n8yTqNSuv4R 9J1a5yMU/8aaGUAlSKTFoxf8B1PnM0ImBcDak46FberTLVG4MKEpM4gjKXzG/ruyksbj FL8qpqaeyuWPNWRdB+/gPVk0+aaYCvY6AIFdTLJ32OPrXjTyCw/G7lxkNvF3PdlEBSgX t8i6p5/3+lCOdtQUEDTXj1Vuz/UuqU276t/OqV0SEuwOnDo4kGSKe50wAtp2xgkFcaxB n6HipZAxIadPZ1MfxAsiVcPt8pvWUk7qpLCWL9Wj+Iji/Ia0QYr+doFX+qimEVR6oeZJ tFBQ== X-Forwarded-Encrypted: i=1; AJvYcCX86qsMLyfvB8QqD+vk6i3geERw+Uqww5xc4OsSR8tId9X/Hj3BB080NotQLOGd7QYh/tNGQnrBoQ==@kvack.org X-Gm-Message-State: AOJu0YxMjdoWB1JAwZhOTKZV1uZ32MbabvxPREP0p3dFFjWXo8WzGq9+ 34d7SnACbdDWI3TnBqu5mPtfqnafcysduaJzyrY4Du4uxpy5YGOetZZzWtXrdl2EAGQ= X-Gm-Gg: ASbGncvWKJwPDSG89yY3Y2U2ePKwKh2nKmd+kX/jHlsbBViY+tnCUPwyXYdGXq7yW+/ smqDAIaNUp/sWGBUyiVhqv03bIVmYxy4Bq5u96oesMubINwxAEjOA541zB84KO/iZszZGiAMH1g p4vdUeyCm9ZzvM8KC4vIt7/sECySKHswmpfabSNh0MZwBXWCBuYK513HS14IPFd4m6qUMXvU7Rt gv3hjtuGFBrCSfJjuI4+m0vjaVhWpSH7MxNvhJ7uupSDrwPsWXlz52YzKVVyYEGw3W0J1F0GbHt 7wEaHD4qNVUpj7zuZ6OaR8jSH7mTZsCUML/YxNF0PZLYRL5oQcUhcJwmKTGSWPujDXdRPSWjopT I4pWYbevKrxw/ZA== X-Google-Smtp-Source: AGHT+IE6QHqF5cRG4JjJmwec3MCxJQBktPbBLGT04dKoF50/oeGUlWRe2hxM1IPr9OMe35BOwAM1yw== X-Received: by 2002:a17:903:24f:b0:234:bca7:292e with SMTP id d9443c01a7336-2366afe6223mr256773915ad.14.1750227110691; Tue, 17 Jun 2025 23:11:50 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.13]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2365d8a4d3fsm91393555ad.64.2025.06.17.23.11.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 17 Jun 2025 23:11:50 -0700 (PDT) From: lizhe.67@bytedance.com To: david@redhat.com Cc: akpm@linux-foundation.org, alex.williamson@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lizhe.67@bytedance.com, peterx@redhat.com Subject: Re: [PATCH v4 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio Date: Wed, 18 Jun 2025 14:11:43 +0800 Message-ID: <20250618061143.6470-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 46247A0002 X-Stat-Signature: iiq7mdaznhknhwtjmxhmn5s7uysyhmhx X-Rspam-User: X-HE-Tag: 1750227112-191080 X-HE-Meta: U2FsdGVkX19jij0c4sMU/7yGYHK6FiXi7FaUpn6OBQVgxA0NIkhBOXsqYuRrT7iGKffgiFQSFzXX7xKyQagSYw/+hICyq+h/B/Ubg/tjTEqVMCp/R9HsA2nHSlcgQeTq4MidN1D2Nb73r1Zas4Mbk2LG3QuHF2rRPG4MxUzYusX0z/HAe4lGlBpeOiFoIgYQsSol2+OxrK0kIX+9l2fgx9Q14A71G9ciYHfcqjKIgpHW+0ieQyPVMxzQX4TICGT8ohUIH44eIkFwtNDN/4B8Z6fPfJ/yl2S+t0sCZjpKJHSDUrB3CCnSc9/q0wM7AKi7Ll6xdCYWdfkfzw7Q71wyW4HFMa98LFYCEpHw+uOimCVK3mC38nstPKXr1Rbg0tr3IFa6Jl9vGkZ2gIAHck5PxHMKVj9ZLgDt16gCDPUldwUh8WqIMMCQr00Avf88LRdGf2rCe7A3M/nnMXss1KT+IY9Pz5XKhobKowEaSLf6A19LMk1xAMtsZpJqw8H1jkoexPNgUO73sSCamw8HTGjASc8rorZvSIckXp4SBzkWE0WaI7RH3+XCOQha5+xpAipb6gtOo0UYtRpTdWqRyy8DtVSCO3UQt345yg03hfpanJIggy2A9bBNT1Cu08J4m5ovgGqZ6AtjLmVbCMB/RRQh61djTTNE7EV8u8gv3PYD0msBJHJApu5CnAd0mxNsqfzXjYGIIveM0K4Zgae5G/E7OVbEI85GAYqG+1aYmsrCNs1y9E57qkFlUEfQNk/+GmqcXmifg4rYmaUFm7Has2fY+HcCRP5RoKvIkd1EQXF2MQ4rUKCUmiz2FllaGKqOpeU+HkigDpoKjKnRlHjj7m54/5cBYctSDWl+8ENBWGfD0s+WFr9yPh/fbFa1U3XYE+31k7/Y1CFneqUwkPgx0t9qoesNDTBvskhoN+ZduBmbFeAwjn/ycjhXyU5sAWuRuhqgOTdX/iLUQTbq69uEqx0 2T6rhzL8 Rgvk5m5RljlulJHDayIZKGUpECQR1bal6rZTMaL16Ov5nu+UOdFJai03GJzb7ixdPn2jVDuTUq6EmrFsAe7Z38bkd9f2NboEl6bstKhE1dVTPlbs4z41DKdJ5dbqD/7nWVvzw7zfw9FYLOoiGaXKlco/vcWWeN2QwdPjAJSCT6jvGdeirNwYl4YAN2KRDmfS6Vf6BAvbksivPVjf1qruwENSDJW1WAj+keWgOb0Pn+1U74mfFOAJNaxwKNE3bEz1dUVhSsBBRUzxPx6tlH/4xBSXoDPpXquJ8DbvJlYaYEB8qFsVJwlxEfgfTSDIW6yOtGCqrfLGkSQtXq16b8IgFqpFgPnlKOJjQINQn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 17 Jun 2025 15:47:09 +0200, david@redhat.com wrote: > > How do you think of this implementation? > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 242b05671502..eb91f99ea973 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2165,6 +2165,23 @@ static inline long folio_nr_pages(const struct folio *folio) > > return folio_large_nr_pages(folio); > > } > > > > +/* > > + * folio_remaining_pages - Counts the number of pages from a given > > + * start page to the end of the folio. > > + * > > + * @folio: Pointer to folio > > + * @start_page: The starting page from which to begin counting. > > + * > > + * Returned number includes the provided start page. > > + * > > + * The caller must ensure that @start_page belongs to @folio. > > + */ > > +static inline unsigned long folio_remaining_pages(struct folio *folio, > > + struct page *start_page) > > +{ > > + return folio_nr_pages(folio) - folio_page_idx(folio, start_page); > > +} > > + > > /* Only hugetlbfs can allocate folios larger than MAX_ORDER */ > > #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE > > #define MAX_FOLIO_NR_PAGES (1UL << PUD_ORDER) > > diff --git a/mm/gup.c b/mm/gup.c > > index 15debead5f5b..14ae2e3088b4 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -242,7 +242,7 @@ static inline struct folio *gup_folio_range_next(struct page *start, > > > > if (folio_test_large(folio)) > > nr = min_t(unsigned int, npages - i, > > - folio_nr_pages(folio) - folio_page_idx(folio, next)); > > + folio_remaining_pages(folio, next)); > > > > *ntails = nr; > > return folio; > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > > index b2fc5266e3d2..34e85258060c 100644 > > --- a/mm/page_isolation.c > > +++ b/mm/page_isolation.c > > @@ -96,7 +96,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e > > return page; > > } > > > > - skip_pages = folio_nr_pages(folio) - folio_page_idx(folio, page); > > + skip_pages = folio_remaining_pages(folio, page); > > pfn += skip_pages - 1; > > continue; > > } > > --- > > Guess I would have pulled the "min" in there, but passing something like > ULONG_MAX for the page_isolation case also looks rather ugly. Yes, the page_isolation case does not require the 'min' logic. Since there are already places in the current kernel code where folio_remaining_pages() is used without needing min, we could simply create a custom wrapper function based on folio_remaining_pages() only in those specific scenarios where min is necessary. Following this line of thinking, the wrapper function in vfio would look something like this. diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -801,16 +801,40 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, return pinned; } +static inline unsigned long vfio_folio_remaining_pages( + struct folio *folio, struct page *start_page, + unsigned long max_pages) +{ + if (!folio_test_large(folio)) + return 1; + return min(max_pages, + folio_remaining_pages(folio, start_page)); +} + --- Does this approach seem acceptable to you? Thanks, Zhe