From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 662BCC5B549 for ; Fri, 6 Jun 2025 09:19:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3EE86B0092; Fri, 6 Jun 2025 05:19:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F166A6B0095; Fri, 6 Jun 2025 05:19:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E05456B0096; Fri, 6 Jun 2025 05:19:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C07E56B0092 for ; Fri, 6 Jun 2025 05:19:29 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 66E34C0B65 for ; Fri, 6 Jun 2025 09:19:29 +0000 (UTC) X-FDA: 83524427658.12.092301D Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf26.hostedemail.com (Postfix) with ESMTP id 46B7314000E for ; Fri, 6 Jun 2025 09:19:26 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SfbEnNxG; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf26.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749201567; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h4T5YwrzsY8niDNVwfUXVb0rsRbVKjjAOyTSSHmbqwg=; b=j3Ic/zryuRTxp3ch2C+tOj3VvNaHP51XUJUMgFQEWDW9rFi0ojs8VWPxKh8Ddk1DZjnciS p0PE9udQiH1BQMSk7yLnaWqnYUdbTglryqorBrZOkDehPqpEYNc37sqKB7UMxkOGDHqZ3w yrFqMxaxX0YLz7js3493d3upKM9M6h8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SfbEnNxG; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf26.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749201567; a=rsa-sha256; cv=none; b=qhPDmsD7nq/klRrTl8t3EqiZwXbgSBjiUvqB5DP35zMNAeljhICKphefB9s1HKjU6VNOOO RPN8IjLAG82E+PrVlt289UVQOuyB5SaHMo24fyb7khkwHoIYoTN6GqCHPUmF/v3HXymCpf 9YqD51Mp8m8/yh5VwuLb8UTkLaG57LQ= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-235e1d4cba0so16133095ad.2 for ; Fri, 06 Jun 2025 02:19:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1749201565; x=1749806365; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=h4T5YwrzsY8niDNVwfUXVb0rsRbVKjjAOyTSSHmbqwg=; b=SfbEnNxGsE8ypPxADs8dm3KNHUC7vDVy4GIMdgobAn9YBS7MD9qIoV5B60p50UttwU SBibx+JftBH3w4xDmdkI6Ky9Vl7v5rZjAnb/qP2JKhn3nLQD0F6O3ZoV4IdTUat8I+ne D2RlEUDn42IZx5EYsJivMH8XeaFcw7j8EqvtqOI50Wd3asOYJDX95/Vbde+s1z1g+TN1 1pGLUEScAtFZTODeeoZspkk7nzsOZTRJqSv3C7qcVMSKVhWi3IwWY3NMGI9/LmhGxTD3 jEcbDE4V4fU2zjJtU6mLj0uBUbVkj+rpMlPU0kpCJcdPBXf/QSS+xygF1eLLR84nM1cr fi9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749201565; x=1749806365; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h4T5YwrzsY8niDNVwfUXVb0rsRbVKjjAOyTSSHmbqwg=; b=M8pBH110YfkeSa1on8zcEBp5zZva3z/SIlP5uB1XnQKzLkLRKuN+z3Y87SX+dkKBr3 TpxY0Uiwi2gUatWxr8gYSv6SBkStAAfdUgfd7UAKd8Q8nDXpn6b90/JMeuybRehFhWcG HLfQCp4OswqC/6n3PWvUj0aEJH5t5/ypo/EuYY6nxY2SytRGJqJpeIz/CPHMivlFwrYd MRHRxSjBR3FLp1h76tN/ZglC9tvgR12pbe78+PZAJWeKm+JkitfF7aqd2reEo6l/zsOb yoPmmZ/pOXIcAfDGMvTfTXynwDCdN9vZGYRoIpg6p+TlsVotL26+RM9VRfegY1/q7nSs vxBg== X-Forwarded-Encrypted: i=1; AJvYcCW8VybUYsfRutOrHGTUeXVO5Q+Omb6Nl/ryMZUkXcAIoHE1627jYbULvEeWUSMu6hh2RpkPzANNug==@kvack.org X-Gm-Message-State: AOJu0Yxg9/TgOcSLaMvnknOfTQSTMUAzXmX3MKepiqzJY4aEiN6hZjU5 oBeOJg/EusPywfhmO4MoIfcTx0NwD1jTaje2GrMZukx9HmYEUoHBtndABEVFGvi4jK8= X-Gm-Gg: ASbGncvfnAx+9kMP01hgmg005xOhAnWqX6TLW6YMYnyWFNuCjFaOuLs0KQgaU9AWqyt fHQalPQQkp/TEsBLy3h+YghbDEimqElDTQL3wG5H2WHuwx6bI7cPEsximEFShoPe6D85Wk2a1OE 8vbEBlL7DIV27RQZgdUuftYeBjtx8LPDRm4jRFaSXZx9rVApOobYBXVW33ULq6NGcs1s9Y1M8/B RNlVDJgWdR9RCXeBYikB7to3lV/GNlmMbPmybAMqczXcOt5bS3uWdRefmOtoMi+ilRiPc1ET7j8 TO94VglGvDEq7a3dFJZ24OkG+xxcHUIbBy2x3HueoXXWCFj/5QQvvlIxLEdexwAHJakjJut33hS PKUI= X-Google-Smtp-Source: AGHT+IHCeDnJP2rWhy7a5vwiHZmUj5bCErFwEoL87T2KVYx/hLNj2JfWdn14NtkZfZZyhK810d4Gxg== X-Received: by 2002:a17:902:f651:b0:235:f70:fd44 with SMTP id d9443c01a7336-23601d071camr35748375ad.21.1749201564834; Fri, 06 Jun 2025 02:19:24 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.10]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23603077f2csm8538105ad.36.2025.06.06.02.19.21 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 06 Jun 2025 02:19:24 -0700 (PDT) From: lizhe.67@bytedance.com To: akpm@linux-foundation.org Cc: david@redhat.com, dev.jain@arm.com, jgg@ziepe.ca, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, muchun.song@linux.dev, peterx@redhat.com, lizhe.67@bytedance.com Subject: Re: [PATCH v4] gup: optimize longterm pin_user_pages() for large folio Date: Fri, 6 Jun 2025 17:19:17 +0800 Message-ID: <20250606091917.91384-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250606023742.58344-1-lizhe.67@bytedance.com> References: <20250606023742.58344-1-lizhe.67@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 46B7314000E X-Stat-Signature: o7pjjupbchg4xpucnhohor5ypa3od7o7 X-Rspam-User: X-HE-Tag: 1749201566-897739 X-HE-Meta: U2FsdGVkX18AlvbLm6qk/wGlvnhKitWPaDRJgfHgJoJAQYSPyAtfKpYeZRWhDiD148+gvb4n3y15i/5xP1kflquyAXw3hY6EXFdcSfPCXWAHbzPeKtYCd9XPmE1dYpjC+GH/tedS1bha8NrphZXKsiWKZI/+D7m/yXVnRGUWBY332V+iGCb1Wf0q9HMIOlesWR+8+OrYSpFB9iOJTbLG98n0zaLEOyyo/UqEDB6Hp9G4zCO7XjZyvLEtTPJ9CrnNDKCOHrQNZTEy/7xAd4VAxEY/86avRjGxyg6POFhRdJCwegRUWvyKJm7Ejr6422JwYXi0+Vf/zKuDW/pS5e4aX03fQXcqnvYXKa4mtgZAlwgfybhmM3qarJlJ25AKYRyBikMI8WU8HykCpLIMmhZtXmdaS2IUQsmvqk62PS1+2bDarWfHyoQN6fjmXvLVkZHMWpb1lRA+XM2B1Zb3T0SEcXT0/pzOlSRII1OaiF7OjIdMx5HGBMZD6I7yu7LqrQzdhGHU9Eu9pZ+BFYwIYziSnpQRpEp5gLEwkuEYaadoiTTp292u3A8jC9QyJ4v5VdHbfCm21Hh9dzn/K3pJt/PEhHJ8AqkBvluUZPoef/TBllOhmSOpPjUGaKN6Ow+YHLj6Gpzklk0F/UOpdgxS0vlyhBeHjmBxcnwdO/kS3kWcLl9Atz7hqxkmJXeznSeNw0rvpzmapD/4bj5PZTWbKJxLJyrVygqyIr3ovjvagTS2mZ3bbUJVB8LjW/kdmbWQQwBC7/fzOR6FN8ALfGfoXG3kTGhU2rpr9+QNYyAS918wRooA0XbLBeaDugvvJkJAjPoEZJ+yZ8tiSxrL1H5hhgKMOWwXAgsEGzw4GbVeylWeZ6Kv5AfTzgOVH1zAfIwcjgoLeYomzKnx50G8RuXxhiGgCD/xaXZdiOjQZe6bdsaO8vB6+9BUoP8wAfanWVVdT9FdUaYADI2HbPT6zrSc7HO QrxTD0Ft Y3FB8my6LDH1+rdodglI7ad460wG6r+mlREpxCFrxhZEwKn7p3BfQl2+eXIL2UOgTWJRObQeOmipk2QfQKAbYkcNdTAHlTKqohLxU61NJHTMIAbWblUsMSidgImA6mpxoOCqGuK6lko6+IoQzFp46EsVYrAVBKFj0XWm7TQcuFjuKr5Aw3w4kWXgiCj/5/PZgfvvbzb+P4dUPvyxukH7gm5Bb3kbLvtKs6hq5dSouqQzSGl4iD7t954LdwUj8NbVTFEZZn9/vWrdkxgt/iRdR9AiA9bb3vwp4G1iunDlxaINGgPmna9zMPrp6A0pxRmdEAkvhfTgPgiRz2N3Wj6i43OpsIOZTJLgirfg0fQhjFdnpcj+oweXeM8X2f6PzmG0hOJ7WoBAF5oUMDkYjujSInAqWjCdfViwbkpZj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 6 Jun 2025 10:37:42 +0800, lizhe.67@bytedance.com wrote: > In the current implementation of the longterm pin_user_pages() function, > we invoke the collect_longterm_unpinnable_folios() function. This function > iterates through the list to check whether each folio belongs to the > "longterm_unpinnabled" category. The folios in this list essentially > correspond to a contiguous region of user-space addresses, with each folio > representing a physical address in increments of PAGESIZE. If this > user-space address range is mapped with large folio, we can optimize the > performance of function collect_longterm_unpinnable_folios() by reducing > the using of READ_ONCE() invoked in > pofs_get_folio()->page_folio()->_compound_head(). Also, we can simplify > the logic of collect_longterm_unpinnable_folios(). Instead of comparing > with prev_folio after calling pofs_get_folio(), we can check whether the > next page is within the same folio. > > The performance test results, based on v6.15, obtained through the > gup_test tool from the kernel source tree are as follows. We achieve an > improvement of over 66% for large folio with pagesize=2M. For small folio, > we have only observed a very slight degradation in performance. > > Without this patch: > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > TAP version 13 > 1..1 > # PIN_LONGTERM_BENCHMARK: Time: get:14391 put:10858 us# > ok 1 ioctl status 0 > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > TAP version 13 > 1..1 > # PIN_LONGTERM_BENCHMARK: Time: get:130538 put:31676 us# > ok 1 ioctl status 0 > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > With this patch: > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > TAP version 13 > 1..1 > # PIN_LONGTERM_BENCHMARK: Time: get:4867 put:10516 us# > ok 1 ioctl status 0 > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > TAP version 13 > 1..1 > # PIN_LONGTERM_BENCHMARK: Time: get:131798 put:31328 us# > ok 1 ioctl status 0 > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > Signed-off-by: Li Zhe > --- > Changelogs: > > v3->v4: > - Fix some issues of code formatting. > > v2->v3: > - Update performance test data based on v6.15. > - Refine the description of the optimization approach in commit message. > - Fix some issues of code formatting. > - Fine-tune the conditions for entering the optimization path. > > v1->v2: > - Modify some unreliable code. > - Update performance test data. > > v3 patch: https://lore.kernel.org/all/20250605033430.83142-1-lizhe.67@bytedance.com/ > v2 patch: https://lore.kernel.org/all/20250604031536.9053-1-lizhe.67@bytedance.com/ > v1 patch: https://lore.kernel.org/all/20250530092351.32709-1-lizhe.67@bytedance.com/ > > mm/gup.c | 37 +++++++++++++++++++++++++++++-------- > 1 file changed, 29 insertions(+), 8 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 84461d384ae2..be968640b935 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2317,6 +2317,31 @@ static void pofs_unpin(struct pages_or_folios *pofs) > unpin_user_pages(pofs->pages, pofs->nr_entries); > } > > +static struct folio *pofs_next_folio(struct folio *folio, > + struct pages_or_folios *pofs, long *index_ptr) > +{ > + long i = *index_ptr + 1; > + > + if (!pofs->has_folios && folio_test_large(folio)) { > + const unsigned long start_pfn = folio_pfn(folio); > + const unsigned long end_pfn = start_pfn + folio_nr_pages(folio); > + > + for (; i < pofs->nr_entries; i++) { > + unsigned long pfn = page_to_pfn(pofs->pages[i]); > + > + /* Is this page part of this folio? */ > + if (pfn < start_pfn || pfn >= end_pfn) > + break; > + } > + } > + > + if (unlikely(i == pofs->nr_entries)) > + return NULL; > + *index_ptr = i; > + > + return pofs_get_folio(pofs, i); > +} > + > /* > * Returns the number of collected folios. Return value is always >= 0. > */ > @@ -2324,16 +2349,12 @@ static void collect_longterm_unpinnable_folios( > struct list_head *movable_folio_list, > struct pages_or_folios *pofs) > { > - struct folio *prev_folio = NULL; > bool drain_allow = true; > - unsigned long i; > - > - for (i = 0; i < pofs->nr_entries; i++) { > - struct folio *folio = pofs_get_folio(pofs, i); > + struct folio *folio; > + long i = 0; > > - if (folio == prev_folio) > - continue; > - prev_folio = folio; > + for (folio = pofs_get_folio(pofs, i); folio; > + folio = pofs_next_folio(folio, pofs, &i)) { > > if (folio_is_longterm_pinnable(folio)) > continue; Hi Andrew, I apologize for the inconvenience I've caused. It seems that there are still one formatting issue with the patch (thanks to David for pointing it out). We need to apply the following fixup. Thank you for your time and patience! diff --git a/mm/gup.c b/mm/gup.c index be968640b935..85112c904a4d 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2354,7 +2354,7 @@ static void collect_longterm_unpinnable_folios( long i = 0; for (folio = pofs_get_folio(pofs, i); folio; - folio = pofs_next_folio(folio, pofs, &i)) { + folio = pofs_next_folio(folio, pofs, &i)) { if (folio_is_longterm_pinnable(folio)) continue; Thanks, Zhe