From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D17D5C5B543 for ; Fri, 6 Jun 2025 02:38:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3DEDB6B00B3; Thu, 5 Jun 2025 22:38:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B71A6B00B6; Thu, 5 Jun 2025 22:38:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A5936B00B8; Thu, 5 Jun 2025 22:38:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 07E506B00B3 for ; Thu, 5 Jun 2025 22:38:05 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 797FE81E83 for ; Fri, 6 Jun 2025 02:38:04 +0000 (UTC) X-FDA: 83523416088.12.FEFE32B Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf11.hostedemail.com (Postfix) with ESMTP id E5A7B40002 for ; Fri, 6 Jun 2025 02:38:01 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=POwIBV5y; spf=pass (imf11.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749177482; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=+LoKFbaTAzGXCfMggd8263jdoOLVlpajbWllgdTcRGM=; b=Tzqu0KgEhkwDAERmCuPB0HmPT+d9cMQI0s6vk7HqWjxzFrCp3VwDUkrlLaNv2rj9OVtJeQ pcq/PzSZogTeJEE8zQ6ri6wfyZ4oAstW2bBJQT8YW6ewlxOVvc+7uum3Blno34qkk9mfcF w62ZrMJco2Qb3Xjoj2v7dENlYYC75ZY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=POwIBV5y; spf=pass (imf11.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749177482; a=rsa-sha256; cv=none; b=Ubu+Sx+g45fMtarDGggMrmUQuP9tWAhreD54ruoOtaDr4rx9kbkxqEnBhXHqd9DqS1D9dS aZQFVzDJY+K09KxpocTENRYM5IPnoGVII59bhvUXrQ91C3gweANnu63G1VsjP4sahpDGG4 TBWaNaAMq4LEQ2S96wrYKrycpeGzLXs= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-234b9dfb842so16300565ad.1 for ; Thu, 05 Jun 2025 19:38:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1749177480; x=1749782280; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=+LoKFbaTAzGXCfMggd8263jdoOLVlpajbWllgdTcRGM=; b=POwIBV5yrUHqt+PoQPIn/rrYj0shRo3Rz74etORGpZYWjKVuGvgOHVUHVbW8QGzgLA wreI9vRU9mYfztSZnIZrz20PJUuBG4h0evmWGNe9WIdyuqgRlJbZ+GseERPeG0i7K39R kbLECyWLaKNjpcZQYAdRyNjZx3wQFGGQN4BwJnREB9Pe0a5kKj2nHMJKmXg002isE6yo nZhHlRyLW2IaizlQ+uYq0Ufrk/tDApik6Kzlox8iy1y2UdXh+edqb5uWseQjeSpTCaaP Z0dUCYr4h8cZ6fJw0tTgcBYvizp2JV5efIs6eR6WvZG65UcBXUw8EjeYmM9pQeTOPKzn Y/4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749177480; x=1749782280; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+LoKFbaTAzGXCfMggd8263jdoOLVlpajbWllgdTcRGM=; b=P/PiVhw3XbzHRl9YQYAV7KDiAP0w1aNi3H13bcJw3mDSjuLiknKYy0LkSG80ga1ejy P/go1e/mMPyZl4fN8B0Hr6H7MFEs6Q6UgWEJMijgyzteQF8bP2c2tjZlVHTO/E0udnKe Okne7wIie8GzpDwmAy0fZ6ZwD8Eek4msMVMGZI9NxOgTRaI2uOjLRpVN4fm34MX1tNU1 h1FON5vtPRJQPpmyDXwQTQDqW2nJcPMwuOMDnjbRv8HLJAsmnLUNlEsBPEVUZ2rqzEwR bVwEjne3ZYwl/WFv7uJfO3RDkU1ivN9heWFiPWPoTw0zuUoSpm9uCEWNA6LyDKnBQ4T2 FlFw== X-Gm-Message-State: AOJu0YzzlbbdBow5zkZ9Cot7EspBAOIp3LDWkq/IdOfmgsjXyyBWd41j naet0f+2ywvZpH6+TblawTrCatpl7qAPL1d3V+sKNtm0SDmkBWoAW4drbSw00GdGI+s= X-Gm-Gg: ASbGncvJPn6GAra2fr5F31sjPfOTIpiAI2nknT8WbzbjD8ck8y6hZ/y2tXEdxAmqPZg A+CBwVJEj3ZVaWY1JzdNie8WLBCvMoy590RYduoe7Ukg8T7N2kCtBx+u0lLblqer761GJLx3Oq1 Ryppq4+9t8b8U2aR96oew5mRVEG7sXjXQbPoY5Xmtv6X+Y9RTdpqcSJPicMRduMuEhzoU76/WI2 ldPCeo8pLftznT2DsgldZn9aXBQPA78BmmBIJfy/mzYPN6QbqwuDEXVCn8LW7UJ7NLjodw/L22u NhqwMAenplzhkrKrH0WlcdPnLdFhGdY4P9jvxgmnwolx8uCvIDBkee44vk5nIdjsEoXmktXWzWn Adg== X-Google-Smtp-Source: AGHT+IGtZW3BdOgL3r8jOtAGn1vMu3y//2ew46XW2/qIieN+7GShazHZ2qvIKrktPJFplwx1zsWpXw== X-Received: by 2002:a17:902:e5cf:b0:235:f45f:ed41 with SMTP id d9443c01a7336-23601d01966mr24112125ad.19.1749177480438; Thu, 05 Jun 2025 19:38:00 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.7]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-236032fcd58sm2768275ad.122.2025.06.05.19.37.56 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 05 Jun 2025 19:38:00 -0700 (PDT) From: lizhe.67@bytedance.com To: akpm@linux-foundation.org, david@redhat.com, jgg@ziepe.ca, jhubbard@nvidia.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, dev.jain@arm.com, lizhe.67@bytedance.com Subject: [PATCH v4] gup: optimize longterm pin_user_pages() for large folio Date: Fri, 6 Jun 2025 10:37:42 +0800 Message-ID: <20250606023742.58344-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: E5A7B40002 X-Stat-Signature: ydifit39xi6xba3unu5yoz5nn8sq3z5p X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1749177481-509172 X-HE-Meta: U2FsdGVkX19O9r+aZVYm6gCqZbBkf2k+QdSgX5f6K5U5sDo49R5rYgzeI8m5ssxKxmjZ2IN+Lc3DfHEpCMo3G36yQZUvniOjxYzf0jeLrh0Z2hs90i4qcyDNA0mTIYZjDFLX1tJ7i4XEMWBxUjOL+p+WpxOsbtRauvRD91/HTIpOSxSfLeI0b0RPsYpPp1Yz83AIBIohBcUOr/gvbkihGUVvmZCvhQbEVqU6HC3ARzoZzIrebMQJMOjQ/N2pecbmHiF5PRHF53sc8rneMzRLTq0Ig3etBZjiw77vvSQPvNBeeofa5IDcd4CCv1ehZrjVqI2ZPPsOMa33mY4uEcVI6BbdMYZvCznzNdHmWERueI7feUUgDFVLgS4mfYM3rcNZg8wwq76UZV12I399O0XNn2Qs9WKaVvGPTsnmjd/5vR0m+mDLchwUXNcwu3bOSfOMCJrWeSxe7bkA5uCg5BoZhHz8iKIEU3IUhqq/JTeKnOA36JBbea9weyZPwu9hd8X20BF6ckJJhsL4ZFgAV/694CWfB8mK15ajJ14N2QzjPMrpYt+aeM2gnViYdK0uyPFytCIryHYho5lYeke1jYr9T/r0p6OpjnbA7dVfig3dJ2eRg/EE2WZmljFrkasUQD+4qld+l3KIr4oqgybGSayMvNNpHOslXJABNKKw7w8y0qf1oMb8KIftnd1clxm2mE5/7ssVHCi8yn+sNdjOgYTAVE3JMtzfk0590q6LAIbENDKZs9FV+AU7pT9ZUCXDTlg5zt86QzwIuCg74pUBuR6SbeID3csMgAr4YFNl37PQ9NiuFg/8rOXLJSklJW6WEHk166gF6cNMwaFD4AtTY8NjZhEtKHIVrElgqTuyndzN+G7B7MVshLZjub4gRRU8B5SpKimkA2LwAHyHaMvP7umw59iwMLG7CofMPvd7Aj9A0nRpMWT6v47oWwbCvqJMvwo4aLarJ79zJSL4lBsZxd/ 8PVicvR5 uEEdyFg2a0yW0Dx38PTIXmUo2FfG3t9Sl0uN1UjcX9g9sTszBNNIRoa2WeigRHczzMiDe0vdo1qrvunHeDByXiRgx8F7EVB8lDuAJvG5TrHHv8EzghCpPIfRlB8BnjDD0x6zg178vLXYVxlC5GhszFBzaQiAH9Yd9zMGtDn3nFoQcUeW7oPBbg6kYdzS1WaDSsxXzc89iAKmuz8TrpG1SJ6D6sDM15/obfHQEJf22hPc5jLJAOulWd//U98ATNGYg/67fi+bXgWD6lRGIArnzS0OUMbRXiQdc/HXn2mQNT9xbJV8ibWJf/30TxT+1qPdKQkvAKM7WASoJKaQOV5N+OLwhaCdM2qe3zCAIAByXpziaRVrAcWWs+bsRCBDlxd8DgSggzC5dz17srC1Q8CXzAAy+j5olRDcCXEIKvGMpfkoOAdc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Li Zhe In the current implementation of the longterm pin_user_pages() function, we invoke the collect_longterm_unpinnable_folios() function. This function iterates through the list to check whether each folio belongs to the "longterm_unpinnabled" category. The folios in this list essentially correspond to a contiguous region of user-space addresses, with each folio representing a physical address in increments of PAGESIZE. If this user-space address range is mapped with large folio, we can optimize the performance of function collect_longterm_unpinnable_folios() by reducing the using of READ_ONCE() invoked in pofs_get_folio()->page_folio()->_compound_head(). Also, we can simplify the logic of collect_longterm_unpinnable_folios(). Instead of comparing with prev_folio after calling pofs_get_folio(), we can check whether the next page is within the same folio. The performance test results, based on v6.15, obtained through the gup_test tool from the kernel source tree are as follows. We achieve an improvement of over 66% for large folio with pagesize=2M. For small folio, we have only observed a very slight degradation in performance. Without this patch: [root@localhost ~] ./gup_test -HL -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:14391 put:10858 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:130538 put:31676 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 With this patch: [root@localhost ~] ./gup_test -HL -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:4867 put:10516 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:131798 put:31328 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Li Zhe --- Changelogs: v3->v4: - Fix some issues of code formatting. v2->v3: - Update performance test data based on v6.15. - Refine the description of the optimization approach in commit message. - Fix some issues of code formatting. - Fine-tune the conditions for entering the optimization path. v1->v2: - Modify some unreliable code. - Update performance test data. v3 patch: https://lore.kernel.org/all/20250605033430.83142-1-lizhe.67@bytedance.com/ v2 patch: https://lore.kernel.org/all/20250604031536.9053-1-lizhe.67@bytedance.com/ v1 patch: https://lore.kernel.org/all/20250530092351.32709-1-lizhe.67@bytedance.com/ mm/gup.c | 37 +++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 84461d384ae2..be968640b935 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2317,6 +2317,31 @@ static void pofs_unpin(struct pages_or_folios *pofs) unpin_user_pages(pofs->pages, pofs->nr_entries); } +static struct folio *pofs_next_folio(struct folio *folio, + struct pages_or_folios *pofs, long *index_ptr) +{ + long i = *index_ptr + 1; + + if (!pofs->has_folios && folio_test_large(folio)) { + const unsigned long start_pfn = folio_pfn(folio); + const unsigned long end_pfn = start_pfn + folio_nr_pages(folio); + + for (; i < pofs->nr_entries; i++) { + unsigned long pfn = page_to_pfn(pofs->pages[i]); + + /* Is this page part of this folio? */ + if (pfn < start_pfn || pfn >= end_pfn) + break; + } + } + + if (unlikely(i == pofs->nr_entries)) + return NULL; + *index_ptr = i; + + return pofs_get_folio(pofs, i); +} + /* * Returns the number of collected folios. Return value is always >= 0. */ @@ -2324,16 +2349,12 @@ static void collect_longterm_unpinnable_folios( struct list_head *movable_folio_list, struct pages_or_folios *pofs) { - struct folio *prev_folio = NULL; bool drain_allow = true; - unsigned long i; - - for (i = 0; i < pofs->nr_entries; i++) { - struct folio *folio = pofs_get_folio(pofs, i); + struct folio *folio; + long i = 0; - if (folio == prev_folio) - continue; - prev_folio = folio; + for (folio = pofs_get_folio(pofs, i); folio; + folio = pofs_next_folio(folio, pofs, &i)) { if (folio_is_longterm_pinnable(folio)) continue; -- 2.20.1