From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8925C5B549 for ; Fri, 30 May 2025 09:24:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 692E96B0083; Fri, 30 May 2025 05:24:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 66A466B0085; Fri, 30 May 2025 05:24:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57FD76B0088; Fri, 30 May 2025 05:24:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3942C6B0083 for ; Fri, 30 May 2025 05:24:10 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D00C91A2B1E for ; Fri, 30 May 2025 09:24:09 +0000 (UTC) X-FDA: 83499037818.30.EF91FF7 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf10.hostedemail.com (Postfix) with ESMTP id 510B0C0003 for ; Fri, 30 May 2025 09:24:07 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=kXmk1t6V; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf10.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748597048; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=xIkeShPRKlS37USVcPnC1/eAh/CEn8XdgrvHCexPAXg=; b=Q58Ayw1ASMo6fOFZwgyv2dQlEvRQSA5DWPp2KSOSiCiYEUSQnmK1PmVhYhrVYIGR1lyxog J+irVeQW7Sn4kZPZIBqs4dxC+wbi9absJTr5e/nVjwzqWGRKaQ+WOZFtFgckJ1E/CAuOTv z5fGbmp5BhkPF6bjSzvR1JPkR/V5acQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748597048; a=rsa-sha256; cv=none; b=AsZ41vHYG4nROfPwTPtcpsYBkeoHDSre1cQVql4xM/H5T2o1wamvfJuiZV7Nr8o9w/IDV0 IpF8ZJIojifQrIcwydxgV+c7xAnUj+5KewKo1gf0fQthSHmoujQJ8vmCfT6/6W0D4WGLpu mwurlM7R/+XEp5Ehb1ploCvkStEjPkE= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=kXmk1t6V; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf10.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-22e331215dbso20659175ad.1 for ; Fri, 30 May 2025 02:24:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748597046; x=1749201846; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=xIkeShPRKlS37USVcPnC1/eAh/CEn8XdgrvHCexPAXg=; b=kXmk1t6V95yrT/qStr4dTp5GeKa80R5KKBwZ0PZxt6wTvAVzBeBLb+l5vhlMDe79KW VCZQJMnDsSvK2o49ZxvpwqLbWAaLvZFkgowlcqXVymJ5/jVK00T+t8AEdIzc38E7hNRq 1No1PYWGW20tBPXRz0W8lH9duVUP2aaQfWlFsm8iK4jyMArHEv/EcECj9Ydj3aDrEVZ8 FUhOw9+c0MREUqtNZp9RtjlOn4OEfgPSm+5dS+BmMBwCnl+nxNVpCCqUejetUTSA9c50 WcmdEsSi0DZTEKY9nnsZumCKzUiZerbcOvsTOp4CgMn5cdgTWBmXY5RTKYsaVGfuMxRa uTuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748597046; x=1749201846; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xIkeShPRKlS37USVcPnC1/eAh/CEn8XdgrvHCexPAXg=; b=vTnkpPVYlYgF5bjqU2p7ieUbzSAo6AgojJOAwtNYUaWDWmoNBmPgzSJ9oSIclbbUAj NgX3wMmJ0MGgOlTJySyxo+YhgVB4x69ApsvIP3Q/MXB+OFlO/sEPi7QBDY1VfjaDKcjg yVF7Gh7uFA97T0cKTO5PyLYo2aikrB7SPJTCRUBFmE2zh4v5uBOeZld17ySmqpfGykyB 7cc5dDLTwrHeZc+tIF/jGhNiGwTB4UkYDpS2XU6eZEAkjWT0rLYrOTc+w+OEaWr24TLd W9dYBeJ2WHtThmdZnNYi1V8TE41a00c4mMU2amc0yYrMhlindFA79iQMfXgnZJNlCoLv 7WBA== X-Gm-Message-State: AOJu0YwnwlOPY3pXpUbBEvi30lVZCjc2X4YmfCwqfC1OosHgrgd7Qcc9 slq/cLvVkWnHziArQ67Tpi+v8NtMjNYbwrQotlvmrNGGctY604DHIy/F/1Oisvm0mig= X-Gm-Gg: ASbGncuQiSWdlB0CQ0o4ESK1XCUZyb1q2aMGKxMVza3bIlHM7c5RLMr/ERalgkRdY36 VFuCF/wJ3Pp9/GzRgrVOD6QcybdAqrr35w1cNGc6bZgQIVlAf3kP83yQ5e4Ub+46pU0acigrDgv m8bHOmZ/HHULxMg8d9bOOf/pMoQ3/3vm5uPZQuCF2KQ8EgLJ8T6R4lqAQzKlRsu+iyFHW1faXLY PgkeEm0ULGTyaE2BfZGpfSndmXkPu4AG8E1PjlQIx1t79ZqL3BZnwlBknbJcRXef27slyQaxI9W DPQsMcrrzwgytQOjkrNHAAdZG6yrRmEJdAErdjyrZDM6ggthLSUgySwa8a2LYIYS9QxvdL++WmO gSw== X-Google-Smtp-Source: AGHT+IHCH599oXAEUYNU0GNXy8bBTUxRQinYHHXmUoGoX7uMtkqXRXPYME/P+Uz+v9cbnxn9jl/p0g== X-Received: by 2002:a17:902:f70f:b0:22e:491b:20d5 with SMTP id d9443c01a7336-2352b790294mr34633575ad.26.1748597046039; Fri, 30 May 2025 02:24:06 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23506bc88a7sm24549685ad.39.2025.05.30.02.24.02 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 02:24:05 -0700 (PDT) From: lizhe.67@bytedance.com To: akpm@linux-foundation.org, david@redhat.com, jgg@ziepe.ca, jhubbard@nvidia.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, lizhe.67@bytedance.com Subject: [PATCH] gup: optimize longterm pin_user_pages() for large folio Date: Fri, 30 May 2025 17:23:51 +0800 Message-ID: <20250530092351.32709-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 510B0C0003 X-Stat-Signature: 7jhfupor8zzegk73yabwseicbtum859d X-Rspam-User: X-HE-Tag: 1748597047-79887 X-HE-Meta: U2FsdGVkX1/e5x0GY3w6GYL+EzOvjHCMUsEMy0xJHiztVp91HIagJFtdyTuQ0jB0rZ7GPlOIvrd3r2LfRqQBBGam74oRvWbYb9tM7Xh4PkJ/X7T14lP3PyhmyEuO/pUebqIMKMYAwFQoMG+Q5dlK4dRiVIqCIqyIh2tmtnsyzwniZOEisHHK1qo5GOaXg107r59hoJGvoTmGaX0GVQ+9bYqP2Axk+xnWQb22Zhv7NE9SVwefd2TEMEIbdH9GkOuFBMFRumTre2pAy0soGKf6miEovTvqs+z9LQG0xXzfqhB8VvyjaePtFM88wXwRT7+6Kt5Nqa3bOv2Oob7+e9zjwycnIZjdF1o4Tb7EdlqLdjRaRL+4p63iQXQqpFDrBYaw9I1OVcarhLFrdlBv3K1jsJQOE3YzcsMSSLWjkgv5sf0xnd4ULVLEia7gp2fu7n31bAle/ub2tirFY7i9bxdWT0tc7gVD2HwfwFB5SDw+xoTvber89eaPqPWAFHdble/q1DR/n8segl0+TVHKfhfKZPfxC2zom6w0BG39AVuVb5iXblOv8pSoWJw2tIKyKBnHE5YKNMkweq/oUyt60uKIYRx4oTnFpwC/r+6AJmScl90guPz7ZTwZ3vWmhDsNfBQT5aXFYWjgqZjghCsY5Ood3zBKreNDfR6ojngei/hU9ULF1cRXGECVtNGD1UWhC7lzZ7x4sq02vdjyEuvbsPKQZLvFzSZyOiaNAei0jzQGK0xqC9FkSEEIg8Ycgx5g4RQKxSsN968ZJU41smuH1gTAnLXOPH6ksTgDF10B+gkikHn9gutIaRN0/zPBAw/29vNpN0KbaH678RvGt4xbGyT2GmxV8TmNOYmKegwyymY/IeekHShybBr0Od9F0eccrS3XNALo2kvMzrIUti27LHDAhl7O/Y9lesaM0PuVllTNqaZAVtXmRYT72y2hE7yKiecqOykCQ/ubdg6EmGh3UD3 xYF0m9PE uxZq7eDL6Ig+rxyRf2VgycebnrXFrS8Fu/m+dV05xyNarTvVFC3Tx4xot6j0fghvYV1UZvw3k091PUSeS8//zDJd3W7phbLi+MqXJUKfZUU7JTWeMUkoEsgXN7+jD4lJtpk/UZOwtZrWyEgbDHx0L/ON3+RZ5RCEITEpeqgbqrzvhT9E+vx7VC84lxz3rj84+XRHeeaQwa/5Byulv9QME50eN8um2vljhR81OIJl4XFk+Nx22IIl0WB2tAI3LfnYNoHRIVmgSfTLCgSO4oOzO/ZkBLCP2wXuxUT9yL/4f/DOKNGmYpZMkIj8IRd2aYo+sxE0jnKrxs2AIhxjncGycaiBFGdoIor2gBIu8K35Sc4qldeIOyvfPHTqK5Sckr55VKl+7i5Wy3UNFghgjxFQ0yjDeTaqSgzaCWc2OTGgGBSg68Z1Hyd7kol5H0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Li Zhe In the current implementation of the longterm pin_user_pages() function, we invoke the collect_longterm_unpinnable_folios() function. This function iterates through the list to check whether each folio belongs to the "longterm_unpinnabled" category. The folios in this list essentially correspond to a contiguous region of user-space addresses, with each folio representing a physical address in increments of PAGESIZE. If this user-space address range is mapped with large folio, we can optimize the performance of function pin_user_pages() by reducing the number of if-else branches and the frequency of memory accesses using READ_ONCE. This patch leverages this approach to achieve performance improvements. The performance test results obtained through the gup_test tool from the kernel source tree are as follows. We achieve an improvement of over 75% for large folio with pagesize=2M. For normal page, we have only observed a very slight degradation in performance. Without this patch: [root@localhost ~] ./gup_test -HL -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:13623 put:10799 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:129733 put:31753 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 With this patch: [root@localhost ~] ./gup_test -HL -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:3386 put:10844 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:131652 put:31393 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Li Zhe --- mm/gup.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 84461d384ae2..8c11418036e2 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2317,6 +2317,25 @@ static void pofs_unpin(struct pages_or_folios *pofs) unpin_user_pages(pofs->pages, pofs->nr_entries); } +static struct folio *pofs_next_folio(struct folio *folio, + struct pages_or_folios *pofs, long *index_ptr) +{ + long i = *index_ptr + 1; + unsigned long nr_pages = folio_nr_pages(folio); + + if (!pofs->has_folios) + while ((i < pofs->nr_entries) && + /* Is this page part of this folio? */ + (folio_page_idx(folio, pofs->pages[i]) < nr_pages)) + i++; + + if (unlikely(i == pofs->nr_entries)) + return NULL; + *index_ptr = i; + + return pofs_get_folio(pofs, i); +} + /* * Returns the number of collected folios. Return value is always >= 0. */ @@ -2324,16 +2343,12 @@ static void collect_longterm_unpinnable_folios( struct list_head *movable_folio_list, struct pages_or_folios *pofs) { - struct folio *prev_folio = NULL; bool drain_allow = true; - unsigned long i; - - for (i = 0; i < pofs->nr_entries; i++) { - struct folio *folio = pofs_get_folio(pofs, i); + long i = 0; + struct folio *folio; - if (folio == prev_folio) - continue; - prev_folio = folio; + for (folio = pofs_get_folio(pofs, 0); folio; + folio = pofs_next_folio(folio, pofs, &i)) { if (folio_is_longterm_pinnable(folio)) continue; -- 2.20.1