From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70516C5AD49 for ; Wed, 4 Jun 2025 03:15:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0CA3A6B0571; Tue, 3 Jun 2025 23:15:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A2966B0572; Tue, 3 Jun 2025 23:15:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFBD76B0573; Tue, 3 Jun 2025 23:15:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CF6D06B0571 for ; Tue, 3 Jun 2025 23:15:52 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 70BB0122168 for ; Wed, 4 Jun 2025 03:15:52 +0000 (UTC) X-FDA: 83516253744.23.C814E6D Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) by imf04.hostedemail.com (Postfix) with ESMTP id E781C40005 for ; Wed, 4 Jun 2025 03:15:49 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Npn0lN3+; spf=pass (imf04.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749006950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=w5fG5l4hQXv8yGsPykiMnPUQCgbt9tOBCN1omnHe9NY=; b=xkwHZB8LLhsFA4cL/NOvJXKeWNwkiRA4X45tjaEeOinmml6wTVWw+tkhoxapgzaQCFQ5Dr G5SGADxyWaK9Xm/vzPj4Hpg+Yua3D9/nEDSxEatvO/eBUYkagoeYNKNdkbYmLMUBVEj+ng DT4msx1AnpX8LMtE4gAaIyP3NkWIVbs= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Npn0lN3+; spf=pass (imf04.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749006950; a=rsa-sha256; cv=none; b=h5+Nk/McgD8dV8bdX2tt2vUilqxu8U0ouhHAbSDjzzCbkM3GaYuXBAKhhdmKkGC66tTCHl 8mSw79fCeP6GTuSvkR/LZ9I0AEIXv0me8DvC6Ms2myKrK7OzdpHjQ60YAKF+2LdCIuIL96 YN05IIpEa/9jgj1ltUDCOy6d4Quzv40= Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-742c46611b6so7528009b3a.1 for ; Tue, 03 Jun 2025 20:15:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1749006949; x=1749611749; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=w5fG5l4hQXv8yGsPykiMnPUQCgbt9tOBCN1omnHe9NY=; b=Npn0lN3+PqJCuAWwh7BxrGEaoSvefqsOg9t88eRdpcUFRp2DhZCIPpJ6oFToEhtfOF L9OtpG2vNs/IvGVKVH948FsJcjV1YFb6eMXya/tpSVE3cqvQ9DW2FmxPGLuUvKhyk72q 0OVj1p7ppR4AdOBQjT9+6I/wOqAVCs4L4RCbiLLjII/rYYgCsN9oaXWdu1yFSoiYmNb0 042/B2d2j/08EEVJ4rtqZ/4gEFfzY+oy5kl8q0a6G+XmnufhYNOU/8g5r1etm/E/zsV6 SxFonKypuhabgdbfDePdP/Lam+8G9W21SfgpH4g1OQJFtk5LoX0wu7kB7d7ovOwkJbLS WDeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749006949; x=1749611749; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=w5fG5l4hQXv8yGsPykiMnPUQCgbt9tOBCN1omnHe9NY=; b=TXwAa+zMPDwr4+9BV590a34talNmjlz4FStOeN2KrNYh3EuMODWHDOigSHgv6t9VP4 E+AopeLTjaSaRaWr8W8F5f5UyyrkvfjKKpLqAUlzNPvB4xV6X6EdqU5+N9b1U6t4LRA0 FoaQuX0/2Ky8GZyiSUwcuT1jYDlpFkRHRwwcNRTXFrmxtazb6qAQLAa9qqOw91sjRIn5 pSsKFYN1akPZZFcqa1bA/NSxV/mIO0ZO+KbEYIk/jQPCpCL8lo33qPuVWDKJX4iw3kRq /f+b0BRTuE9d0BVh7k7u7l/9cH9oQpIITJGGUpoLn2wBSOWBpq9HDvbtG3rkQEy1Pj66 jvZQ== X-Gm-Message-State: AOJu0Yx0Sbb6OvQZAshcxa2rYo38blyHWYy2fQZ61gyfhNM74+fNFOj0 Gm0R0AmgE0OZWuziBiIZAs4us94KXlzQHurRECSgi+6L2K0RX3d2r9zbGgmP6LwCj1U= X-Gm-Gg: ASbGnctuHYUTc0HwW/NO6/ThWNw1fR50JkNheqN5zDtbJflRTvUMOILDm7yGsQdYa2U 6wii7zzCp2RPpP4L7RAQvccsFcvJbH39LRa4NACqRoSXybSmGkmjS6aka4FcFntQ8e7unbueruY 0IzuWyr0+syOt05qn3FcrBY/YAljXMNmQOhnwSPwcRHbfIU/zWDwRTqgAXyBYM5RdHdp4iac/Fi mnbqKar7yMms4iDEA669yIU/tudZd6nFrGY27E0BvuSXPk+HfVNAGCAvXqGwdGoaPmPqFtYRaVy 98WFVsq8YOKH5rJkGNT1mN03j/LFS+6M5IkrJVH+lKECMEPq17zEX3wVwxdqo1r9oGSONNv5lHH IJA== X-Google-Smtp-Source: AGHT+IHeMztvH/4BxcGokHkf3yurYYRaWgFDwtJD4OKMLxALQDdXbF8pTdOAm1Rhqyof6kYiL1tGfQ== X-Received: by 2002:a05:6a00:3a1d:b0:747:aa79:e2f5 with SMTP id d2e1a72fcca58-7480b07a27dmr2006740b3a.0.1749006948608; Tue, 03 Jun 2025 20:15:48 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.7]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-747afe96381sm10498942b3a.10.2025.06.03.20.15.44 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 03 Jun 2025 20:15:48 -0700 (PDT) From: lizhe.67@bytedance.com To: akpm@linux-foundation.org, david@redhat.com, jgg@ziepe.ca, jhubbard@nvidia.com, peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, dev.jain@arm.com, muchun.song@linux.dev, lizhe.67@bytedance.com Subject: [PATCH v2] gup: optimize longterm pin_user_pages() for large folio Date: Wed, 4 Jun 2025 11:15:36 +0800 Message-ID: <20250604031536.9053-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Stat-Signature: 4fqxuwy7sg6yajzcoq1bf6zy1me8bgbm X-Rspamd-Queue-Id: E781C40005 X-Rspamd-Server: rspam11 X-HE-Tag: 1749006949-745662 X-HE-Meta: U2FsdGVkX1+5FL0q6d9l4EVnXJiqTW8XWoH3j4UQvkLud5+lttLlgs627n6YLy75ZW5NqD1NKux/kjVt73jKs7BR54n/Z4qEtuFtTrD3PBZSXzezYDS+IE7WfmpCCbMT1zpQj5U0KOJGn+7BfDlfBX1BgZR1yGISFrdQsxVeyiUwwAwSz6Cx6XXoOyeoKL5ukPn+0QKYGblEaaANIYr9xi+6aT/4GXVEb18WfdKMD1NknnqB4NU7bKhPb0vhg6GaIwFRv0Yx5B24Y6xGtrdTUkYx4u/8kjjQRaikeccQMcz2l1F14+85p1a80NcvrdKkniJxqJKPNnuP+rk9413fymo8JuFNHMeoMeBqqGgJhAa1K1QCmYcNDLeTg90Vy/EtTGBjVwypuA1wj3S/4ptfdOiR58DPmmVvhLmHu3fRY73W5VTxqQbCVdJGjBhHsw7A7pHhYzlRZcODjWFQCnGrdsPjWjWu1UjqEGzqzMAs3OdPRTo0OJHyCc4qRAKV6pbyuvMlbI9YZkRIDg9sj2dbW8/jIOL0H2KX6lpOi/Mo0pNRwalsFv10ea4fiPSUnBOqDccQEmocAWGTSePuaIesoosHsZLPjDc2cm9emSM8aUAiMBL3J9I+SMawVVfLiuBgkzbZXMc/5tZFSEVRNqd3nHQ/ncu30SS61N3fHcWTKvQ5eEB5fED3IbsUW+yayzR65hOVWZAOwA2tyuDW8dV5umaiOh+ARNCLbrgJrpO5BPR8mhBDkUupqLgB5P722dMrJMv1sNZ5pmANyUMNn0AtAvTI2iTo9oAjerScV70VsVJUAyqYlqxFULLf0rJT1V8s5l/0+is9nTi5ilNfjjrzBtexcpM7NJTtrv0BGcYIhreUp1vYQKPNecKjskrWr8mklty0fjM8BghGpdsQ3KdDb2DpDl3dJuarigSLlRoq65UatU6weNS/68FxHRvZaGcYZ2a76bi1Gh56KfhHNBA X2H70tLr wWWy/ti8C82wQ48VFzPdG+r5+sv5AcWtPGXn7UehPii/k691jqLCTH/6jgPpA/M1vV7Gc7UnOEESuFqMM0/0XhAwbNzbD77O/S/P2jeacCC8b7yPElawZL4PbCGUbj80Y4wFJG1otMWY4rfRBrr5HBqECDj2YIHBTZCT4HepanKnXpmlTWH99P0HVHRIZbrzyjoQg4gefZtIWBq/RiIiZWtpV9A32jd8IE3aL3vS4M6cH3q+m8clttQWXgLp+wKbWeggv/lCbG0TPT9A6J9lz5HbIepS7EtKch3Lo35VdM2Ye6a0osIWRaD8DAcrOw4yGBODTskDGEIpEPrLNL87FusA1IuRk6ryFrCm4SQhyazwXqE1Lr73fzQCjga2m3eDHTtuOT40x5+vOe/V/OkqtuwIXLOAwhZPV1WCNclcZ+Y9Ul1I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Li Zhe In the current implementation of the longterm pin_user_pages() function, we invoke the collect_longterm_unpinnable_folios() function. This function iterates through the list to check whether each folio belongs to the "longterm_unpinnabled" category. The folios in this list essentially correspond to a contiguous region of user-space addresses, with each folio representing a physical address in increments of PAGESIZE. If this user-space address range is mapped with large folio, we can optimize the performance of function pin_user_pages() by reducing the frequency of memory accesses using READ_ONCE. This patch leverages this approach to achieve performance improvements. The performance test results obtained through the gup_test tool from the kernel source tree are as follows. We achieve an improvement of over 70% for large folio with pagesize=2M. For normal page, we have only observed a very slight degradation in performance. Without this patch: [root@localhost ~] ./gup_test -HL -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:13623 put:10799 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:129733 put:31753 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 With this patch: [root@localhost ~] ./gup_test -HL -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:4075 put:10792 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 TAP version 13 1..1 # PIN_LONGTERM_BENCHMARK: Time: get:130727 put:31763 us# ok 1 ioctl status 0 # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Li Zhe --- Changelogs: v1->v2: - Modify some unreliable code. - Update performance test data. v1 patch: https://lore.kernel.org/all/20250530092351.32709-1-lizhe.67@bytedance.com/ mm/gup.c | 37 +++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 84461d384ae2..57fd324473a1 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2317,6 +2317,31 @@ static void pofs_unpin(struct pages_or_folios *pofs) unpin_user_pages(pofs->pages, pofs->nr_entries); } +static struct folio *pofs_next_folio(struct folio *folio, + struct pages_or_folios *pofs, long *index_ptr) +{ + long i = *index_ptr + 1; + + if (!pofs->has_folios) { + unsigned long start_pfn = folio_pfn(folio); + unsigned long end_pfn = start_pfn + folio_nr_pages(folio); + + for (; i < pofs->nr_entries; i++) { + unsigned long pfn = page_to_pfn(pofs->pages[i]); + + /* Is this page part of this folio? */ + if ((pfn < start_pfn) || (pfn >= end_pfn)) + break; + } + } + + if (unlikely(i == pofs->nr_entries)) + return NULL; + *index_ptr = i; + + return pofs_get_folio(pofs, i); +} + /* * Returns the number of collected folios. Return value is always >= 0. */ @@ -2324,16 +2349,12 @@ static void collect_longterm_unpinnable_folios( struct list_head *movable_folio_list, struct pages_or_folios *pofs) { - struct folio *prev_folio = NULL; bool drain_allow = true; - unsigned long i; - - for (i = 0; i < pofs->nr_entries; i++) { - struct folio *folio = pofs_get_folio(pofs, i); + long i = 0; + struct folio *folio; - if (folio == prev_folio) - continue; - prev_folio = folio; + for (folio = pofs_get_folio(pofs, 0); folio; + folio = pofs_next_folio(folio, pofs, &i)) { if (folio_is_longterm_pinnable(folio)) continue; -- 2.20.1