From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DB5BC5B543 for ; Thu, 5 Jun 2025 08:23:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0B5D8D0053; Thu, 5 Jun 2025 04:23:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9BBFE8D0007; Thu, 5 Jun 2025 04:23:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AC648D0053; Thu, 5 Jun 2025 04:23:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6B04E8D0007 for ; Thu, 5 Jun 2025 04:23:18 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5B7BD121DA3 for ; Thu, 5 Jun 2025 08:23:17 +0000 (UTC) X-FDA: 83520657234.20.6AD0185 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf29.hostedemail.com (Postfix) with ESMTP id 9B25C12000F for ; Thu, 5 Jun 2025 08:23:14 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=fitNlVQc; spf=pass (imf29.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749111795; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XnhYVXyPNED7aSa+QtVGAu9aQawWwPe0ExGu8cnyfFA=; b=rQoI2NgQk0Nf5hk8jhzLQi37rjESeKjDeufxs/duQhNJnKW88sS+idZde13QVX49HsCDvX B0vFsQV5042cKUAwO3eoDz7gJR1qnwYSJEsqYEX9ex50PbjI/MllMJXJJohhIlk2kid5k8 wVXpSmSILeDU+Wstri1d1uCG/2TdI7I= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=fitNlVQc; spf=pass (imf29.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749111795; a=rsa-sha256; cv=none; b=yijnxHF8/TE+Auo9JnJKdeDhTJZIEXeaZbKPYZ2Bf/Nq9/PyuSDqo0k0M295hUBbZIoiQM WGdrAp6glB6U+cIdp3L7evuxQBp0xeBd8lR2WSC/LmvhzerMbxVq5JG2+xuSgbgI4pr92z 5ZU6G9LFz6NDI4swgkW8eXfW4wv75HM= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-23508d30142so9789415ad.0 for ; Thu, 05 Jun 2025 01:23:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1749111793; x=1749716593; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XnhYVXyPNED7aSa+QtVGAu9aQawWwPe0ExGu8cnyfFA=; b=fitNlVQc3QpiZU22GkBfqA+U+q6JnaFj4eTL9bRLFGD/juZx9Hdin2MV3pXstVS3ds 4hYkDpHq35og8m0yS6uG3zHG7Yn29GUiTup/I+242sWx5wXdfbuBl3NbI2Lqe+xYcczt ITK8flfaZZHxKXzISTHAW3vUlTMOU8BCSGMI5n9+1daWGeQFjqdLnDC1btfuW5sOL7I0 fYYMRC8WgL//a/NzkUlNzlXQLbdESoL3WuAy/KrNYhtIGrU+bmf7HX2iXq0wiJuU/Jer jzv2aZBLmJyhfReztF1uvW8fkoBY53ePCaLSygdO+vkwGnFeRGBD+lK2qF0v8WByiAyK KOvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749111793; x=1749716593; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XnhYVXyPNED7aSa+QtVGAu9aQawWwPe0ExGu8cnyfFA=; b=g72RsKXcAwxq0IlmrFqlUoc3JxRymbEg3pdlCHMGDMopY7lYQ0MneGgES9BSfeqsEh 2HTmO2YgSqJ0Aqj3paw40jNOOAKgJ0uZUn7Sn7pwoIjXGq0/242ajaWMvWVmADg38R13 Ayrj62hIQSsiAj2YTBKU+S+1vFw9MI0Wsys9Ok8I7NrqQRtBjFuDRMKSXXw9coXsAxL7 CsUrMd3fwyfJLvbPdN2DR5Cnxjc7eRglEU5NP27upFreBhqqC1YH6RVrMkuSC4FdqDq1 xvWRtJ2AZL9j6u9d/TO4o5o/RCi1ilbPJZNA8RtH3hPuYX0S5Z1cqRc6d5nbzlcpuL1y h7zA== X-Forwarded-Encrypted: i=1; AJvYcCUe+MhYcJLvQbcZe3MpXsIJCevjwtEV05s+TqfR88b8DpLmwh6fZhHKAuXU9VQIxbl8K9XEfAt5wA==@kvack.org X-Gm-Message-State: AOJu0YyC+yfFyS0v7bs1S6YtZI3lfF7Vnls68ZKmHwYe/PXEc2THoLc8 qkwYO7hzyAHzczxHQDYuEXeBBiUYFRjOXGbPsYo6a8egvzrEeKNByZo1EweVXEIyv2o= X-Gm-Gg: ASbGncvIcuHd9wRHTLUb/zGGWvbwRTiOJM0I2IMXRYVs3B0xJ91R6vTfZz7M5h46Blk M5G1HS8p0CLF7JuZRLGhWPGDt8/iF4YMqcH3G+pE/wcX0OGC6qb0b9etKxKqrdLAHYOSF48o2xO BldqcGp54N5653XEZg7QoOGLFcHVoIDHsH0efxrZ06SS+tsVweqlZChDjLdM+7Z1PJnFIv77pjN QB+aM74eQ+xB7wVKEEzvr0MhhTgH4JLgtUuoX7qlFn9sC2HtuFnkb+LzNOTwr18s7ivHMLrgVBw fMtC7kwXluh09W5Ompkiisajo2O1dLoaxa75ll8XmGRGRadR53RGH6Q4RlW31vmZRJg/Vc7XoRT ZOFs= X-Google-Smtp-Source: AGHT+IH/uXMpO9vPlvwO5HE6KiJ+VbwN8EwnVeB/4PqiwC40qukfCsjqe5p4aL0wGoWKyJb7NE1+9Q== X-Received: by 2002:a17:902:d510:b0:234:bef7:e227 with SMTP id d9443c01a7336-235e1150935mr73488875ad.18.1749111793100; Thu, 05 Jun 2025 01:23:13 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.12]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23506bc8695sm115300815ad.34.2025.06.05.01.23.09 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 05 Jun 2025 01:23:12 -0700 (PDT) From: lizhe.67@bytedance.com To: david@redhat.com Cc: akpm@linux-foundation.org, dev.jain@arm.com, jgg@ziepe.ca, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lizhe.67@bytedance.com, muchun.song@linux.dev, peterx@redhat.com Subject: Re: [PATCH v3] gup: optimize longterm pin_user_pages() for large folio Date: Thu, 5 Jun 2025 16:23:05 +0800 Message-ID: <20250605082305.5280-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <9b04871b-33fb-42cb-840b-88fdb6b93c48@redhat.com> References: <9b04871b-33fb-42cb-840b-88fdb6b93c48@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 9B25C12000F X-Rspamd-Server: rspam09 X-Stat-Signature: re94zqyg4itrnzce8pweuizquk998pty X-HE-Tag: 1749111794-669646 X-HE-Meta: U2FsdGVkX19woFu5VtytxS4lZX6FQh+CMd1iwsKtJoOmTGEqsKigJkb7rTNbyziXOBPFIJWdc27tEvjx3TWx5jkri4EX+UaSMA0gO8t7gvT4Kka0F1amazyQkucDVex3zim3mw1PMGZ2QpiR/UxwJwQZuHEz8SY4sSFBeN/CZOANl/1FF1rW8PBdrCLXtxThHvXluWHXC9jIFRyFlsLozrB1cLJNBrfyOFYdEclfEXo9ndD+nj5gCR7Nj8ObRe3blADizoTrlIxeupAFmU1qvkruT1Q+zUb+Hm+ZN6Ie5DAn3HxP5PCHbSLMaezRj7uEADq6FWHRQuOjjb1iWTlB4HQzVuWozqWudDQ/WFwRshruhrPk93GfK8WTMTjzQ4TX9e1RK+z2wbRx57bwaRc8pBuykLVW8/038x4KPDCwIu5L0s/WBwzyBa9fXLkOoI1e6xa7ps2eOPurjgH3cSyDngQ57Y7th1FHhgmNkBcLFk6wa4xYKSO0tqygj71U5XfiXe7yceTu/e0jhwyiMEzRp6Jo/otS8NJsUGYaInovwwfUQw07CWBY2hYGLjMw5mpYHJVlrxCqVoJ8ZlzTjbER8j6QoppCoC7qY/A6+8POwEh4w3/TnB+TaJsZkIElrRCmwhjoA61RizOSxUZYDbVgSn52ZatPS00utrVaPABBd7bpXVsDSqgECcA/tBPte/bpf84zphYP2qwX8WyoEHypYZ0PePe012czZ7cJBjsCLMkccyirIaiEQ6gvAEWCrrhShgTE9Fvvn0EUFYSuJQE65OmNsIvsJI8Wbl7oqRvDOPP/8meA7Ht5bCaJWv9Q727VQdYARuuM1lxPsdmOLn6VX4+rGa4aIDqf8BozVruDhC4PZhgPm9P4QP7w4KBAlR3wkXjO6A88Aax+T8N7bR1MQ+W+17o/yUEOFvyhQZaIfpAA2YN0e6xEPDWeCspuzbZ5YTXQy5/Qtu/0mbmEDpt 5Ebb5qUG 06/ZNp5wtuKX4nAvGhil0/3I2VulZe2GsGQGJwqqOlWnye1GhMydvp64itRhmU7zBrl8LgJSEdBeOlsuBjs5r8ZVe2EyRwOZF6Koz8SDlfeOWG4oapHJ/Z5mhKwf7PissDcRFl2m1TTpOSG8pba2G7L3jvMHsMdl6lvcC92vzQESgc3IeUC0hphom3DMnquIwYF+C+LzM+034LwnoMewZCkWkowXSkSOZFryBB60+OW6x95fHq0oo1lBqEaIsIZCFOoeWHN250WFS6+GibHy9m8U31NMSZgX0iIzXQ5BhstZRzdvxvoVKKzxIXhOm+XtrPtHKGtQkizA/3I6aMPLb7NjMjiqWIAuV2Hfp3NN0If2fP23DTx7byenBq1929E1RSmoqbUPm5iyeuKmVS+gXlEVp6w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 5 Jun 2025 09:51:06 +0200, david@redhat.com wrote: > On 05.06.25 05:34, lizhe.67@bytedance.com wrote: > > From: Li Zhe > > > > In the current implementation of the longterm pin_user_pages() function, > > we invoke the collect_longterm_unpinnable_folios() function. This function > > iterates through the list to check whether each folio belongs to the > > "longterm_unpinnabled" category. The folios in this list essentially > > correspond to a contiguous region of user-space addresses, with each folio > > representing a physical address in increments of PAGESIZE. If this > > user-space address range is mapped with large folio, we can optimize the > > performance of function collect_longterm_unpinnable_folios() by reducing > > the using of READ_ONCE() invoked in > > pofs_get_folio()->page_folio()->_compound_head(). Also, we can simplify > > the logic of collect_longterm_unpinnable_folios(). Instead of comparing > > with prev_folio after calling pofs_get_folio(), we can check whether the > > next page is within the same folio. > > > > The performance test results, based on v6.15, obtained through the > > gup_test tool from the kernel source tree are as follows. We achieve an > > improvement of over 66% for large folio with pagesize=2M. For small folio, > > we have only observed a very slight degradation in performance. > > > > Without this patch: > > > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:14391 put:10858 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:130538 put:31676 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > > > With this patch: > > > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:4867 put:10516 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:131798 put:31328 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > > > Signed-off-by: Li Zhe > > --- > > Changelogs: > > > > v2->v3: > > - Update performance test data based on v6.15. > > - Refine the description of the optimization approach in commit message. > > - Fix some issues of code formatting. > > - Fine-tune the conditions for entering the optimization path. > > > > v1->v2: > > - Modify some unreliable code. > > - Update performance test data. > > > > v2 patch: https://lore.kernel.org/all/20250604031536.9053-1-lizhe.67@bytedance.com/ > > v1 patch: https://lore.kernel.org/all/20250530092351.32709-1-lizhe.67@bytedance.com/ > > > > mm/gup.c | 37 +++++++++++++++++++++++++++++-------- > > 1 file changed, 29 insertions(+), 8 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index 84461d384ae2..9fbe3592b5fc 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -2317,6 +2317,31 @@ static void pofs_unpin(struct pages_or_folios *pofs) > > unpin_user_pages(pofs->pages, pofs->nr_entries); > > } > > > > +static struct folio *pofs_next_folio(struct folio *folio, > > + struct pages_or_folios *pofs, long *index_ptr) > > ^ use two tabs here > > > +{ > > + long i = *index_ptr + 1; > > + > > + if (!pofs->has_folios && folio_test_large(folio)) { > > + const unsigned long start_pfn = folio_pfn(folio); > > + const unsigned long end_pfn = start_pfn + folio_nr_pages(folio); > > + > > + for (; i < pofs->nr_entries; i++) { > > + unsigned long pfn = page_to_pfn(pofs->pages[i]); > > + > > + /* Is this page part of this folio? */ > > + if (pfn < start_pfn || pfn >= end_pfn) > > + break; > > + } > > + } > > + > > + if (unlikely(i == pofs->nr_entries)) > > + return NULL; > > + *index_ptr = i; > > + > > + return pofs_get_folio(pofs, i); > > +} > > + > > /* > > * Returns the number of collected folios. Return value is always >= 0. > > */ > > @@ -2324,16 +2349,12 @@ static void collect_longterm_unpinnable_folios( > > struct list_head *movable_folio_list, > > struct pages_or_folios *pofs) > > { > > - struct folio *prev_folio = NULL; > > bool drain_allow = true; > > - unsigned long i; > > - > > - for (i = 0; i < pofs->nr_entries; i++) { > > - struct folio *folio = pofs_get_folio(pofs, i); > > + struct folio *folio; > > + long i = 0; > > > > - if (folio == prev_folio) > > - continue; > > - prev_folio = folio; > > + for (folio = pofs_get_folio(pofs, i); folio; > > + folio = pofs_next_folio(folio, pofs, &i)) { > > As discussed, align both "folios" (using tabs and then spaces) > > Acked-by: David Hildenbrand Thank you very much for your review. I will fix the issue in v4 patch. Thanks, Zhe