From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDC9FC5B549 for ; Fri, 30 May 2025 12:20:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 410F56B00C9; Fri, 30 May 2025 08:20:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C1A66B011A; Fri, 30 May 2025 08:20:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 289316B00C9; Fri, 30 May 2025 08:20:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 09C2E6B00B3 for ; Fri, 30 May 2025 08:20:15 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B27601405C1 for ; Fri, 30 May 2025 12:20:14 +0000 (UTC) X-FDA: 83499481548.01.EB143AA Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf17.hostedemail.com (Postfix) with ESMTP id 087424000F for ; Fri, 30 May 2025 12:20:11 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=HGSvDR5g; spf=pass (imf17.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748607612; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zVX/fq17IZ0Mm65tm69m9qikaxT2MG0++vNA13gmJdc=; b=MQM9awyp4j//mEV7LuCKzeyht5DONFEK8yptPHlHMv9F4q5FM/IR7rmXpGjgr5kP98Wlup EIXMPaI/52hY8/R9Pf36QrUwrvauFS8qoJLcJyAKZ3TTBVwR3dqxb/bZ4GHqhoEeyIdnFO M5iO4KLm0zDAk3vOlgijV5hPTVmm0oc= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=HGSvDR5g; spf=pass (imf17.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748607612; a=rsa-sha256; cv=none; b=AuSxy4OFRYBmfiIdqDoklvE9JZybv2G84ohgxJKlVH3w2yC4QLILAzUDoHgGkHF8g5f9vx KFWaORJmmHmM2LJg8+BfT71GdV9cM8kOQveDOQAEwp+Hirt68atwkFxYpqUotdX+g7zxFq 644liQF3P/yH1T9G+p/eHizsi23eDDQ= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-2350fc2591dso12095985ad.1 for ; Fri, 30 May 2025 05:20:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748607611; x=1749212411; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zVX/fq17IZ0Mm65tm69m9qikaxT2MG0++vNA13gmJdc=; b=HGSvDR5gWe24+H5cp36KWH6phiEUFttZtBA5+h21ecfDGqlTCO08iYu9WfsQ4kmhy7 /z1SP0qG9A5TUMb1xJ95uwm8bJXT9Ci4B8UHJ2dds3xxDkk9QkrKP8WobSWuOJAU0dbn BC+nIf7bKDuYOZhm8feTMVH/pyhvrD5MHSo11iIMRJgbyaWfLNdul8HaphOFO4CvK8sB Dpqcc1Jmr6KoqPrA+KPkG8vWKfDGoDLaPbyuNt4SZmeg6YoFZSQuCClR2koTlxYaXK9q 9XIAMvHu4PWl2THbwGKOb6C3gV3JFw5MXEZSVMS9IAn5zmg7qLRDYMBxmrQXSttMMR3H HJKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748607611; x=1749212411; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zVX/fq17IZ0Mm65tm69m9qikaxT2MG0++vNA13gmJdc=; b=FsOw1Aj8HyJc1KNBcpQpOfh1/2RqWamLHXRsO7yRJ5imCPERwJJ+0tNFrJzxqOGj3c a6VlRtILUT/03M8PyUW6dRgPvuIGL4MRggUO0n2T+WNUysCeAQxVrzWihMVGLovSYA6V ziHzE1EjGWvR1ZJPFOfDZ9LJGNFdqd11Z7cLV9qFmH7fKRNPTTsr0WHccL68argEMNeL 9oJsYJmZy1FvkoMrkc7d68OHkzhr6DUafTsSGt2fsIiPospXBzN2e5shuHxzgRm45J30 0O4dN5FhLuaFMMAtMJPt/y5x5RIL1fXAQ6dr42acllKaaH5LqbJkV/j40DsDpFNVhKOB 48Ww== X-Forwarded-Encrypted: i=1; AJvYcCVnARpvLh6uhdCQVORUX1ajY3smHeBOYmlyBpe8wFQTFd8Cq3io+6jQj0LV2RYYd94O3YwsssaXCA==@kvack.org X-Gm-Message-State: AOJu0Yyzzr7+i2/6s+pWI4/J7saeLIczNmvNoOYbQHXg2vdhPqVsGgkR RerI7/4OonWn9Yp/jLfN7v30134YP1qODyE7vsjub9K+V9Q2HZ7XVKsyhDYa49hUgy4= X-Gm-Gg: ASbGncuxB196m6ZGjwJsThVELzlhiuTCxyG0zefSPTGx7j8BxSME3EBWterXu7XLw35 pXEzxPDi1+inSHjhwaHKjnNoAl5ke8Sc9HRwRrvAoaF0tIMGtH87SReaWAFrW7o5Z6kHaAvNJlD hGByabPf0QBMEqXiP16WCrCWIfDH0Q66G7c56lw7ykpDD2mj28iyWzESTwaKgaVYVOeWsjrX4l9 6279W2goPjw9bZV8ssJHP0/A3fXrLaq+do8qZywlzxeKZP+nFZusEJa/Y6uDORYH8e5pj9m79i3 FsJMNxj4W+vdfgJVSPHbOVWDnZXCKMFEMXadej0X7afP5pMYGDXnDad2NovdPFa4HjVoBudsq1R BJhM= X-Google-Smtp-Source: AGHT+IFyawdNLxKKpOA2yzMb22+sXYnk0MR0MJj7pdjyGbLyRuPeq/XfzsI6szDaRh1omT087EqOTw== X-Received: by 2002:a17:903:2a8e:b0:234:2d80:36 with SMTP id d9443c01a7336-234f68dbb53mr120049685ad.14.1748607610592; Fri, 30 May 2025 05:20:10 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.12]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23506d209f5sm27146855ad.252.2025.05.30.05.20.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 05:20:10 -0700 (PDT) From: lizhe.67@bytedance.com To: david@redhat.com Cc: akpm@linux-foundation.org, jgg@ziepe.ca, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lizhe.67@bytedance.com, muchun.song@linux.dev, peterx@redhat.com Subject: Re: [PATCH] gup: optimize longterm pin_user_pages() for large folio Date: Fri, 30 May 2025 20:20:03 +0800 Message-ID: <20250530122003.44555-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 087424000F X-Rspamd-Server: rspam09 X-Stat-Signature: cc65o3kjuo9uybei9cn1i8qtx73cichb X-HE-Tag: 1748607611-239052 X-HE-Meta: U2FsdGVkX18r+LE53cccn1+9QzuXLFLsYGD4JIpdHP88h2kC9eC4xB5Oiy41Wl1tVWoZeqRuZnpjk1sNdoSA5dIImLpoIB0xiSTu7HNf6EV8y6zFRHsgp+NQopUcV6S1j+V1cO7eSD4nHjuMWxzO0p4iWpq9jnKip7u7pv+1wBSKdNBGjM5LBCuy8QzxpHs70CsP7Q6KejInAG3w7xrewVYmsg9qo+2j7lzfcgtxJLkCALKIxImzphzG+LrgtYWAYmu1RRVgtgF+2YN83NgL2h2IJoOxA3F1pryaIXQkwddsExZbJGznenxyFPo8xa89wYYbGuf3lEkTN0vAUUJ0JkMt2b/a6VPQFOtsR4y4U/0dF2GLvGWG/mwuOGEwGpTFjv2rvZ1q883sR3SjCdxxFkln0rXehiBSjcDjL8WcNYZRSG4W+d2kEY1nhXV3T9cRx2KB1Qp/ODbkdTrbQpJT51Lewm1PvcCr3rmWgeCCzvJlQvQQV2SReWB7UUVh6DFM1kKEVJuVEPb63P/Gcr+Y66Y8UTvGY4dHdLx4Nu8Y+IMDnB6LKtbsIHHl3hc6CzUymtzN+VjxOFkAz44K10wtvASMh2r1sKl+V1DnZLu7lMq4cO2WASrciwa57TfH6Sv0PH7eoHab5u2eLt0/4OkufGMcqwUfmpYIo3gS4twpV1R4VITVlB9aWLILHq52NaQqkuaxVQrGkYTYAnXYbsea3tlQBfaM6RfmQktVrRvpqBYwZnOlHFYbf6Opm6RtdTPZBIuFJixWp3/Rky8i6mNG76a7+bOGKAg82xfO4ciQX9M3wELJF86Eh0P3fsMe4m8x2ALoQl6fxKNCOh5wO86YKg27O1ljc+HEGlTXS/6TD8zeOh4eypoCdLRPSypHFMh5TQENporbrx6vSBhvuMrJy3y0ucDsAmGmiwk21421Vth8dUe2esrD2BDrDiXsgyrxbvnsroZf2RSKT5Ik3k/ kbLNolYW fhA65K+9nm0pplPOyRenRusqWtn/UqQ1LSwhV3O5T4OFu6ZB7hUsggy7HHBECTZbk3bJhOpS9c51vvzG7VC2dpqlrohrFVPJuMzRiGwuqJuCh1naFNCYh56LSzH6qIrLddRC1gwlswwsEq2VXpbyQzu6Jh89UPn/ITyGy8ZsI0F2IcOC4B1W8QchkAMFRmaxsMrmv5RNrpUClTIkWdu1gYG04ejkQ7N5I1HBRMkDIcl5GTtq3xpwaRCRMKQrN+Q5wzgzjxD5Ab9AEYSrxkUrvsx02GXbT1jeEWZ3QC4JU3MPhCyBvuxkuHa8fvF4wpYsZXdm/GG9G3xVMLk1WyGo5XsTjXJL0EaLD+AoHCnfv3dXJpG8zq44Tv1azsidQKihlEjt8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 30 May 2025 13:31:26 +0200, david@redhat.com wrote: > On 30.05.25 11:23, lizhe.67@bytedance.com wrote: > > From: Li Zhe > > > > In the current implementation of the longterm pin_user_pages() function, > > we invoke the collect_longterm_unpinnable_folios() function. This function > > iterates through the list to check whether each folio belongs to the > > "longterm_unpinnabled" category. The folios in this list essentially > > correspond to a contiguous region of user-space addresses, with each folio > > representing a physical address in increments of PAGESIZE. If this > > user-space address range is mapped with large folio, we can optimize the > > performance of function pin_user_pages() by reducing the number of if-else > > branches and the frequency of memory accesses using READ_ONCE. This patch > > leverages this approach to achieve performance improvements. > > > > The performance test results obtained through the gup_test tool from the > > kernel source tree are as follows. We achieve an improvement of over 75% > > for large folio with pagesize=2M. For normal page, we have only observed > > a very slight degradation in performance. > > > > Without this patch: > > > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:13623 put:10799 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:129733 put:31753 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > > > With this patch: > > > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:3386 put:10844 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:131652 put:31393 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > > > Signed-off-by: Li Zhe > > --- > > mm/gup.c | 31 +++++++++++++++++++++++-------- > > 1 file changed, 23 insertions(+), 8 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index 84461d384ae2..8c11418036e2 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -2317,6 +2317,25 @@ static void pofs_unpin(struct pages_or_folios *pofs) > > unpin_user_pages(pofs->pages, pofs->nr_entries); > > } > > > > +static struct folio *pofs_next_folio(struct folio *folio, > > + struct pages_or_folios *pofs, long *index_ptr) > > +{ > > + long i = *index_ptr + 1; > > + unsigned long nr_pages = folio_nr_pages(folio); > > + > > + if (!pofs->has_folios) > > + while ((i < pofs->nr_entries) && > > + /* Is this page part of this folio? */ > > + (folio_page_idx(folio, pofs->pages[i]) < nr_pages)) > > passing in a page that does not belong to the folio looks shaky and not > future proof. > > folio_page() == folio > > is cleaner Yes, this approach is cleaner. However, when obtaining a folio corresponding to a page through the page_folio() interface, READ_ONCE() is used internally to read from memory, which results in the performance of pin_user_pages() being worse than before. Could you please suggest an alternative approach to address this problem? Thanks, Zhe