From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49282C5B552 for ; Wed, 4 Jun 2025 07:58:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C05886B05A2; Wed, 4 Jun 2025 03:58:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDD526B05A4; Wed, 4 Jun 2025 03:58:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF3816B05A5; Wed, 4 Jun 2025 03:58:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 95C526B05A2 for ; Wed, 4 Jun 2025 03:58:42 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1B81B1A09EB for ; Wed, 4 Jun 2025 07:58:42 +0000 (UTC) X-FDA: 83516966484.20.AAAF17F Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf11.hostedemail.com (Postfix) with ESMTP id 7848840006 for ; Wed, 4 Jun 2025 07:58:39 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=lJqaUKvr; spf=pass (imf11.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749023920; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XvXmFFuXZemPVLmDJrliCn1kqX4Z09vjc+TkZ/tmAug=; b=pFLRu2l1W7AFAv73SDeQfsgf1sL+64DYA2aq6dHWgxoRPKYRZNSS0+DtBlnnr09zAWmdHN 6sjj+tEZRphiqilm7jy2H432pVzg1jdGq2h4Mdjut4It+3SEfe6gODgWdXOqfiy2RVjlRp SVp6yfoLaoa/KNNSozJwZPbdchy61IM= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=lJqaUKvr; spf=pass (imf11.hostedemail.com: domain of lizhe.67@bytedance.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=lizhe.67@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749023920; a=rsa-sha256; cv=none; b=WW0PAF9RWWIB4YczUMInzbw62d1O2pDiWfcRzNx/RmGL35ZVBn/dmP6ptkNWATuw5s1J95 PZcA3O7PZSdoKBbpKlRfb5LCAEYcdHKyBzr57aJDdbb5eznMDiLkJQ83TqyOSWbRtfl6Ho pM0XgFyyGIX71wRg9x1yiYRtnvEWJD8= Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-311ef4fb549so5737216a91.2 for ; Wed, 04 Jun 2025 00:58:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1749023918; x=1749628718; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XvXmFFuXZemPVLmDJrliCn1kqX4Z09vjc+TkZ/tmAug=; b=lJqaUKvranWaYoyRzZfDxc84RUa/yJokb4by9COefJQlHsYKv7bNaZ8TfAe6XHf2P2 PFPU/xYpZr/eOmXYeAx025ejxfYodZycgzMjIcWsSMdgf6iLLFbRBVIIJfLKQc+hrzaL gPfMpSrq2iLTjKHuosHdF1syAoF/dvpuqlP2WvyC4tHgwDUNOOKYBz/dghGPykPRe+wG z/P8ASYtqM5I/TQO3oD3LV2bh7WImqEMlMKlwl3BryXHa7LFZPMMxpkbJKD5+ii7LY3i JEMkO+4Afbedp2j5UHEHUv4lX+T0RCoUMa3wgfq+4HrIjrPFgjPupkHpj1gNRWXzXkXM quAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749023918; x=1749628718; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XvXmFFuXZemPVLmDJrliCn1kqX4Z09vjc+TkZ/tmAug=; b=u4UKTQfAZjOz9psC9SiBX0jQSKp5lhoS0AVa4tUkD7JJVA4eGTZYHczIxqy8rRgJB0 ezG679CvYArjextdMvl3sza0/kQq7aHklCvcvhPtXdIkmcagNvVtaglXXe9w1F8HFdg3 5KjmJDekFbfCEB3LHl7OTDS1IpsRiSmhw0aUfW7viT8oFLBSUONxnnx/aQe/CIurA0Zg K0iUEcc7a1AWfSqM5+9dkmMA1eDbPAWPlYsn3KRN1yVOVt/X967G5LC+HB+3T9ArHmmM D7/8J4ZsotIAVhq6ulobNUzyHYYS0Y2gxATSF1UgaB+rr+2fhzJdrOLE3zLm9Ovdu0qb PQSw== X-Forwarded-Encrypted: i=1; AJvYcCWY/Nzb3wwoKh+uT5VprwovBOmfTLDioQfI661wuBGjmPk6t6fYpMVlVL3TIjuyjWNCjYmBiMAY3w==@kvack.org X-Gm-Message-State: AOJu0YyBbkGDu7LdOoO6mDcv6t4Rzro+ovV2V6tIB6vEmCZSNqKpGN2S 6aAeW1p06YIVF/QxKp8Xz26P/X1tVhnxZ6LW5uFad+Ze33rUWCt1qwxkxNccThJQXEMU56j35Qa aHUS9VfA= X-Gm-Gg: ASbGnctY2EU4t20Oo3m/bcu0rtfG286caQiVUUllLuo5GfHhE3QYzR234EzrEWLm5vb KcrNthJTFXsc+sO8r6inScx13EaUOfA/g412AQ1C8zoT771dsoRc9vwCd2fBDzkQSQWoXB5RXHJ VDL7rH9xkzwMD7RCRuSwO7divfQU7e2ts8p7QovllOIcszKlSW6FkybqCRTadsR0gjGWuO4y037 wfRfcLmSjMWi18KwQ6b9V8jYMXtonNNkFliZjKLC0jihaI5r5r8XweQasxQVkCEyvGXKV7xDFA5 Wv+0f00rJSQtm7In1ARepmRsFslmfw1UDbxG/TmE6f62KtVdF65XRDBiP27HE2lMFFfhzIE8n38 NqPs= X-Google-Smtp-Source: AGHT+IE8ZdBcfHRqeD7gLSsugXNBHZwBfcGsxChnuZcWsLIjnwQIZA7o60dEujqTl4yozJp30iiVog== X-Received: by 2002:a17:90b:388d:b0:312:e76f:5213 with SMTP id 98e67ed59e1d1-313110995cemr2085795a91.28.1749023918030; Wed, 04 Jun 2025 00:58:38 -0700 (PDT) Received: from localhost.localdomain ([203.208.189.11]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23506cd362fsm98677305ad.116.2025.06.04.00.58.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 04 Jun 2025 00:58:37 -0700 (PDT) From: lizhe.67@bytedance.com To: akpm@linux-foundation.org Cc: david@redhat.com, dev.jain@arm.com, jgg@ziepe.ca, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lizhe.67@bytedance.com, muchun.song@linux.dev, peterx@redhat.com Subject: Re: [PATCH v2] gup: optimize longterm pin_user_pages() for large folio Date: Wed, 4 Jun 2025 15:58:30 +0800 Message-ID: <20250604075830.27751-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250603204414.f2963e4a094e360cad7f966e@linux-foundation.org> References: <20250603204414.f2963e4a094e360cad7f966e@linux-foundation.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 7848840006 X-Rspamd-Server: rspam09 X-Stat-Signature: pspjfq3oqf9zb6twhcu8ersduwjz8m1u X-HE-Tag: 1749023919-493393 X-HE-Meta: U2FsdGVkX19TPrfkpKJDEwx15IcPr5FeMUVp2CaQs2kulvLVmTq29WxjPdeZygSdR2J1aLV34Mor+r52x7txmhf1G2KmCNELXcBc4kU/lFzBVoSwQ77FUoNtNj+LkhN1JVUAUTHhMpB6viOmQtRBZPN3jE4lv4vRTHCcwNXkZDLF2Fmyam7gaX9fjOyWyAK7Yh+ko6XGpiNyDno5XQPuuK7+Rjg2QfQhN4NY5weYzCc4031C+R1wao/q5TzRk83Fz68mm649JgcNgJV9Uz7g0wd4SpWCbJP/2HNshVdW0uxwYF0XvTbIiYcM/CwpqH96xvFYwrCi1JY6d878lxCx3ohkowdSeBjmq8bKd99jnxLSDsPbmA9JDscng1nTETPXG9p02GmOth+eCZXuPbRPEwCEqGngnOOCR1ni3/Fn1LQ6q2B3txrg9blFLvFJhvRtkQvAYNV/UBcDw07kojrDxot08lYI8J17Wmf+g+tgzCiGA0ml07Ps0UU84RoRqrmYOvnUh1kCpEW1Qezc6vwJAWbEUSQC6iErBZLJN2GHmGaLWKM7LnXRAc9143JhVyv84VfUcc9cXTOfvqOSPlgFHNWUA9KZOuxyQwudAYcgaq4+bXtjTEV3B+Z9drigdDmiwBDewDaClHcrD5ue3KmvPfYNDydoxfU5hNLzWtlANbJEK+28g1P6lGwcT/5QWSWkUg6uWH7nXulHklCN7zFA9+ke0PAUAp/lo6gfVJYwxzV59enTAVq5dlSA6ZdQh6vVmRcf8ihZ5AgDdZs96iIx1eL1db6Z2WmnZMFF3MVQPZg5ySnPg1S0RestCn9G1/YHfH+xALTE8kAGt/q2BWcv1/sfTxBFI0y6sYu6yg/Po3QXltz+OYSFp7W1Ff/61GMLhuRI6mBKXmeGs5HpCZT9gEPkwJrnhAOj1Y9n9caw2oFbXmbzLbhwMchuvmoC9ydQlSB+Flhd7inKeX0iyQx npwdUxP0 YnSbsJNFTN3QDfTnu+j/8MMOctkfrzW3Ysg6UBs1LxMzG92aVKpYSreUXok+BD3kOge8x/6qj2CGxwAcCjQBRe8HRDbR4cc3sWcJ4Cicr+VjX3UChTxgxXDaBsZotjuR7/hnIQPXEoxqUdzDq7vHlkAOLMM6kAJ3udycijEDrULpB5dv93ws3upVeFLZLvcb+u4QvD0qRuynU97rG1RriDWovOiWRa1g7RcdDfoEBi3m4na2GAn2IKdK/dJfQa5aVYP/zqsYjco8FShkrqPuj+mTgRqg7w078ggwdVMZC4GunNCbOjlZIISHpeKZCWaGwJU6is4SUNz0p4vfSU3UAT3iXHFWWHWzz37Plp/LxojXE3qJxr/YdFrAxf49ZjRpk7TJ0EpIf1k+493Ondmib4QJcSQg8gvLH5iRZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 3 Jun 2025 20:44:14 -0700, akpm@linux-foundation.org wrote: > On Wed, 4 Jun 2025 11:15:36 +0800 lizhe.67@bytedance.com wrote: > > > From: Li Zhe > > > > In the current implementation of the longterm pin_user_pages() function, > > we invoke the collect_longterm_unpinnable_folios() function. This function > > iterates through the list to check whether each folio belongs to the > > "longterm_unpinnabled" category. The folios in this list essentially > > correspond to a contiguous region of user-space addresses, with each folio > > representing a physical address in increments of PAGESIZE. If this > > user-space address range is mapped with large folio, we can optimize the > > performance of function pin_user_pages() by reducing the frequency of > > memory accesses using READ_ONCE. This patch leverages this approach to > > achieve performance improvements. > > > > The performance test results obtained through the gup_test tool from the > > kernel source tree are as follows. We achieve an improvement of over 70% > > for large folio with pagesize=2M. For normal page, we have only observed > > a very slight degradation in performance. > > > > Without this patch: > > > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:13623 put:10799 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:129733 put:31753 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > > > With this patch: > > > > [root@localhost ~] ./gup_test -HL -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:4075 put:10792 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > [root@localhost ~]# ./gup_test -LT -m 8192 -n 512 > > TAP version 13 > > 1..1 > > # PIN_LONGTERM_BENCHMARK: Time: get:130727 put:31763 us# > > ok 1 ioctl status 0 > > # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 > > I see no READ_ONCE()s in the patch and I had to go off and read the v1 > review to discover that the READ_ONCE is invoked in > page_folio()->_compound_head(). Please help us out by including such > details in the changelogs. Sorry for the inconvenience. I will refine the wording of this part in the next version. > Is it credible that a humble READ_ONCE could yield a 3x improvement in > one case? Why would this happen? Sorry for the incomplete description. I believe that this optimization is the result of multiple factors working together. In addition to reducing the use of READ_ONCE(), when dealing with a large folio, we simplify the check from comparing with prev_folio after invoking pofs_get_folio() to determine if the next page is within the folio. This change reduces the number of branches and increase cache hit rates. The overall effect is a combination of these optimizations. I will incorporate these details into the commit message in the next version. Thanks, Zhe