From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 477ADC369DC for ; Wed, 7 May 2025 05:12:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 510636B000A; Wed, 7 May 2025 01:12:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4BD396B0083; Wed, 7 May 2025 01:12:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AC806B0085; Wed, 7 May 2025 01:12:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 047B56B000A for ; Wed, 7 May 2025 01:12:23 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3EAD6141240 for ; Wed, 7 May 2025 05:12:23 +0000 (UTC) X-FDA: 83414940966.04.D05E79C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 3578F1A000A for ; Wed, 7 May 2025 05:12:21 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746594741; a=rsa-sha256; cv=none; b=OQiFO3u0sXNJnq73EfHKIcvKvIqWJXZcb2alzupQIpRr04xxbyUwKNdgkfT+kOIS74zLre 2YjLBta4/xKcXq0jKn/tzFcjWslm/6dZEpJNo9TcekQUvPiuWB8LECxb1mXMo+zLo7UiI1 AMPAoQR8+YXuz57J/1ABvlVmaJtPscI= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746594741; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/PZ1u4+gg0JDBBAkwmn1vRbfePm47hZzSsJqKynprOg=; b=cEj/Mq976Pi4zr+U3sR6hRnyPGUIFgRXU5Ua/dqtXduGmSNebDbOLOCWY7w27QkH1UE5aU 8HGE4BqKslfmbt7pcnYbSRl3X/ml+lkIyHkT/Rn8XDfOsC8d6HkIoeLX6KqkeADuL75bYE cuV+eL+9bU2tYkNMm5S6vWovuYBW9IE= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 43BB02F; Tue, 6 May 2025 22:12:10 -0700 (PDT) Received: from [10.162.43.22] (K4MQJ0H1H2.blr.arm.com [10.162.43.22]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F06333F673; Tue, 6 May 2025 22:12:16 -0700 (PDT) Message-ID: <17289428-894a-4397-9d61-c8500d032b28@arm.com> Date: Wed, 7 May 2025 10:42:14 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] mm: mincore: use folio_pte_batch() to batch process large folios To: Baolin Wang , akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <7ad05bc9299de5d954fb21a2da57f46dd6ec59d0.1742960003.git.baolin.wang@linux.alibaba.com> Content-Language: en-US From: Dev Jain In-Reply-To: <7ad05bc9299de5d954fb21a2da57f46dd6ec59d0.1742960003.git.baolin.wang@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 3578F1A000A X-Stat-Signature: p75wmy7dorz14rdn1ws9utc1huw757n5 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1746594741-219073 X-HE-Meta: U2FsdGVkX1/nEmOdN8/5KVrL6BgncNqVXRonlYpfnHFvVHQ7dkDp6q4/9oDHo+LLooWkYexNwbkK360/UuSO/IBulfp+XLG/mOatjhg1D9eZ12b4j2EPR+IblHC7Jiiri3l2BjSj4CvlqGtpFIGyB63C7M4NSDxAfaRnU44w6xW655f5M3b5Zt8kw/WNdwjNSZh2nUk6jMcIcZDMlQnGccdcydKrDSmdj1+CInxkQRsIkImg7kx3YW7NtjDd5UqDpIoC0bhbLj6qMO7lQWvdeJKTOF5zyS/6hnSjZ7YU/zDmfhVm4z4qn0IguiLSEpOsXyaKIm2E6ObTXV9iEisq9GOnzZJwZgr47qTw6CIVOvmb1Cmj93DQ35jwcEvYMRDbRQNSWYyNsVZ/5xluIasA2rNyXMm+/JBNyE3y5/8TxzhejBvYp606/Fl/3mKmv0L8Y+kihdVv9mHeDBDJiD3AuPMDD/FDFVM1C0syxY21kcL1RKAj0ZQP+xkzsSuYkS5YfC7iTgPtrd06sy+Fr3EgMIhKVdtAoXnoU2dQa+KtMgMq0EJfwkpSBKm7DjRTqeBYdWwdaalFODu7k/k05M0HIGw5gguWnR1hHg/HVNEIQLscdlimvR2hRKTSRWA2TGzfZFfjQqrTpUgzS3NMZmtjYmpd+gYoZ/6k0L85RPQ1uorW56EgPcfgJxtTPC4shGWNGNba8F1RpXRl1sjdk7GV3DzR2GXPxkoC/MOh3c0RQ7ddgXI0kz1uarrYzcPwlLXhMhcfEeCpN/Ecowi7jj08ch+crBmPTM/KJD37KnHoIfP2j7ZYVa696CTykacLzA/T5i9yJ0yoeMF08Oti2FLKVImbOYk44T/wI1wLPMOXGgp36ePVmwU4WTDxltRUdeLYIvbJn2uTfrqmIYLVoHSPtEpOMqhsDG/QhJKsxltO/4WYMa7KoH1ai2i1NKNo/0uSqLvXK0raHZfIWWXqHoH TlTsgNUI D6zn+4etUh0DdK6OBciu8I1BRgO6VU0dKfBRd45a9Bzug5+wGgi/GBIb773m0FqvZ7Ch2Ov0IZOfDO23mZ53zzh2a+wEl2/16h3qOWJePtV3dwSRmv1yYODQwvXEgNoeJLS2W3Ki+9ADLfVCSm+SLZDoTpfcmrAkJzP0kQXEQa/WhCbSF2kDAjjYjpylr4zZ9XOAD8V8V2XbteQ2hh7BGiFeeIc5mYRzIYy4373i2EtG/pwqqFIKOaxwSgn/3COcOsYSAaBQC3+HOd1Lwg70IipcMXesbLUa/1Ujc2pB9GjBXqrk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 26/03/25 9:08 am, Baolin Wang wrote: > When I tested the mincore() syscall, I observed that it takes longer with > 64K mTHP enabled on my Arm64 server. The reason is the mincore_pte_range() > still checks each PTE individually, even when the PTEs are contiguous, > which is not efficient. > > Thus we can use folio_pte_batch() to get the batch number of the present > contiguous PTEs, which can improve the performance. I tested the mincore() > syscall with 1G anonymous memory populated with 64K mTHP, and observed an > obvious performance improvement: > > w/o patch w/ patch changes > 6022us 1115us +81% > > Moreover, I also tested mincore() with disabling mTHP/THP, and did not > see any obvious regression. > > Signed-off-by: Baolin Wang > --- > mm/mincore.c | 27 ++++++++++++++++++++++----- > 1 file changed, 22 insertions(+), 5 deletions(-) > > diff --git a/mm/mincore.c b/mm/mincore.c > index 832f29f46767..88be180b5550 100644 > --- a/mm/mincore.c > +++ b/mm/mincore.c > @@ -21,6 +21,7 @@ > > #include > #include "swap.h" > +#include "internal.h" > > static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr, > unsigned long end, struct mm_walk *walk) > @@ -105,6 +106,7 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > pte_t *ptep; > unsigned char *vec = walk->private; > int nr = (end - addr) >> PAGE_SHIFT; > + int step, i; > > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > @@ -118,16 +120,31 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > walk->action = ACTION_AGAIN; > return 0; > } > - for (; addr != end; ptep++, addr += PAGE_SIZE) { > + for (; addr != end; ptep += step, addr += step * PAGE_SIZE) { > pte_t pte = ptep_get(ptep); > > + step = 1; > /* We need to do cache lookup too for pte markers */ > if (pte_none_mostly(pte)) > __mincore_unmapped_range(addr, addr + PAGE_SIZE, > vma, vec); > - else if (pte_present(pte)) > - *vec = 1; > - else { /* pte is a swap entry */ > + else if (pte_present(pte)) { > + if (pte_batch_hint(ptep, pte) > 1) { > + struct folio *folio = vm_normal_folio(vma, addr, pte); > + > + if (folio && folio_test_large(folio)) { > + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | > + FPB_IGNORE_SOFT_DIRTY; > + int max_nr = (end - addr) / PAGE_SIZE; > + > + step = folio_pte_batch(folio, addr, ptep, pte, > + max_nr, fpb_flags, NULL, NULL, NULL); > + } > + } Can we go ahead with this along with [1], that will help us generalize for all arches. [1] https://lore.kernel.org/all/20250506050056.59250-3-dev.jain@arm.com/ (Please replace PAGE_SIZE with 1) > + > + for (i = 0; i < step; i++) > + vec[i] = 1; > + } else { /* pte is a swap entry */ > swp_entry_t entry = pte_to_swp_entry(pte); > > if (non_swap_entry(entry)) { > @@ -146,7 +163,7 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > #endif > } > } > - vec++; > + vec += step; > } > pte_unmap_unlock(ptep - 1, ptl); > out: