From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 538F5C3ABC5 for ; Fri, 9 May 2025 01:49:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74D5E6B000A; Thu, 8 May 2025 21:49:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6FAB06B0082; Thu, 8 May 2025 21:49:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C2B86B0083; Thu, 8 May 2025 21:49:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3E39C6B000A for ; Thu, 8 May 2025 21:49:49 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E34B01A02FF for ; Fri, 9 May 2025 01:49:48 +0000 (UTC) X-FDA: 83421688056.17.1360552 Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf20.hostedemail.com (Postfix) with ESMTP id 1F8511C000A for ; Fri, 9 May 2025 01:49:46 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XORixtna; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746755387; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IQt1rwY4PFbf8h+PqozLqumguYfEWib8McIXdx3bUa8=; b=Qp0yGH9XLjsDYyMPm+qsPpYXiDBKpmaJhtK0M2HvVs8Iuno61z8MxjRiYyQBkyP6/33YO9 9FII0XGuXChWGbj/Nw+AGOModleq9HRjvpcahbDgvC4rEOX9Hyu9PfzzHZ7VayPc+GEQ16 xJstGVdn5/BeGE9BM+lu2/0SyPLFkQQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746755387; a=rsa-sha256; cv=none; b=3KOtaHNULDN9W+BTNWoFj7FbXimHBYi2BK0Ttqni14y9+tg3EbRG0mJhWK/XHdNpomR0x8 z7nBYDwQlk3Mg+uAhREnOuSZ+IA4OY5kKxwrjxRSX3XPNy1BbVvuA6feW0Q0tXm8ihCr3X gi2OtchUYlFvO1bPsFQCPyr0UBbtKnY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XORixtna; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f47.google.com with SMTP id ada2fe7eead31-4def06b6f87so18423137.0 for ; Thu, 08 May 2025 18:49:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746755386; x=1747360186; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IQt1rwY4PFbf8h+PqozLqumguYfEWib8McIXdx3bUa8=; b=XORixtnaQ04BOKPfhby7JvNijjRiuWGcu3s456bOkRaydx7RjGG3a9ZpZ+FrR3M3CB HHYW2uSbvwuc01JWwlg+H5dfmqEp6muJF6P/SSVBf6V6MqrVhilpoNtFoteK1MmuW9ZQ gR5vPh4temS3h8R0SXJ4APj+HAa5su+MtFOXjIbq2FZvKUQ9gWDvDMSPrvM6bL5dbNAq d0giP54IAW2jKPouQdQaGD/512SRtBzoQfdlus71nRcPfHAlA1NynQ+GwdMW5QEG/I1o DdTytYYNI7JgNkS8av/IPypJuiJjU1IGU62SIl+bPZHDWI7Mo3UjQ0ENRZJIXJcmbaSs t3pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746755386; x=1747360186; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IQt1rwY4PFbf8h+PqozLqumguYfEWib8McIXdx3bUa8=; b=r0it0mplUcVFjlluzrEbBlCatO2Ytn9H4u/yWykpOmatF00i3JoZyS1/9y5Sa3pnCb 75WqramT4GMP3oUDPOa0gJNO2hMbPkMQFiJI9Hr7V8CFccC5pT+FARah+RlJ/vE/bBBq jsFU8LPNSmzV0tvRVF3sh6hK0AyWkfoHYrBzg4hshHvZwoSTrein+1zku4JX6Lo5m9kw dNaZJz9FyXNypRRFJ44INZKz0EKsorp4eq+7n9XY+4fJZtVduTwC90/h98NoYYMkuK0M YHC3gEozncdalwuXSqHpPp+dUjY1eeJRpZI+6oS7ht+n8LO4rwGiPT1ot7sbtDdg90Bc HmcQ== X-Forwarded-Encrypted: i=1; AJvYcCUUXvoHhtHs0ZjP2d4UFEnuspjq9fTmhcTusfs05jPRamlRsQ6247BBLPfNJs0uZK/sYYB1iUK+dA==@kvack.org X-Gm-Message-State: AOJu0Yw1SiUwwpOYcwkuBOR1pioTJNjQx6rV2YljlfuXLz1HDty0/ucH Yq+Yt6dRqqPR6lRIc+/kgeJAJ7zRr9gbRSydjAuvrQt/YIXHeVlPjUrvH9dAus1VsvjAWGLVdCM qgoZ+zwj7Mf49zCqDav3U5TIibQo= X-Gm-Gg: ASbGncu5DvtxrzYFDmnXbnIB8y1aA10zA2ZefqoDVmsuJWcMjw2BmZdyb+r3N6dtSea C0BclagJvVQqBQpr2WbnWY8rv60CE5lKX3vafwCHU14w/eatbAfBW1wI8dQVyeC/3ZwWL8T5jLw kScSlZMYOhwISJZhPxF7Et9A== X-Google-Smtp-Source: AGHT+IGHgO3CwnsYQdmBlOCEoBZ9NfbDHGjySYoPcsUOR9lgnJEUUZYla5RWiyZFqecWW3cjtAO720t2pDTj9gZ3pVw= X-Received: by 2002:a05:6102:919:b0:4dd:ad20:a333 with SMTP id ada2fe7eead31-4deed3588ddmr1438927137.10.1746755386098; Thu, 08 May 2025 18:49:46 -0700 (PDT) MIME-Version: 1.0 References: <99cb00ee626ceb6e788102ca36821815cd832237.1746697240.git.baolin.wang@linux.alibaba.com> In-Reply-To: <99cb00ee626ceb6e788102ca36821815cd832237.1746697240.git.baolin.wang@linux.alibaba.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 9 May 2025 13:49:35 +1200 X-Gm-Features: AX0GCFs6pvJ9hATtVh9T0ceY_VjmRD6FtNEjr5sgU-0RHAyKZmp_LNouCCvyxeY Message-ID: Subject: Re: [PATCH v3] mm: mincore: use pte_batch_bint() to batch process large folios To: Baolin Wang Cc: akpm@linux-foundation.org, david@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: a7y6nkb48fbo69ujcszi3w344swyrmjf X-Rspamd-Queue-Id: 1F8511C000A X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1746755386-137880 X-HE-Meta: U2FsdGVkX1+9kwdiivbXvm1ADCTsYXkc8rlAirXAdj3dULeb+ttV+vfZY73dMzf0uUgF5cdtO/pIveKJNfrLXzWGbbL1SfswdwsPdCljQJbJX/uEX64jw9kVC9le8zYnStUzfbpJrq2RSxCk8PPBoR6xGixUF/jyWtznJxZaKFCH4+qRmHkVikJp18sOqsxLR3phxxkVa4rytDOs6VkhDAISLC54E9ozkMozAHJJS79uHQwpOG9vAjeuxlhDlA+kHR8QgKHt+uq15NWU17+loQmEQv1Qe79ggiVclfYY1EH/NC/HLITg2Kw0jzYlNt0AZhmRTMIXKsyxV/+s9iBTe4a9f2+Mw/TBWy30dR5NpNNtAZ4T8jYRIkoRvQ6uEI23oKt9VkcxxCxbwhhKwWctWA0RSRa4/LJs0eB4fJH02pEKyAmpNK7njWDxza19K8lplYvWn6NU5xw8gjQVUTynOwyYSd4oruKVFmARu4PGcSX9UC5gGIV+SwOsGSklWRgEzV9ziP8wk8XMgxcY5YomcMlsgTtgODO+DA+J1sSUQeXzaRQaMHLCz1OBA+rwMulx2vYsAhW1J16Ng7oCxRdW58NsnJ6YTFdZUr+SjNRBVi4b4CPuIi/OIfUjlTy2sdiJi7rS4sm6qXQ/tND5flTJtMpqTtsJO4J+JWyZDBhEwQS37QPrOg1th3G60LrbgETaRtbrIGAeMaF+f4yEvomfMJKhE8IWCCjISI7AdoSdnlWjlnVb1i2e7TQaSUbf59j1bCDovNTnCaRM+Jeq4nG9wYH6kfshlRtQtmgVEM1NKWar+7ApQXiTaqGO8IUO4cbFDT+gpcp5B1+fwSFCco1qUV525SVHR2z1fLCdGryA5NG4R0RTZrEgQV8OPzx7Mg3yNRrsYMQokjds75SJ/iOqAyJKRMQwadwuzitE8PTOO7yudKs2zCOOFZF6ij5Ul+zmMJ3C5c+Go9jm80t7Ufj 19izPxQe DMF0XXqnTGNBA5uOvQ8MoX4fb4qwLe2QJUyaNcIcdSOm2vWvI7hCaajAQY8Cf9j8V31aP8I82vNtakqmV2do+dTVkwHAWk5P4IhQDMgiDhgVQQgSQ4F87kbM4638G4a18HpC4T7Ymt4cG5vACo5OrE23UjoW2yPP/pfbVJl8wCi9Zat7hmSdg52RAU48hhlazfxSJlanSW8V5ha+JCEDqFSlbGAYovzPz8JMsPLqLDARAQmW5/yZCEGS5E6AMQ2tjOHwRFyWZ/EPyPzuSqFM8W14idAwJ9rXkATeExMDLqZZAEQx5MN+NejGyvxvLs7IrHT3Q/2q0p2likvkldScf66SI0O/LdEgt9rhO4jYwAfCHfZiBXvuIjPKp94d5xBNbCPpxVwm4bpQ83NNf0J7VHTCUlwZ6clIKru+c X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 9, 2025 at 12:45=E2=80=AFPM Baolin Wang wrote: > > When I tested the mincore() syscall, I observed that it takes longer with > 64K mTHP enabled on my Arm64 server. The reason is the mincore_pte_range(= ) > still checks each PTE individually, even when the PTEs are contiguous, > which is not efficient. > > Thus we can use pte_batch_hint() to get the batch number of the present > contiguous PTEs, which can improve the performance. I tested the mincore(= ) > syscall with 1G anonymous memory populated with 64K mTHP, and observed an > obvious performance improvement: > > w/o patch w/ patch changes > 6022us 549us +91% > > Moreover, I also tested mincore() with disabling mTHP/THP, and did not > see any obvious regression for base pages. > > Signed-off-by: Baolin Wang Reviewed-by: Barry Song > --- > Changes from v2: > - Re-calculate the max_nr, per Barry. > Changes from v1: > - Change to use pte_batch_hint() to get the batch number, per Ryan. > > Note: I observed the min_t() can introduce a slight performance regressio= n > for base pages, so I change to add a batch size check for base pages, > which can resolve the performance regression issue. > --- > mm/mincore.c | 22 +++++++++++++++++----- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/mm/mincore.c b/mm/mincore.c > index 832f29f46767..42d6c9c8da86 100644 > --- a/mm/mincore.c > +++ b/mm/mincore.c > @@ -21,6 +21,7 @@ > > #include > #include "swap.h" > +#include "internal.h" > > static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned lon= g addr, > unsigned long end, struct mm_walk *walk) > @@ -105,6 +106,7 @@ static int mincore_pte_range(pmd_t *pmd, unsigned lon= g addr, unsigned long end, > pte_t *ptep; > unsigned char *vec =3D walk->private; > int nr =3D (end - addr) >> PAGE_SHIFT; > + int step, i; > > ptl =3D pmd_trans_huge_lock(pmd, vma); > if (ptl) { > @@ -118,16 +120,26 @@ static int mincore_pte_range(pmd_t *pmd, unsigned l= ong addr, unsigned long end, > walk->action =3D ACTION_AGAIN; > return 0; > } > - for (; addr !=3D end; ptep++, addr +=3D PAGE_SIZE) { > + for (; addr !=3D end; ptep +=3D step, addr +=3D step * PAGE_SIZE)= { > pte_t pte =3D ptep_get(ptep); > > + step =3D 1; > /* We need to do cache lookup too for pte markers */ > if (pte_none_mostly(pte)) > __mincore_unmapped_range(addr, addr + PAGE_SIZE, > vma, vec); > - else if (pte_present(pte)) > - *vec =3D 1; > - else { /* pte is a swap entry */ > + else if (pte_present(pte)) { > + unsigned int batch =3D pte_batch_hint(ptep, pte); > + > + if (batch > 1) { > + unsigned int max_nr =3D (end - addr) >> P= AGE_SHIFT; > + > + step =3D min_t(unsigned int, batch, max_n= r); > + } > + > + for (i =3D 0; i < step; i++) > + vec[i] =3D 1; > + } else { /* pte is a swap entry */ > swp_entry_t entry =3D pte_to_swp_entry(pte); > > if (non_swap_entry(entry)) { > @@ -146,7 +158,7 @@ static int mincore_pte_range(pmd_t *pmd, unsigned lon= g addr, unsigned long end, > #endif > } > } > - vec++; > + vec +=3D step; > } > pte_unmap_unlock(ptep - 1, ptl); > out: > -- > 2.43.5 >