From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27114E7716C for ; Thu, 5 Dec 2024 15:19:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9308D6B00D4; Thu, 5 Dec 2024 10:19:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EDEB6B00A9; Thu, 5 Dec 2024 10:19:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B237F6B00A5; Thu, 5 Dec 2024 10:19:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B3AA36B00A3 for ; Mon, 16 Sep 2024 01:13:04 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 71C7012010F for ; Mon, 16 Sep 2024 05:13:04 +0000 (UTC) X-FDA: 82569432288.16.ECEC6DB Received: from mail-vk1-f181.google.com (mail-vk1-f181.google.com [209.85.221.181]) by imf28.hostedemail.com (Postfix) with ESMTP id A28DBC0005 for ; Mon, 16 Sep 2024 05:13:02 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=quarantine); spf=pass (imf28.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726463474; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7pKy0GP+8orlCXwX2+qmhzpwl+/GiUUqhznaxwn8qJs=; b=tIUkTUGC7+Ji4E/QIwVgEzbrZWnZfM6wLEOg1YgWQIS7tVWRZMEVXLUk0PillFZkFIpP0T RoKl+92QeDR4cUBUy8QL2mzwll4n93gpoSFqewn3wAZuBm8oKQ7/PGL+YySO3T+yKmUANM kzx3MJskdY83RBV1Tn4aNkW8kRXE5fg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726463474; a=rsa-sha256; cv=none; b=qJ4ncineVD2nLSJp4LiZskusqWGg6IyHnSb4nDd5jh9s10qjjYSK8PCDwNslPrtP01mTwV s9nAYfUs5OQF2ZJRmcbuJdFZ2pSb0sUElJF1bTgAAY1F2V+KFhPHzBBEu52ysUQpa7l+ti b7EOaTYdOZx3JF4+00KCnV0MEOMh2G4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=quarantine); spf=pass (imf28.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com Received: by mail-vk1-f181.google.com with SMTP id 71dfb90a1353d-500fbacd680so1167813e0c.3 for ; Sun, 15 Sep 2024 22:13:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726463582; x=1727068382; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7pKy0GP+8orlCXwX2+qmhzpwl+/GiUUqhznaxwn8qJs=; b=r43geRDCxS2PCkenb+En/h8vzlzfV88K7NsoxA6OGfrPepsYjewKILFkfZGN0aQgXR OaWssdGl8dq8JvHFRoyCuYwn8fyBPdO3K/qldxIsEqxRBnlImvje5recCB5xWTHKTWwf S2CkiwtbAsPoTBkwS4gXmJU3GOGkZ3mgjSrHbTpAhUr6UZ1L0d0X2woXhuJJXnFl3J9Q OU/LaXQhJYXUC3X6Ksjd+AKqVkxNpUNMuhdHPwkgBy0DsK1dO6LibHPjyAXs2YYcON8l 1byeiHPRbqm8QZTiXHEbEGXk+SBPyO4DUaJ3uezNlaZ0f35T98zveSgSIFFaJsNC3jCB +Z9w== X-Forwarded-Encrypted: i=1; AJvYcCX2S0SY9kn5OaMvOc0ukfCIXGvD6ZSPuSbdsK9Qj/bazNWo0SXxIhdhJM/CrfoRvzGmnohHcrBrsg==@kvack.org X-Gm-Message-State: AOJu0YxZaxTdXifAAcE4v9ScflnIEFK6PpoBTfAj4VbjmM9EDssm8RdI tvFNmiTo2TCOiE53ufH/qEC5VWSl4ZvXch7Gt08cJI3fFQLTJgLXftSQhpcTmfdq73HY/xosuSj uNwg+GBFuEuHXKRSA+yybznrT7PQ= X-Google-Smtp-Source: AGHT+IE4LCTrTz3eMBNZ0MVDw9K0mJlvzp1bv5Nv9ZrYzGCXgkjYLFu2wy9Cra/7E6/3br0p4YaV/3+Za2KvRsPcO1I= X-Received: by 2002:a05:6122:1d0e:b0:502:bd0d:abe2 with SMTP id 71dfb90a1353d-5032d3f813amr12379643e0c.6.1726463581614; Sun, 15 Sep 2024 22:13:01 -0700 (PDT) MIME-Version: 1.0 References: <20240913091902.1160520-1-dev.jain@arm.com> In-Reply-To: <20240913091902.1160520-1-dev.jain@arm.com> From: Barry Song Date: Mon, 16 Sep 2024 13:12:50 +0800 Message-ID: Subject: Re: [PATCH] mm: Compute mTHP order efficiently To: Dev Jain Cc: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, ryan.roberts@arm.com, anshuman.khandual@arm.com, hughd@google.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, baolin.wang@linux.alibaba.com, gshan@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 8q5j6tbembq8gtiqnfoiy5ymmo7nkoc5 X-Rspamd-Queue-Id: A28DBC0005 X-Rspamd-Server: rspam09 X-Rspamd-Pre-Result: action=add header; module=dmarc; Action set by DMARC X-Rspam: Yes X-HE-Tag: 1726463582-956016 X-HE-Meta: U2FsdGVkX18uYSA7IVWXv6uwBy+c9QfTUClP2lH7FUJfL3QiTpLhtSv3m6nuGimKzRhOrMmEYzLkRE1wG20winrINMltmseGgDsUXb+NvxXQjuoPJUHScvLQV5er1nYhe43PDAaJP1SPEV0DmS+qt1oS0Vbvyk6bwiCS72tYx/aAq7gwLS6DfjvdOz5YsrrhmZ1gnzNIYXaNpWEcGErgTlvqC599izZ2FnFnm+3fnB1kE4uBSShDpYOsXnvYSa6s1wbLs7JD8AQ4rDeW6rkxOnsbMo/1JMbPypm6VHCb7ugRNcyRS0ljzzEekGrJVpo2uL5Kq1TJgfmyrZRsz5n3h6iqGslnBGkumNJriNJVqtvHSFxT42thUWRQRlWzcuFRHhI9OYlBL04knQaqvx259/zRr788NkLXvs+WRSVYApYgaL7JPF4Ikyl1HJ6WlC4pRDd3onb50Z/rNMHY6jYN+nVnKo/PRUnu+Jm7sHQoB3et0C6cim8tivmKb6NJlwJFyNcRRttCj0cJ3v0c5s7iQnEuI3xOUJFircYd/Dsb669WLl//0oQa8DpXS/iSYTprrOhzeG09lK2M/NTlTCCGCpOlQjBOCcuv9XaD9dFqiKjDsbsY8nlify9sLlqV8bqvNqiTqHOz4xUj1Diy64Q34YB41uaIWykJ6kJL6+pMyiLN5Y8qpER91huPmvi4FDpl/lri8XfxeFwQh9VnBSDxbxG7RCgHa3kY7W+h2K2AIZRo0LKdMarx67mJHhLNqs5pHhr6/CYmPAeHymLqPFLOzGOZd5dwMHAHpOwBUQokludbwpynMefNCq7x+RFSP/CQ4coGkFFRBVUmcrSgE6GqDkgbpEsIDiYful7fZgOivGLXvHl9wZN01yj+r1Uq8hxTG77K2K2VmWwar31QW2dZoso+fBAKVBBIF+SEcx9WuPhSNm/WVuh4iBI09/62Sbl5DMc9wMVveXTdKVMgnvn Qua7SyN/ EEACD4PCV51uxMEgulKG6awGt8N1WHf/mt9mp0Vh15ZukH0PIhGTtO2NFoaMAlFeaKp6gg8SzrI86XsTkrdHtFon5gZQJuQI8Wb9WbChZ8tEqgfbkKW20DUUyDLWx7Ti1uqR+CB4Ahwm/4P6QIqQ1mtkDhMSwW/hpm2VBB+ZNJE0BPR4vIKTguUGC7TN7vEnEFYkiAhF1wl0alB0IDQR4nuvxLQWYquawN8YF9fb5VvyhFSpXZ1U6Bj5iL+CUgui4vV1+ko6UnJUzj2TmDRuh8a9YHugdavGFbqShnHdue0LRnySi9cba99GYZHLSpxdNk/Y5oAK5J2mwRmhoHSDhl5eu56JpnHR2U/bG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000493, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Sep 13, 2024 at 5:19=E2=80=AFPM Dev Jain wrote: > > We use pte_range_none() to determine whether contiguous PTEs are empty > for an mTHP allocation. Instead of iterating the while loop for every > order, use some information, which is the first set PTE found, from the > previous iteration, to eliminate some cases. The key to understanding > the correctness of the patch is that the ranges we want to examine > form a strictly decreasing sequence of nested intervals. > > Suggested-by: Ryan Roberts > Signed-off-by: Dev Jain I like this patch, but could we come up with a better subject for pte_range_none()? The subject is really incorrect. Also, I'd prefer the change for alloc_anon_folio() to be separated into its own patch. So, one patchset with two patches, please. > --- > mm/memory.c | 30 +++++++++++++++++++++++------- > 1 file changed, 23 insertions(+), 7 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 3c01d68065be..ffc24a48ef15 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4409,26 +4409,27 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > return ret; > } > > -static bool pte_range_none(pte_t *pte, int nr_pages) > +static int pte_range_none(pte_t *pte, int nr_pages) > { > int i; > > for (i =3D 0; i < nr_pages; i++) { > if (!pte_none(ptep_get_lockless(pte + i))) > - return false; > + return i; > } > > - return true; > + return nr_pages; > } > > static struct folio *alloc_anon_folio(struct vm_fault *vmf) > { > struct vm_area_struct *vma =3D vmf->vma; > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > + pte_t *first_set_pte =3D NULL, *align_pte, *pte; > unsigned long orders; > struct folio *folio; > unsigned long addr; > - pte_t *pte; > + int max_empty; > gfp_t gfp; > int order; > > @@ -4463,8 +4464,23 @@ static struct folio *alloc_anon_folio(struct vm_fa= ult *vmf) > order =3D highest_order(orders); > while (orders) { > addr =3D ALIGN_DOWN(vmf->address, PAGE_SIZE << order); > - if (pte_range_none(pte + pte_index(addr), 1 << order)) > + align_pte =3D pte + pte_index(addr); > + > + /* Range to be scanned known to be empty */ > + if (align_pte + (1 << order) <=3D first_set_pte) > break; > + > + /* Range to be scanned contains first_set_pte */ > + if (align_pte <=3D first_set_pte) > + goto repeat; > + > + /* align_pte > first_set_pte, so need to check properly *= / > + max_empty =3D pte_range_none(align_pte, 1 << order); > + if (max_empty =3D=3D 1 << order) > + break; > + > + first_set_pte =3D align_pte + max_empty; > +repeat: > order =3D next_order(&orders, order); > } > > @@ -4579,7 +4595,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault= *vmf) > if (nr_pages =3D=3D 1 && vmf_pte_changed(vmf)) { > update_mmu_tlb(vma, addr, vmf->pte); > goto release; > - } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { > + } else if (nr_pages > 1 && pte_range_none(vmf->pte, nr_pages) != =3D nr_pages) { > update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > goto release; > } > @@ -4915,7 +4931,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) > update_mmu_tlb(vma, addr, vmf->pte); > ret =3D VM_FAULT_NOPAGE; > goto unlock; > - } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { > + } else if (nr_pages > 1 && pte_range_none(vmf->pte, nr_pages) != =3D nr_pages) { > update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > ret =3D VM_FAULT_NOPAGE; > goto unlock; > -- > 2.30.2 > Thanks Barry