From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B31BBE7716C for ; Thu, 5 Dec 2024 15:19:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 592656B00D0; Thu, 5 Dec 2024 10:19:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F84E6B00C1; Thu, 5 Dec 2024 10:19:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DBCF6B00B2; Thu, 5 Dec 2024 10:19:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4FEF36B008C for ; Mon, 16 Sep 2024 01:58:47 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ADCDEA5BA9 for ; Mon, 16 Sep 2024 05:58:46 +0000 (UTC) X-FDA: 82569547452.09.FB099CB Received: from mail-vs1-f52.google.com (mail-vs1-f52.google.com [209.85.217.52]) by imf24.hostedemail.com (Postfix) with ESMTP id E0B3E180002 for ; Mon, 16 Sep 2024 05:58:44 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=quarantine); spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726466294; a=rsa-sha256; cv=none; b=zzHrGNZZPfanokTByVLHwLVQP/UT6vnAXhWruFRStvowR/vojfTSjTZ+sOiI0czXCGQ3pF TCYh0AhmvCiqj569IqvXGEHOsjA+ydv67ZExDUj8Qj7UGUEaxVTz6qvBtzUAWlj0vRdzN8 cIiWx1fG5tqtEHwYjlVVBRx9PN7c54Q= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=quarantine); spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726466294; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1hJT+txCRKjGC9WhGmri/YkmrtjflyDminSTQplDjCY=; b=Ose2kF1ptMcGVd2lqXGC9n2HPF1qwWLAJlj4252H76KVOvdgH6DAx1UPqJuGOw8L1+QBbK EVkhuJytdxiZGLM+w6HG3D7a3aD6IoAZ+xsmXMRIHI7q9z8DvE2vwpUbfKU+lKRSXlk1Eb ivuyviHtSw/blAv89Hkh1RUv1jRuDRc= Received: by mail-vs1-f52.google.com with SMTP id ada2fe7eead31-49bd2b37fe9so1287690137.1 for ; Sun, 15 Sep 2024 22:58:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726466324; x=1727071124; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1hJT+txCRKjGC9WhGmri/YkmrtjflyDminSTQplDjCY=; b=kPCW8AwaE4l3LPamh7izparBIuEsNTa7M+Nhjg2I+jRWrAGoqs3u+IGbG8nrMAQlqO ory7ChSYExHFXIFckqfmOy6znzqfEc+L3bZdYLTbwd+SxoqzIyZHNiBnC/Xt94kGr9Jw go84rwzbES5mINqeT/YkEgqpu0vrbiBPXPF9OdRwXGHr92/LLWjNakDpJtT4nd00kv7j PkkCyKI/8dT6jnTyvm1aTb5ws9RQLYedrCx8HuCNPVv/YUAHoY7Gx2VrWedoLd2Nl55+ GXFrL/oEPsamapqisXcmRU6NIb+erywWiGljYYUJwTojidrI+LedTFBw3gyY/2nydLD4 hvwQ== X-Forwarded-Encrypted: i=1; AJvYcCVK8SGVUBhLGZrkxntro1f5J7DYFI8g242iiTpMmCl/wuoVkyxTGkO3/r2sWyTWg+3WP8qoHIDiQA==@kvack.org X-Gm-Message-State: AOJu0YxT7T1ptW6MXsgTTGCUYRjg0ymZspIheU8oXoINoA7qyD8N4Buw nNd6wWyPCd9A2HEqr7n5S6PK577DBl1X6z7nKKOLzs507SUH88XFNmtQU9EBfqaoAQ1JM4YkbIo qoJCWU+vCKuxRvrKp4vOxviJFy2j91g3Uh20= X-Google-Smtp-Source: AGHT+IH/1DWYerKzMUk0aLOXExkBU6NhlheCwfrs4h435JiREB/QhRSZ98tJF30z5gGWeJM1ceXkTBqVNaciY6wOr84= X-Received: by 2002:a05:6102:3052:b0:48f:cb62:231a with SMTP id ada2fe7eead31-49d41563d9emr9792525137.23.1726466324039; Sun, 15 Sep 2024 22:58:44 -0700 (PDT) MIME-Version: 1.0 References: <20240913091902.1160520-1-dev.jain@arm.com> In-Reply-To: From: Barry Song Date: Mon, 16 Sep 2024 13:58:32 +0800 Message-ID: Subject: Re: [PATCH] mm: Compute mTHP order efficiently To: Dev Jain Cc: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, ryan.roberts@arm.com, anshuman.khandual@arm.com, hughd@google.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, baolin.wang@linux.alibaba.com, gshan@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Pre-Result: action=add header; module=dmarc; Action set by DMARC X-Rspam-User: X-Rspamd-Queue-Id: E0B3E180002 X-Rspamd-Server: rspam01 X-Stat-Signature: 63bjzn1gigt1tzqmj7111qbe1qhikkz1 X-Rspam: Yes X-HE-Tag: 1726466324-772672 X-HE-Meta: U2FsdGVkX1/ENtzpuC12WE6PRPgDYmWw9yNfZS78DVTaIeObJG/0tmAHQDVMLWOOr036gJih+/fBukfzqStB9oHKA4Do4nRG1Sl4D23k5rt8l5qmCZa36/NWWDZQnk9KtCo1PKEYDsKua/vM+OXeRqPu46jQ+XID9r28zOhBtJKUrkvJ6lV5bIzlSVjbINxDbX56HzTZxwifFM6Px80JXQjmOk17EltDOUUc3zxvcjZ3bHiNBNq7ZqHEj6NbN+Nsx7zW7ibBAcJLQGl0CRyXxjYzLaxRNuwwAWGY7J/FaiOdfsHFSA66F7ABnSdJCl9b8WOxy4LbgVeWdkP7laHM7d9OFhkd6VQB3owRcMKtlQDbCYUgqZP9RsyTT5onIpCoCp6PABR1C9xR1t/v/peP6RaelwebDJNwzT7j/+3Z/YLTrxM0ZkudoESg9vTSpqtaI6vujgts+PPRD91sksURfwVughDYI2Rl2GWesOqYaAlG/n/km2gQ6Us1xZYt8+cDFL+XWGP4l+OGo3OlPyIrOGr8M8j6uXKh3bPhzWVQRbzTlkNV7/2OPJHp2K0PZ9TiXvjmJuv2xU/bWj7dQmBvUM9CdKDb3Ppnb+Xor1NT3ZnPGD8ySdOH4rO0zjD1UIcCpEEMx2YYdnzsDU4hX/0KLd+EIzI6QAKVvCVTpZLhayHdZGT9UtP8YCsmJm2rN/4JBEs+GvMT7ovQKDexVKGssL7Pmu/ceTVB6xJnGjhzyyiRcDJHuVtQ7AComGAzRwyPD0pxQii2b8SV52nxw8PcSC+Ra9qPvSIZXTx/AiHe2NAm26GJajSnYB6ojmU9yT9DIJi7wJjg63DQtoubN/l/QdR243zxw2CUDwC3aUNUyQYZMpKndJ/Evq17Wo/kWqWNaMhSSBlNnG6qvOEYAuFPCLBbt8bT+CYzAYLl5xhfzcKrBMzgb16Lglj65bwG4yDwiRwlNFiAWq9z3gVxRe9 GdTXNq6t pOyYecaNoSSElsw/4aIFVHErNeW/fYXtub1hKhOu6s+RvQfTc8itHzoDitKuo64D/8FpGpKiSIIMxFIV48jh4kQDIkK798x/p3QI3I/wjNOKRXWB8gSHVGRZnERUuPc9tpjq9ZCPAhHMOaKnnLvb55C1iXkgYK64w0CAIXR++IpC+DDoBcgNJQMYHQu0Q1wF5nC3+xk8Zxyn4T32JtvGizx2mxySbdqrlbP3xh/50SdUgnE75Vd+GZZI2nYZzfzERNp8UqoYKp3xuo0bOMchRECTdFLnSFa3FzObcXCWDUsxd3RIZ9In8W/gf1sVao/AtG22gwUn1C5rX7YiSUx40H4p34DwwmX6vMzGW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Sep 16, 2024 at 1:20=E2=80=AFPM Dev Jain wrote: > > > On 9/16/24 10:42, Barry Song wrote: > > On Fri, Sep 13, 2024 at 5:19=E2=80=AFPM Dev Jain wro= te: > >> We use pte_range_none() to determine whether contiguous PTEs are empty > >> for an mTHP allocation. Instead of iterating the while loop for every > >> order, use some information, which is the first set PTE found, from th= e > >> previous iteration, to eliminate some cases. The key to understanding > >> the correctness of the patch is that the ranges we want to examine > >> form a strictly decreasing sequence of nested intervals. > >> > >> Suggested-by: Ryan Roberts > >> Signed-off-by: Dev Jain > > I like this patch, but could we come up with a better subject for > > pte_range_none()? > > The subject is really incorrect. > > Are you asking me to change "Compute mTHP order efficiently" to > something else? Right. Adjust the subject to more accurately reflect the specific changes being made. > > > > > Also, I'd prefer the change for alloc_anon_folio() to be separated > > into its own patch. > > So, one patchset with two patches, please. > > Fine by me. > > > > >> --- > >> mm/memory.c | 30 +++++++++++++++++++++++------- > >> 1 file changed, 23 insertions(+), 7 deletions(-) > >> > >> diff --git a/mm/memory.c b/mm/memory.c > >> index 3c01d68065be..ffc24a48ef15 100644 > >> --- a/mm/memory.c > >> +++ b/mm/memory.c > >> @@ -4409,26 +4409,27 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > >> return ret; > >> } > >> > >> -static bool pte_range_none(pte_t *pte, int nr_pages) > >> +static int pte_range_none(pte_t *pte, int nr_pages) > >> { > >> int i; > >> > >> for (i =3D 0; i < nr_pages; i++) { > >> if (!pte_none(ptep_get_lockless(pte + i))) > >> - return false; > >> + return i; > >> } > >> > >> - return true; > >> + return nr_pages; > >> } > >> > >> static struct folio *alloc_anon_folio(struct vm_fault *vmf) > >> { > >> struct vm_area_struct *vma =3D vmf->vma; > >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE > >> + pte_t *first_set_pte =3D NULL, *align_pte, *pte; > >> unsigned long orders; > >> struct folio *folio; > >> unsigned long addr; > >> - pte_t *pte; > >> + int max_empty; > >> gfp_t gfp; > >> int order; > >> > >> @@ -4463,8 +4464,23 @@ static struct folio *alloc_anon_folio(struct vm= _fault *vmf) > >> order =3D highest_order(orders); > >> while (orders) { > >> addr =3D ALIGN_DOWN(vmf->address, PAGE_SIZE << order)= ; > >> - if (pte_range_none(pte + pte_index(addr), 1 << order)) > >> + align_pte =3D pte + pte_index(addr); > >> + > >> + /* Range to be scanned known to be empty */ > >> + if (align_pte + (1 << order) <=3D first_set_pte) > >> break; > >> + > >> + /* Range to be scanned contains first_set_pte */ > >> + if (align_pte <=3D first_set_pte) > >> + goto repeat; > >> + > >> + /* align_pte > first_set_pte, so need to check properl= y */ > >> + max_empty =3D pte_range_none(align_pte, 1 << order); > >> + if (max_empty =3D=3D 1 << order) > >> + break; > >> + > >> + first_set_pte =3D align_pte + max_empty; > >> +repeat: > >> order =3D next_order(&orders, order); > >> } > >> > >> @@ -4579,7 +4595,7 @@ static vm_fault_t do_anonymous_page(struct vm_fa= ult *vmf) > >> if (nr_pages =3D=3D 1 && vmf_pte_changed(vmf)) { > >> update_mmu_tlb(vma, addr, vmf->pte); > >> goto release; > >> - } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)= ) { > >> + } else if (nr_pages > 1 && pte_range_none(vmf->pte, nr_pages) = !=3D nr_pages) { > >> update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > >> goto release; > >> } > >> @@ -4915,7 +4931,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) > >> update_mmu_tlb(vma, addr, vmf->pte); > >> ret =3D VM_FAULT_NOPAGE; > >> goto unlock; > >> - } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)= ) { > >> + } else if (nr_pages > 1 && pte_range_none(vmf->pte, nr_pages) = !=3D nr_pages) { > >> update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > >> ret =3D VM_FAULT_NOPAGE; > >> goto unlock; > >> -- > >> 2.30.2 > >> > > Thanks > > Barry >