From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 46090C3ABDA
	for <linux-mm@archiver.kernel.org>; Wed, 14 May 2025 23:44:35 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id A82AD6B000A; Wed, 14 May 2025 19:43:34 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id A04116B00DA; Wed, 14 May 2025 19:43:34 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 8080A6B00DC; Wed, 14 May 2025 19:43:34 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15])
	by kanga.kvack.org (Postfix) with ESMTP id 535CB6B000A
	for <linux-mm@kvack.org>; Wed, 14 May 2025 19:43:34 -0400 (EDT)
Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id 9798AE2C9E
	for <linux-mm@kvack.org>; Wed, 14 May 2025 23:43:35 +0000 (UTC)
X-FDA: 83443142790.17.6EB26C3
Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74])
	by imf26.hostedemail.com (Postfix) with ESMTP id C88D614000F
	for <linux-mm@kvack.org>; Wed, 14 May 2025 23:43:33 +0000 (UTC)
Authentication-Results: imf26.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=mm1SlZfu;
	spf=pass (imf26.hostedemail.com: domain of 3pColaAsKCNU13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3pColaAsKCNU13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1747266213;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=wlApZPNMFWcR7jIJ9KHUoWOXBlzS7oZ5fyFUsGvGlHI=;
	b=PSj7/oRTWqecm9NO4j8mzic6RD3mmCKN8aUn9LFroK4xIoLXNvhodsM0K69mYhM8TSoubz
	kDlzOciqJbaRGA3t1qX++7xjntqdariL88u+H2RR3AI3Bi6ljnkdc01ef8n5KRSypD9wxB
	B9TaB5z8khXRQjf9BLy6ZfisVb97y50=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747266213; a=rsa-sha256;
	cv=none;
	b=WH9cfH6sEtAJMhACj0OY+eTQrtzB6hnSr1J6e+WStUT7CHfp15IshggqnnyJKswmLeRT+w
	BzjMFwfslYV7UPrNUMp10Mt5M/aBQfrwvqxhE6ciIPxZEPJXmlHPHN8Taau7gomx4dm2Iz
	A97yAD8h0VcC5SIxL6rnDyOGPUbkIOY=
ARC-Authentication-Results: i=1;
	imf26.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=mm1SlZfu;
	spf=pass (imf26.hostedemail.com: domain of 3pColaAsKCNU13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3pColaAsKCNU13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com;
	dmarc=pass (policy=reject) header.from=google.com
Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-30ad109bc89so371404a91.0
        for <linux-mm@kvack.org>; Wed, 14 May 2025 16:43:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1747266213; x=1747871013; darn=kvack.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=wlApZPNMFWcR7jIJ9KHUoWOXBlzS7oZ5fyFUsGvGlHI=;
        b=mm1SlZfuASe38dKp8mpXyW6sdn5ePPrba2F1oE3G8RhdZmASo7iVoMEQld6Ul3s9vh
         py+cMuRe4gkWuygsbcDTOgqaPaEsPl05BwQ48ASJliA+cQWMlTRW/3iwPeUjs/nRksN+
         Shnpau2OPYt7dcPf90xbXovlDdiDiGMFIlBRwTVomQP6OFgoSNjw6w/4MCykHWShhqcU
         gP7KZTJ4i7mAnb6KLPST6R3kXIvLFctDiXL8AALpQz14RPp7c7No88JdQGVEGsOnj8l9
         K4ugWYwNUekNXoz1HFYvx+wyEi6kaQQxn307GdODGe2jE/yBZB6V2JIfk+MHPqDpoDuM
         GPzw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1747266213; x=1747871013;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=wlApZPNMFWcR7jIJ9KHUoWOXBlzS7oZ5fyFUsGvGlHI=;
        b=QJirmOA202GscIWRe9jPwPBliib2fwD3ig+dMQ0/0q0xHaP/WuX1aQ1QEm3PLwjW5N
         8OqzjQy/LULS9qB5TW6txYdnyGBkaiHWcwu+DXVpXIhSywaOfV56eygK+/LLyQBjevCh
         xO9ECR69emVtB9zwOwH775W33h2Ezq4ckl8GEkbMo60+q1XKbsASZGqMt2j2wEau0ZgO
         4P1fYA5b6TLqqlfemfJLWxlKEDnjkWRnjU281AzCxumOeqjDAko7QOkorX+7mGWjXj1z
         rkkg9Ll4LrupMGPGbEm35otUiuwfXSyEe8qHDoS9qzmwh3lgYejr7g6vJ+WAyDiYgA0e
         vYyA==
X-Forwarded-Encrypted: i=1; AJvYcCWhkoC6uu03KKhjVhT+5MrlVaXRwkPMmq/+BrzOBUHiJWRBQncm7nsJopQiKgI1qPQI28ENrdxH7g==@kvack.org
X-Gm-Message-State: AOJu0Yylk0GeR4BhVNHeeKJOq6VenXIL5X2JeFOxgBtlDEef1xbtRx+W
	MzOWL0ZMoy3qv/B/ZQMs4B1fs0OAPxDDe1PmDxtOyO9siXtWpaw9KFpTJXhn+/Mi6/f6z2dj1sX
	hQahtMLtVMszFJkd6JDoo8g==
X-Google-Smtp-Source: AGHT+IGpGoT229pDQNLOtvYT1MW+Z5D28EV3QKdaocKPzRw0I86J+skBPLzQ0QgVUYX98zFangxne/p4We6UZ3/4Ig==
X-Received: from pjbpw8.prod.google.com ([2002:a17:90b:2788:b0:301:1ea9:63b0])
 (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:57c4:b0:2fa:15ab:4de7 with SMTP id 98e67ed59e1d1-30e51589dbamr900643a91.12.1747266212707;
 Wed, 14 May 2025 16:43:32 -0700 (PDT)
Date: Wed, 14 May 2025 16:42:02 -0700
In-Reply-To: <cover.1747264138.git.ackerleytng@google.com>
Mime-Version: 1.0
References: <cover.1747264138.git.ackerleytng@google.com>
X-Mailer: git-send-email 2.49.0.1045.g170613ef41-goog
Message-ID: <bdd00f8a1919794da94ba366529756bd6b925ade.1747264138.git.ackerleytng@google.com>
Subject: [RFC PATCH v2 23/51] mm: hugetlb: Refactor out hugetlb_alloc_folio()
From: Ackerley Tng <ackerleytng@google.com>
To: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, 
	x86@kernel.org, linux-fsdevel@vger.kernel.org
Cc: ackerleytng@google.com, aik@amd.com, ajones@ventanamicro.com, 
	akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, 
	anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, 
	binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, 
	chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, 
	david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, 
	erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, 
	haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, 
	isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, 
	jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, 
	jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, 
	kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, 
	kirill.shutemov@intel.com, liam.merwick@oracle.com, 
	maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, 
	mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, 
	muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, 
	oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, 
	paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, 
	peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, 
	quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, 
	quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, 
	quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, 
	rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, 
	rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, 
	steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, 
	thomas.lendacky@amd.com, usama.arif@bytedance.com, vannapurve@google.com, 
	vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, 
	wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, 
	xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, 
	yuzenghui@huawei.com, zhiquan1.li@intel.com
Content-Type: text/plain; charset="UTF-8"
X-Rspamd-Server: rspam08
X-Rspamd-Queue-Id: C88D614000F
X-Stat-Signature: tji1cg4ucjri1ws53hgogqd6fmqs7dct
X-Rspam-User: 
X-HE-Tag: 1747266213-289210
X-HE-Meta: U2FsdGVkX1/KMKh5NtsiJleYa4lhjuBq/yjx8hmoYb1K5slcMVs3FrhGEE8L0XgLrMdcS+nWIDlmdz+cGVt4Nc86U//nn6c/xTeratGRdMP2eAOvdG+SHS7POMlEIl0Fgw02cbwNPvUaduHk/APs0aL/7cY8mY9FMbsP/rKSwlIos1emzORy2+Axcbdd2VjehjxZvsJNVv/SdrrfYPBRsYnvFqmdsAbsJwaPcaGRj0cvNfpEHOusdbCMRRz6soH0xr1A9PTGUGwauxmbUzEIROrX1g5ionpWpPDq5OIVuEixl+9vs50pydckg7yGXmwt8U9vFlnF9lRNUfbt/u0TZUwzjcPtt4HyC2+j4DYjcbssUjfOYMfT3ef1/yIxZEwxlZVebgAW4ECQrB2IEZy19ViJlvxJPqGU3AEFG8CGEJ/pgjoRM22QZXSLH10nh06cJNLsCLfO7XbmEirFjO0BDCN59Zjohuk8HvB9FCgbQkmtIMFgDp1+51K6BhW0JWyLvnN8jB3/Od2OPxxzjAMJbPKJIr5PNHbHW3XQkrwxNFfGK4xgv3VPmXFX0aL6ER8at6XsNSWK36TweVR53JeetwVfJmxnVqxxgMmd7g2zZynEwpdbOwzYGatk1AZXTG0L6U3hsGT7cP7IC6HGfZM+kMxjt/ztzhP+OvmP7F4QzHkLQXmqpY7jsYKgL9xKBR+j3ppmz7ZJTOTSEixAleLRwPpxGudANMJMOjb4gpdeuUN1uGwda10tnZ9TPSozX8WjJEbINlzr8GzJ8NaV5l/8mAFSkiWsjSyDmO8mXKDcBw+HxE7j1IZtZhSOVU5NDA4xTtMndurFlC8C2rtiFVlmKdSuihaz+os6Xz7g8wE5r2RIVSqdCIUATRjK7ah4qsCKSxEcvZaSEsqsGJ8OwTRQYLeggZd90TqXfQWjc6LtDyhN4Q8qVkbHJd5U0jLNTrW5fCY3FOclFTdYukCpxMs
 s0YHxX1u
 pyy5RFWAIeetTCOQ60hFkBlG3JY8wfWVLhGO3H3/R9xCJfYsIRZ0E4AQQmBEJwoGgjRZoQGlaKnLTETywEVKvfZg4qkamU/UfMBiVP5Gga711R60w+n4jgskrTHpcQpMbSjfDf4qbOAZe2s1hPGnq/Iav/P6ugSl9y7tbSRaT2Nw1yTROtk6Myc3BKFr9gxWFtkOyD3RdDsCMVQQnms27knZwRYaSYSzWhd6Cu0bXITGc7zRGwXlHT45DXKl29kZaQQQITQKCJn4ZiwatswE0BWYxktGozi3QBtVb6kHARqIq44PwandbqmEfPuP/2n2Dlp1uu+h795WLHOYeneL5FUFerFtqPuCHYahB0qCJ35YdvmrxCi9GQYWeYYX/eNKDnMsM2SU428EhqhhwqeQmQT2ZOH2ErxkdzYc1gnCBdBIG3HgOL0UssbhFBE2qFeePUneO3F2eV1aDsSDsH+AzHcmivZ2GIaJeVdnikqkd2Ghp2nPmFpjGFLfyLdIulbQta5blCW+/agarJhtG0PUoUcZISHzxKj2yp0rmiplbuwx3XWAPrDq7S24XkVyXhdqw/IhyFJI75Me2Vv9+pF/J+vhCvCSDvfgXjqgYW1TIHHc2/jDdKgv7jsIlGC+rQ4yHsOmRLYW7DhMdkFRdRLBYre748ECEmtiPwEHxDPZnRfkApQbJCMewUoBQgwbPx7lOsag/
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Refactor out hugetlb_alloc_folio() from alloc_hugetlb_folio(), which
handles allocation of a folio and cgroup charging.

Other than flags to control charging in the allocation process,
hugetlb_alloc_folio() also has parameters for memory policy.

This refactoring as a whole decouples the hugetlb page allocation from
hugetlbfs, (1) where the subpool is stored at the fs mount, (2)
reservations are made during mmap and stored in the vma, and (3) mpol
must be stored at vma->vm_policy (4) a vma must be used for allocation
even if the pages are not meant to be used by host process.

This decoupling will allow hugetlb_alloc_folio() to be used by
guest_memfd in later patches. In guest_memfd, (1) a subpool is created
per-fd and is stored on the inode, (2) no vma-related reservations are
used (3) mpol may not be associated with a vma since (4) for private
pages, the pages will not be mappable to userspace and hence have to
associated vmas.

This could hopefully also open hugetlb up as a more generic source of
hugetlb pages that are not bound to hugetlbfs, with the complexities
of userspace/mmap/vma-related reservations contained just to
hugetlbfs.

Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Change-Id: I60528f246341268acbf0ed5de7752ae2cacbef93
---
 include/linux/hugetlb.h |  12 +++
 mm/hugetlb.c            | 192 ++++++++++++++++++++++------------------
 2 files changed, 118 insertions(+), 86 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 8f3ac832ee7f..8ba941d88956 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -698,6 +698,9 @@ bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m);
 int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list);
 int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn);
 void wait_for_freed_hugetlb_folios(void);
+struct folio *hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol,
+				  pgoff_t ilx, bool charge_cgroup_rsvd,
+				  bool use_existing_reservation);
 struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 				unsigned long addr, bool cow_from_owner);
 struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid,
@@ -1099,6 +1102,15 @@ static inline void wait_for_freed_hugetlb_folios(void)
 {
 }
 
+static inline struct folio *hugetlb_alloc_folio(struct hstate *h,
+						struct mempolicy *mpol,
+						pgoff_t ilx,
+						bool charge_cgroup_rsvd,
+						bool use_existing_reservation)
+{
+	return NULL;
+}
+
 static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 					   unsigned long addr,
 					   bool cow_from_owner)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 29d1a3fb10df..5b088fe002a2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2954,6 +2954,101 @@ void wait_for_freed_hugetlb_folios(void)
 	flush_work(&free_hpage_work);
 }
 
+/**
+ * hugetlb_alloc_folio() - Allocates a hugetlb folio.
+ *
+ * @h: struct hstate to allocate from.
+ * @mpol: struct mempolicy to apply for this folio allocation.
+ * @ilx: Interleave index for interpretation of @mpol.
+ * @charge_cgroup_rsvd: Set to true to charge cgroup reservation.
+ * @use_existing_reservation: Set to true if this allocation should use an
+ *                            existing hstate reservation.
+ *
+ * This function handles cgroup and global hstate reservations. VMA-related
+ * reservations and subpool debiting must be handled by the caller if necessary.
+ *
+ * Return: folio on success or negated error otherwise.
+ */
+struct folio *hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol,
+				  pgoff_t ilx, bool charge_cgroup_rsvd,
+				  bool use_existing_reservation)
+{
+	unsigned int nr_pages = pages_per_huge_page(h);
+	struct hugetlb_cgroup *h_cg = NULL;
+	struct folio *folio = NULL;
+	nodemask_t *nodemask;
+	gfp_t gfp_mask;
+	int nid;
+	int idx;
+	int ret;
+
+	idx = hstate_index(h);
+
+	if (charge_cgroup_rsvd) {
+		if (hugetlb_cgroup_charge_cgroup_rsvd(idx, nr_pages, &h_cg))
+			goto out;
+	}
+
+	if (hugetlb_cgroup_charge_cgroup(idx, nr_pages, &h_cg))
+		goto out_uncharge_cgroup_reservation;
+
+	gfp_mask = htlb_alloc_mask(h);
+	nid = policy_node_nodemask(mpol, gfp_mask, ilx, &nodemask);
+
+	spin_lock_irq(&hugetlb_lock);
+
+	if (use_existing_reservation || available_huge_pages(h))
+		folio = dequeue_hugetlb_folio(h, gfp_mask, mpol, nid, nodemask);
+
+	if (!folio) {
+		spin_unlock_irq(&hugetlb_lock);
+		folio = alloc_surplus_hugetlb_folio(h, gfp_mask, mpol, nid, nodemask);
+		if (!folio)
+			goto out_uncharge_cgroup;
+		spin_lock_irq(&hugetlb_lock);
+		list_add(&folio->lru, &h->hugepage_activelist);
+		folio_ref_unfreeze(folio, 1);
+		/* Fall through */
+	}
+
+	if (use_existing_reservation) {
+		folio_set_hugetlb_restore_reserve(folio);
+		h->resv_huge_pages--;
+	}
+
+	hugetlb_cgroup_commit_charge(idx, nr_pages, h_cg, folio);
+
+	if (charge_cgroup_rsvd)
+		hugetlb_cgroup_commit_charge_rsvd(idx, nr_pages, h_cg, folio);
+
+	spin_unlock_irq(&hugetlb_lock);
+
+	gfp_mask = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL;
+	ret = mem_cgroup_charge_hugetlb(folio, gfp_mask);
+	/*
+	 * Unconditionally increment NR_HUGETLB here. If it turns out that
+	 * mem_cgroup_charge_hugetlb failed, then immediately free the page and
+	 * decrement NR_HUGETLB.
+	 */
+	lruvec_stat_mod_folio(folio, NR_HUGETLB, pages_per_huge_page(h));
+
+	if (ret == -ENOMEM) {
+		free_huge_folio(folio);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	return folio;
+
+out_uncharge_cgroup:
+	hugetlb_cgroup_uncharge_cgroup(idx, nr_pages, h_cg);
+out_uncharge_cgroup_reservation:
+	if (charge_cgroup_rsvd)
+		hugetlb_cgroup_uncharge_cgroup_rsvd(idx, nr_pages, h_cg);
+out:
+	folio = ERR_PTR(-ENOSPC);
+	goto out;
+}
+
 /*
  * NOTE! "cow_from_owner" represents a very hacky usage only used in CoW
  * faults of hugetlb private mappings on top of a non-page-cache folio (in
@@ -2971,16 +3066,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 	bool reservation_exists;
 	bool charge_cgroup_rsvd;
 	struct folio *folio;
-	int ret, idx;
-	struct hugetlb_cgroup *h_cg = NULL;
-	gfp_t gfp = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL;
 	struct mempolicy *mpol;
-	nodemask_t *nodemask;
-	gfp_t gfp_mask;
 	pgoff_t ilx;
-	int nid;
-
-	idx = hstate_index(h);
 
 	if (cow_from_owner) {
 		/*
@@ -3020,69 +3107,22 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 	}
 	reservation_exists = vma_reservation_exists || subpool_reservation_exists;
 
-	/*
-	 * If a vma_reservation_exists, we can skip charging hugetlb
-	 * reservations since that was charged in hugetlb_reserve_pages() when
-	 * the reservation was recorded on the resv_map.
-	 */
-	charge_cgroup_rsvd = !vma_reservation_exists;
-	if (charge_cgroup_rsvd) {
-		ret = hugetlb_cgroup_charge_cgroup_rsvd(
-			idx, pages_per_huge_page(h), &h_cg);
-		if (ret)
-			goto out_subpool_put;
-	}
-
 	mpol = get_vma_policy(vma, addr, h->order, &ilx);
 
-	ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg);
-	if (ret) {
-		mpol_cond_put(mpol);
-		goto out_uncharge_cgroup_reservation;
-	}
-
-	gfp_mask = htlb_alloc_mask(h);
-	nid = policy_node_nodemask(mpol, gfp_mask, ilx, &nodemask);
-
-	spin_lock_irq(&hugetlb_lock);
-
-	folio = NULL;
-	if (reservation_exists || available_huge_pages(h))
-		folio = dequeue_hugetlb_folio(h, gfp_mask, mpol, nid, nodemask);
-
-	if (!folio) {
-		spin_unlock_irq(&hugetlb_lock);
-		folio = alloc_surplus_hugetlb_folio(h, gfp_mask, mpol, nid, nodemask);
-		if (!folio) {
-			mpol_cond_put(mpol);
-			goto out_uncharge_cgroup;
-		}
-		spin_lock_irq(&hugetlb_lock);
-		list_add(&folio->lru, &h->hugepage_activelist);
-		folio_ref_unfreeze(folio, 1);
-		/* Fall through */
-	}
-
 	/*
-	 * Either dequeued or buddy-allocated folio needs to add special
-	 * mark to the folio when it consumes a global reservation.
+	 * If a vma_reservation_exists, we can skip charging cgroup reservations
+	 * since that was charged during vma reservation. Use a reservation as
+	 * long as it exists.
 	 */
-	if (reservation_exists) {
-		folio_set_hugetlb_restore_reserve(folio);
-		h->resv_huge_pages--;
-	}
-
-	hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, folio);
-
-	if (charge_cgroup_rsvd) {
-		hugetlb_cgroup_commit_charge_rsvd(idx, pages_per_huge_page(h),
-						  h_cg, folio);
-	}
-
-	spin_unlock_irq(&hugetlb_lock);
+	charge_cgroup_rsvd = !vma_reservation_exists;
+	folio = hugetlb_alloc_folio(h, mpol, ilx, charge_cgroup_rsvd,
+				    reservation_exists);
 
 	mpol_cond_put(mpol);
 
+	if (IS_ERR_OR_NULL(folio))
+		goto out_subpool_put;
+
 	hugetlb_set_folio_subpool(folio, spool);
 
 	/* If vma accounting wasn't bypassed earlier, follow up with commit. */
@@ -3091,9 +3131,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 		/*
 		 * If there is a discrepancy in reservation status between the
 		 * time of vma_needs_reservation() and vma_commit_reservation(),
-		 * then there the page must have been added to the reservation
-		 * map between vma_needs_reservation() and
-		 * vma_commit_reservation().
+		 * then the page must have been added to the reservation map
+		 * between vma_needs_reservation() and vma_commit_reservation().
 		 *
 		 * Adjust for the subpool count incremented above AND
 		 * in hugetlb_reserve_pages for the same page.	Also,
@@ -3115,27 +3154,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 		}
 	}
 
-	ret = mem_cgroup_charge_hugetlb(folio, gfp);
-	/*
-	 * Unconditionally increment NR_HUGETLB here. If it turns out that
-	 * mem_cgroup_charge_hugetlb failed, then immediately free the page and
-	 * decrement NR_HUGETLB.
-	 */
-	lruvec_stat_mod_folio(folio, NR_HUGETLB, pages_per_huge_page(h));
-
-	if (ret == -ENOMEM) {
-		free_huge_folio(folio);
-		return ERR_PTR(-ENOMEM);
-	}
-
 	return folio;
 
-out_uncharge_cgroup:
-	hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg);
-out_uncharge_cgroup_reservation:
-	if (charge_cgroup_rsvd)
-		hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h),
-						    h_cg);
 out_subpool_put:
 	if (!vma_reservation_exists)
 		hugepage_subpool_put_pages(spool, 1);
-- 
2.49.0.1045.g170613ef41-goog