From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9174C54E71 for ; Fri, 22 Mar 2024 10:19:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59AFE6B0089; Fri, 22 Mar 2024 06:19:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 523B16B008A; Fri, 22 Mar 2024 06:19:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34F776B008C; Fri, 22 Mar 2024 06:19:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 20BD06B0089 for ; Fri, 22 Mar 2024 06:19:41 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A27441A12E1 for ; Fri, 22 Mar 2024 10:19:40 +0000 (UTC) X-FDA: 81924278520.06.0D943EC Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf13.hostedemail.com (Postfix) with ESMTP id 5390920024 for ; Fri, 22 Mar 2024 10:19:38 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Ewb2sGe5; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=nv3aVk63; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="nyf44O9/"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=VIrfYZ1k; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711102778; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DcBJ+DMI08yxheavo/IhYEbJa/jTNUFbbnef+9SOx4Q=; b=uxwVdRY+8bNHPerQxaYFUN04pVBAbwGzsySpHnrusyXL+Yxsad22SZPGNyqDWw2L0t1Zgd I0oL3U9989s+WvYQjuMBnBB5xHOWX8woB+wh+08gnd4k2eD4n8BXkQj8aVsLHEqH7188B5 E/qNFsINLHxk+9u9k0oD0KV+rGgsKeM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711102778; a=rsa-sha256; cv=none; b=rKjvu8byGg7Wfyjed7u3/93vteol5gJE/E7Pfs3veYzP4tpvxEB6f8CGhErtJ1wsLk5PTR /1gWwsAqQL5l2d3A2dO+8LYtqNm94EPOR8yfctBszN8UgsDKb7tf7B4fmx6ItmafJlVCnO o6g+rtPcLpA553nkcYx0SEpynO3Ofy4= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Ewb2sGe5; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=nv3aVk63; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="nyf44O9/"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=VIrfYZ1k; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A97905FC96; Fri, 22 Mar 2024 10:19:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1711102776; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DcBJ+DMI08yxheavo/IhYEbJa/jTNUFbbnef+9SOx4Q=; b=Ewb2sGe5Rf6eHsvkZ4KIJopW9SPCwdmlN1ZDjvn9tZzMbfXgtZ++QxAOF9SdKvtgJVejl7 XiV6ABKa97rmCU3mDlITvEpxAoZYTSSl4ZyXILcDyQ2VvUBqs3pWbHXVlBp45vXPsee+Qu vs7xgE/v0jnaEegqXESRy/qFrMEufvo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1711102776; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DcBJ+DMI08yxheavo/IhYEbJa/jTNUFbbnef+9SOx4Q=; b=nv3aVk63JUrsQuJoEozRF+ktFyXQ2FZTSvTv5OLYL+3XbRQDOE1VngxFV+m2c1fMDIF0it 88ofxizHlq5qS2AQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1711102774; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DcBJ+DMI08yxheavo/IhYEbJa/jTNUFbbnef+9SOx4Q=; b=nyf44O9/wWW0FNh6mwGK7ZF1uLbUJYdiOPniVchKt350Lv3oJZdd7KTK3IyD4+TylLsGyf 78sJ/fd9DWgrPfqvyjGgenKR8GR9kNaBxajfy/LLDH8btoGBegx+up0aWXkgTZzUvSLz1O xYyqiKuWOEPrHQlOUBbS98bdXDZIQc0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1711102774; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DcBJ+DMI08yxheavo/IhYEbJa/jTNUFbbnef+9SOx4Q=; b=VIrfYZ1kDcV63EtyYvzQLlCzQXgO2TUbH/s2chZKCm6UM+VIOa5nIHWUGAM5qsVAjKzeWe Y//yllT/a/w9+wBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9626D13688; Fri, 22 Mar 2024 10:19:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id k5xWJDZb/WW0VwAAD6G6ig (envelope-from ); Fri, 22 Mar 2024 10:19:34 +0000 Message-ID: Date: Fri, 22 Mar 2024 11:19:34 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 5/9] mm: Turn folio_test_hugetlb into a PageType Content-Language: en-US To: "Matthew Wilcox (Oracle)" , Andrew Morton Cc: linux-mm@kvack.org, David Hildenbrand , Miaohe Lin , Muchun Song , Oscar Salvador , Luis Chamberlain References: <20240321142448.1645400-1-willy@infradead.org> <20240321142448.1645400-6-willy@infradead.org> From: Vlastimil Babka In-Reply-To: <20240321142448.1645400-6-willy@infradead.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 5390920024 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: p54a7q8u9dppqx4hmjq6sb6aju6tpyw1 X-HE-Tag: 1711102778-64181 X-HE-Meta: U2FsdGVkX19el0N0zDGY7idvV0a9ZHBu/bX1PZxqR/9d2Jr4zNTYI8EWdj7EsQjfKQF3v3WvHXxeNXH+XbGozGhD7ziHwt/JpiKIiRQD1UCdPrTmAYGSM/hpTl+PcfVhpiSC/4Gim0DG1PmPueuR0y+g8ld6rTqOmBzI4BteLzG+HWb+9ugAiDWt9iesWds/hZTOpKXeeAdliZADKPv7HC2GMtYQKQmnxR+pcihKqNCIDEHxQS4rK/Ls2zYd0eauHA1zv1oBnC0hPgY0PuFaIk1IiCeoJr++9k2XvlSXrouEa2wzrMWTXRObXjzM3kcsGQ3z0yGBu1BGLbtVL5rPJaXyoEiYRTxZwJ+E4vKLM3z82Wt8nlQV9ekaH8tetKIh3wld+OO5relxuLTIoQs+RNjLZHJ6fJJGht5JqUaqL3qQ39wKxb+q2EdRb0iHau7R7/n5g8ljLxCJ2o/nrQPQ/9otNU1hLPUZwR3KbaUZVpCeM4jUSSpMUp0wWVGkABBGyFpmQbFv3/OWLj63PLUnIbWrrmggcnPQaPD+Uzq788kApIkng1tBHJSGVcIUGdAg7VO9avhuwDeLAHrdkXaFGXoy1vfe8mq08VQz/ilkbMFYskvwudG4wG0jPXp2e10Jr24ByIjSNbNpbDcFqd3hrV7DEfh6SekesGWwf4RXv5UpVJAH8ucd+a0vXNBbb3JjykdXrvBL+IX7aOHXlXA45cG62X4llWKKSW1yu8T0Ilp+l8hiL46ZqeEfQFHpQ2GDSfn5VoO64P6OQhPwsZPq16OGpEn56h0zJ6bEJpHCzyUz6kFKVsCVrze7FhOKyrPQ8XIeBfYbF1lcT9Wo1K7JZMsBU5B1VwwiIyN1KNUijDsXT9YRFxD5a9pDc1HWl3/2ULLw6Slw288L5FUq5XmB/iuhPEWD6aG6Ku3k7DZ6XtRLy2yAKxxEVnvWou9yb4tVmASfGM7oiJHK+m2ttDR QePW2Gp4 Mxab1NcRC6m+rkfQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/21/24 15:24, Matthew Wilcox (Oracle) wrote: > The current folio_test_hugetlb() can be fooled by a concurrent folio split > into returning true for a folio which has never belonged to hugetlbfs. > This can't happen if the caller holds a refcount on it, but we have a > few places (memory-failure, compaction, procfs) which do not and should > not take a speculative reference. Should we add metadata wrt closing the bug report from Luis? https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ I assume this wouldn't be fun wrt stable... > Since hugetlb pages do not use individual page mapcounts (they are always > fully mapped and use the entire_mapcount field to record the number Wasn't there some discussions to allow partial mappings of hugetlb? What would be the implications? > of mappings), the PageType field is available now that page_mapcount() > ignores the value in this field. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: David Hildenbrand Other than that, Acked-by: Vlastimil Babka > --- > include/linux/page-flags.h | 70 ++++++++++++++++------------------ > include/trace/events/mmflags.h | 1 + > mm/hugetlb.c | 22 ++--------- > 3 files changed, 37 insertions(+), 56 deletions(-) > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 5852f967c640..6fb3cd42ee59 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -190,7 +190,6 @@ enum pageflags { > > /* At least one page in this folio has the hwpoison flag set */ > PG_has_hwpoisoned = PG_error, > - PG_hugetlb = PG_active, > PG_large_rmappable = PG_workingset, /* anon or file-backed */ > }; > > @@ -876,29 +875,6 @@ FOLIO_FLAG_FALSE(large_rmappable) > > #define PG_head_mask ((1UL << PG_head)) > > -#ifdef CONFIG_HUGETLB_PAGE > -int PageHuge(const struct page *page); > -SETPAGEFLAG(HugeTLB, hugetlb, PF_SECOND) > -CLEARPAGEFLAG(HugeTLB, hugetlb, PF_SECOND) > - > -/** > - * folio_test_hugetlb - Determine if the folio belongs to hugetlbfs > - * @folio: The folio to test. > - * > - * Context: Any context. Caller should have a reference on the folio to > - * prevent it from being turned into a tail page. > - * Return: True for hugetlbfs folios, false for anon folios or folios > - * belonging to other filesystems. > - */ > -static inline bool folio_test_hugetlb(const struct folio *folio) > -{ > - return folio_test_large(folio) && > - test_bit(PG_hugetlb, const_folio_flags(folio, 1)); > -} > -#else > -TESTPAGEFLAG_FALSE(Huge, hugetlb) > -#endif > - > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > /* > * PageHuge() only returns true for hugetlbfs pages, but not for > @@ -954,18 +930,6 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) > TESTSCFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) > #endif > > -/* > - * Check if a page is currently marked HWPoisoned. Note that this check is > - * best effort only and inherently racy: there is no way to synchronize with > - * failing hardware. > - */ > -static inline bool is_page_hwpoison(struct page *page) > -{ > - if (PageHWPoison(page)) > - return true; > - return PageHuge(page) && PageHWPoison(compound_head(page)); > -} > - > /* > * For pages that are never mapped to userspace (and aren't PageSlab), > * page_type may be used. Because it is initialised to -1, we invert the > @@ -982,6 +946,7 @@ static inline bool is_page_hwpoison(struct page *page) > #define PG_offline 0x00000100 > #define PG_table 0x00000200 > #define PG_guard 0x00000400 > +#define PG_hugetlb 0x00000800 > > #define PageType(page, flag) \ > ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE) > @@ -1076,6 +1041,37 @@ PAGE_TYPE_OPS(Table, table, pgtable) > */ > PAGE_TYPE_OPS(Guard, guard, guard) > > +#ifdef CONFIG_HUGETLB_PAGE > +FOLIO_TYPE_OPS(hugetlb, hugetlb) > +#else > +FOLIO_TEST_FLAG_FALSE(hugetlb) > +#endif > + > +/** > + * PageHuge - Determine if the page belongs to hugetlbfs > + * @page: The page to test. > + * > + * Context: Any context. > + * Return: True for hugetlbfs pages, false for anon pages or pages > + * belonging to other filesystems. > + */ > +static inline bool PageHuge(const struct page *page) > +{ > + return folio_test_hugetlb(page_folio(page)); > +} > + > +/* > + * Check if a page is currently marked HWPoisoned. Note that this check is > + * best effort only and inherently racy: there is no way to synchronize with > + * failing hardware. > + */ > +static inline bool is_page_hwpoison(struct page *page) > +{ > + if (PageHWPoison(page)) > + return true; > + return PageHuge(page) && PageHWPoison(compound_head(page)); > +} > + > extern bool is_free_buddy_page(struct page *page); > > PAGEFLAG(Isolated, isolated, PF_ANY); > @@ -1142,7 +1138,7 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page) > */ > #define PAGE_FLAGS_SECOND \ > (0xffUL /* order */ | 1UL << PG_has_hwpoisoned | \ > - 1UL << PG_hugetlb | 1UL << PG_large_rmappable) > + 1UL << PG_large_rmappable) > > #define PAGE_FLAGS_PRIVATE \ > (1UL << PG_private | 1UL << PG_private_2) > diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h > index d801409b33cf..d55e53ac91bd 100644 > --- a/include/trace/events/mmflags.h > +++ b/include/trace/events/mmflags.h > @@ -135,6 +135,7 @@ IF_HAVE_PG_ARCH_X(arch_3) > #define DEF_PAGETYPE_NAME(_name) { PG_##_name, __stringify(_name) } > > #define __def_pagetype_names \ > + DEF_PAGETYPE_NAME(hugetlb), \ > DEF_PAGETYPE_NAME(offline), \ > DEF_PAGETYPE_NAME(guard), \ > DEF_PAGETYPE_NAME(table), \ > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 7e9a766059aa..bdcbb62096cf 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1624,7 +1624,7 @@ static inline void __clear_hugetlb_destructor(struct hstate *h, > { > lockdep_assert_held(&hugetlb_lock); > > - folio_clear_hugetlb(folio); > + __folio_clear_hugetlb(folio); > } > > /* > @@ -1711,7 +1711,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, > h->surplus_huge_pages_node[nid]++; > } > > - folio_set_hugetlb(folio); > + __folio_set_hugetlb(folio); > folio_change_private(folio, NULL); > /* > * We have to set hugetlb_vmemmap_optimized again as above > @@ -2050,7 +2050,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid) > > static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio) > { > - folio_set_hugetlb(folio); > + __folio_set_hugetlb(folio); > INIT_LIST_HEAD(&folio->lru); > hugetlb_set_folio_subpool(folio, NULL); > set_hugetlb_cgroup(folio, NULL); > @@ -2160,22 +2160,6 @@ static bool prep_compound_gigantic_folio_for_demote(struct folio *folio, > return __prep_compound_gigantic_folio(folio, order, true); > } > > -/* > - * PageHuge() only returns true for hugetlbfs pages, but not for normal or > - * transparent huge pages. See the PageTransHuge() documentation for more > - * details. > - */ > -int PageHuge(const struct page *page) > -{ > - const struct folio *folio; > - > - if (!PageCompound(page)) > - return 0; > - folio = page_folio(page); > - return folio_test_hugetlb(folio); > -} > -EXPORT_SYMBOL_GPL(PageHuge); > - > /* > * Find and lock address space (mapping) in write mode. > *