From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33113C54E64 for ; Mon, 25 Mar 2024 07:57:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 982306B0082; Mon, 25 Mar 2024 03:57:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 931406B0083; Mon, 25 Mar 2024 03:57:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D2046B0088; Mon, 25 Mar 2024 03:57:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6B66A6B0082 for ; Mon, 25 Mar 2024 03:57:57 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2F40A1608FC for ; Mon, 25 Mar 2024 07:57:57 +0000 (UTC) X-FDA: 81934807794.20.1177571 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf26.hostedemail.com (Postfix) with ESMTP id 993FF14000D for ; Mon, 25 Mar 2024 07:57:54 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=IcxvOD8e; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Dj/mEtq2"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=IcxvOD8e; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Dj/mEtq2"; dmarc=none; spf=pass (imf26.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711353475; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/Es2kY+zFwqhky8wnaVmej26HA0qMOL234SNzDXLEDc=; b=epGehe+DxJvCLDZl5NwsZOKhDK/zzImPSrMbXtnvlaWXbqLMJYJNDxAw6dNigNkVYKMeEW +PWexDpWcHXhII/TowJTIfI0O1FmX/i/kcvO7yl0Xv7YCdhEGYMOWl4Cpzlsm4Nob7kLSt JAg1oeCYI2epQT1kkCBWah040DAeMxQ= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=IcxvOD8e; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Dj/mEtq2"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=IcxvOD8e; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="Dj/mEtq2"; dmarc=none; spf=pass (imf26.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711353475; a=rsa-sha256; cv=none; b=rPBmT26E6SdwkZmco1eSwR8Op7tOkGRxvZrsafJodgFfqkPXBOrLRqjwiAy/75YM7IAtE7 CcUm2eESlBwEgWoWfRd/F3Eyql6zYb4WmfjIqm4F7lM6ecZut4wFTaL8zIZoajGDvCeZqe Ece2syGZ3Dr7Y5rn21GGOixnb1qzrGs= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6618B21F73; Mon, 25 Mar 2024 07:57:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1711353472; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Es2kY+zFwqhky8wnaVmej26HA0qMOL234SNzDXLEDc=; b=IcxvOD8eUgcPR+Qcjn3EaTTv2Uc5iihruf6QjcXbTCE+GxfJCANehvIKXdtsPA1JJsANdJ EIVm2/OR3IpVs9BcTbdq8x5wAp3ghN58HwFkDUibVNAim4c4SqNbRlG/q7JJ1yzXHYYiJZ KVLA57OvaSKzEzS8++PwC59PnWmbrEw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1711353472; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Es2kY+zFwqhky8wnaVmej26HA0qMOL234SNzDXLEDc=; b=Dj/mEtq2oIhNTVcaOJruu7VXI7jplqLQZse8z3KI3squD0zezyF0R5R8tOoX/sS0Zn+tql MiEJ5HK5MHE/HmAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1711353472; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Es2kY+zFwqhky8wnaVmej26HA0qMOL234SNzDXLEDc=; b=IcxvOD8eUgcPR+Qcjn3EaTTv2Uc5iihruf6QjcXbTCE+GxfJCANehvIKXdtsPA1JJsANdJ EIVm2/OR3IpVs9BcTbdq8x5wAp3ghN58HwFkDUibVNAim4c4SqNbRlG/q7JJ1yzXHYYiJZ KVLA57OvaSKzEzS8++PwC59PnWmbrEw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1711353472; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Es2kY+zFwqhky8wnaVmej26HA0qMOL234SNzDXLEDc=; b=Dj/mEtq2oIhNTVcaOJruu7VXI7jplqLQZse8z3KI3squD0zezyF0R5R8tOoX/sS0Zn+tql MiEJ5HK5MHE/HmAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 52715137C4; Mon, 25 Mar 2024 07:57:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id R6SKE4AuAWb/GAAAD6G6ig (envelope-from ); Mon, 25 Mar 2024 07:57:52 +0000 Message-ID: <7ee2bb8c-441a-418b-ba3a-d305f69d31c8@suse.cz> Date: Mon, 25 Mar 2024 08:57:52 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 5/9] mm: Turn folio_test_hugetlb into a PageType To: "Matthew Wilcox (Oracle)" , Andrew Morton , Luis Chamberlain Cc: linux-mm@kvack.org, David Hildenbrand , Miaohe Lin , Muchun Song , Oscar Salvador References: <20240321142448.1645400-1-willy@infradead.org> <20240321142448.1645400-6-willy@infradead.org> Content-Language: en-US From: Vlastimil Babka In-Reply-To: <20240321142448.1645400-6-willy@infradead.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 993FF14000D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: aaa6nyed5uukqg49xtpbdw55o7a5f5p7 X-HE-Tag: 1711353474-522972 X-HE-Meta: U2FsdGVkX18ukKH0lSIj3/DfA63RNwkJbVwD4VGWKu2+TCjyX9EZ9GilbmVyiklRcIkvkReu2YHcYqGV/Pjf79WO1wJRpeDCQqGMrjm/FxTeJijyr/xfXkyp2bNLxGFYdT8hDi8WbED6xKsOYAlkUj7RBNZEAASfWJEwPMLHNt/Yib2VhRvH7nu9LD5V781plbUIxOjR7nL6hpc2U0lU0IofVME32xqc07zRMRN0kzrydEk5f6139G53G/9fyrBEv2OMeF45autyQfaQ6HUQxw8wJGjdtrJYxujv53AtbkJOtRZX4V1f6Bs8XVxxxXZCR9QHAJLVOdTBR/PbIMlzO7IkOTjWQ0n8VrOYDFc4RIbNdZkq+RloLhmpaqN+WoPzksx/EI/xL3k1afa3p3qWFyzk9Uwi1XF9o+uxPAUlSRU4GG0aBGWNdnpwHshBkm+skoCw/MMCzMhwM6cUZiOuGdJsi9BicpCpWFoI/ElUabcPUXDDaiL+2t8c54YUCye+rFrvONX9DQndfJRh7kpDlyGUUabE0j6EE46cFJszwPSxk4HfEyqYlvs2y9qWbyaW04TbwtEed2AF5rm5MVQcwfXYsv5LIO3DzFiqUzK4z8oZRkEXYffEjibiUPi14jktok/TQzj7TXYuQgONKZIBTZG4cmtTlqsaqQLinUHE6aFdQzKSZnF5b7ggkfwOeGIaFa18Iu8gj8t1E4zkOM7w2e/l8qdJzPd053tcf1qk4HXz3ELoiaq2tVSXHZJM7FMJUNHKupA3GBlp/pt7iPc5RE8r0gvvD3OfNXFVYm5WtNDy3HSxfR97RsiBOSk2xcFP4eKI2pnftBZqOqlsT9ieXXteu8h4Q+ZrpTu43Ik3KWbr6ZLsUc5jO0J2ryeSmAvFep6Q7uJ8mqOwtboEWEwQMoC23xDKg8Tar3kgAnnR6W7LrsV3LgXExDeyZooHo8TRb3QEcrSOcs4ysphv6f8 GQaSKOq9 qFo5WfTJIwlNTB1v5PJ88LaraYsHkMpzkTMLk9IW8SuFpd+ji4nA2b4WfHYc67WHpHUlcVCRHifht6hPIo6wcKVFX3l9tzC8M2PjoOtnGqrUbpKUXj9D+C+5wr/JIOQY2JW+HOkdgKKyyc7TEk92SRLMbouhxZkn68OUKdz0UR+Lg3pl02dh5eY1dbetpy2XZuuUeF5/QrZYcbdRCBMXN3JaCf2sRh3BCbHROkhci2NxsvnUShFbZTikJoLLfJIV+EJVniz1pOaaJfir/mAM6TJgCH7tn+6qaBVWXBjogFM8gKjsYHFKqwG3lXD+zbjh8bT5QKkCR86/v4XgL/0NbPiLHeCaegu4fNcFWgyIVKEIO1j095NHMTyDsUEv1RCgE9e5PDBifa67MMREK4JwuhscLGuz/NxejiBfExl/pc03Zlt6yj2q3mXem899g1kl8N35R X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/21/24 15:24, Matthew Wilcox (Oracle) wrote: > The current folio_test_hugetlb() can be fooled by a concurrent folio split > into returning true for a folio which has never belonged to hugetlbfs. > This can't happen if the caller holds a refcount on it, but we have a > few places (memory-failure, compaction, procfs) which do not and should > not take a speculative reference. In compaction and with CONFIG_DEBUG_VM enabled, the current implementation can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks in the PageHuge() testing path. > Since hugetlb pages do not use individual page mapcounts (they are always > fully mapped and use the entire_mapcount field to record the number > of mappings), the PageType field is available now that page_mapcount() > ignores the value in this field. Reported-by: Luis Chamberlain Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227 Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") Cc: > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: David Hildenbrand > --- > include/linux/page-flags.h | 70 ++++++++++++++++------------------ > include/trace/events/mmflags.h | 1 + > mm/hugetlb.c | 22 ++--------- > 3 files changed, 37 insertions(+), 56 deletions(-) > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 5852f967c640..6fb3cd42ee59 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -190,7 +190,6 @@ enum pageflags { > > /* At least one page in this folio has the hwpoison flag set */ > PG_has_hwpoisoned = PG_error, > - PG_hugetlb = PG_active, > PG_large_rmappable = PG_workingset, /* anon or file-backed */ > }; > > @@ -876,29 +875,6 @@ FOLIO_FLAG_FALSE(large_rmappable) > > #define PG_head_mask ((1UL << PG_head)) > > -#ifdef CONFIG_HUGETLB_PAGE > -int PageHuge(const struct page *page); > -SETPAGEFLAG(HugeTLB, hugetlb, PF_SECOND) > -CLEARPAGEFLAG(HugeTLB, hugetlb, PF_SECOND) > - > -/** > - * folio_test_hugetlb - Determine if the folio belongs to hugetlbfs > - * @folio: The folio to test. > - * > - * Context: Any context. Caller should have a reference on the folio to > - * prevent it from being turned into a tail page. > - * Return: True for hugetlbfs folios, false for anon folios or folios > - * belonging to other filesystems. > - */ > -static inline bool folio_test_hugetlb(const struct folio *folio) > -{ > - return folio_test_large(folio) && > - test_bit(PG_hugetlb, const_folio_flags(folio, 1)); > -} > -#else > -TESTPAGEFLAG_FALSE(Huge, hugetlb) > -#endif > - > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > /* > * PageHuge() only returns true for hugetlbfs pages, but not for > @@ -954,18 +930,6 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) > TESTSCFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) > #endif > > -/* > - * Check if a page is currently marked HWPoisoned. Note that this check is > - * best effort only and inherently racy: there is no way to synchronize with > - * failing hardware. > - */ > -static inline bool is_page_hwpoison(struct page *page) > -{ > - if (PageHWPoison(page)) > - return true; > - return PageHuge(page) && PageHWPoison(compound_head(page)); > -} > - > /* > * For pages that are never mapped to userspace (and aren't PageSlab), > * page_type may be used. Because it is initialised to -1, we invert the > @@ -982,6 +946,7 @@ static inline bool is_page_hwpoison(struct page *page) > #define PG_offline 0x00000100 > #define PG_table 0x00000200 > #define PG_guard 0x00000400 > +#define PG_hugetlb 0x00000800 > > #define PageType(page, flag) \ > ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE) > @@ -1076,6 +1041,37 @@ PAGE_TYPE_OPS(Table, table, pgtable) > */ > PAGE_TYPE_OPS(Guard, guard, guard) > > +#ifdef CONFIG_HUGETLB_PAGE > +FOLIO_TYPE_OPS(hugetlb, hugetlb) > +#else > +FOLIO_TEST_FLAG_FALSE(hugetlb) > +#endif > + > +/** > + * PageHuge - Determine if the page belongs to hugetlbfs > + * @page: The page to test. > + * > + * Context: Any context. > + * Return: True for hugetlbfs pages, false for anon pages or pages > + * belonging to other filesystems. > + */ > +static inline bool PageHuge(const struct page *page) > +{ > + return folio_test_hugetlb(page_folio(page)); > +} > + > +/* > + * Check if a page is currently marked HWPoisoned. Note that this check is > + * best effort only and inherently racy: there is no way to synchronize with > + * failing hardware. > + */ > +static inline bool is_page_hwpoison(struct page *page) > +{ > + if (PageHWPoison(page)) > + return true; > + return PageHuge(page) && PageHWPoison(compound_head(page)); > +} > + > extern bool is_free_buddy_page(struct page *page); > > PAGEFLAG(Isolated, isolated, PF_ANY); > @@ -1142,7 +1138,7 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page) > */ > #define PAGE_FLAGS_SECOND \ > (0xffUL /* order */ | 1UL << PG_has_hwpoisoned | \ > - 1UL << PG_hugetlb | 1UL << PG_large_rmappable) > + 1UL << PG_large_rmappable) > > #define PAGE_FLAGS_PRIVATE \ > (1UL << PG_private | 1UL << PG_private_2) > diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h > index d801409b33cf..d55e53ac91bd 100644 > --- a/include/trace/events/mmflags.h > +++ b/include/trace/events/mmflags.h > @@ -135,6 +135,7 @@ IF_HAVE_PG_ARCH_X(arch_3) > #define DEF_PAGETYPE_NAME(_name) { PG_##_name, __stringify(_name) } > > #define __def_pagetype_names \ > + DEF_PAGETYPE_NAME(hugetlb), \ > DEF_PAGETYPE_NAME(offline), \ > DEF_PAGETYPE_NAME(guard), \ > DEF_PAGETYPE_NAME(table), \ > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 7e9a766059aa..bdcbb62096cf 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1624,7 +1624,7 @@ static inline void __clear_hugetlb_destructor(struct hstate *h, > { > lockdep_assert_held(&hugetlb_lock); > > - folio_clear_hugetlb(folio); > + __folio_clear_hugetlb(folio); > } > > /* > @@ -1711,7 +1711,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, > h->surplus_huge_pages_node[nid]++; > } > > - folio_set_hugetlb(folio); > + __folio_set_hugetlb(folio); > folio_change_private(folio, NULL); > /* > * We have to set hugetlb_vmemmap_optimized again as above > @@ -2050,7 +2050,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid) > > static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio) > { > - folio_set_hugetlb(folio); > + __folio_set_hugetlb(folio); > INIT_LIST_HEAD(&folio->lru); > hugetlb_set_folio_subpool(folio, NULL); > set_hugetlb_cgroup(folio, NULL); > @@ -2160,22 +2160,6 @@ static bool prep_compound_gigantic_folio_for_demote(struct folio *folio, > return __prep_compound_gigantic_folio(folio, order, true); > } > > -/* > - * PageHuge() only returns true for hugetlbfs pages, but not for normal or > - * transparent huge pages. See the PageTransHuge() documentation for more > - * details. > - */ > -int PageHuge(const struct page *page) > -{ > - const struct folio *folio; > - > - if (!PageCompound(page)) > - return 0; > - folio = page_folio(page); > - return folio_test_hugetlb(folio); > -} > -EXPORT_SYMBOL_GPL(PageHuge); > - > /* > * Find and lock address space (mapping) in write mode. > *