From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E4E6C282CD for ; Mon, 3 Mar 2025 16:00:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E68136B007B; Mon, 3 Mar 2025 11:00:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E17C26B0082; Mon, 3 Mar 2025 11:00:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB88D6B0083; Mon, 3 Mar 2025 11:00:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A75D36B007B for ; Mon, 3 Mar 2025 11:00:17 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1872B51D3D for ; Mon, 3 Mar 2025 16:00:17 +0000 (UTC) X-FDA: 83180701674.02.685FE5F Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by imf05.hostedemail.com (Postfix) with ESMTP id 26E8B100004 for ; Mon, 3 Mar 2025 16:00:14 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Mu+a1sf/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of jackmanb@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=jackmanb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741017615; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vko7cABqIDq4wyQZcCWllQMJuzpSeLgfyvERDjNgnaM=; b=iIKk19n6LAihrADAq5cpOVfT1O4SMPWGjchMY3cv9ELZZkrNGrzrqHPAk8PrVe5RnzWR0a ZitZBzUgh7k0jcbJs6xrAK+yKQAeiEviECGipTf2DWf4Z+DPt815tJc8Zj6ruN0s6tEANV 0smHpQb/yiL89JGuXPEO8+SbVTdocMo= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Mu+a1sf/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of jackmanb@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=jackmanb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741017615; a=rsa-sha256; cv=none; b=ULcVc4tvX52Vb4XgHMHYE7NUd3K5fC4VNY5t/1LqnLsdc9xoaBaSJuAdPv0JN496qIIA2k P6XgDjc119Hrf1JYwoQkHinNLOSz5LlXY5z5LmQqn4UtLkyx3aJHKsQZqj3Wje8ydp7cGk o4OHBS5lmh5LhnK+SoTh9iFgKiW2Eig= Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-4393ee912e1so85045e9.1 for ; Mon, 03 Mar 2025 08:00:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741017613; x=1741622413; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=vko7cABqIDq4wyQZcCWllQMJuzpSeLgfyvERDjNgnaM=; b=Mu+a1sf/RKkvPqCJbUga7eZjaS6TsEaWj0ouuntTEZ78HO45WLdpqB4+W60zW2Xy7i x8NQZW0X6z+xAKyQG28V1t+uPjZt55dLTj3eQwb1IvI2ANDKUMB+hxiYbkZV6GP9EGQe UrnYXa3poYsdn+Vtr6bZquOao2bLqn6NKIu8kdKuMWKrRl2iI5kxdI7nm+4QNh/1lSkw GCAawoIWswRTYxnpMijijLA8YihCC7JrsuZXc3iUJvCqbNv2lQJxlfii5aEuO2z3rcMi 3zpcEBPaIa0jL8V3ZFuLLNyK8KCnlWhOZx3kJ4jjU9NnTFNmGsQyFmnQqfbgjxuQsSmg EVaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741017613; x=1741622413; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=vko7cABqIDq4wyQZcCWllQMJuzpSeLgfyvERDjNgnaM=; b=FWIkLXqW1IdkkT95Q287k49S9bndbC7e2MxxklN+CCKp4eybEJ8wAVuRds/EmnLtCE HQbLmrgyq7QbWxwUz+G8n1Kh1sY1e9qK3cknEm6PJ62sIi/KiXIb4XzHycr6JjD3v86l cQdT/vyDaZ9CP4DjfYEJqZTkeMz8Zlbcvr5J711skH8fXG5KUO2GT1Mo5k6Wp23B0Xv+ 0xaXopnRuQ9jFNFQ5eSzFsOe91mRYXirQkDC0Mpz24yU4ErhCD0rsB6+oCJcDWKYr//X faRUcO0WwJOt8bF76K6aX6XhD3OBg5ZVBJo4dhCdwU29I7n/JCVQe1rIZfyXGzmUlLTR N4tw== X-Forwarded-Encrypted: i=1; AJvYcCVji1+jdkX7AlL4k1X0Tf9HDqrY3Lbpj/ssfQtUuCn3+QRbTM7Ukgue5xRlzaThKfIBRxgg9gCq/A==@kvack.org X-Gm-Message-State: AOJu0YxlQXMzHGwRrD7DXKY11k6vhHKBhCLvFEXnddY79YjlHQUoSbsD HJ8bzOpE60a5T5uevnaFRSCIAx35IxVed3pxjCN08WcMmb1tlEipM5kZjwjxnQ== X-Gm-Gg: ASbGncsKjV5iwDtSAqg7oTOknWWQ47QGAqjyzZ7W9MUR7LsoezBcuTi403WVIJkDzn1 aCFvTmAwPZot4olr6BySFsgULZsaE4tBK4DlY6ka03v0crz1DfMAQKpQYF1TV5sSuEXgdUgQkMH whrAwTT2vXQoGpb77R1AP5H36uhifqwQtaQeF12ap2HdtWxX/E/YnPjhVS9fJUxAcpXrSaIYNQ4 bgz8C87d9Q292/3vHPeAYD1pI1mFZSTk3xGY3YzX0l5v41z6YdsaWZXe8qm16y2D6bDcDMri/C1 sLooRAJVVQB+TG3KiLXVwZAD7dNJLrCzQfUQDljBuGwe8j1kRIXmk0xCjVpseDVne1ZQKjGmpC4 jk/n5 X-Google-Smtp-Source: AGHT+IGyLIQ4fa7Av3UWVJcXm9EPAuZuYf9YOIbfavxsYUDavvmS7GtHJX/36pPQfOHOTqiDJFYG2w== X-Received: by 2002:a05:600c:4ec8:b0:439:8f59:2c56 with SMTP id 5b1f17b1804b1-43bb44f1834mr2037135e9.2.1741017613471; Mon, 03 Mar 2025 08:00:13 -0800 (PST) Received: from google.com (44.232.78.34.bc.googleusercontent.com. [34.78.232.44]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43bca26676esm5183245e9.8.2025.03.03.08.00.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 08:00:12 -0800 (PST) Date: Mon, 3 Mar 2025 16:00:08 +0000 From: Brendan Jackman To: David Hildenbrand Cc: Andrew Morton , Oscar Salvador , Johannes Weiner , Vlastimil Babka , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm/page_alloc: Add lockdep assertion for pageblock type change Message-ID: References: <20250303-pageblock-lockdep-v2-1-3fc0c37e9532@google.com> <4d0f0bca-3096-4fb4-9e8b-d4dcdf7eeb92@redhat.com> <3e66875e-a4d5-4802-85b3-f873b0aa3b06@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3e66875e-a4d5-4802-85b3-f873b0aa3b06@redhat.com> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 26E8B100004 X-Rspam-User: X-Stat-Signature: bf7dm91ei5qashsiz3co65ujjp4qngt7 X-HE-Tag: 1741017614-585190 X-HE-Meta: U2FsdGVkX1+NVLW161RTwyDTxzXO7206n8oFUbX7BLlT66JIJZiPXSkHXCF+mEo5Lu671xJSCQVPhptuS8LV558VhyhFjWg+LAb+n87KSr3QxycS/d5UQgRS3uNKa5sa5uJp/9bF2Z+tG8qWPxP6qbVTBzJp5MI4W0/As90MFeLuCtK2GEzXRVW3CZOIg/yJAhqKmItzOuGnBjfXdfek2BI3jpnl/au+lw7sV69Ljxux6rv7PwFd0DggUNSls/D9gOUY5Knj1XidXI28diw/C8rNhaumVdxqBWFiNgX5VlftOds75ttndrCMASuRR2gWrMUsdKgqvnXnrzzRv1AKxe+EE5CJ7FjYG8YEEWlB1K+911sKFGmn2oeEyYk0EdrxJbUnTu8tWY4bZ13j7jBfV/qXIiREt9J+9Bo5o8oTCnq3Sm9hNEx9CGKjePIGfS0uAsWlliNpK35xbPTCCf+KPPAiN2iT9Q34eJWWpeee/WSlp0W7O6sPziKQwio5eq88aLvl2NGD5ynL5EamO+MJH37nrCBbDZ+C2lBrP1Oy2KZfwmKMi1oCIVouayavoaj8q9DX66auQtlb8h45TGWBQZl0NxMfQB2DhB/sd0aAQ9dyreVLQoek42i0M4D7hxyrHCAVGGC/BNdJJj7xdSQjKFNwu/Kr3wWmHD8kwR/U/Pq263QrBht/cwbxzHoq7jKJuBIxZ+J7eJkt/0KlJimam3IF6c7+IsH03WSFVOsrqN8hLKSrJiSC0GCWbeWIW3NyF7HkxGxGkxDBWl20Ex0mVDNIR7IVHjE6z8gcqAMaUs7FEVfdkSxSWbYfzxkH17OdZ4867T1JPFNWFCJuUiI/eFrG331gKd0IPKlZEnWkieTOFuQzF8UCj3U5/4O47YdzP9jzYtuNlxHloKjbexneheIApZZbVGs3j2s6aWGMm8EkhOtbjI4XwGHPrXbjuTLmtS0puRtF9lFGUG2muvq WUb1CTfj 5zpc5Sbzjirzl6g077gIyg4J8gU+ISsY/nETOgDxcdGi/MnZxywZXFxTueaJT4gQAMxzwgsjFzKREnn39QcKRd2tjOK7x3SVZ4kGlYJl1h6LSqF2w06Hbs1gfPN/IHm9BkW1+E+xE97Y6sjuC70Z2DMJ53EolMGvEddcobg34+XL+qvXpvKpxU6xKdWLcLj+w/vZeQshR7tnjn/Nk8El/QUGmYCpFauNCMMWi4TGJzSH03WS9ZlTCXpRpseLHE0cBWTqbjHNz1zfrJjbkfuP+8mfGdctXbGkTKvQJ/NMtX0VTiaMlAyWhkPISOCUqrqA63+n9Ib5X+oebv/A3rrcR/nJPMXOGD4WBAoZBHKS5g2DcJbI6di0kYya201k4qa5YFF9Avv+eTWKjiJsLNe5l41a8Tg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 03, 2025 at 03:06:54PM +0100, David Hildenbrand wrote: > On 03.03.25 14:55, Brendan Jackman wrote: > > On Mon, Mar 03, 2025 at 02:11:23PM +0100, David Hildenbrand wrote: > > > On 03.03.25 13:13, Brendan Jackman wrote: > > > > Since the migratetype hygiene patches [0], the locking here is > > > > a bit more formalised. > > > > > > > > For other stuff, it's pretty obvious that it would be protected by the > > > > zone lock. But it didn't seem totally self-evident that it should > > > > protect the pageblock type. So it seems particularly helpful to have it > > > > written in the code. > > > > > > [...] > > > > > > > + > > > > u64 max_mem_size = U64_MAX; > > > > /* add this memory to iomem resource */ > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > > index 579789600a3c7bfb7b0d847d51af702a9d4b139a..1ed21179676d05c66f77f9dbebf88e36bbe402e9 100644 > > > > --- a/mm/page_alloc.c > > > > +++ b/mm/page_alloc.c > > > > @@ -417,6 +417,10 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags, > > > > void set_pageblock_migratetype(struct page *page, int migratetype) > > > > { > > > > + lockdep_assert_once(system_state == SYSTEM_BOOTING || > > > > + in_mem_hotplug() || > > > > + lockdep_is_held(&page_zone(page)->lock)); > > > > + > > > > > > I assume the call chain on the memory hotplug path is mostly > > > > > > move_pfn_range_to_zone()->memmap_init_range()->set_pageblock_migratetype() > > > > > > either when onlining a memory block, or from pagemap_range() while holding > > > the hotplug lock. > > > > > > But there is also the memmap_init_zone_device()->memmap_init_compound()->__init_zone_device_page()->set_pageblock_migratetype() > > > one, called from pagemap_range() *without* holding the hotplug lock, and you > > > assertion would be missing that. > > > > > > I'm not too happy about that assertion in general. > > > > Hmm, thanks for pointing that out. > > > > I guess if we really wanted the assertion the approach would be to > > replace in_mem_hotplug() with some more fine-grained logic about the > > state of the pageblock? But that seems like it would require rework > > that isn't really justified. > > I was wondering if we could just grab the zone lock while initializing, then > assert that we either hold that or are in boot. Would that be because you want to avoid creating in_mem_hotplug()? Or is it more about just simplifying the synchronization in general? FWIW I don't think the in_mem_hotplug() is really that bad in the assertion, it feels natural to me that memory hotplug would be an exception to the locking rules in the same way that startup would be. > In move_pfn_range_to_zone() it should likely not cause too much harm, and we > could just grab it around all zone modification stuff. > > memmap_init_zone_device() might take longer and be more problematic. > > But I am not sure why memmap_init_zone_device() would have to set the > migratetype at all? Because migratetype is a buddy concept, and > ZONE_DEVICE does not interact with the buddy to that degree. > > The comment in __init_zone_device_page states: > > "Mark the block movable so that blocks are reserved for movable at > startup. This will force kernel allocations to reserve their blocks > rather than leaking throughout the address space during boot when > many long-lived kernel allocations are made." Uh, yeah I was pretty mystified by that. It would certainly be nice if we can just get rid of this modification path. > But that just dates back to 966cf44f637e where we copy-pasted that code. > > So I wonder if we could just > > diff --git a/mm/mm_init.c b/mm/mm_init.c > index 57933683ed0d1..b95f545846e6e 100644 > --- a/mm/mm_init.c > +++ b/mm/mm_init.c > @@ -1002,19 +1002,11 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, > page->zone_device_data = NULL; > /* > - * Mark the block movable so that blocks are reserved for > - * movable at startup. This will force kernel allocations > - * to reserve their blocks rather than leaking throughout > - * the address space during boot when many long-lived > - * kernel allocations are made. > - * > - * Please note that MEMINIT_HOTPLUG path doesn't clear memmap > - * because this is done early in section_activate() > + * Note that we leave pageblock migratetypes uninitialized, because > + * they don't apply to ZONE_DEVICE. > */ > - if (pageblock_aligned(pfn)) { > - set_pageblock_migratetype(page, MIGRATE_MOVABLE); > + if (pageblock_aligned(pfn)) > cond_resched(); > - } > /* > * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released memory-model.rst says: > Since the > page reference count never drops below 1 the page is never tracked as > free memory and the page's `struct list_head lru` space is repurposed > for back referencing to the host device / driver that mapped the memory. And this code seems to assume that the whole pageblock is part of the ZONE_DEVICE dance, it would certainly make sense to me...