From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A7F3D743C7 for ; Wed, 20 Nov 2024 19:27:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 77C9E6B0098; Wed, 20 Nov 2024 14:27:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 705D36B009C; Wed, 20 Nov 2024 14:27:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A6796B00A0; Wed, 20 Nov 2024 14:27:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 37F0C6B0098 for ; Wed, 20 Nov 2024 14:27:02 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DBE70120939 for ; Wed, 20 Nov 2024 19:27:01 +0000 (UTC) X-FDA: 82807454604.21.0154517 Received: from mail-ot1-f50.google.com (mail-ot1-f50.google.com [209.85.210.50]) by imf11.hostedemail.com (Postfix) with ESMTP id A24D94000F for ; Wed, 20 Nov 2024 19:25:55 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=I3PT1dAi; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf11.hostedemail.com: domain of nifan.cxl@gmail.com designates 209.85.210.50 as permitted sender) smtp.mailfrom=nifan.cxl@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732130635; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WVkMS7SOnuy3G4XpehNHTXXOXgHnTj+iNth4NXTwJqU=; b=D0ZGCHxVNj/rcjdrU6WGOLj4aQKTkKsh38C4H0o84OUd1oiiwBt0vvj9uicAdoJTzoN+Jl m/jd0g8xlIy+tgYTVNCjQ2ff9JUg7RIX3fgqSzBgPzhCiiC5AtOnB62TFMGHquvasGJ3kr iYCdv+bzgthtd4QNVydVBCxhCVdcWHw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=I3PT1dAi; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf11.hostedemail.com: domain of nifan.cxl@gmail.com designates 209.85.210.50 as permitted sender) smtp.mailfrom=nifan.cxl@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732130635; a=rsa-sha256; cv=none; b=Ul1f13GO8LBlZZeZvWxL0nSt9CvFZrSpPjtrt+SZae1EmbyzUtgXoBwxyU9cgFy4vC0yCX eMrioJvC10g/DkfFWpLapTAZsGBOc4mFBoi7gipCOaJ6ZFzJOsZOIOqSP1Z15/Eslo5+Cd TdomJkGYXni4jlPsm79TjBKnyHOGzu4= Received: by mail-ot1-f50.google.com with SMTP id 46e09a7af769-71811707775so16124a34.3 for ; Wed, 20 Nov 2024 11:26:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732130819; x=1732735619; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=WVkMS7SOnuy3G4XpehNHTXXOXgHnTj+iNth4NXTwJqU=; b=I3PT1dAiyPbyRhvDupOHhpy1mTsHhz36F7yQlPzcJ4VIAaxD22g91RlSxJ5dA7ysqQ Wxp4JyFwpzA1BQIDIep/Lmu1VwPr/EQKudoJR/TsO4qAaDqDK6B7AX32e83e+c2tLnF0 u2NLpcVzd+IR2AabgczJnHtBELvmDCb5P2315K3e6ive9n/nCjSW87GioK8oKjN8As4J zlrgxzWSQIy5t7BL/i/p94JbjM/HMWXs3COXA/aWXGeC+NOitEWhSF2xcbbpoJgcoATf ZE0nLemdSsn4G1zCfWsW0rhdaeQghl7jOalp0oX5499tLl+EhEnfPA/C89WjSeguw3RA 0M4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732130819; x=1732735619; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WVkMS7SOnuy3G4XpehNHTXXOXgHnTj+iNth4NXTwJqU=; b=CNYN8+x6dhZrMjrw5YMDsDRomsGiBvBbKD35EZ4LvUPFy7FWa9JFEykWwn/2XK2Is+ cEqIhP22hu5Lf49a9ufJFMI0BzcXuxHvgWub+ndlmYpzTkLWqF6x1SwuYNT7YTiFvGW/ EkQg1X4pK+cLqvkVrrTle1bTTwFeHQSTfbuboZ2kLgGlmejge81ghyM+lsmvN9BS8arw IUHEpd29NMDuk+XyqJuoZ1u5mrOYJYK3m0SpnOIFTE9eIKQqJ+TjD4KJkZWq7yCQa5d0 hBpm0uqB20GRlDGn17lbq4hMqmsDPHsYbvN5H2Mm7ppSjxejBEwtEfrgKWQzBQQRNY00 HFEw== X-Forwarded-Encrypted: i=1; AJvYcCWkyQK2QYlrZ8qts+SwaXMNDfhLvJIEF5VaxlzkdJWfdiiajRazMMGk5a7uYlJ7vpjDd1hAJQsRXQ==@kvack.org X-Gm-Message-State: AOJu0YwbosE29Z+AkzUODjH9/+GcXdZr/n1rnNbPJrOzKwNLtugRMe/5 iEzhcRgvJnpIVkyzPkVR+KZRONcT7rg9/in3XO9W4yLF8nUsXVXol/DF5Q== X-Google-Smtp-Source: AGHT+IGkeFT658eRFseEUmi+fYk0b8sEK0+TJIC98SNQyfI5clZCIt4++25ouRAtqYHNHR4LI1Nv3Q== X-Received: by 2002:a17:902:f606:b0:211:fb9c:b1ce with SMTP id d9443c01a7336-2126a3a4ab3mr45550455ad.17.1732130429634; Wed, 20 Nov 2024 11:20:29 -0800 (PST) Received: from smc-140338-bm01 ([149.97.161.244]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21209a60aa3sm66621535ad.184.2024.11.20.11.20.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Nov 2024 11:20:29 -0800 (PST) From: Fan Ni X-Google-Original-From: Fan Ni Date: Wed, 20 Nov 2024 19:20:26 +0000 To: Gregory Price Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-cxl@vger.kernel.org, kernel-team@meta.com, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, rrichter@amd.com, Terry.Bowman@amd.com, dave.jiang@intel.com, ira.weiny@intel.com, alison.schofield@intel.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, rafael@kernel.org, lenb@kernel.org, david@redhat.com, osalvador@suse.de, gregkh@linuxfoundation.org, akpm@linux-foundation.org, rppt@kernel.org Subject: Re: [PATCH v6 0/3] memory,x86,acpi: hotplug memory alignment advisement Message-ID: References: <20241106155847.7985-1-gourry@gourry.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241106155847.7985-1-gourry@gourry.net> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A24D94000F X-Stat-Signature: 9rsijtfioawj3zkjosttr5hwtrdmer63 X-Rspam-User: X-HE-Tag: 1732130755-169628 X-HE-Meta: U2FsdGVkX18jiO/M03SPTpqHUmsfe6Yw1w2Qw27Ju/Qq8xBh3dymFQSYHkalTrM//WeA+qmDI29NJVOMb/vbHop5quuma5VgszqjFzZ/xKlZUElxaFaVKXeAC9crS5CMfTWaAjaReJvyRkHLySfSx/kFGscvGGoamiSKYB0NUoUHnWx67r4eS6ULzQBS47EKMUpt744a4P57Qi4FeSEh1A7sghZx6MOKlEms7MGQ0/3/+CvsrQWfa073rYfACHNwwc/FQjaxj1CRmuCTirnW7Y5O5f3TwhGHR741cL1Ucu2bW/QXVNVuRqqTy4PUWEpELx3SsmVcra9eDllRkz7jr1hLhgEvfF50VbWm5PNn2E/IlQjCsa88cNhUWCxfWKBhfUc3HVDaugxMAqtlTCFiyldQ9uHD4E4vPVw/1Nac4yRvWAqtF2eSNhEd5CDo97NhwmReBYkb3a0HWZgNsGnbNyqRfP3NiWDSD2Ptq6uXPubmMos2N2rpNs26GV7U1gD787nvdiro15xfLuNhsPgCspjGsnRjXvBILVd8kfM3qq7H3TLfAopSv8DUHPQ68VdtQdObdohg1yqruWufzfVFkz8xJWRrmbdr6GemA9zp2SDdysMXq1tlUSi1FUTfyhqZYLUsE9SGhTKQfwOPI88Gi/fvTsdNaGgMQi5UQcrBrlIEmwlMrsneHCHqzemozJRsJBBxX9gZFOoUlbUjzV78yDcrrjVBd69sJ9oc6aOJSIqFSUmT+w/D7jHOZ7VJDbGnghIw4xfBZLdoGgy587BUhfoRiA4ATUSUZRwXHVatOIC4Ltst9jqcx8h5d47cRHNU5smJgut3vHo59LQbSTJqcbPcgXeenOB0kMXRvLPtudq0IZlsTEucUtLsZEUeIqH7VImEn0JsdPqKKfVgOMNQl2FowWNze2OSfQSL7OkHCU0B+OjW0YLk/i6W0SIB8Sh3Ew362vSsliZtHjswgog Sc4+5OKb +P03sCF6JhDtNzNkB9qghe0sDiuk9hWkfCLbrkkmbulZ2jg8amNA7w76iNwHjXVnOahtAxP4o8gbDqXNn0DwFyFyvZSBM4VZiofOKHlxSphGKsGmfLFpaRqXzDIAHmJcdk/di2Q/jvnGr3OcEcBFoupZBr1HYVDUYmx5aVyOxvtcCOeniYrxyOxld3BVFWhMG5tmQ/u+f+ElQ9tqVppnVxlb0U47RFMLGQ0koTbE+ifDPPsX4F/YpAqTGVFrycLygluAI5x2SfEjE0+LoMQLYeYUipXW3kag1p5A7z9S5WbHqz2f9yEpE1yZsyOD4Ou2ZIeEpy4FKO5iDzlYMdaGkGrf6ufpSB15mf6Ixl6ZfBOn2yTtIAbe5soLpVkKaFZVdflsAOop65ky5dwzusPs/Jh3GYQPkX8NBNdQ5Op++l41lLyq0Kr3QPCd2MrhpOyZqz/e5KKZfeMEYVk980yB0FeKnXvgZ+1F5s10pkbO8JS7ptMZRyrJFlDmyPRhw3O4xKhH/+qhO3qyX9+4nRHAPYGlZZ1HG3ejOCSy/DwgToWwKs7dA0LEHK9rDjA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 06, 2024 at 10:58:44AM -0500, Gregory Price wrote: > When physical address regions are not aligned to memory block size, > the misaligned portion is lost (stranded capacity). > > Block size (min/max/selected) is architecture defined. Most architectures > tend to use the minimum block size or some simplistic heurist. On x86, > memory block size increases up to 2GB, and is otherwise fitted to the > alignment of non-hotplug (i.e. not special purpose memory). > > CXL exposes its memory for management through the ACPI CEDT (CXL Early > Detection Table) in a field called the CXL Fixed Memory Window. Per > the CXL specification, this memory must be aligned to at least 256MB. > > When a CFMW aligns on a size less than the block size, this causes a > loss of up to 2GB per CFMW on x86. It is not uncommon for CFMW to be > allocated per-device - though this behavior is BIOS defined. > > This patch set provides 3 things: > 1) implement advise/query functions in driverse/base/memory.c to > report/query architecture agnostic hotplug block alignment advice. > 2) update x86 memblock size logic to consider the hotplug advice > 3) add code in acpi/numa/srat.c to report CFMW alignment advice > > The advisement interfaces are design to be called during arch_init > code prior to allocator and smp_init. start_kernel will call these > through setup_arch() (via acpi and mm/init_64.c on x86), which occurs > prior to mm_core_init and smp_init - so no need for atomics. > > There's an attempt to signal callers to advise() that query has already > occurred, but this is predicated on the notion that query actually > occurs (which presently only happens on the x86 arch). This is to > assist debugging future users. Otherwise, the advise() call has > been marked __init to help static discovery of bad call times. > > Once query is called the first time, it will always return the same value. > > Interfaces return -EBUSY and 0 respectively on systems without hotplug. > > v6: > - boot_cpu_has -> cpu_feature_enabled() in x86 code > - ack tags > > Suggested-by: Ira Weiny > Suggested-by: David Hildenbrand > Suggested-by: Dan Williams > Signed-off-by: Gregory Price > Tested on a CXL server with a directly attached cxl device, works as expected. Tested-by: Fan Ni Fan > Gregory Price (3): > memory: implement memory_block_advise/probe_max_size > x86: probe memory block size advisement value during mm init > acpi,srat: give memory block size advice based on CFMWS alignment > > arch/x86/mm/init_64.c | 15 ++++++++---- > drivers/acpi/numa/srat.c | 12 ++++++++- > drivers/base/memory.c | 53 ++++++++++++++++++++++++++++++++++++++++ > include/linux/memory.h | 10 ++++++++ > 4 files changed, 84 insertions(+), 6 deletions(-) > > -- > 2.43.0 > -- Fan Ni (From gmail)