From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A75D8EB64D8 for ; Fri, 16 Jun 2023 06:37:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E81178E0002; Fri, 16 Jun 2023 02:37:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E30CC8E0001; Fri, 16 Jun 2023 02:37:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD1A38E0002; Fri, 16 Jun 2023 02:37:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BDEB08E0001 for ; Fri, 16 Jun 2023 02:37:14 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 82F8B40761 for ; Fri, 16 Jun 2023 06:37:14 +0000 (UTC) X-FDA: 80907653988.08.665AAC1 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf07.hostedemail.com (Postfix) with ESMTP id A96364000D for ; Fri, 16 Jun 2023 06:37:11 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VK81I4zr; spf=pass (imf07.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686897432; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OYsX2tMR7iF5BnLnoHLvzBLyjk8xBgJrZ7F/7cR5Cu4=; b=xTgecFMEIQS6hntWF20wfefouYl83pyKo4CMVfBYJTRxXu7EdEGDUME/PC45yC0urG7hmf Zd//FIIRxC1zeB4I27UQvL33NOYe97+3DlC/f+4MWI+o3EBffFkGeY9V52MKt3W5RJ58Ph tocPzxOARfXymy0vwjJofg8xsL00p7c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686897432; a=rsa-sha256; cv=none; b=XG72rIEB5R3XKVWb36p/H72xQ2DAe/Bq6kifqA+uHACDuNAw2rRaeqZoJQ9dSCPE5qoMRw 3Mup9oyEP/cgII1oMYY7XMQHfvbFI5Lo00LV70k69aBVu0JZoKZDXeUngcJlDZjoaNuVmQ N6QoYroM1IS6kBXhDnOWY2D1trJjFTs= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VK81I4zr; spf=pass (imf07.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686897431; x=1718433431; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=lhxN1g1baqKTY5d9KGHAVK+vUiIK8HZoj801iDtHTbg=; b=VK81I4zrdNwl0Blo+GXcXGMMWWqr5U+ZwVXTM2gr5V+Oha1CElvF8IY4 dCTaeMT0FRehAW5FwhtDpxa04zIh+TbJc5dfQcRgjfksThBU7ovYDGyhj 1OzDK18iCpk+ZIx7JOC7xMW1b1ovNhoK/pBZvuCSoz+WggQ1mdJOJ9Ej3 D+Mz1vDNboYSpYMcItUSguny2UT7xjEcpm2dYd91F7yo2M5u29nLV7lOK +epeBEVUQhpyU8w1B66RKXtrYK/fvzK0wyu+2AXJxiu6pGJ4NnCTyQxbh 0me1X3TAVs+n0Z/7IeIYNTyZQeSDO3CeeW2sv+gHTtQUjeXZ2P8OVbRCX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="387807157" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="387807157" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 23:37:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="742548150" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="742548150" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 23:37:06 -0700 From: "Huang, Ying" To: Vishal Verma Cc: "Rafael J. Wysocki" , Len Brown , Andrew Morton , David Hildenbrand , Oscar Salvador , Dan Williams , Dave Jiang , , , , , , Dave Hansen Subject: Re: [PATCH 1/3] mm/memory_hotplug: Allow an override for the memmap_on_memory param References: <20230613-vv-kmem_memmap-v1-0-f6de9c6af2c6@intel.com> <20230613-vv-kmem_memmap-v1-1-f6de9c6af2c6@intel.com> Date: Fri, 16 Jun 2023 14:35:32 +0800 In-Reply-To: <20230613-vv-kmem_memmap-v1-1-f6de9c6af2c6@intel.com> (Vishal Verma's message of "Thu, 15 Jun 2023 16:00:23 -0600") Message-ID: <874jn7gbij.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: A96364000D X-Rspam-User: X-Stat-Signature: doyuf5e7nndx7fuuhxjasej9tyrj6f1h X-Rspamd-Server: rspam03 X-HE-Tag: 1686897431-526900 X-HE-Meta: U2FsdGVkX19PVi+oHeIG6Vr50jEejNZoTqJ3OogU5AM88hLtlts6dciEuzA3vv/vz/8dN5cTJwOcUNhM0Y67dwYqRIgVdLK1Ilu+T46nox7ys9H9gD5HlXMjfVT7DobUAdHTcjOozhjV+cBjTaOUi+sQFitPEvMG5LjoQBwr0Z5y+BJdkBT3JUn23irU5P0hTdOLX3OBkdeV9cFF+yKr7k1cpNuZT7q7i/wOuISsYaA79qYNJ2lO2MjAnKNefdBAYy7U+6zwqmBSy+9tq1iZ54+BGY7hl0xgyxsOFTUW+opLYdq9NxQ+RN0587EU5P+7ay7KPnLY+yD3sKGxKnYXmZ7uU8mbOn1gR3qr94D1kROQHzs38+jNzaKY76fQE4VG/OscD9DFQexyJCsXh9u4bkbXEPtTSModTLkD2gosVvZjL/1SvAuIUvRnUzr/WxXBXi/OMlAnaoHqO+Ese8fIl2Pqt0MTL3FmC33B/iCO+ctdc9Hufhlr5ZTd9MhKRbKcTEykNQ8YjCj2VOFk2jrqt9GVKDDi5FI6CQb12BbKjLodEc3MwZRM6A4n3iItoEhWmyo1odbiRej8Ot3RWuiKWM3TRGRDfVfHKUrmJUE7or84KpRYaJ0nDTR97s7ceBjwHQE5bjTV58YoKYlFeR5TsWdjgpNwPDPlqHZhaFwpvNqh63KFgf7jwjgZyKIQeSXRsu8VuoxQPIERqzr/fgoecf2+JkGqqg60XQkC909fwl0SuwBJp4HFbU8JJW21fotd0nPx5EvyRh1qcvtZxzk+HP3iCDY5UA2nxOjb4iW4GIf8+y8Nd7BKfPplUPF6gDWu2rRhx9YuUlCzLkhdaK3KO9+VSDfImT2V96XDrLwC67fIhyIAWbVC9B4+9wYuN6HoUTcSUYYTEa2pUojChVmaOaNV2GUQXQrhcWTiAV9/a1CgBbxVa0nBMfr0WSNKTqChnxEgvqLliaATt1QX0fH ZoioPZdk bJHpg5qenSfappX486rd1zt6aGoCw1hs5OJikuFXPUUOevX9nPFRYx9O+2LbS3l5oxmJpxRgvbVgzc+Fha2kHgxM2alCKRfkXrwU4zHn1j90HaS74cv3AYV2Rr8a5wjT4E5QecJpff27BGo9efgmRjtaZvHz3WyABbSf0GNMPY5f0rLATu3WZhNHhLkish41tSWlx0PNiRUbMsZDIoVDOSiQ2SRg69V5lEIiIB5Dgbm+QHiPVIN/2t/nU/7RXok/aLIySH+e+YBwBhxQdgUp0gvxsr+2wat9CYmQSayObx8cQZybF7PfGYZhRgQoyQLovXJtJF2SgGNue/tV0in3iVMITHHPDpHZitZuNo3jj5dRg8Js= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, Vishal, Thanks for your patch! Vishal Verma writes: > For memory hotplug to consider MHP_MEMMAP_ON_MEMORY behavior, the > 'memmap_on_memory' module parameter was a hard requirement. > > In preparation for the dax/kmem driver to use memmap_on_memory > semantics, arrange for the module parameter check to be bypassed via the > appropriate mhp_flag. > > Recall that the kmem driver could contribute huge amounts of hotplugged > memory originating from special purposes devices such as CXL memory > expanders. In some cases memmap_on_memory may be the /only/ way this new > memory can be hotplugged. Hence it makes sense for kmem to have a way to > force memmap_on_memory without depending on a module param, if all the > other conditions for it are met. > > The only other user of this interface is acpi/acpi_memoryhotplug.c, > which only enables the mhp_flag if an initial > mhp_supports_memmap_on_memory() test passes. Maintain the existing > behavior and semantics for this by performing the initial check from > acpi without the MHP_MEMMAP_ON_MEMORY flag, so its decision falls back > to the module parameter. > > Cc: "Rafael J. Wysocki" > Cc: Len Brown > Cc: Andrew Morton > Cc: David Hildenbrand > Cc: Oscar Salvador > Cc: Dan Williams > Cc: Dave Jiang > Cc: Dave Hansen > Cc: Huang Ying > Signed-off-by: Vishal Verma > --- > include/linux/memory_hotplug.h | 2 +- > drivers/acpi/acpi_memhotplug.c | 2 +- > mm/memory_hotplug.c | 24 ++++++++++++++++-------- > 3 files changed, 18 insertions(+), 10 deletions(-) > > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index 9fcbf5706595..c9ddcd3cad70 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -358,7 +358,7 @@ extern struct zone *zone_for_pfn_range(int online_type, int nid, > extern int arch_create_linear_mapping(int nid, u64 start, u64 size, > struct mhp_params *params); > void arch_remove_linear_mapping(u64 start, u64 size); > -extern bool mhp_supports_memmap_on_memory(unsigned long size); > +extern bool mhp_supports_memmap_on_memory(unsigned long size, mhp_t mhp_flags); > #endif /* CONFIG_MEMORY_HOTPLUG */ > > #endif /* __LINUX_MEMORY_HOTPLUG_H */ > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c > index 24f662d8bd39..119d3bb49753 100644 > --- a/drivers/acpi/acpi_memhotplug.c > +++ b/drivers/acpi/acpi_memhotplug.c > @@ -211,7 +211,7 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) > if (!info->length) > continue; > > - if (mhp_supports_memmap_on_memory(info->length)) > + if (mhp_supports_memmap_on_memory(info->length, 0)) > mhp_flags |= MHP_MEMMAP_ON_MEMORY; > result = __add_memory(mgid, info->start_addr, info->length, > mhp_flags); > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 8e0fa209d533..bb3845830922 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1283,15 +1283,21 @@ static int online_memory_block(struct memory_block *mem, void *arg) > return device_online(&mem->dev); > } > > -bool mhp_supports_memmap_on_memory(unsigned long size) > +bool mhp_supports_memmap_on_memory(unsigned long size, mhp_t mhp_flags) > { > unsigned long nr_vmemmap_pages = size / PAGE_SIZE; > unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page); > unsigned long remaining_size = size - vmemmap_size; > > /* > - * Besides having arch support and the feature enabled at runtime, we > - * need a few more assumptions to hold true: > + * The MHP_MEMMAP_ON_MEMORY flag indicates a caller that wants to force > + * memmap_on_memory (if other conditions are met), regardless of the > + * module parameter. drivers/dax/kmem.c is an example, where large > + * amounts of hotplug memory may come from, and the only option to > + * successfully online all of it is to place the memmap on this memory. > + * > + * Besides having arch support and the feature enabled at runtime or > + * via the mhp_flag, we need a few more assumptions to hold true: > * > * a) We span a single memory block: memory onlining/offlinin;g happens > * in memory block granularity. We don't want the vmemmap of online > @@ -1315,10 +1321,12 @@ bool mhp_supports_memmap_on_memory(unsigned long size) > * altmap as an alternative source of memory, and we do not exactly > * populate a single PMD. > */ > - return mhp_memmap_on_memory() && > - size == memory_block_size_bytes() && > - IS_ALIGNED(vmemmap_size, PMD_SIZE) && > - IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)); > + > + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY) || mhp_memmap_on_memory()) > + return size == memory_block_size_bytes() && > + IS_ALIGNED(vmemmap_size, PMD_SIZE) && > + IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)); > + return false; > } > > /* > @@ -1375,7 +1383,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) > * Self hosted memmap array > */ > if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { > - if (!mhp_supports_memmap_on_memory(size)) { > + if (!mhp_supports_memmap_on_memory(size, mhp_flags)) { > ret = -EINVAL; > goto error; > } It appears that we need to deal with the hot-remove path too. try_remove_memory() will call mhp_memmap_on_memory() and only work with MHP_MEMMAP_ON_MEMORY properly if it returns true. Best Regards, Huang, Ying