From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C5B1C61D92 for ; Tue, 21 Nov 2023 13:13:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6B506B03B6; Tue, 21 Nov 2023 08:13:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E1B196B041A; Tue, 21 Nov 2023 08:13:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE3506B044C; Tue, 21 Nov 2023 08:13:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BF6D46B03B6 for ; Tue, 21 Nov 2023 08:13:34 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8C30080146 for ; Tue, 21 Nov 2023 13:13:34 +0000 (UTC) X-FDA: 81482003148.26.AB1026A Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf17.hostedemail.com (Postfix) with ESMTP id 9E2C64000B for ; Tue, 21 Nov 2023 13:13:30 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=RTiOE0dc; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf17.hostedemail.com: domain of sumanthk@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=sumanthk@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700572410; a=rsa-sha256; cv=none; b=PKj7RyrnIIXUwHA1TOj8h4kPvE0u83WVX4DzAA30c1WA1POTA3n92j1UcOd5LAOaWPhC9f slQbHGA8TFyxfUPmIUNrsqh0PqXEFmkZS9TOGJFFZehyvHUpYLwq82UpiqXoDm9m18iq2H lqGrkOd+uoxtwf0IzPrLJEaAdUGrHv0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=RTiOE0dc; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf17.hostedemail.com: domain of sumanthk@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=sumanthk@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700572410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FVLudUXUSGBYrgKo4pjp6gW2MJGR16/V+Q84VE+FdT4=; b=TZ2dD25waj8lxlEoHIs7x1r6RqIm+xZ0arjdHJUc9SFDB1LW8hg9Khs5F5bG/92NqxMAoT o0BBnLd2iMmBrKXQ2ql+/wjSQoa9ncHHg1LpGrVToy+Do6dvlKbNVljNJLopBqPL2GSfv0 sbDJUnYaO+E+QueSn8mDBI6is0jBXsI= Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ALD8aDG014292; Tue, 21 Nov 2023 13:13:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=FVLudUXUSGBYrgKo4pjp6gW2MJGR16/V+Q84VE+FdT4=; b=RTiOE0dcISiB10gBTWM52WD5taMYo+spqImDJa9yPg1sRPuKLCm40Gv18xWymnqpvxxO eR2+x7Lck8iN157vRHUtZ85mV/kPSgpYoFw6h3/7BupCgz8B9m2eK+BBeoNybytYXfsE 23/clwKa0XId8ybFMqGp/yOKRVLwPcFq1DES30AE60Uv43Z2UxDQRhyY6SQfWeJpw3a0 U1GKETkC2g6lgj0Bo6aTw3jnQQUd7NkArDpQ2MSso8SHFgdqectSLcToUe1CiQjFs2Jv 9I0ldvdif4V63rahqYWCMDy5f8gw0q5bAfGIU07kDKOaZBtYi5uKCCBqzfyuFL51gzZ3 yA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ugw5wg7ts-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Nov 2023 13:13:27 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3ALD918p016490; Tue, 21 Nov 2023 13:13:26 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ugw5wg7t9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Nov 2023 13:13:26 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3ALD4MFx001932; Tue, 21 Nov 2023 13:13:25 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3uf7yygx7a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Nov 2023 13:13:25 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3ALDDMwf17302072 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Nov 2023 13:13:22 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8C1592004F; Tue, 21 Nov 2023 13:13:22 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B8CE720043; Tue, 21 Nov 2023 13:13:21 +0000 (GMT) Received: from li-2b55cdcc-350b-11b2-a85c-a78bff51fc11.ibm.com (unknown [9.171.14.211]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTPS; Tue, 21 Nov 2023 13:13:21 +0000 (GMT) Date: Tue, 21 Nov 2023 14:13:20 +0100 From: Sumanth Korikkar To: David Hildenbrand Cc: Gerald Schaefer , linux-mm , Andrew Morton , Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: Re: [PATCH 0/8] implement "memmap on memory" feature on s390 Message-ID: References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> <20231117140009.5d8a509c@thinkpad-T15> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: AJanLlsY0DZjit79sGiGlLyBnR8MrbUo X-Proofpoint-GUID: 16ssfV0qc6WYzzIzCYXOAZjH7Oqe5kmd X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-21_05,2023-11-21_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=96 lowpriorityscore=0 mlxscore=96 bulkscore=0 spamscore=96 clxscore=1015 impostorscore=0 phishscore=0 adultscore=0 mlxlogscore=-136 priorityscore=1501 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311210103 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9E2C64000B X-Stat-Signature: b8x4rkfn4tre85ibck5qfs57rjxzjke9 X-HE-Tag: 1700572410-206949 X-HE-Meta: U2FsdGVkX1/rUupFTzRXxQA6mQ9AkXYMxyEjn5zxnb4X0z4HzQlTzbxBXab9Q0X+n8D2Ah3++9kptsRchD1UbZ5JxjI53XI/UW8LVdXkLhS4uAcvKhK+mTJ7dmTzkBXy831+h6iVpCzcdb1zHXQSbpbqDyTgXaIRepYje7n6Cwm4tlKh7uKBuNY2tU8MuOPHgY0XRbTni+69qcIdggKNM/O1qjO9s3iZq1ifPr1pJMIdJKwVNE0hxR/kA4XhWRNRv61dDPX79KIL0cXKBBNYjpVqkvplYn97bGgbjkog9kz9Q2I40tkionLYtqTIoQG3920wwNKeB2B/4CdVrhNIgOjPlnmn95U59SE4Mg4KIJoji6rppI/kfA74TMRH6gokUmrzunNpg8lKpgDVIEo++PAYU+FiB4OvrBiMin9lp3CRYVZjmk6VzHDAEJH5VdSq5cu0tnGRPdCa+bTw3nVI2dWd4ng0PJclZEA6X3TMtQQEc/efi5N9SKdL97tsL+UAR4fsdeEgpmOYzTgjfxy6PWLaEgdk5shyVa6oSB3k2/F8JLFCKFq+ZR4/btOVH37XUMMaNtGWhn6albLrUkIkp/yOicI1tsa2P8sTDyDyDw9Br2tWRxN6UiHN4S2mzov47Dz2GhLtUmHgfRCBBfBjNlxqjzg70afjFvNak7jonIiZUcgomdI/3TXGqrjYNaIvECJ1vCsdOqBrpQRCvIFf2+4cTBboqUpjxeYLQb1XGlRQvAgrdNU3NdwNERUWrsA76HyjH3AXnGo2vqT9SpCWQV8v8QjJkwYh9BNTaUGoXEGRCU7d683zVeNRpek5aaOdvJ4l/LThK1Gt7LY+AWtwXj2zcsI6RMpH0kvS0spS/TFPGvChlwzr2sOifQfl9JbAqfPFAjaR29orOVQTJl6ne7ls8hWWdOYM+aH3kaQ6+vJGzdsE225Jplhsl2qrhg63ciLTYTH1sgp6lh+mU2/ IpskP6NN 6171eiQsHuvNkNbz1nTAU0MdteemVGMBcNHvOG28Ie/ehdgx3jwoez4BhIrUkC+uj2dd+f+gI4pOdnpEtrqBn3sST4tErhP5iIUdNV6koKScli5fvmVrVnNVHsyBMkXCOEVcHYcknb7FltpQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Nov 17, 2023 at 04:37:29PM +0100, David Hildenbrand wrote: > > > > Maybe there is also already a common code bug with that, s390 might be > > special but that is often also good for finding bugs in common code ... > > If it's only the page_init_poison() as noted by Sumanth, we could disable > that on s390x with an altmap some way or the other; should be possible. > > I mean, you effectively have your own poisoning if the altmap is effectively > inaccessible and makes your CPU angry on access :) > > Last but not least, support for an inaccessible altmap might come in handy > for virtio-mem eventually, and make altmap support eventually simpler. So > added bonus points. We tried out two possibilities dealing with vmemmap altmap inaccessibilty. Approach 1: Add MHP_ALTMAP_INACCESSIBLE flag and pass it in add_memory() diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index 075094ca59b4..ab2dfcc7e9e4 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -358,6 +358,13 @@ static int sclp_mem_notifier(struct notifier_block *nb, * buddy allocator later. */ __arch_set_page_nodat((void *)__va(start), memory_block->altmap->free); + /* + * Poison the struct pages after memory block is accessible. + * This is needed for only altmap. Without altmap, the struct + * pages are poisoined in sparse_add_section(). + */ + if (memory_block->altmap->inaccessible) + page_init_poison(pfn_to_page(arg->start_pfn), memory_block->altmap->free); break; case MEM_FINISH_OFFLINE: sclp_mem_change_state(start, size, 0); @@ -412,7 +419,7 @@ static void __init add_memory_merged(u16 rn) goto skip_add; for (addr = start; addr < start + size; addr += block_size) add_memory(0, addr, block_size, - MACHINE_HAS_EDAT1 ? MHP_MEMMAP_ON_MEMORY : MHP_NONE); + MACHINE_HAS_EDAT1 ? MHP_MEMMAP_ON_MEMORY|MHP_ALTMAP_INACCESSIBLE : MHP_NONE); skip_add: first_rn = rn; num = 1; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 7d2076583494..5c70707e706f 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -106,6 +106,11 @@ typedef int __bitwise mhp_t; * implies the node id (nid). */ #define MHP_NID_IS_MGID ((__force mhp_t)BIT(2)) +/* + * Mark memmap on memory (struct pages array) as inaccessible during memory + * hotplug addition phase. + */ +#define MHP_ALTMAP_INACCESSIBLE ((__force mhp_t)BIT(3)) /* * Extended parameters for memory hotplug: diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 744c830f4b13..9837f3e6fb95 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -25,6 +25,7 @@ struct vmem_altmap { unsigned long free; unsigned long align; unsigned long alloc; + bool inaccessible; }; /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7a5fc89a8652..d8299853cdcc 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1439,6 +1439,8 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { if (mhp_supports_memmap_on_memory(size)) { mhp_altmap.free = memory_block_memmap_on_memory_pages(); + if (mhp_flags & MHP_ALTMAP_INACCESSIBLE) + mhp_altmap.inaccessible = true; params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); if (!params.altmap) { ret = -ENOMEM; diff --git a/mm/sparse.c b/mm/sparse.c index 77d91e565045..3991c717b769 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -907,7 +907,8 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * nr_pages); + if (!altmap || !altmap->inaccessible) + page_init_poison(memmap, sizeof(struct page) * nr_pages); ms = __nr_to_section(section_nr); set_section_nid(section_nr, nid); Approach 2: =========== Shouldnt kasan zero shadow mapping performed first before accessing/initializing memmap via page_init_poisining()? If that is true, then it is a problem for all architectures and should could be fixed like: diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7a5fc89a8652..eb3975740537 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1093,6 +1093,7 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages, if (ret) return ret; + page_init_poison(pfn_to_page(pfn), sizeof(struct page) * nr_pages); move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE); for (i = 0; i < nr_pages; i++) diff --git a/mm/sparse.c b/mm/sparse.c index 77d91e565045..4ddf53f52075 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -906,8 +906,11 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. + * For altmap, do this later when onlining the memory, as it might + * not be accessible at this point. */ - page_init_poison(memmap, sizeof(struct page) * nr_pages); + if (!altmap) + page_init_poison(memmap, sizeof(struct page) * nr_pages); ms = __nr_to_section(section_nr); set_section_nid(section_nr, nid); Also, if this approach is taken, should page_init_poison() be performed with cond_resched() as mentioned in commit d33695b16a9f ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()") ? Opinions? Thank you