From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 270A3C00144 for ; Mon, 1 Aug 2022 04:41:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 13F268E0002; Mon, 1 Aug 2022 00:41:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EF858E0001; Mon, 1 Aug 2022 00:41:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAAAD8E0002; Mon, 1 Aug 2022 00:41:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D88BE8E0001 for ; Mon, 1 Aug 2022 00:41:11 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 848491A0186 for ; Mon, 1 Aug 2022 04:41:11 +0000 (UTC) X-FDA: 79749774342.22.71393AE Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf24.hostedemail.com (Postfix) with ESMTP id 135671800FE for ; Mon, 1 Aug 2022 04:41:09 +0000 (UTC) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2714cg0C029348; Mon, 1 Aug 2022 04:40:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=uhpKx3x0xG9N3j2V9BpIaZmBpPxWP4s8h0X7qAg6OhY=; b=ZzRmp3uhrlC7a8mH6AAQMtSq8C3gZeyheV+v/gom7bg/yquWEGofRRaPSgYYS5y6SW2M 9h8KW4Wavf6qxgairLyjPOtH8IznDbiTXyJRtWtimTK6blRg4XpxAifC/GJ7Qd3yXRkq fgEEmILYs2X4Yf2pieWP+BK8EGeoLhpE62qJNzWFV3ODGyMFaOlf5Pq2LzgYgWkjxCm8 fjYWxR//0Esc/4Jvxit8Cq7PXbcdYeubgMUg51FgQAd/lCisgsDjDLMO0ffLieGONCHw Ab0KGPzEfTiiKQUTk/F4LhDcEqpTCayvXUmouY8VGM7FIM5ZU/n7CzLg/ykuK5VgCwDW KA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hp51q3t2b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Aug 2022 04:40:49 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2714Huj1018438; Mon, 1 Aug 2022 04:40:48 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hp51q3t1u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Aug 2022 04:40:48 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2714Zabi032343; Mon, 1 Aug 2022 04:40:46 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma03ams.nl.ibm.com with ESMTP id 3hmv98snq7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Aug 2022 04:40:46 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2714ehVG21823866 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 1 Aug 2022 04:40:43 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B6879AE04D; Mon, 1 Aug 2022 04:40:43 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4F528AE045; Mon, 1 Aug 2022 04:40:40 +0000 (GMT) Received: from [9.43.22.209] (unknown [9.43.22.209]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 1 Aug 2022 04:40:40 +0000 (GMT) Message-ID: Date: Mon, 1 Aug 2022 10:10:39 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v11 4/8] mm/demotion/dax/kmem: Set node's abstract distance to MEMTIER_ADISTANCE_PMEM Content-Language: en-US To: "Huang, Ying" Cc: linux-mm@kvack.org, akpm@linux-foundation.org, Wei Xu , Yang Shi , Davidlohr Bueso , Tim C Chen , Michal Hocko , Linux Kernel Mailing List , Hesham Almatary , Dave Hansen , Jonathan Cameron , Alistair Popple , Dan Williams , Johannes Weiner , jvgediya.oss@gmail.com References: <20220728190436.858458-1-aneesh.kumar@linux.ibm.com> <20220728190436.858458-5-aneesh.kumar@linux.ibm.com> <875yjgmocg.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkt8s7w9.fsf@linux.ibm.com> <87k07slnt7.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Aneesh Kumar K V In-Reply-To: <87k07slnt7.fsf@yhuang6-desk2.ccr.corp.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: M1GP1LNOeqYg8rUFG5j6zI2HEJ6fJ9BQ X-Proofpoint-ORIG-GUID: QRYI9fI8bNG1DvGisY4jgeL4iFNBpJcO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-01_01,2022-07-28_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0 spamscore=0 impostorscore=0 suspectscore=0 mlxscore=0 priorityscore=1501 adultscore=0 clxscore=1015 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2208010024 ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=ZzRmp3uh; spf=pass (imf24.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659328870; a=rsa-sha256; cv=none; b=EvNJYtK5cebhXcS8aQRaATfAXWfw5cHJHul7b1FWVxbhnimj9F+ytzzK6kJbcDTCdYVNkZ 4OkehCRtxWbchUxJfNoGOFzDhoukOUB7O8ardmBkJKqT6heq/hOaUuySXPaiBxLpNtkCJj t9DyIHpV9smDcJrYnx3bPfrHPI2NILw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659328870; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uhpKx3x0xG9N3j2V9BpIaZmBpPxWP4s8h0X7qAg6OhY=; b=0w0m5cbGikuJr//hrL0hXRtSh/Y6w3/ZGAhJZDwD+dbEdPGS0tCCY3ickLuHdPisIgg6sr 1jMJWlb0ZlcrSsFyBTssYulsWEdERysBXk6wKCznyYl7+WlzbaPzvNhGaJSCdfwcmh+Krr IkgeYZ0AmPf9ZDnhabrLf8eP+GREL18= X-Stat-Signature: wmqckn3djqgbky3jand75o1fsy9am63n X-Rspamd-Queue-Id: 135671800FE X-Rspam-User: Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=ZzRmp3uh; spf=pass (imf24.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com X-Rspamd-Server: rspam04 X-HE-Tag: 1659328869-416575 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 8/1/22 7:36 AM, Huang, Ying wrote: > "Aneesh Kumar K.V" writes: > >> "Huang, Ying" writes: >> >>> "Aneesh Kumar K.V" writes: >>> >>>> By default, all nodes are assigned to the default memory tier which >>>> is the memory tier designated for nodes with DRAM >>>> >>>> Set dax kmem device node's tier to slower memory tier by assigning >>>> abstract distance to MEMTIER_ADISTANCE_PMEM. PMEM tier >>>> appears below the default memory tier in demotion order. >>>> >>>> Signed-off-by: Aneesh Kumar K.V >>>> --- >>>> drivers/dax/kmem.c | 9 +++++++++ >>>> include/linux/memory-tiers.h | 19 ++++++++++++++++++- >>>> mm/memory-tiers.c | 28 ++++++++++++++++------------ >>>> 3 files changed, 43 insertions(+), 13 deletions(-) >>>> >>>> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c >>>> index a37622060fff..6b0d5de9a3e9 100644 >>>> --- a/drivers/dax/kmem.c >>>> +++ b/drivers/dax/kmem.c >>>> @@ -11,6 +11,7 @@ >>>> #include >>>> #include >>>> #include >>>> +#include >>>> #include "dax-private.h" >>>> #include "bus.h" >>>> >>>> @@ -41,6 +42,12 @@ struct dax_kmem_data { >>>> struct resource *res[]; >>>> }; >>>> >>>> +static struct memory_dev_type default_pmem_type = { >>> >>> Why is this named as default_pmem_type? We will not change the memory >>> type of a node usually. >>> >> >> Any other suggestion? pmem_dev_type? > > Or dax_pmem_type? > > DAX is used to enumerate the memory device. > >> >>>> + .adistance = MEMTIER_ADISTANCE_PMEM, >>>> + .tier_sibiling = LIST_HEAD_INIT(default_pmem_type.tier_sibiling), >>>> + .nodes = NODE_MASK_NONE, >>>> +}; >>>> + >>>> static int dev_dax_kmem_probe(struct dev_dax *dev_dax) >>>> { >>>> struct device *dev = &dev_dax->dev; >>>> @@ -62,6 +69,8 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) >>>> return -EINVAL; >>>> } >>>> >>>> + init_node_memory_type(numa_node, &default_pmem_type); >>>> + >>> >>> The memory hot-add below may fail. So the error handling needs to be >>> added. >>> >>> And, it appears that the memory type and memory tier of a node may be >>> fully initialized here before NUMA hot-adding started. So I suggest to >>> set node_memory_types[] here only. And set memory_dev_type->nodes in >>> node hot-add callback. I think there is the proper place to complete >>> the initialization. >>> >>> And, in theory dax/kmem.c can be unloaded. So we need to clear >>> node_memory_types[] for nodes somewhere. >>> >> >> I guess by module exit we can be sure that all the memory managed >> by dax/kmem is hotplugged out. How about something like below? > > Because we set node_memorty_types[] in dev_dax_kmem_probe(), it's > natural to clear it in dev_dax_kmem_remove(). > Most of required reset/clear is done as part of memory hotunplug. So if we did manage to successfully unplug the memory, everything except node_memory_types[node] should be reset. That makes the clear_node_memory_type the below. void clear_node_memory_type(int node, struct memory_dev_type *memtype) { mutex_lock(&memory_tier_lock); /* * memory unplug did clear the node from the memtype and * dax/kem did initialize this node's memory type. */ if (!node_isset(node, memtype->nodes) && node_memory_types[node] == memtype){ node_memory_types[node] = NULL; } mutex_unlock(&memory_tier_lock); } With the module unload, it is kind of force removing the usage of the specific memtype. Considering module unload will remove the usage of specific memtype from other parts of the kernel and we already do all the required reset in memory hot unplug, do we need to do the clear_node_memory_type above? -aneesh