From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76C4BC433E1 for ; Tue, 18 Aug 2020 17:07:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2E96B2067C for ; Tue, 18 Aug 2020 17:07:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E96B2067C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C79328D002E; Tue, 18 Aug 2020 13:07:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C29648D0001; Tue, 18 Aug 2020 13:07:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3EBD8D002E; Tue, 18 Aug 2020 13:07:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id 9DD1D8D0001 for ; Tue, 18 Aug 2020 13:07:13 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5E9761E01 for ; Tue, 18 Aug 2020 17:07:13 +0000 (UTC) X-FDA: 77164319946.02.heat90_42106c627021 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 7719410097AAB for ; Tue, 18 Aug 2020 17:07:10 +0000 (UTC) X-HE-Tag: heat90_42106c627021 X-Filterd-Recvd-Size: 9367 Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Aug 2020 17:07:09 +0000 (UTC) Received: from lhreml710-chm.china.huawei.com (unknown [172.18.7.108]) by Forcepoint Email with ESMTP id 175EE3FB0546F7BFA5F9; Tue, 18 Aug 2020 18:07:08 +0100 (IST) Received: from lhrphicprd00229.huawei.com (10.123.41.22) by lhreml710-chm.china.huawei.com (10.201.108.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1913.5; Tue, 18 Aug 2020 18:07:07 +0100 From: Jonathan Cameron To: , , , CC: Lorenzo Pieralisi , Bjorn Helgaas , , , Ingo Molnar , Thomas Gleixner , , Dan Williams , Brice Goglin , Sean V Kelley , Jonathan Cameron Subject: [PATCH v8 5/6] node: Add access1 class to represent CPU to memory characteristics Date: Wed, 19 Aug 2020 01:04:16 +0800 Message-ID: <20200818170417.1515975-6-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20200818170417.1515975-1-Jonathan.Cameron@huawei.com> References: <20200818170417.1515975-1-Jonathan.Cameron@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.123.41.22] X-ClientProxiedBy: lhreml702-chm.china.huawei.com (10.201.108.51) To lhreml710-chm.china.huawei.com (10.201.108.61) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 7719410097AAB X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: New access1 class is nearly the same as access0, but always provides characteristics for CPUs to memory. The existing access0 class provides characteristics to nearest or direct connnect initiator which may be a Generic Initiator such as a GPU or network adapter. This new class allows thread placement on CPUs to be performed so as to give optimal access characteristics to memory, even if that memory is for example attached to a GPU or similar and only accessible to the CPU via an appropriate bus. Suggested-by: Dan Willaims Signed-off-by: Jonathan Cameron --- drivers/acpi/numa/hmat.c | 87 +++++++++++++++++++++++++++++++--------- 1 file changed, 68 insertions(+), 19 deletions(-) diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index 07cfe50136e0..00b4cdbefb5e 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -56,7 +56,7 @@ struct memory_target { unsigned int memory_pxm; unsigned int processor_pxm; struct resource memregions; - struct node_hmem_attrs hmem_attrs; + struct node_hmem_attrs hmem_attrs[2]; struct list_head caches; struct node_cache_attrs cache_attrs; bool registered; @@ -65,6 +65,7 @@ struct memory_target { struct memory_initiator { struct list_head node; unsigned int processor_pxm; + bool has_cpu; }; =20 struct memory_locality { @@ -108,6 +109,7 @@ static __init void alloc_memory_initiator(unsigned in= t cpu_pxm) return; =20 initiator->processor_pxm =3D cpu_pxm; + initiator->has_cpu =3D node_state(pxm_to_node(cpu_pxm), N_CPU); list_add_tail(&initiator->node, &initiators); } =20 @@ -215,28 +217,28 @@ static u32 hmat_normalize(u16 entry, u64 base, u8 t= ype) } =20 static void hmat_update_target_access(struct memory_target *target, - u8 type, u32 value) + u8 type, u32 value, int access) { switch (type) { case ACPI_HMAT_ACCESS_LATENCY: - target->hmem_attrs.read_latency =3D value; - target->hmem_attrs.write_latency =3D value; + target->hmem_attrs[access].read_latency =3D value; + target->hmem_attrs[access].write_latency =3D value; break; case ACPI_HMAT_READ_LATENCY: - target->hmem_attrs.read_latency =3D value; + target->hmem_attrs[access].read_latency =3D value; break; case ACPI_HMAT_WRITE_LATENCY: - target->hmem_attrs.write_latency =3D value; + target->hmem_attrs[access].write_latency =3D value; break; case ACPI_HMAT_ACCESS_BANDWIDTH: - target->hmem_attrs.read_bandwidth =3D value; - target->hmem_attrs.write_bandwidth =3D value; + target->hmem_attrs[access].read_bandwidth =3D value; + target->hmem_attrs[access].write_bandwidth =3D value; break; case ACPI_HMAT_READ_BANDWIDTH: - target->hmem_attrs.read_bandwidth =3D value; + target->hmem_attrs[access].read_bandwidth =3D value; break; case ACPI_HMAT_WRITE_BANDWIDTH: - target->hmem_attrs.write_bandwidth =3D value; + target->hmem_attrs[access].write_bandwidth =3D value; break; default: break; @@ -329,8 +331,12 @@ static __init int hmat_parse_locality(union acpi_sub= table_headers *header, =20 if (mem_hier =3D=3D ACPI_HMAT_MEMORY) { target =3D find_mem_target(targs[targ]); - if (target && target->processor_pxm =3D=3D inits[init]) - hmat_update_target_access(target, type, value); + if (target && target->processor_pxm =3D=3D inits[init]) { + hmat_update_target_access(target, type, value, 0); + /* If the node has a CPU, update access 1*/ + if (node_state(pxm_to_node(inits[init]), N_CPU)) + hmat_update_target_access(target, type, value, 1); + } } } } @@ -566,6 +572,7 @@ static void hmat_register_target_initiators(struct me= mory_target *target) unsigned int mem_nid, cpu_nid; struct memory_locality *loc =3D NULL; u32 best =3D 0; + bool access0done =3D false; int i; =20 mem_nid =3D pxm_to_node(target->memory_pxm); @@ -577,7 +584,11 @@ static void hmat_register_target_initiators(struct m= emory_target *target) if (target->processor_pxm !=3D PXM_INVAL) { cpu_nid =3D pxm_to_node(target->processor_pxm); register_memory_node_under_compute_node(mem_nid, cpu_nid, 0); - return; + access0done =3D true; + if (node_state(cpu_nid, N_CPU)) { + register_memory_node_under_compute_node(mem_nid, cpu_nid, 1); + return; + } } =20 if (list_empty(&localities)) @@ -591,6 +602,40 @@ static void hmat_register_target_initiators(struct m= emory_target *target) */ bitmap_zero(p_nodes, MAX_NUMNODES); list_sort(p_nodes, &initiators, initiator_cmp); + if (!access0done) { + for (i =3D WRITE_LATENCY; i <=3D READ_BANDWIDTH; i++) { + loc =3D localities_types[i]; + if (!loc) + continue; + + best =3D 0; + list_for_each_entry(initiator, &initiators, node) { + u32 value; + + if (!test_bit(initiator->processor_pxm, p_nodes)) + continue; + + value =3D hmat_initiator_perf(target, initiator, + loc->hmat_loc); + if (hmat_update_best(loc->hmat_loc->data_type, value, &best)) + bitmap_clear(p_nodes, 0, initiator->processor_pxm); + if (value !=3D best) + clear_bit(initiator->processor_pxm, p_nodes); + } + if (best) + hmat_update_target_access(target, loc->hmat_loc->data_type, best, 0)= ; + } + + for_each_set_bit(i, p_nodes, MAX_NUMNODES) { + cpu_nid =3D pxm_to_node(i); + register_memory_node_under_compute_node(mem_nid, cpu_nid, 0); + } + } + + /* Access 1 ignores Generic Initiators */ + bitmap_zero(p_nodes, MAX_NUMNODES); + list_sort(p_nodes, &initiators, initiator_cmp); + best =3D 0; for (i =3D WRITE_LATENCY; i <=3D READ_BANDWIDTH; i++) { loc =3D localities_types[i]; if (!loc) @@ -600,6 +645,10 @@ static void hmat_register_target_initiators(struct m= emory_target *target) list_for_each_entry(initiator, &initiators, node) { u32 value; =20 + if (!initiator->has_cpu) { + clear_bit(initiator->processor_pxm, p_nodes); + continue; + } if (!test_bit(initiator->processor_pxm, p_nodes)) continue; =20 @@ -610,12 +659,11 @@ static void hmat_register_target_initiators(struct = memory_target *target) clear_bit(initiator->processor_pxm, p_nodes); } if (best) - hmat_update_target_access(target, loc->hmat_loc->data_type, best); + hmat_update_target_access(target, loc->hmat_loc->data_type, best, 1); } - for_each_set_bit(i, p_nodes, MAX_NUMNODES) { cpu_nid =3D pxm_to_node(i); - register_memory_node_under_compute_node(mem_nid, cpu_nid, 0); + register_memory_node_under_compute_node(mem_nid, cpu_nid, 1); } } =20 @@ -628,10 +676,10 @@ static void hmat_register_target_cache(struct memor= y_target *target) node_add_cache(mem_nid, &tcache->cache_attrs); } =20 -static void hmat_register_target_perf(struct memory_target *target) +static void hmat_register_target_perf(struct memory_target *target, int = access) { unsigned mem_nid =3D pxm_to_node(target->memory_pxm); - node_set_perf_attrs(mem_nid, &target->hmem_attrs, 0); + node_set_perf_attrs(mem_nid, &target->hmem_attrs[access], access); } =20 static void hmat_register_target_device(struct memory_target *target, @@ -733,7 +781,8 @@ static void hmat_register_target(struct memory_target= *target) if (!target->registered) { hmat_register_target_initiators(target); hmat_register_target_cache(target); - hmat_register_target_perf(target); + hmat_register_target_perf(target, 0); + hmat_register_target_perf(target, 1); target->registered =3D true; } mutex_unlock(&target_lock); --=20 2.19.1