From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 307FFC433E4 for ; Tue, 18 Aug 2020 14:25:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F072820786 for ; Tue, 18 Aug 2020 14:25:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F072820786 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6C0306B002C; Tue, 18 Aug 2020 10:25:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 671A86B002D; Tue, 18 Aug 2020 10:25:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5ADEE6B002E; Tue, 18 Aug 2020 10:25:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0133.hostedemail.com [216.40.44.133]) by kanga.kvack.org (Postfix) with ESMTP id 45B3B6B002C for ; Tue, 18 Aug 2020 10:25:10 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 03BCB181AEF1F for ; Tue, 18 Aug 2020 14:25:10 +0000 (UTC) X-FDA: 77163911580.11.elbow37_1c0fcbc27020 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id CC7BB180F8B81 for ; Tue, 18 Aug 2020 14:25:09 +0000 (UTC) X-HE-Tag: elbow37_1c0fcbc27020 X-Filterd-Recvd-Size: 5891 Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Aug 2020 14:25:09 +0000 (UTC) Received: from lhreml710-chm.china.huawei.com (unknown [172.18.7.106]) by Forcepoint Email with ESMTP id 2DA3684C42A72950CADE; Tue, 18 Aug 2020 15:25:05 +0100 (IST) Received: from lhrphicprd00229.huawei.com (10.123.41.22) by lhreml710-chm.china.huawei.com (10.201.108.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1913.5; Tue, 18 Aug 2020 15:25:04 +0100 From: Jonathan Cameron To: , , , CC: Lorenzo Pieralisi , Bjorn Helgaas , , , Ingo Molnar , , Tony Luck , Fenghua Yu , Thomas Gleixner , , Dan Williams , Song Bao Hua , Jonathan Cameron Subject: [PATCH v3 0/6] ACPI: Only create NUMA nodes from entries in SRAT or SRAT emulation. Date: Tue, 18 Aug 2020 22:24:24 +0800 Message-ID: <20200818142430.1156547-1-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.19.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.123.41.22] X-ClientProxiedBy: lhreml710-chm.china.huawei.com (10.201.108.61) To lhreml710-chm.china.huawei.com (10.201.108.61) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: CC7BB180F8B81 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a trivial rebase and resend of V2 now the merge window has closed= . Here, I will use the term Proximity Domains for the ACPI description and NUMA Nodes for the in kernel representation. ACPI 6.3 included a clarification that only Static Resource Allocation Structures in SRAT may define the existence of proximity domains (sec 5.2.16). This clarification closed a possible interpretation that other parts of ACPI (e.g. DSDT _PXM, NFIT etc) could define new proximity domains that were not also mentioned in SRAT structures. In practice the kernel has never allowed this alternative interpretation = as such nodes are only partially initialized. This is architecture specific but to take an example, on x86 alloc_node_data has not been called. Any use of them for node specific allocation, will result in a crash as t= he infrastructure to fallback to a node with memory is not setup. We ran into a problem when enabling _PXM handling for PCI devices and fou= nd there were boards out there advertising devices in proximity domains that didn't exist [2]. The fix suggested in this series is to replace instances that should not 'create' new nodes with pxm_to_node. This function needs a some addition= al hardening against invalid inputs to make sure it is safe for use in these new callers. Patch 1 Hardens pxm_to_node() against numa_off, and pxm entry being too l= arge. Patch 2-4 change the various callers not related to SRAT entries so that = they set this parameter to false, so do not attempt to initialize a new NUMA n= ode if the relevant one does not already exist. Patch 5 is a function rename to reflect change in functionality of acpi_map_pxm_to_online_node() as it no longer creates a new map, but just= does a lookup of existing maps. Patch 6 covers the one place we do not allow the full flexibility defined in the ACPI spec. For SRAT GIC Interrupt Translation Service (ITS) Affin= ity Structures, on ARM64, the driver currently makes an additional pass of SR= AT later in the boot than the one used to identify NUMA domains. Note, this currently means that an ITS placed in a proximity domain that = is not defined by another SRAT structure will result in the a crash. To avoid this crash with minimal changes we do not create new NUMA nodes = based on this particular entry type. Any current platform trying to do this wi= ll not boot, so this is an improvement, if perhaps not a perfect solution. [1] Note in ACPI Specification 6.3 5.2.16 System Resource Affinity Table = (SRAT) [2] https://patchwork.kernel.org/patch/10597777/ Thanks to Bjorn Helgaas for review of v1 and Barry Song for internal revi= ews that lead to a slightly different approach for this v2. Changes since v2. * Trivial rebase to v5.9-rc1 * Collect up tags. Changes since v1. * Use pxm_to_node for what was previously the path using acpi_map_pxm_to_= node with create=3D=3Dfalse. (Barry) * Broke patch up into an initial noop stage followed by patches (Bjorn) to update each type of case in which partial creation of NUMA nodes is = prevented. * Added patch 5 to rename function to reflect change of functionality. * Updated descriptions (now mostly in individual patches) inline with Bjo= rn's comments. Jonathan Cameron (6): ACPI: Add out of bounds and numa_off protections to pxm_to_node ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node ACPI: Rename acpi_map_pxm_to_online_node to pxm_to_online_node ACPI: Remove side effect of partly creating a node in acpi_get_node irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory drivers/acpi/arm64/iort.c | 2 +- drivers/acpi/nfit/core.c | 6 ++---- drivers/acpi/numa/hmat.c | 4 ++-- drivers/acpi/numa/srat.c | 4 ++-- drivers/iommu/intel/dmar.c | 2 +- drivers/irqchip/irq-gic-v3-its.c | 7 ++++++- include/linux/acpi.h | 15 +++++++-------- 7 files changed, 21 insertions(+), 19 deletions(-) --=20 2.19.1