From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BB7BC3DA79 for ; Mon, 15 Jan 2024 20:35:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A18016B0087; Mon, 15 Jan 2024 15:35:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C8586B0088; Mon, 15 Jan 2024 15:35:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 890286B0089; Mon, 15 Jan 2024 15:35:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 77EDD6B0087 for ; Mon, 15 Jan 2024 15:35:16 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 10053C0535 for ; Mon, 15 Jan 2024 20:35:15 +0000 (UTC) X-FDA: 81682700232.10.A97EBE9 Received: from mail-vs1-f49.google.com (mail-vs1-f49.google.com [209.85.217.49]) by imf28.hostedemail.com (Postfix) with ESMTP id 372A9C0022 for ; Mon, 15 Jan 2024 20:35:13 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1BVbs9fm; spf=pass (imf28.hostedemail.com: domain of elver@google.com designates 209.85.217.49 as permitted sender) smtp.mailfrom=elver@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705350914; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vMS0J/7CyXgLKOAf0V+IWQO6ynhdDfGM83qwFs2ZDn0=; b=3H3/uPlQJ3v3w3j6fEZ5WkZwWVkXEWEtTDvU9PEIXRTyjutBwEmF6EdqhtkI1v/X53lsbt JlhGJ67dF0NNnZm+vz1f7C0tPPhdshoE5GwJ5+syVNNtwgM7hMJYfkC6Iuu0CHmVPM89IN mnrsWRJwmjFe8MTvOONQKKc8ftDdEsU= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1BVbs9fm; spf=pass (imf28.hostedemail.com: domain of elver@google.com designates 209.85.217.49 as permitted sender) smtp.mailfrom=elver@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705350914; a=rsa-sha256; cv=none; b=R1sNOcmER5h2xt9VEwuaRzKdzx56qW3HvihU5xFq6N3qGtk344ruaoKxrxUAYq4YURiiRi eEt04CbWOO+kvjCGA5UebZdV5qqR3l389p9WigCv4BloEIulzGid4+txGKFqyf4qxHmsHl 4rf+LIWZmnk5vAN7IzIQBqphnK3Q7sw= Received: by mail-vs1-f49.google.com with SMTP id ada2fe7eead31-4678c4e51a5so2269740137.0 for ; Mon, 15 Jan 2024 12:35:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705350913; x=1705955713; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=vMS0J/7CyXgLKOAf0V+IWQO6ynhdDfGM83qwFs2ZDn0=; b=1BVbs9fm6sg/kysq5dS5OgF7+mIRRWn+teMMZPp+rvWr8A+rXSLmKlkakYZGaer9zM Ukendfh6bQRmGJPTVGy1SX8phLRDsw1ddejdKA1SiBC0VSBb8PMRNE9tEvMkqdMKiRze ZuZ/IM+LmKXj4JRMbVy5vcvPjTQoxQLfenO9Pc0eNsNli0GgWmHRUeTYUQ4eDUwJR7ha /lOIwjniM7if1zBII8TDYvTRZ0PzYBDAqV+oNb6eaBy5tHp3XYN7t2hWgvKJfrPkp1jd ZG4NZbElaCdj3WwxWASyTFGzcxfiwHrKCKSBUgYv06qFdu7s8r9ZNXgiswR/b1gKPkJY Oh/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705350913; x=1705955713; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vMS0J/7CyXgLKOAf0V+IWQO6ynhdDfGM83qwFs2ZDn0=; b=b0oYt6HuLbmI5a2TIrESsgK3iTmdoS06xgQoJxiTSFvEg9qZDZaMYprR/VTX6jNgRO g7H+hSL45LHo4ROKA5FY9SJBSGZ5Nthp/f7NBiUPorWo8Pno+DvC4knDJb5665T3ATje JBKTR3DFf8+8zY15SwuYw6OsCsIDsGlXFYc2QR7BoEjI3VhFxKmjDFeWLBghyuDxQVmk LSGxacaUlo3lW2ABmzYGMg6fwZeP1UKz3ojTWPngbtwPnd9X0wLIVjGowH5koyAiL77k L7d+dqaLcNHEMuYm73BUJc3llV/yIgpwxWkS4NAPlTJQwveVnA1ea8GpInXXiaM/4Khr 6XBQ== X-Gm-Message-State: AOJu0YyFgUfPcCQ+w4/bepmA905td/H7ZeIR+O8ENXgrYeeaXFOMqp3M 8AKA+GFyNUi6y8oLTu+Mez+PMCUPrugJ2bkRWpS3bWvCEaYR X-Google-Smtp-Source: AGHT+IGUa0qgnZPret07daFtiwbn3OW5/NvVga8dBqeZHddyjGHQ4zzvPkyxT1yyJjqP7HMhOHZVVrKV/FTCklvQ4oM= X-Received: by 2002:a05:6102:48c:b0:468:dca:dd58 with SMTP id n12-20020a056102048c00b004680dcadd58mr3124255vsa.17.1705350912977; Mon, 15 Jan 2024 12:35:12 -0800 (PST) MIME-Version: 1.0 References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> <20240115184430.2710652-1-glider@google.com> In-Reply-To: <20240115184430.2710652-1-glider@google.com> From: Marco Elver Date: Mon, 15 Jan 2024 21:34:35 +0100 Message-ID: Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage To: Alexander Potapenko Cc: quic_charante@quicinc.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, dan.j.williams@intel.com, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, osalvador@suse.de, vbabka@suse.cz, "Paul E. McKenney" , Dmitry Vyukov , kasan-dev@googlegroups.com, Ilya Leoshkevich , Nicholas Miehlbradt Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 372A9C0022 X-Rspam-User: X-Stat-Signature: zoia7jt7pzmuxjn5igcoxtie4xubg19k X-Rspamd-Server: rspam01 X-HE-Tag: 1705350913-72996 X-HE-Meta: U2FsdGVkX1/mHiNsfM4ySccMOSM7uE9hd9nmN8S6o9aR9RdexASY4YOePGHEZ+Q11dRvpunRSnaKka6kKlKet0q2BwCw1UxNMwcsoVxwZn/kGtyiWFfce3hqqIpTmmkFHKlXAi8WK18yyDvWjZkIA/eztfwDVnSpWy2dhs/Y3MBSdm9c+uxadNTcsTXUxvzxi7zLBPAPuZVPKcd4+NgnYsqzV7ZVhYSzHNZ574dhRVZK9PS36dpE2bnOzPVVdLrdeouel5Ek5+8kPSV6TEC9CZ1i9WYRKIxEA1tMTWlrSG6irxiyJvkdAGiFdQsFsLRsklKqC7o3DuZ10DHYcFgL41JvJN6Q5elrmAs7gq8Hy/rGDUsae5RZR9DG5JoxKnrtkR8Ms0RD4+ft++9HCj0YomvisCb42nYdFo5+gOvTunYG0VJr8U+tzKtThtxqd28g70G2S/6PqInX1toXN4ySk2MD0WiVm6AhzRhyKJZp7LybcD/2EeOm8L6NSc1Irk4WGtHsAzg+YsLk6ZrCQR+neozCp2gnMcnAR3AoGKRdxOYgn8ufZWjzdwG1B/WKatWbfEQSu+9iAQ1SL7LI5O6M5OH0zJZzZzJ8+8YKXoysxl8xPs/jgoacr9vPtuhm++LOTIEZ90YIYxDqpzgaxDN7f/+MNebr6PeqwTFc/U2O4NdRE7481XOxf3U6TvD3x3xyfzA81LnH2zxCUMxTNBUlIGM22v5Pa5Nr0cmhvgLck+9OkSUez4L8Lv4t4YG3g2e+zkR1PLPJuawNI/HAt01JZR51inhaFNFnGyZitHcHMSp5hMC3+hLeDbk+YRDpmT3nM7ILJSWPEx3SJicKz0ZFhJnuMkFW0oZjoEQ6HXYowIDP6iyVCzUAqaYtZL48p+KWwdEIjGMah7ui65xe4Lb694X180UtgDPBKGcTL3+t93AKOD4+drELG6Dv+kyeQG/kLIIF1sy714o8sMtLQbL 3B6zvCbw 5yRqnuLLKN6eHvOJh4mncYamN0zQESLnbFWd76zwvMpiO2dcGD0heEfTFvDbqc0pyw2lvr0y4hRQGH4RNRkQuYBPTQUoHw2CGpwu2biqYj/ajz4wQoXIXpzTf/5+q4k5zYrEkSsqqxe2K0LqkjCWrYPfBN+5nbFeAh59O5IpFPPI+kmPANYLCIE/GDFo9rTkc/m5Qk70iKnWDIS9SS0L3b+mmsfhW83fPWfPd+ZkJKLhw6rea9+5dXkBktGraU9KNKAJUDAstxuYxlnRjN4tsmYhWXQ8yvo/cMmEurCXp5aOv/zDntvItKCbqW4fUF8no3+YtdtDhzHuTJxC5bLG5xLiw/3hfj7F0N4RXTbULZYHu8f0+KMQm7vqnkW9BdE9bj3Kqi05zFZCAni3lja1tyHXEgAUaniKw7Kz4qt1FRU+MqoMg4hRhMj6bZFKUApADQzss X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 15 Jan 2024 at 19:44, Alexander Potapenko wrote: > > Cc: "Paul E. McKenney" > Cc: Marco Elver > Cc: Dmitry Vyukov > Cc: kasan-dev@googlegroups.com > Cc: Ilya Leoshkevich > Cc: Nicholas Miehlbradt > > Hi folks, > > (adding KMSAN reviewers and IBM people who are currently porting KMSAN to other > architectures, plus Paul for his opinion on refactoring RCU) > > this patch broke x86 KMSAN in a subtle way. > > For every memory access in the code instrumented by KMSAN we call > kmsan_get_metadata() to obtain the metadata for the memory being accessed. For > virtual memory the metadata pointers are stored in the corresponding `struct > page`, therefore we need to call virt_to_page() to get them. > > According to the comment in arch/x86/include/asm/page.h, virt_to_page(kaddr) > returns a valid pointer iff virt_addr_valid(kaddr) is true, so KMSAN needs to > call virt_addr_valid() as well. > > To avoid recursion, kmsan_get_metadata() must not call instrumented code, > therefore ./arch/x86/include/asm/kmsan.h forks parts of arch/x86/mm/physaddr.c > to check whether a virtual address is valid or not. > > But the introduction of rcu_read_lock() to pfn_valid() added instrumented RCU > API calls to virt_to_page_or_null(), which is called by kmsan_get_metadata(), > so there is an infinite recursion now. I do not think it is correct to stop that > recursion by doing kmsan_enter_runtime()/kmsan_exit_runtime() in > kmsan_get_metadata(): that would prevent instrumented functions called from > within the runtime from tracking the shadow values, which might introduce false > positives. > > I am currently looking into inlining __rcu_read_lock()/__rcu_read_unlock(), into > KMSAN code to prevent it from being instrumented, but that might require factoring > out parts of kernel/rcu/tree_plugin.h into a non-private header. Do you think this > is feasible? __rcu_read_lock/unlock() is only outlined in PREEMPT_RCU. Not sure that helps. Otherwise, there is rcu_read_lock_sched_notrace() which does the bare minimum and is static inline. Does that help? > Another option is to cut some edges in the code calling virt_to_page(). First, > my observation is that virt_addr_valid() is quite rare in the kernel code, i.e. > not all cases of calling virt_to_page() are covered with it. Second, every > memory access to KMSAN metadata residing in virt_to_page(kaddr)->shadow always > accompanies an access to `kaddr` itself, so if there is a race on a PFN then > the access to `kaddr` will probably also trigger a fault. Third, KMSAN metadata > accesses are inherently non-atomic, and even if we ensure pfn_valid() is > returning a consistent value for a single memory access, calling it twice may > already return different results. > > Considering the above, how bad would it be to drop synchronization for KMSAN's > version of pfn_valid() called from kmsan_virt_addr_valid()?