From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9843CDB465 for ; Mon, 16 Oct 2023 08:23:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51AFA6B017C; Mon, 16 Oct 2023 04:23:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CACA6B0189; Mon, 16 Oct 2023 04:23:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3925A6B018D; Mon, 16 Oct 2023 04:23:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2854F6B017C for ; Mon, 16 Oct 2023 04:23:24 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E78BB80B14 for ; Mon, 16 Oct 2023 08:23:23 +0000 (UTC) X-FDA: 81350635086.26.90CD752 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf03.hostedemail.com (Postfix) with ESMTP id AB2FC20006 for ; Mon, 16 Oct 2023 08:23:21 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HBYftfhg; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697444601; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6yKDePwtzTa+USWKPH07FF4dsILlQgoX/L1Fc3rLl0E=; b=a1Nar/dkvmM+7/POuVLAYyn2C/vnUdMaISI5Lgzw0uveaoxJajZjSF4d2rB465XvNuAhh9 bfdSooXpCqcsM8L5HXP92nZEviV/HqBzJsXBlt/u1qkxmNHmhX1gkcKIiI71EpL3isqzdN zXXTq6tJlZTypFx9PUy47W01D1G1Ueo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697444601; a=rsa-sha256; cv=none; b=W9mvjTqo7bZA/EPMnCjXQqxBRSySzFLUKxjvVoZ254wMmZUWYjBzj+B1HI6nTGxXiBjcju GV4L6O+WUVKIGLrbYBErFJkDSGb8Wr9XnQWdlgkwATcQ8O7kID0VZJyK4NEH4ut25G0YIY fRJf2rA4AF6xXwE7wsHBphlT0LpwQkg= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HBYftfhg; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1697444601; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6yKDePwtzTa+USWKPH07FF4dsILlQgoX/L1Fc3rLl0E=; b=HBYftfhgroVhGmLwkARk1T4pSzJBqntSGOD3UBaoYfqfDLliWaHyvVpmqfn+pkljPQURGx aety+F5r2/AEe5c1y4L3ZZpDKYMjsWsb1q8DRzxH26+VoOkvvE1W1mnXq/ZA2AI8pwBvyj A9ce4u1SAXGlKXDyDXQjGOH/QKpuaUM= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-125-GcmN2vUdM2OYTMv7VqTehg-1; Mon, 16 Oct 2023 04:23:19 -0400 X-MC-Unique: GcmN2vUdM2OYTMv7VqTehg-1 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-405917470e8so30943965e9.1 for ; Mon, 16 Oct 2023 01:23:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697444598; x=1698049398; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6yKDePwtzTa+USWKPH07FF4dsILlQgoX/L1Fc3rLl0E=; b=cERDzuV4JmZhd+JJ8leIl7zwheJez8DN5t7kgJKoXH6rql+O/jQy554vkMhbhFiFwL BOTGiyU4BvcuAB3tTi0489F+e/+DAoF6Eu/Ej0BITRGCFr0XxIznKR6mLbTWmj1w8W8j Y7++9ig1sMZCekVkcJgBlAvoDPWyYfit/rOS9pC303VWVY/Nl3gJmZ5WIzSOl81KUbC2 tL72ft3T8Y8FpXkXCkI34hql84AdxgtDWv8P2XQXG/cljobtmQnNbv1TayncnGxQvxv3 fRrv0Lnc26J7E/6ICLaHIe16e98RJx8ivkhWfLWNpjpH5bXzyAGz7XhdCH5GfG4L9vlc gbUw== X-Gm-Message-State: AOJu0YxvrEIEyxicRs5KR189IPGwZd6KqEXNYn0a9zB9fPRVoqZl836D K2/paicfHa3iJ3zT/vW2BxZSHfAG3y6Ue5bOSCH+QbxcjP0VSclfYU8FaD/rC1imMlqpr16RwUq jVqOm5LkUYF8= X-Received: by 2002:a5d:594c:0:b0:31f:ef77:67ee with SMTP id e12-20020a5d594c000000b0031fef7767eemr30236409wri.40.1697444598358; Mon, 16 Oct 2023 01:23:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IER3lZ78xZW6erjzHE/D/mHu65eaq4cnC5JaODMu+8Ut61774H39DoplStarXPslv8yUZczyg== X-Received: by 2002:a5d:594c:0:b0:31f:ef77:67ee with SMTP id e12-20020a5d594c000000b0031fef7767eemr30236391wri.40.1697444597881; Mon, 16 Oct 2023 01:23:17 -0700 (PDT) Received: from ?IPV6:2003:cb:c73c:9300:8903:bf2e:db72:6527? (p200300cbc73c93008903bf2edb726527.dip0.t-ipconnect.de. [2003:cb:c73c:9300:8903:bf2e:db72:6527]) by smtp.gmail.com with ESMTPSA id n6-20020adffe06000000b003140f47224csm26564497wrr.15.2023.10.16.01.23.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 16 Oct 2023 01:23:17 -0700 (PDT) Message-ID: Date: Mon, 16 Oct 2023 10:23:16 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 To: Andrew Morton , Charan Teja Kalla Cc: osalvador@suse.de, dan.j.williams@intel.com, vbabka@suse.cz, mgorman@techsingularity.net, aneesh.kumar@linux.ibm.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> <20231014152532.5f3dca7838c2567a1a9ca9c6@linux-foundation.org> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage In-Reply-To: <20231014152532.5f3dca7838c2567a1a9ca9c6@linux-foundation.org> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: AB2FC20006 X-Rspam-User: X-Stat-Signature: 4ba88euzfjr7p77oy9nkh3sd3pzdkiam X-Rspamd-Server: rspam03 X-HE-Tag: 1697444601-205395 X-HE-Meta: U2FsdGVkX1/9k9qF8ihJSZhfwXiArDFtrOD6FRbiFo0uqP/4Elia3D9lqJgoIBcPBVRs1cGsLJz7Onyl6SV5oBISvjC8ExSFpb9wnMb8d4noXmty//68eMJkpGz0Qld3vIMCk3TqB6uzk2FicVJeOnqADhoNnqbqxCGOGQmc/HOTQ5xRj3+FxHpCcTuFAN6g5GHpn7O0mOz05zLb6o0tl84fW8cCAXL5W6+YMBjDSt78hb41iPrIvYRN69VMKuULsxEVfGvXDyTtIre2I/IihyamUMIrNWOxDOwKF1z3IOEoAy3LGjGii1PR67dXuWhRbPnqOrLpQJHknSkJpw6NXC1xW4bUCig3u/3RKiIEUFDxOhsYZRy/oJNnByAcpxij7iHnlsO45poK2Ztjc6KEfeuagnN4FYHGRDm0s+RCN3qvakRBJxM3fdslQNraWoStACrnK6N8uGCH6dcedqXiYZpPUaAeuPuVOW+e32S851Mqtz/fJ6Sh4oNo7L6c330mwC0pc/FWte9nqTLx4MtivhNtjgvwz361yZ4kUKfzcqjMdxnvpJuI9jdMq/9TB2o53ajHr/Y6GuWPYP/PpjOfgqvrteP2oOWyGjkxLZJw4Zx0BKwkC2x/WVj9L4/YQFuNZnnUkmENyKOb3XmPK7t6G2MzcrTaWx0P5NwPyXlWDW96AOmmX5pzSK4ZpqSsIj1XO1pdcNmz+lK6N+O41FZLhR7Ren+jJ373Im+Pp4YBN/H3b2LCTn3xUyBYtSZZTanBBtKg0GcfH1kUNHmayD58uwOrF/1Jb3aJL6k+0qKBPs2++HJW2CG0onuR2tsLj3clYyhpsmDkZFc/XGUbqyl8/iK7YPSl6LhdwGHLSIAFHchY4IyNMd4qb4I5UMg4121DvARwiZDc0Y/VsxRpO1K8jUa8WEdrlUqNPZjUqcOsjaGbx4DGLaSBRXBeeuI9H7xNu3C7lcuUi/p1Ov/LVWt id/1bbNb V+wwC+fhAYqzrd+a139bc28MiFzZo2ZX8HNMNBjSXeuRIj00XL2VRqzqLKugAtPEwOrIjMf5VZumm6OM4EirYAJ0b71gszfpWD+4s3RTaWZwCjWQZ/Lqu0e347z7FehJNVeCZeYGBwmhcaIha/0az5WQHYveHmvr0v3d9jTIuw/Nf8xzY1A5mEFJpj1R4DV0wuNiCC/xWff0smN4Jwr29MaVCc5l1dIBTd0iQLrpHseey9RRJqlmG1EEDmBzS40omJmKjICi2tieR7V8jkNk6PGd1mrM2IX/NmtZC3VW+ShMQIdOpDYEkrmzk9Fm4UaIFon0SyyWCDOEjcZH4lB/AxJluNsp1gA+WFj1vWiM7qYBQkT4lpy+GbBqni7w7XCwS06pUCwH5F4qyUi7lzBneCBphVvLZuhEP9QU6qrosISu1aAmjoS/kNJXzm2is7SymlZa2mDiyj+HiuJkdabqYkkvE72vxW90dA5rB7J56et25/Xc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 15.10.23 00:25, Andrew Morton wrote: > On Fri, 13 Oct 2023 18:34:27 +0530 Charan Teja Kalla wrote: > >> The below race is observed on a PFN which falls into the device memory >> region with the system memory configuration where PFN's are such that >> [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and >> end pfn contains the device memory PFN's as well, the compaction >> triggered will try on the device memory PFN's too though they end up in >> NOP(because pfn_to_online_page() returns NULL for ZONE_DEVICE memory >> sections). When from other core, the section mappings are being removed >> for the ZONE_DEVICE region, that the PFN in question belongs to, >> on which compaction is currently being operated is resulting into the >> kernel crash with CONFIG_SPASEMEM_VMEMAP enabled. > > Seems this bug is four years old, yes? It must be quite hard to hit. From the description, it's not quite clear to me if this was actually hit -- usually people include the dmesg bug/crash info. > > When people review this, please offer opinions on whether a fix should > be backported into -stable kernels, thanks. > >> compact_zone() memunmap_page >> ------------- --------------- >> __pageblock_pfn_to_page >> ...... >> (a)pfn_valid(): >> valid_section()//return true >> (b)__remove_pages()-> >> sparse_remove_section()-> >> section_deactivate(): >> [Free the array ms->usage and set >> ms->usage = NULL] >> pfn_section_valid() >> [Access ms->usage which >> is NULL] >> >> NOTE: From the above it can be said that the race is reduced to between >> the pfn_valid()/pfn_section_valid() and the section deactivate with >> SPASEMEM_VMEMAP enabled. >> >> The commit b943f045a9af("mm/sparse: fix kernel crash with >> pfn_section_valid check") tried to address the same problem by clearing >> the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns >> false thus ms->usage is not accessed. >> >> Fix this issue by the below steps: >> a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage. >> b) RCU protected read side critical section will either return NULL when >> SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage. >> c) Synchronize the rcu on the write side and free the ->usage. No >> attempt will be made to access ->usage after this as the >> SECTION_HAS_MEM_MAP is cleared thus valid_section() return false. This affects any kind of memory hotunplug. When hotunplugging memory we will end up calling synchronize_rcu() for each and every memory section, which sounds extremely wasteful. Can't we find a way to kfree_rcu() that thing and read/write the pointer using READ?ONCE?WRITE_ONCE instead? -- Cheers, David / dhildenb