From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E365C433EF for ; Tue, 19 Jul 2022 15:37:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0DD6B6B0074; Tue, 19 Jul 2022 11:37:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 08DB96B0075; Tue, 19 Jul 2022 11:37:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EBDF36B0078; Tue, 19 Jul 2022 11:37:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DD29D6B0074 for ; Tue, 19 Jul 2022 11:37:20 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A33E2A54 for ; Tue, 19 Jul 2022 15:37:20 +0000 (UTC) X-FDA: 79704253440.30.8572E7F Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf24.hostedemail.com (Postfix) with ESMTP id 1B6B31800A2 for ; Tue, 19 Jul 2022 15:37:19 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 829CC34CC4; Tue, 19 Jul 2022 15:37:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1658245038; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=AHDsbNoX4BgfFza6+kahupshGKvdKhml/XU63hfZ/84=; b=eqQZHhc2lCrDj0hrb55nBkBm0E15yypGhM4uWaFjAQO9cjLrfMTymRp4WYzLyHuOm7TDPY eQgUACNnQWLoWo5mEHl9d1W30ImeplJy1MDdnlKmXye7d0mpImPaiTt83eNIL2TYKqgLBz fltS6yrMOKoEmdhMlrvlrs89wMd3a38= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 2C4502C141; Tue, 19 Jul 2022 15:37:15 +0000 (UTC) Date: Tue, 19 Jul 2022 17:37:14 +0200 From: Michal Hocko To: David Hildenbrand Cc: Charan Teja Kalla , akpm@linux-foundation.org, pasha.tatashin@soleen.com, sjpark@amazon.de, sieberf@amazon.com, shakeelb@google.com, dhowells@redhat.com, willy@infradead.org, vbabka@suse.cz, minchan@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "iamjoonsoo.kim@lge.com" Subject: Re: [PATCH] mm: fix use-after free of page_ext after race with memory-offline Message-ID: References: <1657810063-28938-1-git-send-email-quic_charante@quicinc.com> <6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658245040; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AHDsbNoX4BgfFza6+kahupshGKvdKhml/XU63hfZ/84=; b=b8NSQFrgo32cJJHxavMxaekSdQjyoQnd3Q+F8WGmdl13P/yKuR9SohKEeFmR0MLGx9L6Dy ObO7JkRnBI8OdJwoLiZYr+e/MFPYsYGoVqwcIOCSlj9xSYfgbGSdGfkNc00bBh4h6Pzm4Q pBAPrewHfonkgvNB8kstY11gXgEwDdo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658245040; a=rsa-sha256; cv=none; b=ApvI1Koi9WBmrrmeHChM+ph4toGfB8nxDvq7Bz8poswDtaZm8nnDzx3adTukTCIFZ390Xz +ImlP50sAtAA4MfhMA+MEM0bhvROGoqPCcH4TucsViQQTVY0RAhZZBS+fghFmCcsU7uEHZ l8qnQbYc06LdZ67KB95A+qq1Fqst6K0= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=eqQZHhc2; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf24.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Rspamd-Queue-Id: 1B6B31800A2 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=eqQZHhc2; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf24.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: 3394fegaq9zb7bsgz1sj5u6mwo8bhm89 X-HE-Tag: 1658245039-464720 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 19-07-22 17:19:34, David Hildenbrand wrote: > On 18.07.22 16:54, Michal Hocko wrote: > > On Mon 18-07-22 19:28:13, Charan Teja Kalla wrote: [...] > >>> 3) Change the design where the page_ext is valid as long as the struct > >>> page is alive. > >> > >> :/ Doesn't spark joy." > > > > I would be wondering why. It should only take to move the callback to > > happen at hotremove. So it shouldn't be very involved of a change. I can > > imagine somebody would be relying on releasing resources when offlining > > memory but is that really the case? > > Various reasons: > > 1) There was a discussion in the past to eventually also use rcu > protection for handling pdn_to_online_page(). So doing it cleanly here > is certainly an improvement. Call me skeptical on that. > 2) I really dislike having to scatter section online checks all over the > place in page ext code. Once there is a difference between active vs. > stale page ext data things get a bit messy and error prone. This is > already ugly enough in our generic memmap handling code IMHO. They should represent a free page in any case so even they are stall they shouldn't be really dangerous, right? > 3) Having on-demand allocations, such as KASAN or page ext from the > memory online notifier is at least currently cleaner, because we don't > have to handle each and every subsystem that hooks into that during the > core memory hotadd/remove phase, which primarily only setups the > vmemmap, direct map and memory block devices. Cannot this hook into __add_pages which is the real implementation of the arch independent way to allocate vmemmap. Or at the sparsemem level because we do not (and very likely won't) support memory hotplug on any other memory model. > Personally, I think what we have in this patch is quite nice and clean. > But I won't object if it can be similarly done in a clean way from > hot(un)plug code. Well, if the scheme can be done without synchronize_rcu for each section which can backfire and if the scheme doesn't add too much complexity to achieve that then sure I won't object. I just do not get why page_ext should have a different allocation lifetime expectancy than a real page. Quite confusing if you ask me. -- Michal Hocko SUSE Labs