From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9EE8C43334 for ; Wed, 20 Jul 2022 10:43:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3A356B0072; Wed, 20 Jul 2022 06:43:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EE89B6B0073; Wed, 20 Jul 2022 06:43:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD9326B0074; Wed, 20 Jul 2022 06:43:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CDF646B0072 for ; Wed, 20 Jul 2022 06:43:30 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 794BA140499 for ; Wed, 20 Jul 2022 10:43:30 +0000 (UTC) X-FDA: 79707141780.19.AE42DD9 Received: from alexa-out-sd-02.qualcomm.com (alexa-out-sd-02.qualcomm.com [199.106.114.39]) by imf14.hostedemail.com (Postfix) with ESMTP id CE6DE10007E for ; Wed, 20 Jul 2022 10:43:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1658313809; x=1689849809; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=bXOtkSkNPN/B9tmbWziX3PgrKXqM7+mLUw/3t2EA2Ug=; b=T4qSwC9EppExbVEUPXs4Yn8xodhI81u6Wuw22az2yxFwuN/DvYBj7vJY 2nrv+N209dFXTpMWNNXregGoZlkkDOQR7nzSyUehW036PWXZN4R/G5UTc Bu2GS/rgIeIzwwQIPXOA1lChQR3w+zb2Ghc0WCgQAnojwvUmjThQ9WDeY Y=; Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-02.qualcomm.com with ESMTP; 20 Jul 2022 03:43:28 -0700 X-QCInternal: smtphost Received: from nasanex01c.na.qualcomm.com ([10.47.97.222]) by ironmsg02-sd.qualcomm.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jul 2022 03:43:28 -0700 Received: from nalasex01a.na.qualcomm.com (10.47.209.196) by nasanex01c.na.qualcomm.com (10.47.97.222) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.22; Wed, 20 Jul 2022 03:43:27 -0700 Received: from [10.216.34.46] (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.22; Wed, 20 Jul 2022 03:43:22 -0700 Message-ID: Date: Wed, 20 Jul 2022 16:13:19 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH] mm: fix use-after free of page_ext after race with memory-offline Content-Language: en-US To: Michal Hocko , Pavan Kondeti CC: , , , , , , , , , , , , "iamjoonsoo.kim@lge.com" References: <1657810063-28938-1-git-send-email-quic_charante@quicinc.com> <20220720082112.GA14437@hu-pkondeti-hyd.qualcomm.com> From: Charan Teja Kalla In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcdkim header.b=T4qSwC9E; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf14.hostedemail.com: domain of quic_charante@quicinc.com designates 199.106.114.39 as permitted sender) smtp.mailfrom=quic_charante@quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658313810; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bXOtkSkNPN/B9tmbWziX3PgrKXqM7+mLUw/3t2EA2Ug=; b=yQ1RvPqyJ4uZhOTQmHhJxV1OoDLDOrhFfxwSL275CjkrnoIEDUydk0Y8IUTFjEmTtDsWks pPWtNHwgAy+pUTxN2/Xia0gOnn/Dp37jiVXJVAdrr0hotTmusvnLP8prJAf5IOymXSP4LA TWLmHdqRZQc/IpnaerawEpau9amXbac= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658313810; a=rsa-sha256; cv=none; b=56/A1V1xG/R5Qg0QtfA1Jt5yP/A/NLt8HhON5DGCrmaeZ9/sMS5pU+YkoXHKRjcpKS4UIF lvfhTdDhR0DMsl7i/sc1p4xgDrTmmlbltS0tTQC4ZQYkmCCZ1HqtTmfSX+VAfmPT1i/eAV MdKbBDWB+YXB03UtnclqfRZAokRPWA0= X-Rspamd-Queue-Id: CE6DE10007E Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcdkim header.b=T4qSwC9E; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf14.hostedemail.com: domain of quic_charante@quicinc.com designates 199.106.114.39 as permitted sender) smtp.mailfrom=quic_charante@quicinc.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: m4oa71qqmoxg9gqxtwd59t3rzouyinua X-HE-Tag: 1658313809-167909 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks Michal & Pavan, On 7/20/2022 2:40 PM, Michal Hocko wrote: >>>> Thanks! The most imporant part is how the exclusion is actual achieved >>>> because that is not really clear at first sight >>>> >>>> CPU1 CPU2 >>>> lookup_page_ext(PageA) offlining >>>> offline_page_ext >>>> __free_page_ext(addrA) >>>> get_entry(addrA) >>>> ms->page_ext = NULL >>>> synchronize_rcu() >>>> free_page_ext >>>> free_pages_exact (now addrA is unusable) >>>> >>>> rcu_read_lock() >>>> entryA = get_entry(addrA) >>>> base + page_ext_size * index # an address not invalidated by the freeing path >>>> do_something(entryA) >>>> rcu_read_unlock() >>>> >>>> CPU1 never checks ms->page_ext so it cannot bail out early when the >>>> thing is torn down. Or maybe I am missing something. I am not familiar >>>> with page_ext much. >>> >>> Thanks a lot for catching this Michal. You are correct that the proposed >>> code from me is still racy. I Will correct this along with the proper >>> commit message in the next version of this patch. >>> >> Trying to understand your discussion with Michal. What part is still racy? We >> do check for mem_section::page_ext and bail out early from lookup_page_ext(), >> no? >> >> Also to make this scheme explicit, we can annotate page_ext member with __rcu >> and use rcu_assign_pointer() on the writer side. Annotating with __rcu requires all the read and writes to ms->page_ext to be under rcu_[access|assign]_pointer which is a big patch. I think READ_ONCE and WRITE_ONCE, mentioned by Michal, below should does the job. >> >> struct page_ext *lookup_page_ext(const struct page *page) >> { >> unsigned long pfn = page_to_pfn(page); >> struct mem_section *section = __pfn_to_section(pfn); >> /* >> * The sanity checks the page allocator does upon freeing a >> * page can reach here before the page_ext arrays are >> * allocated when feeding a range of pages to the allocator >> * for the first time during bootup or memory hotplug. >> */ >> if (!section->page_ext) >> return NULL; >> return get_entry(section->page_ext, pfn); >> } > You are right. I was looking at the wrong implementation and misread > ifdef vs. ifndef CONFIG_SPARSEMEM. My bad. > There is still a small race window b/n ms->page_ext setting NULL and its access even under CONFIG_SPARSEMEM. In the above mentioned example: CPU1 CPU2 rcu_read_lock() lookup_page_ext(PageA): offlining offline_page_ext __free_page_ext(addrA) get_entry(addrA) if (!section->page_ext) turns to be false. ms->page_ext = NULL addrA = get_entry(base=section->page_ext): base + page_ext_size * index; **Since base is NULL here, caller can still do the dereference on the invalid pointer address.** synchronize_rcu() free_page_ext free_pages_exact (now ) > Memory hotplug is not supported outside of CONFIG_SPARSEMEM so the > scheme should really work. I would use READ_ONCE for ms->page_ext and > WRITE_ONCE on the initialization side. Yes, I should be using the READ_ONCE() and WRITE_ONCE() here. Thanks, Charan