From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B589C433F5 for ; Wed, 9 Mar 2022 21:55:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB8EA8D0003; Wed, 9 Mar 2022 16:55:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D68308D0001; Wed, 9 Mar 2022 16:55:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C30128D0003; Wed, 9 Mar 2022 16:55:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id B5DE88D0001 for ; Wed, 9 Mar 2022 16:55:43 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8AF6D2071C for ; Wed, 9 Mar 2022 21:55:43 +0000 (UTC) X-FDA: 79226205366.11.4B47496 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf30.hostedemail.com (Postfix) with ESMTP id 11EA580014 for ; Wed, 9 Mar 2022 21:55:42 +0000 (UTC) Received: by mail-pj1-f50.google.com with SMTP id m11-20020a17090a7f8b00b001beef6143a8so3496582pjl.4 for ; Wed, 09 Mar 2022 13:55:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9fPg1ReAlI/OxffQjTQJbSAN97su7fVq3uGIZ7B9a6Y=; b=dfcw9olikZDzL0RqPH+JgDmeVPHx6RrEQgbK9G8tLLiJYTYE85J0XqT81GfAmLNWFl b9/bZ/TcOIfxVYS8OfhwgIOMAsiPjoWXhMrmIpZP72BFomKtuYs+x9l0dW6Kx5UJSNCz JQ5kvdiOEx/TBkgfi4X3qkrC08wFIEK+9wCR2r0/Db2rvYLO77FTE2WfcQ+v3svao1TL o1a8cBeHW82aobG4iLL25DXnLf2yl1vZ8GFkqCFz4JHYnPrlQCYqa5/y5WbmgOyVUfdv MiPwfKOOmtv9gHLAz6hoqJl3Awx2cWS/21hU9TXW/eu4K6QgXKnJGTxHBrNHJjwcdd11 YF/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9fPg1ReAlI/OxffQjTQJbSAN97su7fVq3uGIZ7B9a6Y=; b=vkJT/8sc56mLg8zPE3r/fl0FC1fF9dZHUl+dizszx0Ttgv1Yl2GDt0W236vqUqgHq/ kv6iWYAFIr4KXLVUkeMtjlZGC6tW4ZA8eTVEQer+df7aF+I9sObCx23Ylh/1DBEv/d+z t3T3nKsmu99O8O8+wjpqzPTb+OkRBDMvrsM2L/QePwUx7A9GefXEARrURb2p8SRPVXoQ nyZugLKpidIkJ5XqzxpDDTulGbXsM9RNhY7uvqnrw7APa9AKER8GxmzN/jAkjB5wEU5+ xETXzIOqtMABT3rycgPhAoeL+Cf6fYy8Ls6D0309hbfP57sR1CnHjDrsTzKkYF3BR4pp Zgig== X-Gm-Message-State: AOAM5336EO+ZMDInH+IBkQDlCQaOXlFJlZqXnCdHdrJonLH4JW2fatpY 7I2IUefZgCuXg4T49yWogsYw6pc+mEZw9RakiqE= X-Google-Smtp-Source: ABdhPJw/uu83dqkESnkZ3J+RbKaZtxKKPVYcVYk7dSEOud/wim9hdUDZdD6zA1MdcVglmVbO0KXqa4XIAcnxXqcCSJ0= X-Received: by 2002:a17:90a:7181:b0:1bf:a024:ab61 with SMTP id i1-20020a17090a718100b001bfa024ab61mr1637893pjk.200.1646862942135; Wed, 09 Mar 2022 13:55:42 -0800 (PST) MIME-Version: 1.0 References: <20220309091449.2753904-1-naoya.horiguchi@linux.dev> In-Reply-To: <20220309091449.2753904-1-naoya.horiguchi@linux.dev> From: Yang Shi Date: Wed, 9 Mar 2022 13:55:30 -0800 Message-ID: Subject: Re: [PATCH v1] mm/hwpoison: set PageHWPoison after taking page lock in memory_failure_hugetlb() To: Naoya Horiguchi Cc: Linux MM , Andrew Morton , Mike Kravetz , Miaohe Lin , Naoya Horiguchi , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 11EA580014 X-Stat-Signature: pycfoqdm3f7o59bqk96bw8jop8srpwi1 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=dfcw9oli; spf=pass (imf30.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-HE-Tag: 1646862942-241035 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 9, 2022 at 1:15 AM Naoya Horiguchi wrote: > > From: Naoya Horiguchi > > There is a race condition between memory_failure_hugetlb() and hugetlb > free/demotion, which causes setting PageHWPoison flag on the wrong page > (which was a hugetlb when memory_failrue() was called, but was removed > or demoted when memory_failure_hugetlb() is called). This results in > killing wrong processes. So set PageHWPoison flag with holding page lock, > > Signed-off-by: Naoya Horiguchi > --- > mm/memory-failure.c | 27 ++++++++++++--------------- > 1 file changed, 12 insertions(+), 15 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index ac6492e36978..fe25eee8f9d6 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1494,24 +1494,11 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > int res; > unsigned long page_flags; > > - if (TestSetPageHWPoison(head)) { > - pr_err("Memory failure: %#lx: already hardware poisoned\n", > - pfn); > - res = -EHWPOISON; > - if (flags & MF_ACTION_REQUIRED) > - res = kill_accessing_process(current, page_to_pfn(head), flags); > - return res; > - } > - > - num_poisoned_pages_inc(); > - > if (!(flags & MF_COUNT_INCREASED)) { > res = get_hwpoison_page(p, flags); I'm not an expert of hugetlb, I may be wrong. I'm wondering how this could solve the race? Is the below race still possible? __get_hwpoison_page() head = compound_head(page) hugetlb demotion (1G --> 2M) get_hwpoison_huge_page(head, &hugetlb); Then the head may point to a 2M page, but the hwpoisoned subpage is not in that 2M range? > if (!res) { > lock_page(head); > if (hwpoison_filter(p)) { > - if (TestClearPageHWPoison(head)) > - num_poisoned_pages_dec(); > unlock_page(head); > return -EOPNOTSUPP; > } > @@ -1544,13 +1531,16 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > page_flags = head->flags; > > if (hwpoison_filter(p)) { > - if (TestClearPageHWPoison(head)) > - num_poisoned_pages_dec(); > put_page(p); > res = -EOPNOTSUPP; > goto out; > } > > + if (TestSetPageHWPoison(head)) And I don't think "head" is still the head you expected if the race happened. I think we need to re-retrieve the head once the page refcount is bumped and locked. > + goto already_hwpoisoned; > + > + num_poisoned_pages_inc(); > + > /* > * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so > * simply disable it. In order to make it work properly, we need > @@ -1576,6 +1566,13 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > out: > unlock_page(head); > return res; > +already_hwpoisoned: > + unlock_page(head); > + pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); > + res = -EHWPOISON; > + if (flags & MF_ACTION_REQUIRED) > + res = kill_accessing_process(current, page_to_pfn(head), flags); > + return res; > } > > static int memory_failure_dev_pagemap(unsigned long pfn, int flags, > -- > 2.25.1 >