From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 195ECC433B4 for ; Wed, 21 Apr 2021 08:15:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6CE1761436 for ; Wed, 21 Apr 2021 08:15:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6CE1761436 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BE6046B006C; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5BAD6B006E; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B02D6B0070; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id 7C3456B006C for ; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 31D89181AEF32 for ; Wed, 21 Apr 2021 08:15:39 +0000 (UTC) X-FDA: 78055665198.18.5E5D65B Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf14.hostedemail.com (Postfix) with ESMTP id 5CFDDC0007CE for ; Wed, 21 Apr 2021 08:15:27 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id q2so3672379pfk.9 for ; Wed, 21 Apr 2021 01:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tDNi7Pq76OfCbCdDxGFAeY3yEV5OAvKRxPwWNq6c2gI=; b=o8Og5Y6Dn/+kfZJBIZs6BSbOTg5nNzykjXvsqwsIVdyq+0GF0bS3b+DdqN6hXehJDI biDuz3w83z7gZ2p3zCFEgxTozK7f5T/vS/T4lP+erF7v0waEzzszV296BjZMgUPSbINI JyCsfp0RQMKtwvkdEP7Zw9uEKNFwIxYCHtVCyPfWq9appujBXx4FzsxfBdMwgnEvqNXf mTcBcncnis9jwnR1Ldm+6ZG0LjnIrYvoFQ+FVyqr774UZGGGf2HWWhLZ5AwdK6oI6L4r ER0aZh3OT3sajU78pis5FyN1AdgC7Vz638KeEDzGnG5IUGDHdWteZ4RDDdqXgFkpl2zC QP1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tDNi7Pq76OfCbCdDxGFAeY3yEV5OAvKRxPwWNq6c2gI=; b=Fa0yXZUXQMqf8deX4XiMTPBjiEPSuyuKXOqw3nIF/y2xQoBErbZs4d333gmZ/sHOY6 oNHTa7meOggB4WNTYWJz4YpEGKNoVNfvTnMGvBX0WUKlr23VgiBu4oSLbZrpCwIFULKx iUOrcDL9cmNejA/I6RtURK256hxE0wUgQ/ihkzX+p5DDkMDeVr+4sIBCXbeo09kfbrhN h95R14F6pRlndVQCPGcBdjJpBsyedBMyUK+eiYxTl2Hjf1drhMolazzGWfm0dHEa7iDk 3lH6pmcplOTFzXCh2dMLUOyxZmQh93lEIECqGBIUNoVFBai9gD7wI9mRHtmQNCrEArnE hArA== X-Gm-Message-State: AOAM530oML1K14aX3KSUyO3+EGhX9n97zP560NN6N/TT7PSA46nR1qht 0ytDKQ2X6ZVg96W5lV7r/6eobmjF5tMR1S5lKQn8HA== X-Google-Smtp-Source: ABdhPJzXH/SH9ISXGP6z7pxMlQohQseRZtXZzZ6t0MqPW7IN6/wHVEY+DNN9jKRYkpmbFgdPziL4EhPQZKiFliNnuHE= X-Received: by 2002:a63:f07:: with SMTP id e7mr21045228pgl.341.1618992937492; Wed, 21 Apr 2021 01:15:37 -0700 (PDT) MIME-Version: 1.0 References: <20210421060259.67554-1-songmuchun@bytedance.com> In-Reply-To: From: Muchun Song Date: Wed, 21 Apr 2021 16:15:00 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: hugetlb: fix a race between memory-failure/soft_offline and gather_surplus_pages To: Michal Hocko Cc: Mike Kravetz , Andrew Morton , Oscar Salvador , Linux Memory Management List , LKML , Naoya Horiguchi Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5CFDDC0007CE X-Stat-Signature: r3cksqi6jt4ns93suos4fuae7dgzxoh4 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mail-pf1-f169.google.com; client-ip=209.85.210.169 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618992927-142447 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 21, 2021 at 4:03 PM Michal Hocko wrote: > > [Cc Naoya] > > On Wed 21-04-21 14:02:59, Muchun Song wrote: > > The possible bad scenario: > > > > CPU0: CPU1: > > > > gather_surplus_pages() > > page = alloc_surplus_huge_page() > > memory_failure_hugetlb() > > get_hwpoison_page(page) > > __get_hwpoison_page(page) > > get_page_unless_zero(page) > > zero = put_page_testzero(page) > > VM_BUG_ON_PAGE(!zero, page) > > enqueue_huge_page(h, page) > > put_page(page) > > > > The refcount can possibly be increased by memory-failure or soft_offline > > handlers, we can trigger VM_BUG_ON_PAGE and wrongly add the page to the > > hugetlb pool list. > > The hwpoison side of this looks really suspicious to me. It shouldn't > really touch the reference count of hugetlb pages without being very > careful (and having hugetlb_lock held). What would happen if the > reference count was increased after the page has been enqueed into the > pool? This can just blow up later. If the page has been enqueued into the pool, then the page can be allocated to other users. The page reference count will be reset to 1 in the dequeue_huge_page_node_exact(). Then memory-failure will free the page because of put_page(). This is wrong. Because there is another user. > > > Signed-off-by: Muchun Song > > --- > > mm/hugetlb.c | 11 ++++------- > > 1 file changed, 4 insertions(+), 7 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 3476aa06da70..6c96332db34b 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -2145,17 +2145,14 @@ static int gather_surplus_pages(struct hstate *h, long delta) > > > > /* Free the needed pages to the hugetlb pool */ > > list_for_each_entry_safe(page, tmp, &surplus_list, lru) { > > - int zeroed; > > - > > if ((--needed) < 0) > > break; > > /* > > - * This page is now managed by the hugetlb allocator and has > > - * no users -- drop the buddy allocator's reference. > > + * The refcount can possibly be increased by memory-failure or > > + * soft_offline handlers. > > */ > > - zeroed = put_page_testzero(page); > > - VM_BUG_ON_PAGE(!zeroed, page); > > - enqueue_huge_page(h, page); > > + if (likely(put_page_testzero(page))) > > + enqueue_huge_page(h, page); > > } > > free: > > spin_unlock_irq(&hugetlb_lock); > > -- > > 2.11.0 > > > > -- > Michal Hocko > SUSE Labs