From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5ED15C433E0 for ; Thu, 14 Jan 2021 13:48:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DEC3B23A60 for ; Thu, 14 Jan 2021 13:48:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DEC3B23A60 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4E85A8D00E4; Thu, 14 Jan 2021 08:48:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 470198D008E; Thu, 14 Jan 2021 08:48:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 338448D00E4; Thu, 14 Jan 2021 08:48:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 1966B8D008E for ; Thu, 14 Jan 2021 08:48:20 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D23F2824556B for ; Thu, 14 Jan 2021 13:48:19 +0000 (UTC) X-FDA: 77704509918.14.unit03_220ca3827527 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id A775018229835 for ; Thu, 14 Jan 2021 13:48:19 +0000 (UTC) X-HE-Tag: unit03_220ca3827527 X-Filterd-Recvd-Size: 7014 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Thu, 14 Jan 2021 13:48:18 +0000 (UTC) Received: by mail-pl1-f170.google.com with SMTP id d4so2925759plh.5 for ; Thu, 14 Jan 2021 05:48:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sMAwTolIz7BCXLPef7/V0xFFPky4VaH+dzKWAXw2Md4=; b=RKUJGdmzoHCY767iJgJb4RgZDksjjlM/bZ2pzDYdnm6YwNmhYZhcMS3S5/+az0bX8L ru08QVVt0u7DoUVkN0fEew3C+0xkHF+/2TZK3WBOyL8QclgjJfUCzrfiihPxQhMxYP2D 4+8EE8jl35IZf2oSxFxse6HtBpEM0t2nRTmh6xwshp6EP3UCIu+5/6e+/e6IITxqC+vN 3H0+n0dyFzXLzhLd68uVIFCUxYCfYnFjfOz/jMk8pTCoC4I0UrihEOB4m78a9Uv0KkwE Q1WwV1RCGRCzutAQoIXKcGZyUUazeDE7zx3wzSBUGK4Sft1Mp9umFLLnpHnEkQYZ0nBv d6Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sMAwTolIz7BCXLPef7/V0xFFPky4VaH+dzKWAXw2Md4=; b=jSgUE9j2F0EOSAERFMAn9NOE1VdZXJUN3i1qDRJz5fSpdT0xBwgPuGJSJc7Q7ivhAf RdfyUhf9718/mSJgrjD7EPsdDz0pfkD/V7WIBiFrR64KoZZuLrhN040HL8csYjzMkXb6 AjpFt+ef9m8R2BOGlBijgFksxcPD6XHl8l65ry2pFAQgrhZvWUqRCmrpJkxt1i20dLot zI8IWKTwSxxatc+SkiiicgM9i3+JT9ZScdV05B1tjy6wD1Nqies5lGVvsOXWYmXdk4sX R7H4itkwZB+NFuiUAnVUwos1HB9YBmN3oZVSJrBZdKfvdIev8DvG1KJHDXTORC0BYxMh Xe4w== X-Gm-Message-State: AOAM533Ao3FF4jX5j9e+YfP3STlFpu21XSgXVbvhWmJcbKu20dqw8xS6 mFPeZVayttm8lbf4JNZ9HtYb5dUUBY/xHQLtSvu2Ew== X-Google-Smtp-Source: ABdhPJyweVc8rWi3+2qzxJ9OFjKhe/VPakkxKSdANKScqHC3U/euQDfJJNkASKGF7HBN8KKN2WlEQlAQXCDLNAGiYhI= X-Received: by 2002:a17:902:b416:b029:dc:3657:9265 with SMTP id x22-20020a170902b416b02900dc36579265mr7670391plr.24.1610632097712; Thu, 14 Jan 2021 05:48:17 -0800 (PST) MIME-Version: 1.0 References: <20210114103515.12955-1-songmuchun@bytedance.com> <20210114103515.12955-4-songmuchun@bytedance.com> <20210114132036.GA27777@dhcp22.suse.cz> In-Reply-To: <20210114132036.GA27777@dhcp22.suse.cz> From: Muchun Song Date: Thu, 14 Jan 2021 21:47:36 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v5 3/5] mm: hugetlb: fix a race between freeing and dissolving the page To: Michal Hocko Cc: Mike Kravetz , Andrew Morton , Naoya Horiguchi , Andi Kleen , Linux Memory Management List , LKML , linux- stable Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 14, 2021 at 9:20 PM Michal Hocko wrote: > > On Thu 14-01-21 18:35:13, Muchun Song wrote: > > There is a race condition between __free_huge_page() > > and dissolve_free_huge_page(). > > > > CPU0: CPU1: > > > > // page_count(page) == 1 > > put_page(page) > > __free_huge_page(page) > > dissolve_free_huge_page(page) > > spin_lock(&hugetlb_lock) > > // PageHuge(page) && !page_count(page) > > update_and_free_page(page) > > // page is freed to the buddy > > spin_unlock(&hugetlb_lock) > > spin_lock(&hugetlb_lock) > > clear_page_huge_active(page) > > enqueue_huge_page(page) > > // It is wrong, the page is already freed > > spin_unlock(&hugetlb_lock) > > > > The race windows is between put_page() and dissolve_free_huge_page(). > > > > We should make sure that the page is already on the free list > > when it is dissolved. > > Please describe user visible effects as suggested in > http://lkml.kernel.org/r/20210113093134.GU22493@dhcp22.suse.cz Sorry forgot to update this. > > > Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") > > Signed-off-by: Muchun Song > > Reviewed-by: Mike Kravetz > > Cc: stable@vger.kernel.org > > --- > > mm/hugetlb.c | 41 +++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 41 insertions(+) > [...] > > +retry: > > /* Not to disrupt normal path by vainly holding hugetlb_lock */ > > if (!PageHuge(page)) > > return 0; > > @@ -1770,6 +1789,28 @@ int dissolve_free_huge_page(struct page *page) > > int nid = page_to_nid(head); > > if (h->free_huge_pages - h->resv_huge_pages == 0) > > goto out; > > + > > + /* > > + * We should make sure that the page is already on the free list > > + * when it is dissolved. > > + */ > > + if (unlikely(!PageHugeFreed(head))) { > > + spin_unlock(&hugetlb_lock); > > + > > + /* > > + * Theoretically, we should return -EBUSY when we > > + * encounter this race. In fact, we have a chance > > + * to successfully dissolve the page if we do a > > + * retry. Because the race window is quite small. > > + * If we seize this opportunity, it is an optimization > > + * for increasing the success rate of dissolving page. > > + */ > > + while (PageHeadHuge(head) && !PageHugeFreed(head)) > > + cond_resched(); > > Sorry, I should have raised that when replying to the previous version > already but we have focused more on other things. Is there any special > reason that you didn't simply > if (!PageHugeFreed(head)) { > spin_unlock(&hugetlb_lock); > cond_resched(); > goto retry; > } > > This would be less code and a very slight advantage would be that the > waiter might get blocked on the spin lock while the concurrent freeing > is happening. But maybe you wanted to avoid exactly this contention? > Please put your thinking into the changelog. I want to avoid the lock contention. I will add this reason to the changelog. Thanks. > > > + > > + goto retry; > > + } > > + > > /* > > * Move PageHWPoison flag from head page to the raw error page, > > * which makes any subpages rather than the error page reusable. > > -- > > 2.11.0 > > -- > Michal Hocko > SUSE Labs