From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA988C433DB for ; Mon, 11 Jan 2021 04:41:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 64854225AB for ; Mon, 11 Jan 2021 04:41:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64854225AB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A5C436B0149; Sun, 10 Jan 2021 23:41:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E55B6B014A; Sun, 10 Jan 2021 23:41:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AC9E6B014B; Sun, 10 Jan 2021 23:41:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id 723376B0149 for ; Sun, 10 Jan 2021 23:41:58 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 33ADA181AEF00 for ; Mon, 11 Jan 2021 04:41:58 +0000 (UTC) X-FDA: 77692246716.23.wish42_1d17a102750a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 13FDC37606 for ; Mon, 11 Jan 2021 04:41:58 +0000 (UTC) X-HE-Tag: wish42_1d17a102750a X-Filterd-Recvd-Size: 6497 Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Mon, 11 Jan 2021 04:41:57 +0000 (UTC) Received: by mail-lf1-f51.google.com with SMTP id v67so7656693lfa.0 for ; Sun, 10 Jan 2021 20:41:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=g4dO/ZLr0ZIE4Ng1itVD5IzqiR/Cecvu81AcHLMbTek=; b=HMplX0lMqcUFUOxW4HCbUvqChmvav3l7jLVJtDZu27s6/KkVZBH1fv4B3IBwQU6sEJ +tf9G88binXZG1goF8xBvgMkYphk3hww2vhFEFZD6L2mdfrkFEYlvj6GzthC6LQ7G7ap sDbVmbS0b/+CuvYsKTePVZMAcgTOufIayaorZjWs4zZ9A+DH6BmuwTMTpzwi6u78C2iS R7IntfdHLsz7zar7SioPoymT7bNpVDQTqKvtqtDVFYOqnkvZMPyQWk7ys71UJ48m+J1x TXbutsYZ+POZtwfFI7E7Elsy+OSUV3YL7WOBJmWcOaqLo8jAzDsOmZqkbiHCsqPNY5N2 ztKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=g4dO/ZLr0ZIE4Ng1itVD5IzqiR/Cecvu81AcHLMbTek=; b=mEATJEbQG6QIzmM0rdMBxmfLB2WKcEHrmJ+lO6PcZ/0YDlA7+bL08b97cmV6zFmCDW cte9eFp6jebf/uUS+RNtfjBnKnFZqVR/Be0EJVqitlV2zmhYWrThqj1KdBEB8AchuSur 52WDkPlIv8uNtZgEWjq5ac1/EnuwVRymvUZH2OLfHYUgSCax9mVMzJ7VZHi3cTgUIUb3 lXHigGlQoYWeeFFZaQmVWL21W9FcBqMxzrRq+NRU7OE9o4u3LLUn51B+ZadS2SuHjWZF 9i3kr3x4eCMKmf/4LPgyrNv1Q4Eh6tQIpAYXNIVDF/t10p3LiJ3WEx4YnwKr6b+W6f0+ x9WQ== X-Gm-Message-State: AOAM5300W5MG6wHqdoRrAGWzpV4d+VD5+7hcv7ebMeG86kUwRPNM2XVi hR0cPL4cy6UZvgfKoc8DhSnSNjAvrynbiUfXBM0= X-Google-Smtp-Source: ABdhPJwjy92psIeBV49XVRBM/CKbUrrSFhMq8aLlgfiOS2u3eSxgMGJp+6rJAvbKFSfvMRB99s1qWH9mkSMWYEtxN9M= X-Received: by 2002:a19:804a:: with SMTP id b71mr6132913lfd.504.1610340116091; Sun, 10 Jan 2021 20:41:56 -0800 (PST) MIME-Version: 1.0 References: <20210106035027.GA1160@open-light-1.localdomain> In-Reply-To: From: Liang Li Date: Mon, 11 Jan 2021 12:41:40 +0800 Message-ID: Subject: Re: [PATCH 4/6] hugetlb: avoid allocation failed when page reporting is on going To: Alexander Duyck Cc: Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , linux-mm , LKML , virtualization@lists.linux-foundation.org Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > > > Please don't use this email address for me anymore. Either use > > > alexander.duyck@gmail.com or alexanderduyck@fb.com. I am getting > > > bounces when I reply to this thread because of the old address. > > > > No problem. > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > > index eb533995cb49..0fccd5f96954 100644 > > > > --- a/mm/hugetlb.c > > > > +++ b/mm/hugetlb.c > > > > @@ -2320,6 +2320,12 @@ struct page *alloc_huge_page(struct vm_area_struct *vma, > > > > goto out_uncharge_cgroup_reservation; > > > > > > > > spin_lock(&hugetlb_lock); > > > > + while (h->free_huge_pages <= 1 && h->isolated_huge_pages) { > > > > + spin_unlock(&hugetlb_lock); > > > > + mutex_lock(&h->mtx_prezero); > > > > + mutex_unlock(&h->mtx_prezero); > > > > + spin_lock(&hugetlb_lock); > > > > + } > > > > > > This seems like a bad idea. It kind of defeats the whole point of > > > doing the page zeroing outside of the hugetlb_lock. Also it is > > > operating on the assumption that the only way you might get a page is > > > from the page zeroing logic. > > > > > > With the page reporting code we wouldn't drop the count to zero. We > > > had checks that were going through and monitoring the watermarks and > > > if we started to hit the low watermark we would stop page reporting > > > and just assume there aren't enough pages to report. You might need to > > > look at doing something similar here so that you can avoid colliding > > > with the allocator. > > > > For hugetlb, things are a little different, Just like Mike points out: > > "On some systems, hugetlb pages are a precious resource and > > the sysadmin carefully configures the number needed by > > applications. Removing a hugetlb page (even for a very short > > period of time) could cause serious application failure." > > > > Just keeping some pages in the freelist is not enough to prevent that from > > happening, because these pages may be allocated while zero out is on > > going, and application may still run into a situation for not available free > > pages. > > I get what you are saying. However I don't know if it is acceptable > for the allocating thread to be put to sleep in this situation. There > are two scenarios where I can see this being problematic. > > One is a setup where you put the page allocator to sleep and while it > is sleeping another thread is then freeing a page and your thread > cannot respond to that newly freed page and is stuck waiting on the > zeroed page. > > The second issue is that users may want a different option of just > breaking up the request into smaller pages rather than waiting on the > page zeroing, or to do something else while waiting on the page. So > instead of sitting on the request and waiting it might make more sense > to return an error pointer like EAGAIN or EBUSY to indicate that there > is a page there, but it is momentarily tied up. It seems returning EAGAIN or EBUSY will still change the application's behavior, I am not sure if it's acceptable. Thanks Liang