From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E09EC6379D for ; Tue, 24 Nov 2020 12:46:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E88F2076E for ; Tue, 24 Nov 2020 12:46:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="FrdMPUpD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E88F2076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 960BA6B00C9; Tue, 24 Nov 2020 07:46:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 911706B00CA; Tue, 24 Nov 2020 07:46:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 78ABE6B00CB; Tue, 24 Nov 2020 07:46:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 5F3676B00C9 for ; Tue, 24 Nov 2020 07:46:12 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2707C180AD806 for ; Tue, 24 Nov 2020 12:46:12 +0000 (UTC) X-FDA: 77519284584.21.color70_63183602736e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id F1A96180442C2 for ; Tue, 24 Nov 2020 12:46:11 +0000 (UTC) X-HE-Tag: color70_63183602736e X-Filterd-Recvd-Size: 6283 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Nov 2020 12:46:11 +0000 (UTC) Received: by mail-pf1-f196.google.com with SMTP id n137so8360979pfd.3 for ; Tue, 24 Nov 2020 04:46:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zvMoedg9c3vAj7qTNNLGhSdFiVG9vOlTW6cU7Q9XeFo=; b=FrdMPUpDEFFjBtYI8z4IeWI2Z4ohddd3ZcheGrplS0kTOmz4h18rj8yx1KD0p94H/4 4da07fC+FGhGMT861t9xebrdtDKWv9GNFNa7UdYTDzRWtSm8THlMn+8gUcHPYDBgU4BG 0Ioe9Otk2VxnLl3jVXir6r5KJbCI/ivA9S5xTRVUBw7Q0usttObbgvmqqTbLfu3cCPpP gjzox5FYTLPTL7hND87YVmaJkLBvFPmyYgeJF/5aqmejfQ7Pst18yDrJ+VYwlnnykvgZ eJjGrnH6KVtJNyy+toC6DpaYps582AB+j+Zu9MrYeWS5JSy1WQj/QILf892bEpx8fxwL DhFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zvMoedg9c3vAj7qTNNLGhSdFiVG9vOlTW6cU7Q9XeFo=; b=BQ4iyledLRyqI9OM96rmX2nF7cLhNxWxdm4nNwzkhcmZ2cFRrijdV8jJQ+kylCfdbB Lob+l8DJyiScslEhJLcdB8nbMF59JPL1thxgsfLKogzBLb9oi7PKg+9IELRUYGFUe8ia UFf9y0dCMk8QMF5HPiErM49EnirVIWZSTphhDCQnHc3qgJuY645qaSVOt5HfM/5WqPHX 9VRtpso+vrD9pokNhx08XI9AAUg680C6oiTbFXCJKCCzL0ZD46UGYa1FQfRTrdGOjKYt iU6xx/kwQGmu6G5lRLGP+GFlVOFeiBowWTqA7en0IIzla8msnqYjk6nYQSqyPdAryVqg uhAg== X-Gm-Message-State: AOAM531Ml6u3gbOk62M9SVnuDhFoSakMNwzslPDu9a7qAzYq9gbeUDij 1ITWkrkH0qKPHeYnt7uitFpNF1HuZEgbPzTX9777jw== X-Google-Smtp-Source: ABdhPJw4zreHDeiqTrvJd7toGaCPcJpurHA+IsBwgCC6p/FqJZJI2OUo3Y313QMps8SXpsVjl4nPz3fGSgYe8qccqlw= X-Received: by 2002:a17:90b:941:: with SMTP id dw1mr4752466pjb.147.1606221969905; Tue, 24 Nov 2020 04:46:09 -0800 (PST) MIME-Version: 1.0 References: <20201124095259.58755-1-songmuchun@bytedance.com> <20201124095259.58755-10-songmuchun@bytedance.com> <20201124115109.GW27488@dhcp22.suse.cz> In-Reply-To: <20201124115109.GW27488@dhcp22.suse.cz> From: Muchun Song Date: Tue, 24 Nov 2020 20:45:30 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v6 09/16] mm/hugetlb: Defer freeing of HugeTLB pages To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 24, 2020 at 7:51 PM Michal Hocko wrote: > > On Tue 24-11-20 17:52:52, Muchun Song wrote: > > In the subsequent patch, we will allocate the vmemmap pages when free > > HugeTLB pages. But update_and_free_page() is called from a non-task > > context(and hold hugetlb_lock), so we can defer the actual freeing in > > a workqueue to prevent use GFP_ATOMIC to allocate the vmemmap pages. > > This has been brought up earlier without any satisfying answer. Do we > really have bother with the freeing from the pool and reconstructing the > vmemmap page tables? Do existing usecases really require such a dynamic > behavior? In other words, wouldn't it be much simpler to allow to use If someone wants to free a HugeTLB page, there is no way to do that if we do not allow this behavior. When do we need this? On our server, we will allocate a lot of HugeTLB pages for SPDK or virtualization. Sometimes, we want to debug some issues and want to apt install some debug tools, but if the host has little memory and the install operation can be failed because of no memory. In this time, we can try to free some HugeTLB pages to buddy in order to continue debugging. So maybe we need this. > hugetlb pages with sparse vmemmaps only for the boot time reservations > and never allow them to be freed back to the allocator. This is pretty > restrictive, no question about that, but it would drop quite some code Yeah, if we do not allow freeing the HugeTLB page to buddy, it actually can drop some code. But I think that it only drop this one and next one patch. It seems not a lot. And if we drop this patch, we need to add some another code to do the boot time reservations and other code to disallow freeing HugeTLB pages. So why not support freeing now. > AFAICS and the resulting series would be much easier to review really > carefully. Additional enhancements can be done on top with specifics > about usecases which require more flexibility. The code of allocating vmemmap pages for the HugeTLB page is very similar to the freeing vmemmap pages. The two operations are opposite. I think that if someone can understand the freeing path, it is also easy for him to understand the allcating path. If you look at close to this patch, I believe that it is easy for you. > > > Signed-off-by: Muchun Song > > --- > > mm/hugetlb.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++------ > > mm/hugetlb_vmemmap.c | 5 --- > > mm/hugetlb_vmemmap.h | 10 ++++++ > > 3 files changed, 95 insertions(+), 16 deletions(-) > -- > Michal Hocko > SUSE Labs -- Yours, Muchun