From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDC8CC43461 for ; Wed, 16 Sep 2020 02:45:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4D089206B7 for ; Wed, 16 Sep 2020 02:45:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="TFqZS2R0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D089206B7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9AC0E6B0055; Tue, 15 Sep 2020 22:45:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 95B386B005A; Tue, 15 Sep 2020 22:45:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 849CE6B005C; Tue, 15 Sep 2020 22:45:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0173.hostedemail.com [216.40.44.173]) by kanga.kvack.org (Postfix) with ESMTP id 6C9E56B0055 for ; Tue, 15 Sep 2020 22:45:42 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 28E88181AEF15 for ; Wed, 16 Sep 2020 02:45:42 +0000 (UTC) X-FDA: 77267384124.09.alarm15_1800ff327116 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id F073C180AD815 for ; Wed, 16 Sep 2020 02:45:41 +0000 (UTC) X-HE-Tag: alarm15_1800ff327116 X-Filterd-Recvd-Size: 7019 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Sep 2020 02:45:41 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id b17so776307pji.1 for ; Tue, 15 Sep 2020 19:45:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MjF3+x9EzhdvpwkfgHNm6fCxJhsYfijrbsp5nTW4btk=; b=TFqZS2R0m222F1PpstUQjgnssD8KBxBq2UnQoK3XNMEmgo0GqrqYy3UNc2QomqCL9Q Rjq/gjGSG4To7GIWcl0a2bVvh9mfyPtGSbVCIhOhQtD7UcCj5bFs1QwTUi91XMCGn0ot XzIM3CObSqg+igz1rE6hW7FGStNFpCaJ95bvJsBc0BVGPusEXNuPKZPE1HQcpsJ4W4kV b1ZhDwshZNkc33IR9x+8x2ofAPlhCCpraHAybRO0zIULfY95TxBEeIMexHPn2sg8s8Kb oGmo4RnB9ACv51S04r6v+rrx1nHJO2xn496ATeQGorVZKphQgOUu4i8FMsSmOvot/8dF QoBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MjF3+x9EzhdvpwkfgHNm6fCxJhsYfijrbsp5nTW4btk=; b=VLzGh6eZRBGCSyKeIRsDhLRx4wSmjrfGJEPN4LGpEGLc6jFYLs0d0Tbf1gj6EgYIX/ zs3GC2fXSroewl14gRz2RA2IC52lRC9gnUg6YCjJuan3A5qEIxplN8Ta2k8t6+FrtytG uYRBvz++7ZVQ3GBH28OXw+3wCq66KYH41lSvsh1e56V7PT7C65RP06xz4wZatMDkLAWa pNC0QeptV7FcZG8DlWDgGCA5n+/QYGaHLQq6ANzhsZpK793jF69+/MzehFVA0YWMM2i0 NJtIl2ObirItDnPQVOY6Jug62Lb2OwvrOy+zIpbxrFsagPIOTJAqxGIN1DxE8tMn5zz3 CGSQ== X-Gm-Message-State: AOAM531s8yh03NjTfduxKDwR1W22B+nHJE1Dh/BK6LoA1jnMfNVUQthx 4N2tsXp+9Ezi83PfIEkpqUsioEG1DX2yFBfDwkfjvw== X-Google-Smtp-Source: ABdhPJwQUmoHrkkKHxHWVxGeTnBYi6Mtj+lxHnZr+5S36ywWnha+P6KyFwM7kKSQXksZzvU9GeSjckV2Af7XxXPxkl0= X-Received: by 2002:a17:902:aa4b:b029:d0:cbe1:e739 with SMTP id c11-20020a170902aa4bb02900d0cbe1e739mr22798393plr.20.1600224339980; Tue, 15 Sep 2020 19:45:39 -0700 (PDT) MIME-Version: 1.0 References: <20200915125947.26204-1-songmuchun@bytedance.com> <20200915143241.GH5449@casper.infradead.org> <20200915154213.GI5449@casper.infradead.org> <20200915173948.GK5449@casper.infradead.org> <20200915181530.GL5449@casper.infradead.org> In-Reply-To: <20200915181530.GL5449@casper.infradead.org> From: Muchun Song Date: Wed, 16 Sep 2020 10:45:04 +0800 Message-ID: Subject: Re: [External] Re: [RFC PATCH 00/24] mm/hugetlb: Free some vmemmap pages of hugetlb page To: Matthew Wilcox Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, David Rientjes , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: F073C180AD815 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 16, 2020 at 2:15 AM Matthew Wilcox wrote: > > On Wed, Sep 16, 2020 at 02:03:15AM +0800, Muchun Song wrote: > > On Wed, Sep 16, 2020 at 1:39 AM Matthew Wilcox wrote: > > > > > > On Wed, Sep 16, 2020 at 01:32:46AM +0800, Muchun Song wrote: > > > > On Tue, Sep 15, 2020 at 11:42 PM Matthew Wilcox wrote: > > > > > > > > > > On Tue, Sep 15, 2020 at 11:28:01PM +0800, Muchun Song wrote: > > > > > > On Tue, Sep 15, 2020 at 10:32 PM Matthew Wilcox wrote: > > > > > > > > > > > > > > On Tue, Sep 15, 2020 at 08:59:23PM +0800, Muchun Song wrote: > > > > > > > > This patch series will free some vmemmap pages(struct page structures) > > > > > > > > associated with each hugetlbpage when preallocated to save memory. > > > > > > > > > > > > > > It would be lovely to be able to do this. Unfortunately, it's completely > > > > > > > impossible right now. Consider, for example, get_user_pages() called > > > > > > > on the fifth page of a hugetlb page. > > > > > > > > > > > > Can you elaborate on the problem? Thanks so much. > > > > > > > > > > OK, let's say you want to do a 2kB I/O to offset 0x5000 of a 2MB page > > > > > on a 4kB base page system. Today, that results in a bio_vec containing > > > > > {head+5, 0, 0x800}. Then we call page_to_phys() on that (head+5) struct > > > > > page to get the physical address of the I/O, and we turn it into a struct > > > > > scatterlist, which similarly has a reference to the page (head+5). > > > > > > > > As I know, in this case, the get_user_pages() will get a reference > > > > to the head page (head+0) before returning such that the hugetlb > > > > page can not be freed. Although get_user_pages() returns the > > > > page (head+5) and the scatterlist has a reference to the page > > > > (head+5), this patch series can handle this situation. I can not > > > > figure out where the problem is. What I missed? Thanks. > > > > > > You freed pages 4-511 from the vmemmap so they could be used for > > > something else. Page 5 isn't there any more. So if you return head+5, > > > then when we complete the I/O, we'll look for the compound_head() of > > > head+5 and we won't find head. > > > > We do not free pages 4-511 from the vmemmap. Actually, we only > > free pages 128-511 from the vmemmap. > > > > The 512 struct pages occupy 8 pages of physical memory. We only > > free 6 physical page frames to the buddy. But we will create a new > > mapping just like below. The virtual address of the freed pages will > > remap to the second page frame. So the second page frame is > > reused. > > Oh! I get what you're doing now. > > For the vmemmap case, you free the last N-2 physical pages but map the > second physical page multiple times. So for the 512 pages case, we > see pages: > > abcdefgh | ijklmnop | ijklmnop | ijklmnop | ijklmnop | ijklmnop | ijklmnop ... Yeah, great. You are right. > > Huh. I think that might work, except for PageHWPoison. I'll go back > to your patch series and look at that some more. > The PageHWPoison also is considered in the patch series. Looking forward to your suggestions. Thanks. -- Yours, Muchun