From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B97F6C433DB for ; Mon, 21 Dec 2020 09:11:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2A07622C9C for ; Mon, 21 Dec 2020 09:11:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A07622C9C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3256B6B005D; Mon, 21 Dec 2020 04:11:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D53E6B0068; Mon, 21 Dec 2020 04:11:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 19EEC6B006C; Mon, 21 Dec 2020 04:11:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F2DBA6B005D for ; Mon, 21 Dec 2020 04:11:29 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B49E88249980 for ; Mon, 21 Dec 2020 09:11:29 +0000 (UTC) X-FDA: 77616721098.16.rail91_3c03ed327456 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id 8C5F1100E5A50 for ; Mon, 21 Dec 2020 09:11:29 +0000 (UTC) X-HE-Tag: rail91_3c03ed327456 X-Filterd-Recvd-Size: 6712 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Mon, 21 Dec 2020 09:11:28 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A0714AD10; Mon, 21 Dec 2020 09:11:27 +0000 (UTC) Date: Mon, 21 Dec 2020 10:11:23 +0100 From: Oscar Salvador To: Muchun Song Cc: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, rdunlap@infradead.org, oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, rientjes@google.com, willy@infradead.org, mhocko@suse.com, song.bao.hua@hisilicon.com, david@redhat.com, naoya.horiguchi@nec.com, duanxiongchun@bytedance.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v10 03/11] mm/hugetlb: Free the vmemmap pages associated with each HugeTLB page Message-ID: <20201221091123.GB14343@linux> References: <20201217121303.13386-1-songmuchun@bytedance.com> <20201217121303.13386-4-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201217121303.13386-4-songmuchun@bytedance.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 17, 2020 at 08:12:55PM +0800, Muchun Song wrote: > +static inline void free_bootmem_page(struct page *page) > +{ > + unsigned long magic = (unsigned long)page->freelist; > + > + /* > + * The reserve_bootmem_region sets the reserved flag on bootmem > + * pages. > + */ > + VM_WARN_ON(page_ref_count(page) != 2); > + > + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) > + put_page_bootmem(page); > + else > + VM_WARN_ON(1); Ideally, I think we want to see what how the page looks since its state is not what we expected, so maybe join both conditions and use dump_page(). > + * By removing redundant page structs for HugeTLB pages, memory can returned to ^^ be > + * the buddy allocator for other uses. [...] > +void free_huge_page_vmemmap(struct hstate *h, struct page *head) > +{ > + unsigned long vmemmap_addr = (unsigned long)head; > + > + if (!free_vmemmap_pages_per_hpage(h)) > + return; > + > + vmemmap_remap_free(vmemmap_addr + RESERVE_VMEMMAP_SIZE, > + free_vmemmap_pages_size_per_hpage(h)); I am not sure what others think, but I would like to see vmemmap_remap_free taking three arguments: start, end, and reuse addr, e.g: void free_huge_page_vmemmap(struct hstate *h, struct page *head) { unsigned long vmemmap_addr = (unsigned long)head; unsigned long vmemmap_end, vmemmap_reuse; if (!free_vmemmap_pages_per_hpage(h)) return; vmemmap_addr += RESERVE_MEMMAP_SIZE; vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h); vmemmap_reuse = vmemmap_addr - PAGE_SIZE; vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse); } The reason for me to do this is to let the callers of vmemmap_remap_free decide __what__ they want to remap. More on this below. > +static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr, > + unsigned long end, > + struct vmemmap_remap_walk *walk) > +{ > + pte_t *pte; > + > + pte = pte_offset_kernel(pmd, addr); > + > + if (walk->reuse_addr == addr) { > + BUG_ON(pte_none(*pte)); > + walk->reuse_page = pte_page(*pte++); > + addr += PAGE_SIZE; > + } Although it is quite obvious, a brief comment here pointing out what are we doing and that this is meant to be set only once would be nice. > +static void vmemmap_remap_range(unsigned long start, unsigned long end, > + struct vmemmap_remap_walk *walk) > +{ > + unsigned long addr = start - PAGE_SIZE; > + unsigned long next; > + pgd_t *pgd; > + > + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); > + VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE)); > + > + walk->reuse_page = NULL; > + walk->reuse_addr = addr; With the change I suggested above, struct vmemmap_remap_walk should be initialitzed at once in vmemmap_remap_free, so this should not longer be needed. (And btw, you do not need to set reuse_page to NULL, the way you init the struct in vmemmap_remap_free makes sure to null any field you do not explicitly set). > +static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, > + struct vmemmap_remap_walk *walk) > +{ > + /* > + * Make the tail pages are mapped with read-only to catch > + * illegal write operation to the tail pages. "Remap the tail pages as read-only to ..." > + */ > + pgprot_t pgprot = PAGE_KERNEL_RO; > + pte_t entry = mk_pte(walk->reuse_page, pgprot); > + struct page *page; > + > + page = pte_page(*pte); struct page *page = pte_page(*pte); since you did the same for the other two. > + list_add(&page->lru, walk->vmemmap_pages); > + > + set_pte_at(&init_mm, addr, pte, entry); > +} > + > +/** > + * vmemmap_remap_free - remap the vmemmap virtual address range > + * [start, start + size) to the page which > + * [start - PAGE_SIZE, start) is mapped, > + * then free vmemmap pages. > + * @start: start address of the vmemmap virtual address range > + * @size: size of the vmemmap virtual address range > + */ > +void vmemmap_remap_free(unsigned long start, unsigned long size) > +{ > + unsigned long end = start + size; > + LIST_HEAD(vmemmap_pages); > + > + struct vmemmap_remap_walk walk = { > + .remap_pte = vmemmap_remap_pte, > + .vmemmap_pages = &vmemmap_pages, > + }; As stated above, this would become: void vmemmap_remap_free(unsigned long start, unsigned long end, usigned long reuse) { LIST_HEAD(vmemmap_pages); struct vmemmap_remap_walk walk = { .reuse_addr = reuse, .remap_pte = vmemmap_remap_pte, .vmemmap_pages = &vmemmap_pages, }; You might have had your reasons to do this way, but this looks more natural to me, with the plus that callers of vmemmap_remap_free can specify what they want to remap. -- Oscar Salvador SUSE L3