From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 640CAC433F5 for ; Sat, 12 Feb 2022 10:09:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B6EC6B0072; Sat, 12 Feb 2022 05:09:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 867376B0073; Sat, 12 Feb 2022 05:09:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 755CC6B0078; Sat, 12 Feb 2022 05:09:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id 678B16B0072 for ; Sat, 12 Feb 2022 05:09:17 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 22C4D180958D7 for ; Sat, 12 Feb 2022 10:09:17 +0000 (UTC) X-FDA: 79133705154.28.94C7911 Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) by imf14.hostedemail.com (Postfix) with ESMTP id C9164100006 for ; Sat, 12 Feb 2022 10:09:15 +0000 (UTC) Received: by mail-yb1-f169.google.com with SMTP id j2so32058018ybu.0 for ; Sat, 12 Feb 2022 02:09:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=uY1QGZxvGJNV1jQzuanAqMu3WZtLOvsmLZnn0LbatKM=; b=FvtDkCYKqTsbh24ngYPII58GTtKLRiNNrwb9D+4lHh2yDSyJflk5IOHe/2Pz50gszB 5+YhYc/POsGXJWz71gKctIlk2CC3NqaHnP+sWwAugLGyA0G/89+Fj/YxRWfL07IvADAu w550p0TJcrAcQHtjd4OSjbYQNY2cfsyJ1CmekQApWcmAagHnJM+CJVEvtoDG/yGkWdyh wr1tJAPNEXQdIgyaHbUqjNepODQHsQQrA/BlwTfTNiAg9pJw7BgLaJzx0eagMcX/Hp7g uemiY0fS7TobC04DACyRdjzDFvUe9+dUgJ2RnmK+tDcr2rx2uQ9Ta4tgRGY9lakN2+Kl c5aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=uY1QGZxvGJNV1jQzuanAqMu3WZtLOvsmLZnn0LbatKM=; b=VKMjCN6UYK1iJzGmN3V+Am2SZFpkMKxUkbJZ31SZh02Kl68BW6CD0+VY89ID3Ce/k8 cbTU6TzblAqiwNLGy98selutv7TEcbi3FyJKeo4vD48fQXkIG4hzNNjSnYAaXqAJKlif jsX2zgPa5reDWKdR60VtwNPI9WODkl/HdyRdI3PlfqX+5myfy74F7MglfBGDOITeiObC 8R6sEcj517vdOz7Z3SUmCVFXFTSFvXcN78io+x6LUOKafW6IVXUifvfds00qBIueZ7TN t5BvwyrkWVcyeyUS74uxu9wMlxZ+N1dW86IeAgEY3DedCfrWNaPhh2QWyaAtkzL90t1W 1rvA== X-Gm-Message-State: AOAM530IC48bj9pvL1lb71uI6if1ER4T5S3+ggKLhc4VGVMcy4y056AD 4AdMnYc5md+xYSRun9cMeTZ0ReSAP/01yo2l/mXWXA== X-Google-Smtp-Source: ABdhPJzYQpbf+Roqjp0GzTDpkzICPMdHmyElZzFdRcACcJ2hhbApUPi2w9MF5msS2pnfs6Amy4uo8UyalDQKMw0zp5Y= X-Received: by 2002:a25:4742:: with SMTP id u63mr4863670yba.523.1644660554886; Sat, 12 Feb 2022 02:09:14 -0800 (PST) MIME-Version: 1.0 References: <20220210193345.23628-1-joao.m.martins@oracle.com> <20220210193345.23628-5-joao.m.martins@oracle.com> In-Reply-To: From: Muchun Song Date: Sat, 12 Feb 2022 18:08:38 +0800 Message-ID: Subject: Re: [PATCH v5 4/5] mm/sparse-vmemmap: improve memory savings for compound devmaps To: Joao Martins Cc: Linux Memory Management List , Dan Williams , Vishal Verma , Matthew Wilcox , Jason Gunthorpe , Jane Chu , Mike Kravetz , Andrew Morton , Jonathan Corbet , Christoph Hellwig , nvdimm@lists.linux.dev, Linux Doc Mailing List Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: C9164100006 X-Stat-Signature: 7sqa7d5qjfos15d7ix1yg4yk6icyu9xr Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=FvtDkCYK; spf=pass (imf14.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.219.169 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-HE-Tag: 1644660555-639865 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Feb 11, 2022 at 8:37 PM Joao Martins wrote: > > On 2/11/22 07:54, Muchun Song wrote: > > On Fri, Feb 11, 2022 at 3:34 AM Joao Martins wrote: > > [...] > >> pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, > >> - struct vmem_altmap *altmap) > >> + struct vmem_altmap *altmap, > >> + struct page *block) > > > > Why not use the name of "reuse" instead of "block"? > > Seems like "reuse" is more clear. > > > Good idea, let me rename that to @reuse. > > >> { > >> pte_t *pte = pte_offset_kernel(pmd, addr); > >> if (pte_none(*pte)) { > >> pte_t entry; > >> void *p; > >> > >> - p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); > >> - if (!p) > >> - return NULL; > >> + if (!block) { > >> + p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); > >> + if (!p) > >> + return NULL; > >> + } else { > >> + /* > >> + * When a PTE/PMD entry is freed from the init_mm > >> + * there's a a free_pages() call to this page allocated > >> + * above. Thus this get_page() is paired with the > >> + * put_page_testzero() on the freeing path. > >> + * This can only called by certain ZONE_DEVICE path, > >> + * and through vmemmap_populate_compound_pages() when > >> + * slab is available. > >> + */ > >> + get_page(block); > >> + p = page_to_virt(block); > >> + } > >> entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL); > >> set_pte_at(&init_mm, addr, pte, entry); > >> } > >> @@ -609,7 +624,8 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) > >> } > >> > >> static int __meminit vmemmap_populate_address(unsigned long addr, int node, > >> - struct vmem_altmap *altmap) > >> + struct vmem_altmap *altmap, > >> + struct page *reuse, struct page **page) > > > > We can remove the last argument (struct page **page) if we change > > the return type to "pte_t *". More simple, don't you think? > > > > Hmmm, perhaps it is simpler, specially provided the only error code is ENOMEM. > > Albeit perhaps what we want is a `struct page *` rather than a pte. The caller can extract `struct page` from a pte. [...] > >> - if (vmemmap_populate(start, end, nid, altmap)) > >> + if (pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap) > > > > Should we add a judgment like "is_power_of_2(sizeof(struct page))" since > > this optimization is only applied when the size of the struct page does not > > cross page boundaries? > > Totally miss that -- let me make that adjustment. > > Can I ask which architectures/conditions this happens? E.g. arm64 when !CONFIG_MEMCG. Thanks.