From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAB08C433EF for ; Fri, 5 Nov 2021 00:38:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 71C5C61245 for ; Fri, 5 Nov 2021 00:38:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 71C5C61245 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BC0FB6B0073; Thu, 4 Nov 2021 20:38:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4A466B0074; Thu, 4 Nov 2021 20:38:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9EAAF940007; Thu, 4 Nov 2021 20:38:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0209.hostedemail.com [216.40.44.209]) by kanga.kvack.org (Postfix) with ESMTP id 8CB7A6B0073 for ; Thu, 4 Nov 2021 20:38:30 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4C99F75300 for ; Fri, 5 Nov 2021 00:38:30 +0000 (UTC) X-FDA: 78773015580.14.CC17549 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by imf19.hostedemail.com (Postfix) with ESMTP id C4445B0000AC for ; Fri, 5 Nov 2021 00:38:22 +0000 (UTC) Received: by mail-pj1-f44.google.com with SMTP id n11-20020a17090a2bcb00b001a1e7a0a6a6so2179650pje.0 for ; Thu, 04 Nov 2021 17:38:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gXZfe3XSvkmz9Zv7oXVc/KDwWDYooAVUt2M4nCNJbcc=; b=kn0Y9XtYZc0iLme7QWMO1kowXA7MhRt+pKJzO7g20Az1fuZ5rJ4WLvd1UZWHT4T7cS /gQ/l35vjzkfloXPkNISuDvmpFt5VJ0mRfrCsLd6Y45J5UfdLLQKQg6+S5B61PB9Mera tTOGX6vSiyIvHJ798LZGkIIp7bDuqD1kbnKXb+cwnt4cqdVfWJb38HnlJPxvNV+kM5gm feHMyMqq1cU0hK9c68xEiCnsiO7hofAl64CfdLLxc5Wg0qWHYBCGL9nEGCfzJSfBnuuD v0D5uFLyLeuJh11o/EVwSIcDC49AAnOIuDeC+nO7G7sTb+DxRxizUTwyXobdDJmzpZ6v YdgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gXZfe3XSvkmz9Zv7oXVc/KDwWDYooAVUt2M4nCNJbcc=; b=Bx3KMVRrL6TEEkVnOJrAQ+AtJ0yGhjmBnFgTt1alw6AJe7TTlSvrtls1QiELI7ET84 MqZOP6FuTE8reHZpCSUEP0CzZ5ByoyTvqhPgabcnKY48NQUct/GynLNpkZcDYsSFloB1 8zY/LrbHW5JgYdOXYK7iPIsmKKrInkZUyfk9pEW8DSBoZTWrFhwQ4EPkGuycJ0rYLpuu lJhOF7E8N+InhG5JCzyRSH8wajRZfLVVg/Vucz9rR2EyDypqz5f4EJiYDucJSTLhT/zF QgbeMhp02bg13/MVynjHQwaODYwm5ysjEzGqW66ukqNngNB+ptFC1vgSgGjJDY8YfVGO +kSQ== X-Gm-Message-State: AOAM530GkZnKIvd/UUvWJJyglOU7ipj9eM2eagLJ43GsBKHOmHWzre6E oAo+hGz7kAglq4rB1df92QkiIDRAs2oVMfrW3CDEpA== X-Google-Smtp-Source: ABdhPJxOY0u0oFlQSBrEPoQcgRxWjHiAZmPve42mZm7G1Upa9WD5C1B0BnCbKl6KC2SF6AsknOdKTn8ee4+tmTmcvNU= X-Received: by 2002:a17:902:b697:b0:141:c7aa:e10f with SMTP id c23-20020a170902b69700b00141c7aae10fmr35230738pls.18.1636072708803; Thu, 04 Nov 2021 17:38:28 -0700 (PDT) MIME-Version: 1.0 References: <20210827145819.16471-1-joao.m.martins@oracle.com> <20210827145819.16471-8-joao.m.martins@oracle.com> In-Reply-To: <20210827145819.16471-8-joao.m.martins@oracle.com> From: Dan Williams Date: Thu, 4 Nov 2021 17:38:19 -0700 Message-ID: Subject: Re: [PATCH v4 07/14] device-dax: compound devmap support To: Joao Martins Cc: Linux MM , Vishal Verma , Dave Jiang , Naoya Horiguchi , Matthew Wilcox , Jason Gunthorpe , John Hubbard , Jane Chu , Muchun Song , Mike Kravetz , Andrew Morton , Jonathan Corbet , Christoph Hellwig , Linux NVDIMM , Linux Doc Mailing List Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C4445B0000AC X-Stat-Signature: 8a9dzzf79mfh36m3qucmyn7fgn4g93su Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel-com.20210112.gappssmtp.com header.s=20210112 header.b=kn0Y9XtY; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=intel.com (policy=none); spf=none (imf19.hostedemail.com: domain of dan.j.williams@intel.com has no SPF policy when checking 209.85.216.44) smtp.mailfrom=dan.j.williams@intel.com X-HE-Tag: 1636072702-297063 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 27, 2021 at 7:59 AM Joao Martins wrote: > > Use the newly added compound devmap facility which maps the assigned dax > ranges as compound pages at a page size of @align. Currently, this means, > that region/namespace bootstrap would take considerably less, given that > you would initialize considerably less pages. > > On setups with 128G NVDIMMs the initialization with DRAM stored struct > pages improves from ~268-358 ms to ~78-100 ms with 2M pages, and to less > than a 1msec with 1G pages. > > dax devices are created with a fixed @align (huge page size) which is > enforced through as well at mmap() of the device. Faults, consequently > happen too at the specified @align specified at the creation, and those > don't change through out dax device lifetime. s/through out/throughout/ > MCEs poisons a whole dax huge page, as well as splits occurring at the configured page size. A clarification here, MCEs trigger memory_failure() to *unmap* a whole dax huge page, the poison stays limited to a single cacheline. Otherwise the patch looks good to me. > > Signed-off-by: Joao Martins > --- > drivers/dax/device.c | 56 ++++++++++++++++++++++++++++++++++---------- > 1 file changed, 43 insertions(+), 13 deletions(-) > > diff --git a/drivers/dax/device.c b/drivers/dax/device.c > index 6e348b5f9d45..5d23128f9a60 100644 > --- a/drivers/dax/device.c > +++ b/drivers/dax/device.c > @@ -192,6 +192,42 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, > } > #endif /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ > > +static void set_page_mapping(struct vm_fault *vmf, pfn_t pfn, > + unsigned long fault_size, > + struct address_space *f_mapping) > +{ > + unsigned long i; > + pgoff_t pgoff; > + > + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size)); > + > + for (i = 0; i < fault_size / PAGE_SIZE; i++) { > + struct page *page; > + > + page = pfn_to_page(pfn_t_to_pfn(pfn) + i); > + if (page->mapping) > + continue; > + page->mapping = f_mapping; > + page->index = pgoff + i; > + } > +} > + > +static void set_compound_mapping(struct vm_fault *vmf, pfn_t pfn, > + unsigned long fault_size, > + struct address_space *f_mapping) > +{ > + struct page *head; > + > + head = pfn_to_page(pfn_t_to_pfn(pfn)); > + head = compound_head(head); > + if (head->mapping) > + return; > + > + head->mapping = f_mapping; > + head->index = linear_page_index(vmf->vma, > + ALIGN(vmf->address, fault_size)); > +} > + > static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf, > enum page_entry_size pe_size) > { > @@ -225,8 +261,7 @@ static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf, > } > > if (rc == VM_FAULT_NOPAGE) { > - unsigned long i; > - pgoff_t pgoff; > + struct dev_pagemap *pgmap = dev_dax->pgmap; > > /* > * In the device-dax case the only possibility for a > @@ -234,17 +269,10 @@ static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf, > * mapped. No need to consider the zero page, or racing > * conflicting mappings. > */ > - pgoff = linear_page_index(vmf->vma, > - ALIGN(vmf->address, fault_size)); > - for (i = 0; i < fault_size / PAGE_SIZE; i++) { > - struct page *page; > - > - page = pfn_to_page(pfn_t_to_pfn(pfn) + i); > - if (page->mapping) > - continue; > - page->mapping = filp->f_mapping; > - page->index = pgoff + i; > - } > + if (pgmap_geometry(pgmap) > 1) > + set_compound_mapping(vmf, pfn, fault_size, filp->f_mapping); > + else > + set_page_mapping(vmf, pfn, fault_size, filp->f_mapping); > } > dax_read_unlock(id); > > @@ -426,6 +454,8 @@ int dev_dax_probe(struct dev_dax *dev_dax) > } > > pgmap->type = MEMORY_DEVICE_GENERIC; > + if (dev_dax->align > PAGE_SIZE) > + pgmap->geometry = dev_dax->align >> PAGE_SHIFT; > dev_dax->pgmap = pgmap; > > addr = devm_memremap_pages(dev, pgmap); > -- > 2.17.1 >