linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Tarun Sahu <tsahu@linux.ibm.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org
Subject: Re: [PATCH v2 1/2] mm/vmemmap/devdax: Fix kernel crash when probing devdax devices
Date: Tue, 11 Apr 2023 20:50:59 +0530	[thread overview]
Message-ID: <87y1mybg2s.fsf@linux.ibm.com> (raw)
In-Reply-To: <1d9e9377-d835-4b83-8770-adc3d1313228@oracle.com>

Joao Martins <joao.m.martins@oracle.com> writes:

> On 11/04/2023 15:18, Aneesh Kumar K.V wrote:
>> commit 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound
>> devmaps") added support for using optimized vmmemap for devdax devices. But how
>> vmemmap mappings are created are architecture specific. For example, powerpc
>> with hash translation doesn't have vmemmap mappings in init_mm page table
>> instead they are bolted table entries in the hardware page table
>> 
>> vmemmap_populate_compound_pages() used by vmemmap optimization code is not aware
>> of these architecture-specific mapping. Hence allow architecture to opt for this
>> feature. I selected architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP
>> option as also supporting this feature.
>> 
> Perhaps rephrase last sentence to be in imperative form e.g.:
>
> Architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP option are selected when
> supporting this feature.
>
>> This patch fixes the below crash on ppc64.
>> 
> Avoid the 'This patch' per submission guidelines e.g.
>
> 'On ppc64 (pmem) where this isn't supported, it fixes below crash:'

ok will update in the next version.

>
>> BUG: Unable to handle kernel data access on write at 0xc00c000100400038
>> Faulting instruction address: 0xc000000001269d90
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> Modules linked in:
> 

...

>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 716d30d93616..c47f2186d2c2 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -3442,6 +3442,22 @@ void vmemmap_populate_print_last(void);
>>  void vmemmap_free(unsigned long start, unsigned long end,
>>  		struct vmem_altmap *altmap);
>>  #endif
>> +
>> +#ifdef CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMA
>
> You are missing a 'P'

I noticed that after sending v2 and sent v2-updated with that change. 

>
>> +static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap,
>> +					   struct dev_pagemap *pgmap)
>> +{
>> +	return is_power_of_2(sizeof(struct page)) &&
>> +		pgmap && (pgmap_vmemmap_nr(pgmap) > 1) && !altmap;
>> +}
>> +#else
>> +static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap,
>> +					   struct dev_pagemap *pgmap)
>> +{
>> +	return false;
>> +}
>> +#endif
>> +
>>  void register_page_bootmem_memmap(unsigned long section_nr, struct page *map,
>>  				  unsigned long nr_pages);
>>  
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 3bb3484563ed..292411d8816f 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6844,10 +6844,13 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
>>   * of an altmap. See vmemmap_populate_compound_pages().
>>   */
>>  static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap,
>> +					      struct dev_pagemap *pgmap,
>>  					      unsigned long nr_pages)
>>  {
>> -	return is_power_of_2(sizeof(struct page)) &&
>> -		!altmap ? 2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages;
>> +	if (vmemmap_can_optimize(altmap, pgmap))
>> +		return 2 * (PAGE_SIZE / sizeof(struct page));
>> +	else
>> +		return nr_pages;
>>  }
>>  
>
> Keep the ternary operator as already is the case for compound_nr_pages to avoid
> doing too much in one patch:
>
> 	return vmemmap_can_optimize(altmap, pgmap) ?
> 		2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages;
>
> Or you really want to remove the ternary operator perhaps take the unnecessary
> else and make the long line be less indented:
>
> 	if (!vmemmap_can_optimize(altmap, pgmap))
> 		return nr_pages;
>
> 	return  2 * (PAGE_SIZE / sizeof(struct page));
>
> I don't think the latter is a significant improvement over the ternary one. But
> I guess that's a matter of preferred style.

How about

static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap,
					      struct dev_pagemap *pgmap)
{

	if (!vmemmap_can_optimize(altmap, pgmap))
		return pgmap_vmemmap_nr(pgmap);

	return 2 * (PAGE_SIZE / sizeof(struct page));
}



>
>>  static void __ref memmap_init_compound(struct page *head,
>> @@ -6912,7 +6915,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
>>  			continue;
>>  
>>  		memmap_init_compound(page, pfn, zone_idx, nid, pgmap,
>> -				     compound_nr_pages(altmap, pfns_per_compound));
>> +				     compound_nr_pages(altmap, pgmap, pfns_per_compound));
>>  	}
>>  
>>  	pr_info("%s initialised %lu pages in %ums\n", __func__,
>> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
>> index c5398a5960d0..10d73a0dfcec 100644
>> --- a/mm/sparse-vmemmap.c
>> +++ b/mm/sparse-vmemmap.c
>> @@ -458,8 +458,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn,
>>  		!IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION)))
>>  		return NULL;
>>  
>> -	if (is_power_of_2(sizeof(struct page)) &&
>> -	    pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap)
>> +	if (vmemmap_can_optimize(altmap, pgmap))
>>  		r = vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap);
>>  	else
>>  		r = vmemmap_populate(start, end, nid, altmap);

-aneesh


  reply	other threads:[~2023-04-11 15:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-11 14:18 Aneesh Kumar K.V
2023-04-11 14:18 ` [PATCH v2 2/2] m/hugetlb: Rename ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP Aneesh Kumar K.V
2023-04-11 15:52   ` Joao Martins
2023-04-11 16:14     ` Aneesh Kumar K V
2023-04-11 17:34       ` Joao Martins
2023-04-12  3:43   ` [External] " Muchun Song
2023-04-11 14:35 ` [PATCH v2 1/2] mm/vmemmap/devdax: Fix kernel crash when probing devdax devices Joao Martins
2023-04-11 15:20   ` Aneesh Kumar K.V [this message]
2023-04-11 15:50     ` Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y1mybg2s.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=songmuchun@bytedance.com \
    --cc=tsahu@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox