From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC8B7C76196 for ; Tue, 11 Apr 2023 15:21:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A500900007; Tue, 11 Apr 2023 11:21:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25519900002; Tue, 11 Apr 2023 11:21:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11D6D900007; Tue, 11 Apr 2023 11:21:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 00A0E900002 for ; Tue, 11 Apr 2023 11:21:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BD7C61C699E for ; Tue, 11 Apr 2023 15:21:26 +0000 (UTC) X-FDA: 80669474172.12.9494162 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf14.hostedemail.com (Postfix) with ESMTP id 6D225100002 for ; Tue, 11 Apr 2023 15:21:23 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=SYGUK561; spf=pass (imf14.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681226483; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bvV+EMpyt+xbG0zTJj7+9kPlaygEwCO1rY34hMkCRPY=; b=rZJnHi4ctHKAN386bFerCkjhl2OxDRtyWahkfPwymhtLB19m+GPYviqQqfDcenU0ou3eyj NevFvNSCqbmraPC3DKQ3ZPZFlkmhxPquWTDDkOo/VrpU9Pvvumb/4bbO/HmbTbLyqaU7HL NK9fBNaYQV8bMfjy1sieBoT5QBkxDUo= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=SYGUK561; spf=pass (imf14.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681226483; a=rsa-sha256; cv=none; b=2AGetop1kuBcQbgfZjB2SaaVSzuRiaQ5ZNLHJU62WzCjkEk3VfFY6yo5W/MuF1vG/cbDAA BG/u94+5K7U/p0UEYdRWsUO203bSur6l1Kd3eNGcPBXNYlZOpiiIVMhyz6m6QFtdz2SzUS 1Qrq/k51QmRWCGJOEoYFRZ6xIrx0gAc= Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33BF2JeL014139; Tue, 11 Apr 2023 15:21:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=pp1; bh=bvV+EMpyt+xbG0zTJj7+9kPlaygEwCO1rY34hMkCRPY=; b=SYGUK561xYrD/7/08aGrfKlVgaDy7HOuGecGLo8130jfutH/zww6u3QgH8FDgnrTrpeF ZJThnNDTLIPYd1lChejGUxq9H29JRlyGxZt/oIGfqmaS2wFVUNgkrjo/p0m86740fV29 j+J6STjBQCj7Wf5vrQZOsvYUptwzmLCh+iaO/4hAYp3n96mkug2qRx2qodrz9XJ3knbk EH/OYXHQ9D2TLEUp5VpgfIK7DZRS0yC/azymFzIdP/zGfCwCsBOB474gGFLfvNIZn5mM RtQ/HM7eJMKZkDibS7+wfLwbrswbISgbitq7pFUPfouck3eCLiUXN8Xckk3/K/WqF9Kl Ug== Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pw8qbmc1v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Apr 2023 15:21:06 +0000 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 33BC0GeE013521; Tue, 11 Apr 2023 15:21:05 GMT Received: from smtprelay07.dal12v.mail.ibm.com ([9.208.130.99]) by ppma01wdc.us.ibm.com (PPS) with ESMTPS id 3pu0m6gyn7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Apr 2023 15:21:05 +0000 Received: from smtpav02.wdc07v.mail.ibm.com (smtpav02.wdc07v.mail.ibm.com [10.39.53.229]) by smtprelay07.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 33BFL4iG38535880 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Apr 2023 15:21:05 GMT Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9D5135805F; Tue, 11 Apr 2023 15:21:04 +0000 (GMT) Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 298DA5805E; Tue, 11 Apr 2023 15:21:02 +0000 (GMT) Received: from skywalker.linux.ibm.com (unknown [9.43.75.136]) by smtpav02.wdc07v.mail.ibm.com (Postfix) with ESMTP; Tue, 11 Apr 2023 15:21:01 +0000 (GMT) X-Mailer: emacs 29.0.60 (via feedmail 11-beta-1 I) From: "Aneesh Kumar K.V" To: Joao Martins Cc: Muchun Song , Dan Williams , Tarun Sahu , linux-mm@kvack.org, akpm@linux-foundation.org Subject: Re: [PATCH v2 1/2] mm/vmemmap/devdax: Fix kernel crash when probing devdax devices In-Reply-To: <1d9e9377-d835-4b83-8770-adc3d1313228@oracle.com> References: <20230411141818.62152-1-aneesh.kumar@linux.ibm.com> <1d9e9377-d835-4b83-8770-adc3d1313228@oracle.com> Date: Tue, 11 Apr 2023 20:50:59 +0530 Message-ID: <87y1mybg2s.fsf@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: WW-62jGnscK9Dd3SYdszWj04aOYj7gA- X-Proofpoint-GUID: WW-62jGnscK9Dd3SYdszWj04aOYj7gA- X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-11_10,2023-04-11_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 mlxscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 impostorscore=0 phishscore=0 adultscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304110133 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: sntywbh95fkscs9zwnymkr9adu95zhfc X-Rspamd-Queue-Id: 6D225100002 X-HE-Tag: 1681226483-575176 X-HE-Meta: U2FsdGVkX18JQTAEeD18Sb4ZFjg+cCp5enj40UKn21G9TTIONPLSIi8ca8+P0+Od1uLY+MtESPSt+NWG75YK5S6oEsb7n9k/3kuLA3Ww5SPzAQyeEfZT0jltZxpjWUPsEBrqCwNMAZY86QtJB5ZiVJnsmMlcCT2Rzi4y2VCR+26wk4w00WLRVDXQumosqCtoICDWZiNHx3n53byTyPYS0icajio7itU97Lj/pA/T+z+hIRwo43Nd5sTYBdUz88zvnqYurGO3NO4S3E1xBNCu2M+bcL4dsUGvjiqaC2LJqY5x8vAnT2vGsai4h6Ru79XKFlgOiVkHugqt6V8vGRr31qomEz4Zqr64seHeHVmlkbi7XZCt3Zoy0Yheg8XvAjrJKAo1Jxh7c1TBaIERNPuNKt1FP/+XeaF2pYQwMkVIK0r3PHUzuVSg2OsMXGErKT6LnIhZEhTA3HYsIhh3zbZOh52CjQVeOgeAb7eoLxfrchWZAg+JOw7s9OqARAceOh3fBrPXp2892F+GN5NpwJC0BeVGClwXlugV1VNRAKdl6PJBzaYYMuWMiEOc0APdg2C/xcqE8vyC8mrHtR1W28uYXQq8z2LBglpyX4TU3yUZWgJqvM9HpL4EXIlY9vloJMnaWyQ/zwnMpWFLsjNyk/+UVUnZEwRFgq7HdVChPljQjaTeGKGQLu3WWa1v6eLjlsYZ2qD2HXXMVS7dkz+REeh3zbLHGBt7TjkwSuiB1xafwOj/S6HI0xHxc2RcP4ja51Zg3fd4WqaGwWx9FBmnaYR+/+H66+cTnCU3v632JdTITfS3E8xSdUzuwfTENIbrL6bsGnKDMR2jSo5DkWZ2igvgyw630KEMafdJdcwKhdKGzhtiQswuw7ol1xf4hRNWtgc/CKB7cBR/coKSKn3ZTVl2e3sWKtk7xoIwxPUqlaTV2a99eDUxcjIQfmmWXzOZ2pSMmdhRuJFvxTCfZd8PODZ vA9lDQ98 WlCI5hSTh2WSPg2AOOV5z9lIWL++A6mKBwI4IFAgpuNy9Lk186l5pQlRtlf/pfb4xvP+ER6dLMUabwIOkOVVXossGAys31Vqjws7mYrQ8Zw9vUuW6tNzlKcPamQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Joao Martins writes: > On 11/04/2023 15:18, Aneesh Kumar K.V wrote: >> commit 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound >> devmaps") added support for using optimized vmmemap for devdax devices. But how >> vmemmap mappings are created are architecture specific. For example, powerpc >> with hash translation doesn't have vmemmap mappings in init_mm page table >> instead they are bolted table entries in the hardware page table >> >> vmemmap_populate_compound_pages() used by vmemmap optimization code is not aware >> of these architecture-specific mapping. Hence allow architecture to opt for this >> feature. I selected architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP >> option as also supporting this feature. >> > Perhaps rephrase last sentence to be in imperative form e.g.: > > Architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP option are selected when > supporting this feature. > >> This patch fixes the below crash on ppc64. >> > Avoid the 'This patch' per submission guidelines e.g. > > 'On ppc64 (pmem) where this isn't supported, it fixes below crash:' ok will update in the next version. > >> BUG: Unable to handle kernel data access on write at 0xc00c000100400038 >> Faulting instruction address: 0xc000000001269d90 >> Oops: Kernel access of bad area, sig: 11 [#1] >> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries >> Modules linked in: > ... >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 716d30d93616..c47f2186d2c2 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -3442,6 +3442,22 @@ void vmemmap_populate_print_last(void); >> void vmemmap_free(unsigned long start, unsigned long end, >> struct vmem_altmap *altmap); >> #endif >> + >> +#ifdef CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMA > > You are missing a 'P' I noticed that after sending v2 and sent v2-updated with that change. > >> +static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, >> + struct dev_pagemap *pgmap) >> +{ >> + return is_power_of_2(sizeof(struct page)) && >> + pgmap && (pgmap_vmemmap_nr(pgmap) > 1) && !altmap; >> +} >> +#else >> +static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, >> + struct dev_pagemap *pgmap) >> +{ >> + return false; >> +} >> +#endif >> + >> void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, >> unsigned long nr_pages); >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3bb3484563ed..292411d8816f 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6844,10 +6844,13 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, >> * of an altmap. See vmemmap_populate_compound_pages(). >> */ >> static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, >> + struct dev_pagemap *pgmap, >> unsigned long nr_pages) >> { >> - return is_power_of_2(sizeof(struct page)) && >> - !altmap ? 2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages; >> + if (vmemmap_can_optimize(altmap, pgmap)) >> + return 2 * (PAGE_SIZE / sizeof(struct page)); >> + else >> + return nr_pages; >> } >> > > Keep the ternary operator as already is the case for compound_nr_pages to avoid > doing too much in one patch: > > return vmemmap_can_optimize(altmap, pgmap) ? > 2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages; > > Or you really want to remove the ternary operator perhaps take the unnecessary > else and make the long line be less indented: > > if (!vmemmap_can_optimize(altmap, pgmap)) > return nr_pages; > > return 2 * (PAGE_SIZE / sizeof(struct page)); > > I don't think the latter is a significant improvement over the ternary one. But > I guess that's a matter of preferred style. How about static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { if (!vmemmap_can_optimize(altmap, pgmap)) return pgmap_vmemmap_nr(pgmap); return 2 * (PAGE_SIZE / sizeof(struct page)); } > >> static void __ref memmap_init_compound(struct page *head, >> @@ -6912,7 +6915,7 @@ void __ref memmap_init_zone_device(struct zone *zone, >> continue; >> >> memmap_init_compound(page, pfn, zone_idx, nid, pgmap, >> - compound_nr_pages(altmap, pfns_per_compound)); >> + compound_nr_pages(altmap, pgmap, pfns_per_compound)); >> } >> >> pr_info("%s initialised %lu pages in %ums\n", __func__, >> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c >> index c5398a5960d0..10d73a0dfcec 100644 >> --- a/mm/sparse-vmemmap.c >> +++ b/mm/sparse-vmemmap.c >> @@ -458,8 +458,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, >> !IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION))) >> return NULL; >> >> - if (is_power_of_2(sizeof(struct page)) && >> - pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap) >> + if (vmemmap_can_optimize(altmap, pgmap)) >> r = vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap); >> else >> r = vmemmap_populate(start, end, nid, altmap); -aneesh