From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 35A6CCCF9F8 for ; Thu, 6 Nov 2025 15:34:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7FC0C8E0009; Thu, 6 Nov 2025 10:34:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D37B8E0002; Thu, 6 Nov 2025 10:34:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C2468E0009; Thu, 6 Nov 2025 10:34:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5933F8E0002 for ; Thu, 6 Nov 2025 10:34:11 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A91FB1DDE23 for ; Thu, 6 Nov 2025 15:34:10 +0000 (UTC) X-FDA: 84080578260.22.1EF7971 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf14.hostedemail.com (Postfix) with ESMTP id 2A4AF100012 for ; Thu, 6 Nov 2025 15:34:07 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=Ud1ezlZu; spf=pass (imf14.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1762443248; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=USDQ8ElKabVeXvYmFW0YLq1tEPCwDR9lnFGrXCv6r44=; b=eHvBRAfY9aQkZAMe1ZCLsSZvLNJ4X4fmSRHuLVmsBg6a7tM0zwzI6Xcv7IPr1QerJlzb8J UBLrHuex18zrWy2j8IN8xvHMdJkryxEdvFQqyLzqbY3NfPhmycj/MzbESVrGXZd9utxScS xlYFx2JNQyfqaf4CUjRGzdl7FNntilc= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=Ud1ezlZu; spf=pass (imf14.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1762443248; a=rsa-sha256; cv=none; b=FyUXKYNWDWcD72OVfy3AV90AX+O3bN7BDvGXuBNwJ5Gl05pHxu7LvjA2uSeqW+aC86R6FZ eLNR79qkuubpc9p+6wUDBTISXyGGWsZzy9Y7ofqmzhhT8A0XTaUWRjN6bVIIKoLJ5ZGEZ/ fP5CQuN88pPCiMjO4iZBCPuOY4iX+fk= Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5A673rWV004542; Thu, 6 Nov 2025 15:33:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to; s=pp1; bh=USDQ8ElKabVeXvYmFW0YLq1tEPCwDR 9lnFGrXCv6r44=; b=Ud1ezlZulJ19pMoMC9cV2Puhpc6xfvY5kH8l9lzJzReHWw mKEvGrCq+8tKTYQ2MxDxhIU69JV0pOnXy39LBfKHxozegDXvBfvTmpds52s5U6cE /B8hVE2W8NjEjWAswsJNBdI1NjgMAEZpxAZmQ2gRLaUTDivnAluHG4qYTu7XsDwF Vsk72wk+IJjAz3KWLN/C9Bbd92H7vh+BDHU0ZjrBqUI/zYCcM9fmOWIHGJocsLRR RFqfscnpogqSAjX1viBd8MZh0c3dOnzgnvPYX0MV5maeP1AFCCwYIoJhH103h2O4 ttiB6tX1e7rVFmgT6CXyJjvDJ+Kc6/eoN7Ra3efQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4a59xc7x4v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Nov 2025 15:33:33 +0000 (GMT) Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.18.1.12/8.18.0.8) with ESMTP id 5A6FXWW0020778; Thu, 6 Nov 2025 15:33:32 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4a59xc7x4m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Nov 2025 15:33:32 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 5A6FSEis009863; Thu, 6 Nov 2025 15:33:30 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4a5x1kp3cv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Nov 2025 15:33:30 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 5A6FXSer15008248 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Nov 2025 15:33:28 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8439B20043; Thu, 6 Nov 2025 15:33:28 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 90CA520040; Thu, 6 Nov 2025 15:33:27 +0000 (GMT) Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [9.155.204.135]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTPS; Thu, 6 Nov 2025 15:33:27 +0000 (GMT) Date: Thu, 6 Nov 2025 16:33:26 +0100 From: Alexander Gordeev To: Kevin Brodsky Cc: Ritesh Harjani , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andreas Larsson , Andrew Morton , Boris Ostrovsky , Borislav Petkov , Catalin Marinas , Christophe Leroy , Dave Hansen , David Hildenbrand , "David S. Miller" , David Woodhouse , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Juergen Gross , "Liam R. Howlett" , Lorenzo Stoakes , Madhavan Srinivasan , Michael Ellerman , Michal Hocko , Mike Rapoport , Nicholas Piggin , Peter Zijlstra , Ryan Roberts , Suren Baghdasaryan , Thomas Gleixner , Vlastimil Babka , Will Deacon , Yeoreum Yun , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, x86@kernel.org Subject: Re: [PATCH v4 07/12] mm: enable lazy_mmu sections to nest Message-ID: References: <20251029100909.3381140-1-kevin.brodsky@arm.com> <20251029100909.3381140-8-kevin.brodsky@arm.com> <87ms5050g0.ritesh.list@gmail.com> <50d1b63a-88d7-4484-82c0-3bde96e3207d-agordeev@linux.ibm.com> <48a4ecb5-3412-4d3f-9e43-535f8bee505f@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48a4ecb5-3412-4d3f-9e43-535f8bee505f@arm.com> X-TM-AS-GCONF: 00 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMTAxMDAyMSBTYWx0ZWRfX/YZyxmYrnrJ9 xCd4ujIZq7y0lim+YrboYNwl5TsRqk1oOoSjjpPrN2bdS3Yj8+BGw0eDwtF58nU1AxMxpNChFIt kjix63Qwyja0jEYGMpU1W2106lg0/CJxII5rA6IH8ca9YKwDy28POrrS/UliHt5WUxWiY4RvYvK KvjB7Q8W206gTc1ADhSmWj/G5k8ScNY8TzY3EvQXr/BVbxmytuyeX5wifFN5ALnc+chrAMwW/6j QnaXo4exfOcjHuuvP3Y/8qCuhkp//Cj/oqysiN8ehC/uU7vBozF7kWjRZh1pps/+U+PuMO1myX5 CRiCQpqgahhrliHsenckRMbSmwXbtqSltJhRXk0UqsG5hBu7Va14xA+ZT8E/Z1xVt8mLCpJRj/v zoQNd/ElbUWMuQJeG/kbcYXCoAfCRQ== X-Proofpoint-GUID: TFOhjUGUwcHF-txd3JWs84wNA_RrcFJO X-Authority-Analysis: v=2.4 cv=OdCVzxTY c=1 sm=1 tr=0 ts=690cbfcd cx=c_pps a=3Bg1Hr4SwmMryq2xdFQyZA==:117 a=3Bg1Hr4SwmMryq2xdFQyZA==:17 a=kj9zAlcOel0A:10 a=6UeiqGixMTsA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VwQbUJbxAAAA:8 a=20KFwNOVAAAA:8 a=ENMs_uJo8d3F8rUHHo4A:9 a=CjuIK1q_8ugA:10 a=DXsff8QfwkrTrK3sU8N1:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=bWyr8ysk75zN3GCy5bjg:22 X-Proofpoint-ORIG-GUID: 0GlCJX74XkrUHHpzFp_J9wLdkqJNwiej X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2025-11-06_03,2025-11-06_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 adultscore=0 spamscore=0 suspectscore=0 impostorscore=0 lowpriorityscore=0 priorityscore=1501 clxscore=1015 phishscore=0 bulkscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2510240000 definitions=main-2511010021 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2A4AF100012 X-Stat-Signature: o5xa7apyajph6ge3ifqn1k6dnepzpmta X-Rspam-User: X-HE-Tag: 1762443247-567741 X-HE-Meta: U2FsdGVkX18ICe1AIblZWcp3QEeqmo+rH6zfdpzNu8TDxG+uJb8PseH07Bv60ZPGac/ByUjT6W5VrE9FFHNFrkMVgnORI1152Kz4BU+iqV2zdnmQKWreHusegx4KwCbSsSkza2JNwwlQXVdkQSMV4ZlBlswnoTyajulti9+54ykM/L84DsKB1wuNLnvsAbdU5UigLQ1TPCByaEF4v91oHIv3HqCZ5L62deptZqYD97c4mpo3zdyf0WrcL/ugHAP0CAva4eeGcIUH8ljgTOZuXhigPjUzcpirGtJLDMEv4wZ8gxeFNcJ+B8QVVz1FM1XKY+QBzSmcbm1v42uJFaApZqI0kYJZFxqX9CWwUqD1MhKIBvs06aZ6cCVY5r7M65Kn7mGJfyNkfrpLUmQQOHZio+KHNvCFi1wQ8DLHM/rrQvycT0rxE6tYceS0BqC0yPshPdCalhnpJ9leqvrV4Cx1OSgQZ4DsHWwz8H7wsdhHoocrt5TZPwFnayNO8vQ39f0SL13EYkoFd9NP2dhRVPZt8Wjmth0AiH84gOG+bPwmq3uPh+9ZCSQjQpb/yJvdg4gxcIttCVqL/orYDJ/n7VwKe9zLuocxXa7V2LJghYcmpBniF2YMIozXibMLjOXe6o7LHr+G8O/eESDb8PVkenKzkKmJi6dD28ilnre44LB1aAIspynlJu3Yk9Jf6+RNcQvnn/w6JSDkpzzZOxJ2vNOAU+xJVJ9ats+TZzkjSv7hd+BQdIWxPCWn1MbeMJcyC1hmOK2M1MXWNE4xDXW1a6veabeYwahTGbt6+d0a4EJQx7yoHiQWPDQqKUQ6oLi34pKnAXax5MmuStN5QorcJScYk2Z9KrDjzoUGfZkL+zw4N6/ZUe8lTOPUzhCvdU2pjLla2FbPavoDPbx6tzjfnSDabh6i9Ivz0nzbHj7KO+8F341LlewHhNtHJxpyIfkapfE/TCllz0Veu3IAWlOyMgL znvdO65/ svIkqv/tKDgPCCp09qfcww2BSuzDPClGhF0PAeVMBf0xa/YVtvSnQ/LJdfTGyI+UDeXsa6dkor54eLOEPibGfs5HJ+QTC0tSd0fimit34/AfgkJ2PLLmne4kd8M4ioufIMgM4MY1iGcudXPj7TjaQRhLQ6l21vfsRW8klJ3bVXzh7i+1pCOhGPqK4kQAe+I4Bdmbq/jy48Z5NlMZy9pBIsuyr7nBWDzr6iguI0Jkz4b1jT5+XGT5qG7BX2V+sFYmYBzytpc0gxToDdyjFj77cSDWdKKjWBoPHuXbNewEfVPqUzaffcw21WQc076SuZjodjOzSI/iVKU7NXtw8/KpEM2Dpiu/3m9q2CEAzej15t8La7OSkuW4rbfZrpIPmD4DWTZcBvaJIpWb1Sis= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 06, 2025 at 10:51:43AM +0000, Kevin Brodsky wrote: > On 05/11/2025 16:12, Alexander Gordeev wrote: > > On Wed, Nov 05, 2025 at 02:19:03PM +0530, Ritesh Harjani wrote: > >>> + * in_lazy_mmu_mode() can be used to check whether the lazy MMU mode is > >>> + * currently enabled. > >>> */ > >>> #ifdef CONFIG_ARCH_HAS_LAZY_MMU_MODE > >>> static inline void lazy_mmu_mode_enable(void) > >>> { > >>> - arch_enter_lazy_mmu_mode(); > >>> + struct lazy_mmu_state *state = ¤t->lazy_mmu_state; > >>> + > >>> + VM_WARN_ON_ONCE(state->nesting_level == U8_MAX); > >>> + /* enable() must not be called while paused */ > >>> + VM_WARN_ON(state->nesting_level > 0 && !state->active); > >>> + > >>> + if (state->nesting_level++ == 0) { > >>> + state->active = true; > >>> + arch_enter_lazy_mmu_mode(); > >>> + } > >>> } > >> Some architectures disables preemption in their > >> arch_enter_lazy_mmu_mode(). So shouldn't the state->active = true should > >> happen after arch_enter_lazy_mmu_mode() has disabled preemption()? i.e. > > Do you have some scenario in mind that could cause an issue? > > IOW, what could go wrong if the process is scheduled to another > > CPU before preempt_disable() is called? > > I'm not sure I understand the issue either. > > >> static inline void lazy_mmu_mode_enable(void) > >> { > >> - arch_enter_lazy_mmu_mode(); > >> + struct lazy_mmu_state *state = ¤t->lazy_mmu_state; > >> + > >> + VM_WARN_ON_ONCE(state->nesting_level == U8_MAX); > >> + /* enable() must not be called while paused */ > >> + VM_WARN_ON(state->nesting_level > 0 && !state->active); > >> + > >> + if (state->nesting_level++ == 0) { > >> + arch_enter_lazy_mmu_mode(); > >> + state->active = true; > >> + } > >> } > >> > >> ... I think it make more sense to enable the state after the arch_** > >> call right. > > But then in_lazy_mmu_mode() would return false if called from > > arch_enter_lazy_mmu_mode(). Not big problem, but still.. > > The ordering of nesting_level/active was the way you expected in v3, but > the conclusion of the discussion with David H [1] is that it doesn't > really matter so I simplified the ordering in v4 - the arch hooks > shouldn't call in_lazy_mmu_mode() or inspect lazy_mmu_state. > arch_enter()/arch_leave() shouldn't need it anyway since they're called > once per outer section (not in nested sections). arch_flush() could > potentially do something different when nested, but that seems unlikely. > > - Kevin > > [1] > https://lore.kernel.org/all/af4414b6-617c-4dc8-bddc-3ea00d1f6f3b@redhat.com/ I might be misunderstand this conversation, but it looked to me as a discussion about lazy_mmu_state::nesting_level value, not lazy_mmu_state::active. I do use in_lazy_mmu_mode() (lazy_mmu_state::active) check from the arch- callbacks. Here is the example (and likely the only case so far) where it hits: static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, void *_data) { lazy_mmu_mode_pause(); ... if (likely(pte_none(ptep_get(ptep)))) { /* Here set_pte() checks whether we are in lazy_mmu mode */ set_pte_at(&init_mm, addr, ptep, pte); <--- calls set_pte() data->pages[index] = NULL; } ... lazy_mmu_mode_resume(); ... } So without in_lazy_mmu_mode() check above the arch-specific set_pte() implementation enters a wrong branch, which ends up in: [ 394.503134] Call Trace: [ 394.503137] [<00007fffe01333f4>] dump_stack_lvl+0xbc/0xf0 [ 394.503143] [<00007fffe010298c>] vpanic+0x1cc/0x418 [ 394.503149] [<00007fffe0102c7a>] panic+0xa2/0xa8 [ 394.503154] [<00007fffe01e7a8a>] check_panic_on_warn+0x8a/0xb0 [ 394.503160] [<00007fffe082d122>] end_report+0x72/0x110 [ 394.503166] [<00007fffe082d3e6>] kasan_report+0xc6/0x100 [ 394.503171] [<00007fffe01b9556>] ipte_batch_ptep_get+0x146/0x150 [ 394.503176] [<00007fffe0830096>] kasan_populate_vmalloc_pte+0xe6/0x1e0 [ 394.503183] [<00007fffe0718050>] apply_to_pte_range+0x1a0/0x570 [ 394.503189] [<00007fffe07260fa>] __apply_to_page_range+0x3ca/0x8f0 [ 394.503195] [<00007fffe0726648>] apply_to_page_range+0x28/0x40 [ 394.503201] [<00007fffe082fe34>] __kasan_populate_vmalloc+0x324/0x340 [ 394.503207] [<00007fffe076954e>] alloc_vmap_area+0x31e/0xbf0 [ 394.503213] [<00007fffe0770106>] __get_vm_area_node+0x1a6/0x2d0 [ 394.503218] [<00007fffe07716fa>] __vmalloc_node_range_noprof+0xba/0x260 [ 394.503224] [<00007fffe0771970>] __vmalloc_node_noprof+0xd0/0x110 [ 394.503229] [<00007fffe0771a22>] vmalloc_noprof+0x32/0x40 [ 394.503234] [<00007fff604eaa42>] full_fit_alloc_test+0xb2/0x3e0 [test_vmalloc] [ 394.503241] [<00007fff604eb478>] test_func+0x488/0x760 [test_vmalloc] [ 394.503247] [<00007fffe025ad68>] kthread+0x368/0x630 [ 394.503253] [<00007fffe01391e0>] __ret_from_fork+0xd0/0x490 [ 394.503259] [<00007fffe24e468a>] ret_from_fork+0xa/0x30 I could have cached lazy_mmu_state::active as arch-specific data and check it, but then what is the point to have it generalized? Thanks!