From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B268EC4167B for ; Thu, 30 Nov 2023 00:34:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AFFB8D0026; Wed, 29 Nov 2023 19:34:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 25F6D8D0001; Wed, 29 Nov 2023 19:34:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 102198D0026; Wed, 29 Nov 2023 19:34:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F20F68D0001 for ; Wed, 29 Nov 2023 19:34:27 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A33228065A for ; Thu, 30 Nov 2023 00:34:27 +0000 (UTC) X-FDA: 81512749374.28.7D51677 Received: from mail-vk1-f171.google.com (mail-vk1-f171.google.com [209.85.221.171]) by imf07.hostedemail.com (Postfix) with ESMTP id CC78440003 for ; Thu, 30 Nov 2023 00:34:25 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=M4YdL6oX; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.171 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701304465; a=rsa-sha256; cv=none; b=d0go6Am6xlY4LGcQQYk7g1u2gScawtfeo4kJKaVkfUbRwvSY+uw2lOx9Qx5kKlgu0qD9kz xghgPQ4Tdvlhs+OmYFUAaYkBD8bn+k33Ho5FMUy9O4Vr2Z1E8q2UYJhw6ff7bhUPA4jmfN LcsTx6L2tO3MFPskufCMeK0yJGxh+C8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=M4YdL6oX; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.171 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701304465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4iE0EOVY/boKVVt75az+2H2IylfQ/rY8ywinKA1Q3dE=; b=vWMxJyBlXObqM19JJ9v9yM/pwzOOEeprB4Q/jshjW3tJqcikr9bFeEJk07ts7vOdFp51YS VCGqyJb5ZJSP1uNphPRmJWTIlhAbVpB7v7dl4LMyAgGSmLghF9zMZEpfKXyTgwaUCxt9q7 YACYMyIPc5ECMTANASTddAvLrtZcMVs= Received: by mail-vk1-f171.google.com with SMTP id 71dfb90a1353d-4b2881aa3d6so120089e0c.0 for ; Wed, 29 Nov 2023 16:34:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701304465; x=1701909265; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4iE0EOVY/boKVVt75az+2H2IylfQ/rY8ywinKA1Q3dE=; b=M4YdL6oX1NFxWKzua2MWGRmdTT8JaJmlP+st8ebs3QDu9zCpnWd2lZtwVC/WwIeHUJ MzB1EYTOVSOmXahY4uu37KLHGfKQ0q+gwAbY2gxSTuDtHk+knW8VjU6KDlrD2ThdeBke 1q/PkMSkCWA0CFmOL7aW8dTY12Ucglqs4ffMX2SW5NEEhtMtd3JN1CfmIRwd9XfkbZlF DJg5OHIYtf8Q56LN3bm1uRbmUDtvu53vxqIredc+5kakZOtTaRlkZ7gRt8+bPNsPz5Tq BbcNrAWXhkeFX5Kr96aWFRqWlO9qPlQUk2zAFTgxEAiI8sC7AHCj2CMeSB3mDe4wCfzd WxZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701304465; x=1701909265; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4iE0EOVY/boKVVt75az+2H2IylfQ/rY8ywinKA1Q3dE=; b=Bkp8q7wa6GkrorZ90Bx4KR8nXF2GFKUU1FN12nXrz4EfdmiG2D0nOcievKAW6P9nLm dPTwUox74AeDUT3LOSPI1EVsv0nf68TZqlNChPIv1uxxI46Dle8edVOk0neyQKDQ3XO1 fj5716fzzxppKcUoS3S/XqsasSc6Xhnt7DvOLZmA2/RsDJlWD34DyjyhsuSpMnixuexu WRTz8oPa+/lQQlwy0JSODrfWgl7oOlB+ag1o3GR8fTCKNwYoHIn0K9SegyHhdKzNo0rV BHWlVRqPG+RdiU2E6g4bJM8B/JDdHclbyd1bt4mBXni2TC3SVcOT7hgQeJ/X6GPxp35K BjxQ== X-Gm-Message-State: AOJu0YxFcLzdDrl0SrppuVI2mIjL+EVDm/cPCeUgDxq18UqRp6K2IxM3 gKOtgOeJt2Yg0v6pSzXWsl7Ly9+062Jl2xT3SSw= X-Google-Smtp-Source: AGHT+IH/7TiXtW+dJCvD+hegyntTLRvMUl/FFmAv+2SlmPHJ0VgdYi2kHtEuesXQ3+WWHl2KBOhCT+zweurmoUcAGEM= X-Received: by 2002:ac5:c386:0:b0:4ab:f099:17ad with SMTP id s6-20020ac5c386000000b004abf09917admr19420175vkk.9.1701304464688; Wed, 29 Nov 2023 16:34:24 -0800 (PST) MIME-Version: 1.0 References: <20231115163018.1303287-2-ryan.roberts@arm.com> <20231127055414.9015-1-v-songbaohua@oppo.com> <755343a1-ce94-4d38-8317-0925e2dae3bc@arm.com> <02d85331-eaa0-4d76-a3d6-ea5eb18b683c@arm.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Thu, 30 Nov 2023 13:34:13 +1300 Message-ID: Subject: Re: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork() To: Ryan Roberts Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, anshuman.khandual@arm.com, ardb@kernel.org, catalin.marinas@arm.com, david@redhat.com, dvyukov@google.com, glider@google.com, james.morse@arm.com, jhubbard@nvidia.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mark.rutland@arm.com, maz@kernel.org, oliver.upton@linux.dev, ryabinin.a.a@gmail.com, suzuki.poulose@arm.com, vincenzo.frascino@arm.com, wangkefeng.wang@huawei.com, will@kernel.org, willy@infradead.org, yuzenghui@huawei.com, yuzhao@google.com, ziy@nvidia.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: CC78440003 X-Stat-Signature: wqf86u9ahi9bj3ksomdfiiysbsmy79sr X-Rspam-User: X-HE-Tag: 1701304465-742525 X-HE-Meta: U2FsdGVkX1+pQzlOnnY4nsFVFXEo0FyI5SL1lDNOKRAJkZQoTKngNNAQRQe+c+oQOAL/gVHuLd05WaNJmT1E4RaZdhBDRQNiSUfEV63/Fb2lYbKmTS6/KkIDK2N6N0R45ZajSLTlzSJMDmlbEfIFjCIiiYzwP7ACmn2+pQF9bJzixsD/1ACNYUMlYB9gr2SAT5rbwEQn59+mbbALlLiCP94SjYflJIVy839Rycgo9bXrYDB0JEmXo2gCi7+u+irzz6r6XKY2wdmqBBKBZNaHk+NjcwkJFTxN1cP25X6HTXxN5jEpajO3cTrT2LxXwVgm8DsFoKPEgkSqUcfKyGBfXVshAW+SEErJe88lPAlBQV6sQzul8gvGp8r356jAa1sWUzE7mWwqPdSnNVo75vJbZSXxnu/AGWIhEubFSLwrSJ6PY6EurbPlEixzeLo/OBNYleMdNeQDR8t5b7Lp/vy1KjT1IKlo8zguw8XF6/zzcFE90n0vXidIBhDgaNdo1ncy56ybN7iAt4u4T+jMv54XLkLlm1mcmWgUaSzx+I/nZKQkaIkGJL2TzqGlqpnEhztCeM4iJYUKdMedykPfiZEZMOjqvoWCoDNVNyTLnEetle1TKFxXL8SNIsJ35oOs2DpPmV/lyRiHsPF96JBIn5b3Bt8QpPrl2hS/DcTibyAeLqfLGP3Zmauvil6fzc0VrECPOr89dsiYzsYmZRQU77rF1LwIPo83BBNq95QJWiPNkRmyfalKOzrcVRaO3pHVhc6V0PiyQzLX5ZZvGuQV0X48tjjSj4Pz6YSZGCIMcPmxhvhOguAIusVMSN2LCIT3LYcIrTICYQEYLMwA8w7VTb7AS4HrqM2KsNSU8f5NLfc1+g/Re7+7kmZGVYqJG+tX6Xzh9mmMF72rQGEkApqq5W2FF+eIEjJ+ScObfPjVSSJxF7jP8oaBL/UapxAwqIkjq/iphc1v8plF6/BlAkbkZAY jYx0JlVh FNd2owau79VJr9BD2isKeY1RdqwfUSDzVtA+PtfI4fbxkfv4cPxQmsJi9dD8oDvHr2/+go7NC/BKLfLsdXTpqGm2/QI0HANZb93Nlr7bnrzGX8x2uIP6c7uPCCjkvvxY42HQJ34n2zrzh7sgt3+AHvaCCrial0ZhI6H0GXdxaxGIj3tGwRkYC5o1uJLhAKVyXch6UrIXX6SJiqvNQG7ifp6dIYreQlVB+aqoDuZyL9NfRXyT60GKb6Qn1IlIzL2yJPCpq90SIYEj458cZMd0/xUzIZFViY+d1+HEpWbk80nPp5FZYYD2QaTLC6dD0WqjJMAjpl+lzU3ZHDkGQEbcgYKjWGSP0VxgrxlSP0KcsYLhcEARFItKkqF9huFcqnh2wgh+nCNUygIx589NQe2iNpArYelawYd6Knfv9/hnOfEkp3u09230K+5tJkXFoxAH/2FOs0CR0iMjMeAtG+9iLjAP/+WBUJWU+4YCBrFBeLB1Z+9V83yYPxTxeKj6KIZkxuAPA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 30, 2023 at 3:07=E2=80=AFAM Ryan Roberts = wrote: > > On 29/11/2023 13:09, Barry Song wrote: > > On Thu, Nov 30, 2023 at 1:29=E2=80=AFAM Ryan Roberts wrote: > >> > >> On 28/11/2023 19:00, Barry Song wrote: > >>> On Wed, Nov 29, 2023 at 12:00=E2=80=AFAM Ryan Roberts wrote: > >>>> > >>>> On 28/11/2023 00:11, Barry Song wrote: > >>>>> On Mon, Nov 27, 2023 at 10:24=E2=80=AFPM Ryan Roberts wrote: > >>>>>> > >>>>>> On 27/11/2023 05:54, Barry Song wrote: > >>>>>>>> +copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_are= a_struct *src_vma, > >>>>>>>> + pte_t *dst_pte, pte_t *src_pte, > >>>>>>>> + unsigned long addr, unsigned long end, > >>>>>>>> + int *rss, struct folio **prealloc) > >>>>>>>> { > >>>>>>>> struct mm_struct *src_mm =3D src_vma->vm_mm; > >>>>>>>> unsigned long vm_flags =3D src_vma->vm_flags; > >>>>>>>> pte_t pte =3D ptep_get(src_pte); > >>>>>>>> struct page *page; > >>>>>>>> struct folio *folio; > >>>>>>>> + int nr =3D 1; > >>>>>>>> + bool anon; > >>>>>>>> + bool any_dirty =3D pte_dirty(pte); > >>>>>>>> + int i; > >>>>>>>> > >>>>>>>> page =3D vm_normal_page(src_vma, addr, pte); > >>>>>>>> - if (page) > >>>>>>>> + if (page) { > >>>>>>>> folio =3D page_folio(page); > >>>>>>>> - if (page && folio_test_anon(folio)) { > >>>>>>>> - /* > >>>>>>>> - * If this page may have been pinned by the parent = process, > >>>>>>>> - * copy the page immediately for the child so that = we'll always > >>>>>>>> - * guarantee the pinned page won't be randomly repl= aced in the > >>>>>>>> - * future. > >>>>>>>> - */ > >>>>>>>> - folio_get(folio); > >>>>>>>> - if (unlikely(page_try_dup_anon_rmap(page, false, sr= c_vma))) { > >>>>>>>> - /* Page may be pinned, we have to copy. */ > >>>>>>>> - folio_put(folio); > >>>>>>>> - return copy_present_page(dst_vma, src_vma, = dst_pte, src_pte, > >>>>>>>> - addr, rss, preallo= c, page); > >>>>>>>> + anon =3D folio_test_anon(folio); > >>>>>>>> + nr =3D folio_nr_pages_cont_mapped(folio, page, src_= pte, addr, > >>>>>>>> + end, pte, &any_dirt= y); > >>>>>>> > >>>>>>> in case we have a large folio with 16 CONTPTE basepages, and user= space > >>>>>>> do madvise(addr + 4KB * 5, DONTNEED); > >>>>>> > >>>>>> nit: if you are offsetting by 5 pages from addr, then below I thin= k you mean > >>>>>> page0~page4 and page6~15? > >>>>>> > >>>>>>> > >>>>>>> thus, the 4th basepage of PTE becomes PTE_NONE and folio_nr_pages= _cont_mapped() > >>>>>>> will return 15. in this case, we should copy page0~page3 and page= 5~page15. > >>>>>> > >>>>>> No I don't think folio_nr_pages_cont_mapped() will return 15; that= 's certainly > >>>>>> not how its intended to work. The function is scanning forwards fr= om the current > >>>>>> pte until it finds the first pte that does not fit in the batch - = either because > >>>>>> it maps a PFN that is not contiguous, or because the permissions a= re different > >>>>>> (although this is being relaxed a bit; see conversation with David= H against this > >>>>>> same patch). > >>>>>> > >>>>>> So the first time through this loop, folio_nr_pages_cont_mapped() = will return 5, > >>>>>> (page0~page4) then the next time through the loop we will go throu= gh the > >>>>>> !present path and process the single swap marker. Then the 3rd tim= e through the > >>>>>> loop folio_nr_pages_cont_mapped() will return 10. > >>>>> > >>>>> one case we have met by running hundreds of real phones is as below= , > >>>>> > >>>>> > >>>>> static int > >>>>> copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struc= t *src_vma, > >>>>> pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, > >>>>> unsigned long end) > >>>>> { > >>>>> ... > >>>>> dst_pte =3D pte_alloc_map_lock(dst_mm, dst_pmd, addr, &dst_= ptl); > >>>>> if (!dst_pte) { > >>>>> ret =3D -ENOMEM; > >>>>> goto out; > >>>>> } > >>>>> src_pte =3D pte_offset_map_nolock(src_mm, src_pmd, addr, &s= rc_ptl); > >>>>> if (!src_pte) { > >>>>> pte_unmap_unlock(dst_pte, dst_ptl); > >>>>> /* ret =3D=3D 0 */ > >>>>> goto out; > >>>>> } > >>>>> spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); > >>>>> orig_src_pte =3D src_pte; > >>>>> orig_dst_pte =3D dst_pte; > >>>>> arch_enter_lazy_mmu_mode(); > >>>>> > >>>>> do { > >>>>> /* > >>>>> * We are holding two locks at this point - either = of them > >>>>> * could generate latencies in another task on anot= her CPU. > >>>>> */ > >>>>> if (progress >=3D 32) { > >>>>> progress =3D 0; > >>>>> if (need_resched() || > >>>>> spin_needbreak(src_ptl) || spin_needbre= ak(dst_ptl)) > >>>>> break; > >>>>> } > >>>>> ptent =3D ptep_get(src_pte); > >>>>> if (pte_none(ptent)) { > >>>>> progress++; > >>>>> continue; > >>>>> } > >>>>> > >>>>> the above iteration can break when progress > =3D32. for example, a= t the > >>>>> beginning, > >>>>> if all PTEs are none, we break when progress >=3D32, and we break w= hen we > >>>>> are in the 8th pte of 16PTEs which might become CONTPTE after we re= lease > >>>>> PTL. > >>>>> > >>>>> since we are releasing PTLs, next time when we get PTL, those pte_n= one() might > >>>>> become pte_cont(), then are you going to copy CONTPTE from 8th pte, > >>>>> thus, immediately > >>>>> break the consistent CONPTEs rule of hardware? > >>>>> > >>>>> pte0 - pte_none > >>>>> pte1 - pte_none > >>>>> ... > >>>>> pte7 - pte_none > >>>>> > >>>>> pte8 - pte_cont > >>>>> ... > >>>>> pte15 - pte_cont > >>>>> > >>>>> so we did some modification to avoid a break in the middle of PTEs > >>>>> which can potentially > >>>>> become CONTPE. > >>>>> do { > >>>>> /* > >>>>> * We are holding two locks at this point - either o= f them > >>>>> * could generate latencies in another task on anoth= er CPU. > >>>>> */ > >>>>> if (progress >=3D 32) { > >>>>> progress =3D 0; > >>>>> #ifdef CONFIG_CONT_PTE_HUGEPAGE > >>>>> /* > >>>>> * XXX: don't release ptl at an unligned address as > >>>>> cont_pte might form while > >>>>> * ptl is released, this causes double-map > >>>>> */ > >>>>> if (!vma_is_chp_anonymous(src_vma) || > >>>>> (vma_is_chp_anonymous(src_vma) && IS_ALIGNED(add= r, > >>>>> HPAGE_CONT_PTE_SIZE))) > >>>>> #endif > >>>>> if (need_resched() || > >>>>> spin_needbreak(src_ptl) || spin_needbreak(dst_pt= l)) > >>>>> break; > >>>>> } > >>>>> > >>>>> We could only reproduce the above issue by running thousands of pho= nes. > >>>>> > >>>>> Does your code survive from this problem? > >>>> > >>>> Yes I'm confident my code is safe against this; as I said before, th= e CONT_PTE > >>>> bit is not blindly "copied" from parent to child pte. As far as the = core-mm is > >>>> concerned, there is no CONT_PTE bit; they are just regular PTEs. So = the code > >>>> will see some pte_none() entries followed by some pte_present() entr= ies. And > >>>> when calling set_ptes() on the child, the arch code will evaluate th= e current > >>>> state of the pgtable along with the new set_ptes() request and deter= mine where > >>>> it should insert the CONT_PTE bit. > >>> > >>> yep, i have read very carefully and think your code is safe here. The > >>> only problem > >>> is that the code can randomly unfold parent processes' CONPTE while s= etting > >>> wrprotect in the middle of a large folio while it actually should kee= p CONT > >>> bit as all PTEs can be still consistent if we set protect from the 1s= t PTE. > >>> > >>> while A forks B, progress >=3D 32 might interrupt in the middle of a > >>> new CONTPTE folio which is forming, as we have to set wrprotect to pa= rent A, > >>> this parent immediately loses CONT bit. this is sad. but i can't fin= d a > >>> good way to resolve it unless CONT is exposed to mm-core. any idea on > >>> this? > >> > >> No this is not the case; copy_present_ptes() will copy as many ptes as= are > >> physcially contiguous and belong to the same folio (which usually mean= s "the > >> whole folio" - the only time it doesn't is when we hit the end of the = vma). We > >> will then return to the main loop and move forwards by the number of p= tes that > >> were serviced, including: > > > > I probably have failed to describe my question. i'd like to give a > > concrete example > > > > 1. process A forks B > > 2. At the beginning, address~address +64KB has pte_none PTEs > > 3. we scan the 5th pte of address + 5 * 4KB, progress becomes 32, we > > break and release PTLs > > 4. another page fault in process A gets PTL and set > > address~address+64KB to pte_cont > > 5. we get the PTL again and arrive 5th pte > > 6. we set wrprotects on 5,6,7....15 ptes, in this case, we have to > > unfold parent A > > and child B also gets unfolded PTEs unless our loop can go back the 0th= pte. > > > > technically, A should be able to keep CONTPTE, but because of the imple= mentation > > of the code, it can't. That is the sadness. but it is obviously not you= r fault. > > > > no worries. This is not happening quite often. but i just want to make = a note > > here, maybe someday we can get back to address it. > > Ahh, I understand the situation now, sorry for being slow! > > I expect this to be a very rare situation anyway since (4) suggests proce= ss A > has another thread, and forking is not encouraged for multithreaded progr= ams. In > fact the fork man page says: > > After a fork() in a multithreaded program, the child can safely call on= ly > async-signal-safe functions (see signal-safety(7)) until such time as i= t calls > execve(2). > > So in this case, we are about to completely repaint the child's address s= pace > with execve() anyway. > > So its just the racing parent that loses the CONT_PTE bit. I expect this = to be > extremely rare. I'm not sure there is much we can do to solve it though, = because > unlike with your solution, we have to cater for multiple sizes so there i= s no > obvious boarder until we get to PMD size and I'm guessing that's going to= be a > problem for latency spikes. right. i don't think this can be a big problem. the background is that we have some way to constantly detect and report unexpected unfold/events, so we run hun= dreds of phones in lab, and collect data to find out if we have any potential problems. we record unexpected unfold and reasons in a proc file, we monitor those data = to look for potential bugs we might have. this ">=3D32" break and unfold was f= ound in this way, thus we simply addressed it by disallowing the break at an unalig= ned address. This is never an issue which can stop your patchset. but I was still glad to share our observations with you and would like to hear if you had any idea :-) > > > > > >> > >> progress +=3D 8 * ret; > >> > >> That might go above 32, so we will flash the lock. But we haven't done= that in > >> the middle of a large folio. So the contpte-ness should be preserved. > >> > >>> > >>> Our code[1] resolves this by only breaking at the aligned address > >>> > >>> if (progress >=3D 32) { > >>> progress =3D 0; > >>> #ifdef CONFIG_CONT_PTE_HUGEPAGE > >>> /* > >>> * XXX: don't release ptl at an unligned address as cont_pte > >>> might form while > >>> * ptl is released, this causes double-map > >>> */ > >>> if (!vma_is_chp_anonymous(src_vma) || > >>> (vma_is_chp_anonymous(src_vma) && IS_ALIGNED(addr, > >>> HPAGE_CONT_PTE_SIZE))) > >>> #endif > >>> if (need_resched() || > >>> spin_needbreak(src_ptl) || spin_needbreak(dst_ptl)) > >>> break; > >>> } > >>> > >>> [1] https://github.com/OnePlusOSS/android_kernel_oneplus_sm8550/blob/= oneplus/sm8550_u_14.0.0_oneplus11/mm/memory.c#L1180 Thanks Barry