From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81167C54E58 for ; Wed, 20 Mar 2024 14:35:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F11466B0083; Wed, 20 Mar 2024 10:35:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9BB16B0085; Wed, 20 Mar 2024 10:35:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D139E6B0088; Wed, 20 Mar 2024 10:35:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B7BFC6B0083 for ; Wed, 20 Mar 2024 10:35:24 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8F5044142A for ; Wed, 20 Mar 2024 14:35:24 +0000 (UTC) X-FDA: 81917665368.27.0357D4F Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) by imf05.hostedemail.com (Postfix) with ESMTP id C867710000B for ; Wed, 20 Mar 2024 14:35:22 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Me4wh6xU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710945322; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KQw4LqJtRSTxnpuasR1+89w7OEc9fb41JI6MyxKAeVQ=; b=WZV/sYrBS9l67qC1x3uFGvV+pXEfo3hcaXNAw79EeqvGrvv/iGwSzamyz8CkBuj+nFqd3S /O/Dbkvs0KP+q0AP6cI+Xl9ZiP7WfFgaBLGkQw/u4075ph4r8dQolPrr9OCWVTc3R9HqiO vt1K1RVjm/9R9eVv5llyKkkb83uSWCg= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Me4wh6xU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710945322; a=rsa-sha256; cv=none; b=p0HV974KrrT9m3eMlsMrzUYMNMmz8rp9ljbtpOV8o6K2zQjXUW2dtDibZcD6bZNwIHJjAJ d6fS8/z5ACfZ1cp5j2Al8CI/1I/iC72FkPulPjWNPfeH2K4v64fE+ldpm99f9sQGaH+b03 98Zoyd/GzJaXZhNZ/Ub8P2Jg8vHxv50= Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-60a0579a931so66616667b3.0 for ; Wed, 20 Mar 2024 07:35:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710945322; x=1711550122; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KQw4LqJtRSTxnpuasR1+89w7OEc9fb41JI6MyxKAeVQ=; b=Me4wh6xUvqDvUU3N4TQAW6GjIOFUohyt7Y98pKo1yRZAeVR31ycghQgyw+tx8L+1vY Z2QjCkQcdiCtM/1jRB/LmIeTPq7GB1tZWjL/lPbHV7xLSabibh2R4yC8Gn1nLW153rEn 2tWHzGUS3P0F1sd/duNqJvsBbusP+meWOyEXsQy493mcWrRpS8giFHxKYp5A0lbhcx0y UfhQ7WfRo49SHzpJdI+8mZRTI+J+dYMhZ5x+wyftiTTFmDNtn6v3A1tpRN+niRrLbY9X RtiLM/ZPeK4o6kRQw7wiopfVASIuNPd5tit7+bLq/ox5A4bVgKN3QfGkg5Qr8ewDkNdI bfmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710945322; x=1711550122; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KQw4LqJtRSTxnpuasR1+89w7OEc9fb41JI6MyxKAeVQ=; b=n5KPwDFpFBbl2esu1DcnRa1TLTKEyUJ9vEaccBjHDYUCs5LxkVOTEKlrairW472zsL RDsWOEKgGBJ7W8sEODVZDv7BcVFWwuV4XmjSMi/s+rq3ZGbxozzn7grvupvsDR+xWOKL C8Ou/OBgyWqbq9ruWCBvLG4v9DoOirIQnxY5dm9XVbVdoJgTAFEpg72p0ITe8ksS9ntS 4sDrxnytip9i9f9IixoDTtFfeMrPOgyIbG3oqrfnpu5/5pWHqaEnpaZwbRVXivl8AaE9 kAmNRYwX1yM9cM34ppUAAirP1B8sgfEGfwVRGsZ13nd9Tmh5SYufAF5g0tLxCb5Eyp3v PhRA== X-Forwarded-Encrypted: i=1; AJvYcCXB1CcUVpk8EGwSMoipoFyh5d5IZxa5E2b7UWf4KWQvzSybPT/sSMIY2Q1MMeb1r1NF3J0VrIbOM6kPtSqM9TjtpZA= X-Gm-Message-State: AOJu0YwRhCyP5pnVLxwP4nCc0kMg8res/+mrlyS9QMc4CDf+KX/IB/YU 3MVSFdp5+foEixqCctrPfGh2kFCQzmwlOV4y95HTtB8I70UeE4LEQz2iZOubZ7YQPSgjk9lpi9I ckVqS5UR4yqs2IxeFaYn/0H7NCOM= X-Google-Smtp-Source: AGHT+IGo3Od7BLDoZc9OLt3q5G0qUriehLBxvf8AaigJfsUa3Fd14cqK0AugjnxYKErz+59Bc578lIgIJEm+uLVoIhA= X-Received: by 2002:a81:ac1c:0:b0:60f:d6fc:74f3 with SMTP id k28-20020a81ac1c000000b0060fd6fc74f3mr14431030ywh.7.1710945320287; Wed, 20 Mar 2024 07:35:20 -0700 (PDT) MIME-Version: 1.0 References: <20240311150058.1122862-1-ryan.roberts@arm.com> <20240311150058.1122862-7-ryan.roberts@arm.com> <7ba06704-2090-4eb2-9534-c4d467cc085a@arm.com> In-Reply-To: <7ba06704-2090-4eb2-9534-c4d467cc085a@arm.com> From: Lance Yang Date: Wed, 20 Mar 2024 22:35:08 +0800 Message-ID: Subject: Re: [PATCH v4 6/6] mm: madvise: Avoid split during MADV_PAGEOUT and MADV_COLD To: Ryan Roberts Cc: Barry Song <21cnbao@gmail.com>, Andrew Morton , David Hildenbrand , Matthew Wilcox , Huang Ying , Gao Xiang , Yu Zhao , Yang Shi , Michal Hocko , Kefeng Wang , Chris Li , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C867710000B X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: hueh6byc3oicnoobxxqnfmbbun7armrw X-HE-Tag: 1710945322-584080 X-HE-Meta: U2FsdGVkX1/XLqjrjmM4t9YCh5L+mULaUIGc7WIA8sVftAvVFJCY6ZFYfI5j02MZaxycbKIQqwq2EoFwl70QWithwprjQpIRu743fPbYZjCiGE1hTTXF8n9HpmoyRkOB1+E+lfF7osfgJoDc+7xQdjdXqsuEVpAUKyusV4p98wzXVaZBqwYO/ygyCoposmlZcPyFpbB92/zgqSvVmhmbHhVQ8WYgxXS8q9aqjc5lmTQwZYhw8YjEL3zhSY53VfnAslVRCTvwLn5z6MOWplOmh9hb5XnLkKaUJwqJYalD6sfxo9Pibcq5473TWWjw3XIESRzLCC4alv0J6LV9R6gdIYQakj5pjN86o3EGOwgrBBmGtvx2+K+KV5P3c1qdgm7cQrWIlb/XNui3W34DD3Xe1iRiAiMnsZmj0t04QBEU7jvE/qsnlFb2wdbXMERz5YcjhcpKA1lh1vfsGiLMLuKnrI+lNClzI9J0dj03Q9yqP8NEV9A/XU1KXJr7/zEsQLFJApOpyR+jEUuEmEYb55TF1oRVuebCH5GxhXDYfFjjENJ9RjVfjKdzLzTdc/pJnTItm6wOJlNA3szNKHF/PpUh9MoRkZ5zKOaxQv2509ZMLHN4uJ/6MRPQMUZ1pE+cLxQcLM0Di5m/82GuuBOIL6ymkmI5ctimCs41AFfhjzNGiRr7T5vzIg2WUwkxYWlbaIvom9U2BjgVcgHJ6TNHOTr9xcQYnm4c60owjJ5ye3T1EFMHLvjbA7d2u+VwQ1R8b+pxMvysU5HY4w98hz0QQ9iyFRjWdkdsObuan+3qJ8pOP33N4X5vEOXdKKKrYUnTAfBNztpHAigpLME0COvZzN5xxtFoUORWpjclyH0wqrx8aGqS0hBPZeBvYBS1IH8A8yAUD38qxnykAcWtfqOtRA2TkkTCVar0VxW22SnDruh6ztNl5Z2fBmfsFN2EEpk7/vWk1dhX/QH2XaeYugjVkMl Vp4I1BED LPf65Xg3mEreiYtH4Zt+sJpN6mUk8BHS+JdyvNsphZTfBnWaTp5JO9Ywb+UuH9p/xNkrwp9LFLQlh3g1/WGTEQxkT+UbYoc1kIrHixsO6RLggzDNoxymI3Yd0OR4QG9a5pWRQpUafbu24E1va77sKvdk0h0bQshBkW4pknWuD8jApO4Cbn0w013PrG0zbOPCDDNJivUiVoawzdeJlU0I2RQWowWR8RGfkZi4xLxVxdjPY/WQP5mv+YdWbWty/6TnCJAM4ozGWIcahdRiL3/8yb+rgj2TF748381hRnc+DXph3k6tR1b5IJBs1tlm3J9URN0DnQvKPOPWSpCMdXgbbROQVzzcGmJE+/O89pAFKvTgCedXAOqEiS39QpNBA07m6otYMqgDbMIzNs0tS8b8sqd47xaKORhQyUDD5StPl7fPLk2LTcvStWlcGI+oDJWgRAJXwqebIoNd9MvnKtwInvPoDbcl4Gp8wQimox39U8HUx0+bWhAn9SRRr3Vn3Gqop0YzrMgC07f55YZPquGnqj933HlOFILfx4T19gYozNZymXWQoCLJ5/HcmSTf75MNq7ISiE2ymsLyQxzn1eqE1CZih24XsN7T89RPZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 20, 2024 at 9:49=E2=80=AFPM Ryan Roberts = wrote: > > Hi Lance, Barry, > > Sorry - I totally missed this when you originally sent it! No worries at all :) > > > On 13/03/2024 14:02, Lance Yang wrote: > > On Wed, Mar 13, 2024 at 5:03=E2=80=AFPM Ryan Roberts wrote: > >> > >> On 13/03/2024 07:19, Barry Song wrote: > >>> On Tue, Mar 12, 2024 at 4:01=E2=80=AFAM Ryan Roberts wrote: > >>>> > >>>> Rework madvise_cold_or_pageout_pte_range() to avoid splitting any la= rge > >>>> folio that is fully and contiguously mapped in the pageout/cold vm > >>>> range. This change means that large folios will be maintained all th= e > >>>> way to swap storage. This both improves performance during swap-out,= by > >>>> eliding the cost of splitting the folio, and sets us up nicely for > >>>> maintaining the large folio when it is swapped back in (to be covere= d in > >>>> a separate series). > >>>> > >>>> Folios that are not fully mapped in the target range are still split= , > >>>> but note that behavior is changed so that if the split fails for any > >>>> reason (folio locked, shared, etc) we now leave it as is and move to= the > >>>> next pte in the range and continue work on the proceeding folios. > >>>> Previously any failure of this sort would cause the entire operation= to > >>>> give up and no folios mapped at higher addresses were paged out or m= ade > >>>> cold. Given large folios are becoming more common, this old behavior > >>>> would have likely lead to wasted opportunities. > >>>> > >>>> While we are at it, change the code that clears young from the ptes = to > >>>> use ptep_test_and_clear_young(), which is more efficent than > >>>> get_and_clear/modify/set, especially for contpte mappings on arm64, > >>>> where the old approach would require unfolding/refolding and the new > >>>> approach can be done in place. > >>>> > >>>> Signed-off-by: Ryan Roberts > >>> > >>> This looks so much better than our initial RFC. > >>> Thank you for your excellent work! > >> > >> Thanks - its a team effort - I had your PoC and David's previous batch= ing work > >> to use as a template. > >> > >>> > >>>> --- > >>>> mm/madvise.c | 89 ++++++++++++++++++++++++++++++-------------------= --- > >>>> 1 file changed, 51 insertions(+), 38 deletions(-) > >>>> > >>>> diff --git a/mm/madvise.c b/mm/madvise.c > >>>> index 547dcd1f7a39..56c7ba7bd558 100644 > >>>> --- a/mm/madvise.c > >>>> +++ b/mm/madvise.c > >>>> @@ -336,6 +336,7 @@ static int madvise_cold_or_pageout_pte_range(pmd= _t *pmd, > >>>> LIST_HEAD(folio_list); > >>>> bool pageout_anon_only_filter; > >>>> unsigned int batch_count =3D 0; > >>>> + int nr; > >>>> > >>>> if (fatal_signal_pending(current)) > >>>> return -EINTR; > >>>> @@ -423,7 +424,8 @@ static int madvise_cold_or_pageout_pte_range(pmd= _t *pmd, > >>>> return 0; > >>>> flush_tlb_batched_pending(mm); > >>>> arch_enter_lazy_mmu_mode(); > >>>> - for (; addr < end; pte++, addr +=3D PAGE_SIZE) { > >>>> + for (; addr < end; pte +=3D nr, addr +=3D nr * PAGE_SIZE) { > >>>> + nr =3D 1; > >>>> ptent =3D ptep_get(pte); > >>>> > >>>> if (++batch_count =3D=3D SWAP_CLUSTER_MAX) { > >>>> @@ -447,55 +449,66 @@ static int madvise_cold_or_pageout_pte_range(p= md_t *pmd, > >>>> continue; > >>>> > >>>> /* > >>>> - * Creating a THP page is expensive so split it only= if we > >>>> - * are sure it's worth. Split it if we are only owne= r. > >>>> + * If we encounter a large folio, only split it if i= t is not > >>>> + * fully mapped within the range we are operating on= . Otherwise > >>>> + * leave it as is so that it can be swapped out whol= e. If we > >>>> + * fail to split a folio, leave it in place and adva= nce to the > >>>> + * next pte in the range. > >>>> */ > >>>> if (folio_test_large(folio)) { > >>>> - int err; > >>>> - > >>>> - if (folio_estimated_sharers(folio) > 1) > >>>> - break; > >>>> - if (pageout_anon_only_filter && !folio_test_= anon(folio)) > >>>> - break; > >>>> - if (!folio_trylock(folio)) > >>>> - break; > >>>> - folio_get(folio); > >>>> - arch_leave_lazy_mmu_mode(); > >>>> - pte_unmap_unlock(start_pte, ptl); > >>>> - start_pte =3D NULL; > >>>> - err =3D split_folio(folio); > >>>> - folio_unlock(folio); > >>>> - folio_put(folio); > >>>> - if (err) > >>>> - break; > >>>> - start_pte =3D pte =3D > >>>> - pte_offset_map_lock(mm, pmd, addr, &= ptl); > >>>> - if (!start_pte) > >>>> - break; > >>>> - arch_enter_lazy_mmu_mode(); > >>>> - pte--; > >>>> - addr -=3D PAGE_SIZE; > >>>> - continue; > >>>> + const fpb_t fpb_flags =3D FPB_IGNORE_DIRTY | > >>>> + FPB_IGNORE_SOFT_DIRT= Y; > >>>> + int max_nr =3D (end - addr) / PAGE_SIZE; > >>>> + > >>>> + nr =3D folio_pte_batch(folio, addr, pte, pte= nt, max_nr, > >>>> + fpb_flags, NULL); > >>> > >>> I wonder if we have a quick way to avoid folio_pte_batch() if users > >>> are doing madvise() on a portion of a large folio. > >> > >> Good idea. Something like this?: > >> > >> if (pte_pfn(pte) =3D=3D folio_pfn(folio) > >> nr =3D folio_pte_batch(folio, addr, pte, ptent, max_nr= , > >> fpb_flags, NULL); > >> > >> If we are not mapping the first page of the folio, then it can't be a = full > >> mapping, so no need to call folio_pte_batch(). Just split it. > > > > if (folio_test_large(folio)) { > > [...] > > nr =3D folio_pte_batch(folio, addr, pte, ptent, = max_nr, > > fpb_flags, NULL); > > + if (folio_estimated_sharers(folio) > 1) > > + continue; > > > > Could we use folio_estimated_sharers as an early exit point here? > > I'm not sure what this is saving where you have it? Did you mean to put i= t > before folio_pte_batch()? Currently it is just saving a single conditiona= l. Apologies for the confusion. I made a diff to provide clarity. diff --git a/mm/madvise.c b/mm/madvise.c index 56c7ba7bd558..c3458fdea82a 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -462,12 +462,11 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *p= md, nr =3D folio_pte_batch(folio, addr, pte, ptent, max= _nr, fpb_flags, NULL); - // Could we use folio_estimated_sharers as an early exit point here? + if (folio_estimated_sharers(folio) > 1) + continue; if (nr < folio_nr_pages(folio)) { int err; - if (folio_estimated_sharers(folio) > 1) - continue; if (pageout_anon_only_filter && !folio_test_anon(folio)) continue; if (!folio_trylock(folio)) > > But now that I think about it a bit more, I remember why I was originally > unconditionally calling folio_pte_batch(). Given its a large folio, if th= e split > fails, we can move the cursor to the pte where the next folio begins so w= e don't > have to iterate through one pte at a time which would cause us to keep ca= lling > folio_estimated_sharers(), folio_test_anon(), etc on the same folio until= we get > to the next boundary. > > Of course the common case at this point will be for the split to succeed,= but > then we are going to iterate over ever single PTE anyway - one way or ano= ther > they are all fetched into cache. So I feel like its neater not to add the > conditionals for calling folio_pte_batch(), and just leave this as I have= it here. > > > > > if (nr < folio_nr_pages(folio)) { > > int err; > > > > - if (folio_estimated_sharers(folio) > 1) > > - continue; > > [...] > > > >> > >>> > >>>> + > >>>> + if (nr < folio_nr_pages(folio)) { > >>>> + int err; > >>>> + > >>>> + if (folio_estimated_sharers(folio) >= 1) > >>>> + continue; > >>>> + if (pageout_anon_only_filter && !fol= io_test_anon(folio)) > >>>> + continue; > >>>> + if (!folio_trylock(folio)) > >>>> + continue; > >>>> + folio_get(folio); > >>>> + arch_leave_lazy_mmu_mode(); > >>>> + pte_unmap_unlock(start_pte, ptl); > >>>> + start_pte =3D NULL; > >>>> + err =3D split_folio(folio); > >>>> + folio_unlock(folio); > >>>> + folio_put(folio); > >>>> + if (err) > >>>> + continue; > >>>> + start_pte =3D pte =3D > >>>> + pte_offset_map_lock(mm, pmd,= addr, &ptl); > >>>> + if (!start_pte) > >>>> + break; > >>>> + arch_enter_lazy_mmu_mode(); > >>>> + nr =3D 0; > >>>> + continue; > >>>> + } > >>>> } > >>>> > >>>> /* > >>>> * Do not interfere with other mappings of this foli= o and > >>>> - * non-LRU folio. > >>>> + * non-LRU folio. If we have a large folio at this p= oint, we > >>>> + * know it is fully mapped so if its mapcount is the= same as its > >>>> + * number of pages, it must be exclusive. > >>>> */ > >>>> - if (!folio_test_lru(folio) || folio_mapcount(folio) = !=3D 1) > >>>> + if (!folio_test_lru(folio) || > >>>> + folio_mapcount(folio) !=3D folio_nr_pages(folio)= ) > >>>> continue; > >>> > >>> This looks so perfect and is exactly what I wanted to achieve. > >>> > >>>> > >>>> if (pageout_anon_only_filter && !folio_test_anon(fol= io)) > >>>> continue; > >>>> > >>>> - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); > >>>> - > >>>> - if (!pageout && pte_young(ptent)) { > >>>> - ptent =3D ptep_get_and_clear_full(mm, addr, = pte, > >>>> - tlb->fullmm)= ; > >>>> - ptent =3D pte_mkold(ptent); > >>>> - set_pte_at(mm, addr, pte, ptent); > >>>> - tlb_remove_tlb_entry(tlb, pte, addr); > >>>> + if (!pageout) { > >>>> + for (; nr !=3D 0; nr--, pte++, addr +=3D PAG= E_SIZE) { > >>>> + if (ptep_test_and_clear_young(vma, a= ddr, pte)) > >>>> + tlb_remove_tlb_entry(tlb, pt= e, addr); > > > > IIRC, some of the architecture(ex, PPC) don't update TLB with set_pte_a= t and > > tlb_remove_tlb_entry. So, didn't we consider remapping the PTE with old= after > > pte clearing? > > Sorry Lance, I don't understand this question, can you rephrase? Are you = saying > there is a good reason to do the original clear-mkold-set for some arches= ? IIRC, some of the architecture(ex, PPC) don't update TLB with ptep_test_and_clear_young() and tlb_remove_tlb_entry(). In my new patch[1], I use refresh_full_ptes() and tlb_remove_tlb_entries() to batch-update the access and dirty bits. [1] https://lore.kernel.org/linux-mm/20240316102952.39233-1-ioworker0@gmail= .com Thanks, Lance > > > > > Thanks, > > Lance > > > > > > > >>>> + } > >>> > >>> This looks so smart. if it is not pageout, we have increased pte > >>> and addr here; so nr is 0 and we don't need to increase again in > >>> for (; addr < end; pte +=3D nr, addr +=3D nr * PAGE_SIZE) > >>> > >>> otherwise, nr won't be 0. so we will increase addr and > >>> pte by nr. > >> > >> Indeed. I'm hoping that Lance is able to follow a similar pattern for > >> madvise_free_pte_range(). > >> > >> > >>> > >>> > >>>> } > >>>> > >>>> /* > >>>> -- > >>>> 2.25.1 > >>>> > >>> > >>> Overall, LGTM, > >>> > >>> Reviewed-by: Barry Song > >> > >> Thanks! > >> > >> >