From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1972EC02180 for ; Wed, 15 Jan 2025 05:09:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85F94280002; Wed, 15 Jan 2025 00:09:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 81044280001; Wed, 15 Jan 2025 00:09:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B08E280002; Wed, 15 Jan 2025 00:09:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4E336280001 for ; Wed, 15 Jan 2025 00:09:20 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E17BB121010 for ; Wed, 15 Jan 2025 05:09:19 +0000 (UTC) X-FDA: 83008507638.20.048D0F4 Received: from mail-vs1-f42.google.com (mail-vs1-f42.google.com [209.85.217.42]) by imf30.hostedemail.com (Postfix) with ESMTP id 111CC80012 for ; Wed, 15 Jan 2025 05:09:17 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jGtdnnc6; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.42 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736917758; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7fbYeVovEpQNPEsVr+R1bXLYLSrZasFYAhkIJyzquyg=; b=bbwZq4UAvzWlUwVfUHLQr8fYSYw2M9Ng/+p/XDXAq0mAagulOMk16sFxsr3DKEXXf7k46n Sqcr4zhkBeSebZECpOCix3hfXONvC4TO7ZDcJQid84Ondws8ZgQ/gpAGD31yUx41pVw0dx PY+Ku6t50qgeLFtZgI1I+1dAqpcUk2M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736917758; a=rsa-sha256; cv=none; b=Y1V+lIa3qfi3bM8E5+D0Y/EM5VSFNO0aVhS8hqVrImDdxLo1vGI8YqEjiuk42qTHqVLw39 FzGHLCMdxvEjLihCK7u+map68XDawfe277E37ZbDBu/FzQJZjeWmDbh5GO39MRC5RMKD2a m67ViUfI7nhm6yv+ZrIdRIzChG5WjdA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jGtdnnc6; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.42 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f42.google.com with SMTP id ada2fe7eead31-4affbb7ef2dso4563285137.0 for ; Tue, 14 Jan 2025 21:09:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736917757; x=1737522557; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7fbYeVovEpQNPEsVr+R1bXLYLSrZasFYAhkIJyzquyg=; b=jGtdnnc6nbPrq3+Ntt1WhiTns0KngnJ28lHtNHr71vshcPVfIJDvh14+O0WvFEehYr u/jbwNWjz9BE07yzPzc3UT60wVjTu8Ilv7McDSbgkf89He900ek6Os95XBTcZlNAyQ/W 3HOoAaHto1O6Y5BYreZO/E8GcmmDoTFG29pD0cLWzkBuxMBVCt/TlN3Pk6Q4URBetTLO OZ3MAnBMNlckT3I/tF1Z8Q6IjOirfqz8HjSlinnFw2gRHI9aOMDCzh+/qDlybdBniSAd cSfk/lNcBQvITLU0MrZAcGpRPV5eh8mJeegIeqASwQDHU9mzN8pBRc7UBmsjlqVspmxy OxUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736917757; x=1737522557; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7fbYeVovEpQNPEsVr+R1bXLYLSrZasFYAhkIJyzquyg=; b=dUTdiDG9HpC2AyU1h2wEG/k1LmSW+mFxKtuDKSGbagNXXXQvzcOi6LUnBE81LcbmGD 4kyoFBk6NFyquX9KQzBgdZXDMpu79GvovumHgSAftXfaJPP3Dq0fRN+HE7u1G7wdFQ4W ZWJFeyLf4EhoHjvhFMpR6W37sNdV20asPJq6WT6kY6UphTIC1SHlCpQx6Hmb7+t0uvTU 0bfYQy0pBremZcxmHEZJHfgcXHFYqUOSaUzLuDNRVp/O89Oz3sSfbjMBVKBKf9vLPgUj rdrZTzuRBYPdB4SKYaRB5Ix8lu0dhRhyyTqu4YY4NttCpWGkxW0/1CMFDI2oJFd2psF5 baLA== X-Forwarded-Encrypted: i=1; AJvYcCWemVBnhZxYQ9EJypvGVFz71TR1Dk7tHT6pXOSVVYAuCOobzj9Une5ANDkC5/6EP1oLm3xnviMQWQ==@kvack.org X-Gm-Message-State: AOJu0YwMDjLnWMj6V8gbzsrUpYorW9VUEjyYzRWDh5FEgRuar7Bc6jaB Q+PjYxrCCqD1koCJxAlSh1J8eiyvZmwSnBFSAz76YZY2POy1CgAJBlfVgCH4rUUDrSWlD3ofspd fTeMr4m77JPLZk4YnMu2W3jEHiFk= X-Gm-Gg: ASbGncv8yhCjIOQoGPu3HytBIZn6irAxdGlB3qU/bitpixk4jyFgXd931yVHfs7jcpY s8CTWiFmbI1ijwRWvE1zcJj5D4P8yS3Evvqs0NOpRbL0j6XNj2j77Mz22Mcf5mia80yE4AJxE X-Google-Smtp-Source: AGHT+IFmGID4K5/mSLF+58XoWNncJOLjS/s8UHg4Zw0rlCGx0x3SqpLH1MDphU6XepLnS3F8aO89CUrWhzG2ruK/a3M= X-Received: by 2002:a05:6102:2b85:b0:4b1:f3e:882f with SMTP id ada2fe7eead31-4b3d0d76129mr25986421137.1.1736917757021; Tue, 14 Jan 2025 21:09:17 -0800 (PST) MIME-Version: 1.0 References: <20250115033808.40641-1-21cnbao@gmail.com> <20250115033808.40641-5-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Wed, 15 Jan 2025 18:09:06 +1300 X-Gm-Features: AbW1kva4ytoolAfPhuEb-stEXQYisWCWfd6yD_9BD8lvXUvBT6PE0IiICJf-Oqs Message-ID: Subject: Re: [PATCH v3 4/4] mm: Avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap To: Lance Yang Cc: akpm@linux-foundation.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 111CC80012 X-Stat-Signature: unwzkkkkb8oo5fjhw89nhusb838jxmox X-Rspam-User: X-HE-Tag: 1736917757-741121 X-HE-Meta: U2FsdGVkX19FZaogOcrefsamh9Dc2FnW2qxtK7w+QPF/YqkfU9GehUWxQp1ODdcd34CR83lADEOcHq4gQzFzAc1is3biDYefbojgNeQuwQwQHBP4wQAjbov23FzI3c0nn2WlMLHNEpCTjPQ6SIyreKIurPurZ8B5W5beY1Pfd8yT4aKgCOIRwgMOfKElsqm4XyP++k4BDYqPQ2p9F+PVRxHcyGxRSle00+u00g5PfqINb5miHqbT5sDyGJ0ZDaHE2l630hKqwtPBpCjX8c7KVOFLXWGKjn5NxrHxNRFFx/YZ7lRnBE93JnJZ8Y5xFw91clRtu+44BYRJIEsN90htkoL6ODGVV5QOP2H9YwYxudkIRTehq+FvyOk2xdlgvEERsiDXH17VUSkXeIXcb3JrlMA40Op4uwHE9CD0M/MDhp2ZdsF8KH3tzemab3JZjIINFqdJchQOBHbYHIkSjnz/81Yp6QnyrZaYQB4R1hWXxLrlx3P5oQgum17UHuvFRx1ARqTpvhHKiLQ2JUutSsNe+YTHac2s/qIe24DoxGFc7mTd84vj6zmJIyqsbtWski21g1xelQKwgnSe31+ru5JIbLxwgMF0ydBOs+SZEBu+yU7/uh0JkuLjf2umMVY4MAJHvNqlz7fsV6Vb7CS4V/BxoaGsbG5y8ZUshL1SBY88I6E4fn0GRFDDyeKju7Jxnpuo6AOIQBN5VHS5F9oWpoHJLQ9RJOwS/5rjZs+xWRDGG5KwF537zPZuquRH08qFnSgFzquaX93KAGXyCvuBEMr982A/czTXNcqx+px8COk6Y7BSU3q097uSzzdeLgImZFPmYi4oLg7e64/+9qQ1xajqwt7WL7ssqWY9KnEnnqk0ELKKcuYTHUiKsqbkA5CRIGtV2Dy6RS9euT/obie31X8SRWu0y3crEymP2cvy59i117/2+fqy3KmThur+PhfI2sVT4/Ih1gHUYnvBWgHVo9H W+Am4APM 3lPSKIWQzlrJRKztAgj53LjoXHaONXsa6BWwKx3whr57z3EbwaD1WxC9MqvFwKeE4QgI+TVhaR+qNjFrJv9fZSegiBbBvdBjzPsQugS+SqidjRZeOY5FVkm1ysRpKgV3BRtr96E2VQZyPrJxeWqfIJByFWx59sCO4HQRciDi8kB4kT90TBxFfVmq6z1gRyeLB8xLI7EepyEMc8QMGrqq2T6MQcsp6Wy62pnwfeGUzOFanwMnAAhmp5Ld1p0saiQhAsQU7hXByVcqxqhOx0JH345t4U2sZltY/Wr+xB66uOgKPQH4wlZrif1GqypJdNCymwobXWCI+4uXCi3j1IdicthGAM3NfGsP9Nmu3G4FGDZcmoOYeoragJy5zPjgg6R8nxbkIbomGM/PCSmfm37LMCoAswnStW4qTsUP81qFvW3kMoz3owx36HPaiQg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 15, 2025 at 6:01=E2=80=AFPM Lance Yang wr= ote: > > On Wed, Jan 15, 2025 at 11:38=E2=80=AFAM Barry Song <21cnbao@gmail.com> w= rote: > > > > From: Barry Song > > > > The try_to_unmap_one() function currently handles PMD-mapped THPs > > inefficiently. It first splits the PMD into PTEs, copies the dirty > > state from the PMD to the PTEs, iterates over the PTEs to locate > > the dirty state, and then marks the THP as swap-backed. This process > > involves unnecessary PMD splitting and redundant iteration. Instead, > > this functionality can be efficiently managed in > > __discard_anon_folio_pmd_locked(), avoiding the extra steps and > > improving performance. > > > > The following microbenchmark redirties folios after invoking MADV_FREE, > > then measures the time taken to perform memory reclamation (actually > > set those folios swapbacked again) on the redirtied folios. > > > > #include > > #include > > #include > > #include > > > > #define SIZE 128*1024*1024 // 128 MB > > > > int main(int argc, char *argv[]) > > { > > while(1) { > > volatile int *p =3D mmap(0, SIZE, PROT_READ | PROT_WRIT= E, > > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > > > memset((void *)p, 1, SIZE); > > madvise((void *)p, SIZE, MADV_FREE); > > /* redirty after MADV_FREE */ > > memset((void *)p, 1, SIZE); > > > > clock_t start_time =3D clock(); > > madvise((void *)p, SIZE, MADV_PAGEOUT); > > clock_t end_time =3D clock(); > > > > double elapsed_time =3D (double)(end_time - start_time)= / CLOCKS_PER_SEC; > > printf("Time taken by reclamation: %f seconds\n", elaps= ed_time); > > > > munmap((void *)p, SIZE); > > } > > return 0; > > } > > > > Testing results are as below, > > w/o patch: > > ~ # ./a.out > > Time taken by reclamation: 0.007300 seconds > > Time taken by reclamation: 0.007226 seconds > > Time taken by reclamation: 0.007295 seconds > > Time taken by reclamation: 0.007731 seconds > > Time taken by reclamation: 0.007134 seconds > > Time taken by reclamation: 0.007285 seconds > > Time taken by reclamation: 0.007720 seconds > > Time taken by reclamation: 0.007128 seconds > > Time taken by reclamation: 0.007710 seconds > > Time taken by reclamation: 0.007712 seconds > > Time taken by reclamation: 0.007236 seconds > > Time taken by reclamation: 0.007690 seconds > > Time taken by reclamation: 0.007174 seconds > > Time taken by reclamation: 0.007670 seconds > > Time taken by reclamation: 0.007169 seconds > > Time taken by reclamation: 0.007305 seconds > > Time taken by reclamation: 0.007432 seconds > > Time taken by reclamation: 0.007158 seconds > > Time taken by reclamation: 0.007133 seconds > > =E2=80=A6 > > > > w/ patch > > > > ~ # ./a.out > > Time taken by reclamation: 0.002124 seconds > > Time taken by reclamation: 0.002116 seconds > > Time taken by reclamation: 0.002150 seconds > > Time taken by reclamation: 0.002261 seconds > > Time taken by reclamation: 0.002137 seconds > > Time taken by reclamation: 0.002173 seconds > > Time taken by reclamation: 0.002063 seconds > > Time taken by reclamation: 0.002088 seconds > > Time taken by reclamation: 0.002169 seconds > > Time taken by reclamation: 0.002124 seconds > > Time taken by reclamation: 0.002111 seconds > > Time taken by reclamation: 0.002224 seconds > > Time taken by reclamation: 0.002297 seconds > > Time taken by reclamation: 0.002260 seconds > > Time taken by reclamation: 0.002246 seconds > > Time taken by reclamation: 0.002272 seconds > > Time taken by reclamation: 0.002277 seconds > > Time taken by reclamation: 0.002462 seconds > > =E2=80=A6 > > > > This patch significantly speeds up try_to_unmap_one() by allowing it > > to skip redirtied THPs without splitting the PMD. > > > > Suggested-by: Baolin Wang > > Suggested-by: Lance Yang > > Signed-off-by: Barry Song > > --- > > mm/huge_memory.c | 24 +++++++++++++++++------- > > mm/rmap.c | 13 ++++++++++--- > > 2 files changed, 27 insertions(+), 10 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 3d3ebdc002d5..47cc8c3f8f80 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -3070,8 +3070,12 @@ static bool __discard_anon_folio_pmd_locked(stru= ct vm_area_struct *vma, > > int ref_count, map_count; > > pmd_t orig_pmd =3D *pmdp; > > > > - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd)) > > + if (pmd_dirty(orig_pmd)) > > + folio_set_dirty(folio); > > + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE))= { > > + folio_set_swapbacked(folio); > > return false; > > + } > > If either the PMD or the folio is dirty, should we just return false righ= t away, > regardless of VM_DROPPABLE? There=E2=80=99s no need to proceed further in= that > case, IMHO ;) I don't quite understand you, but we need to proceed to clear pmd entry. if vm_droppable is true, even if the folio is dirty, we still drop the foli= o. > > Thanks, > Lance > > > > > orig_pmd =3D pmdp_huge_clear_flush(vma, addr, pmdp); > > > > @@ -3098,8 +3102,15 @@ static bool __discard_anon_folio_pmd_locked(stru= ct vm_area_struct *vma, > > * > > * The only folio refs must be one from isolation plus the rmap= (s). > > */ > > - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd) || > > - ref_count !=3D map_count + 1) { > > + if (pmd_dirty(orig_pmd)) > > + folio_set_dirty(folio); > > + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE))= { > > + folio_set_swapbacked(folio); > > + set_pmd_at(mm, addr, pmdp, orig_pmd); > > + return false; > > + } > > + > > + if (ref_count !=3D map_count + 1) { > > set_pmd_at(mm, addr, pmdp, orig_pmd); > > return false; > > } > > @@ -3119,12 +3130,11 @@ bool unmap_huge_pmd_locked(struct vm_area_struc= t *vma, unsigned long addr, > > { > > VM_WARN_ON_FOLIO(!folio_test_pmd_mappable(folio), folio); > > VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > > + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); > > + VM_WARN_ON_FOLIO(folio_test_swapbacked(folio), folio); > > VM_WARN_ON_ONCE(!IS_ALIGNED(addr, HPAGE_PMD_SIZE)); > > > > - if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) > > - return __discard_anon_folio_pmd_locked(vma, addr, pmdp,= folio); > > - > > - return false; > > + return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); > > } > > > > static void remap_page(struct folio *folio, unsigned long nr, int flag= s) > > diff --git a/mm/rmap.c b/mm/rmap.c > > index be1978d2712d..a859c399ec7c 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1724,9 +1724,16 @@ static bool try_to_unmap_one(struct folio *folio= , struct vm_area_struct *vma, > > } > > > > if (!pvmw.pte) { > > - if (unmap_huge_pmd_locked(vma, pvmw.address, pv= mw.pmd, > > - folio)) > > - goto walk_done; > > + if (folio_test_anon(folio) && !folio_test_swapb= acked(folio)) { > > + if (unmap_huge_pmd_locked(vma, pvmw.add= ress, pvmw.pmd, folio)) > > + goto walk_done; > > + /* > > + * unmap_huge_pmd_locked has either alr= eady marked > > + * the folio as swap-backed or decided = to retain it > > + * due to GUP or speculative references= . > > + */ > > + goto walk_abort; > > + } > > > > if (flags & TTU_SPLIT_HUGE_PMD) { > > /* > > -- > > 2.39.3 (Apple Git-146) > >