From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B67ABC02180 for ; Wed, 15 Jan 2025 05:01:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F36096B0088; Wed, 15 Jan 2025 00:01:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EE64D6B0089; Wed, 15 Jan 2025 00:01:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DADED6B008A; Wed, 15 Jan 2025 00:01:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BDB126B0088 for ; Wed, 15 Jan 2025 00:01:57 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4E4A61C7CE5 for ; Wed, 15 Jan 2025 05:01:57 +0000 (UTC) X-FDA: 83008489074.20.7984822 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf14.hostedemail.com (Postfix) with ESMTP id 629E810000F for ; Wed, 15 Jan 2025 05:01:55 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kUbIn4UJ; spf=pass (imf14.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736917315; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bpjwKaFokunhUDCn+gTRF/HY7rsYKAP4TBiw2uLyuvU=; b=S6CLUGBmMfL5VJGvO6bLV5U1WscmgfntMhSmPs+8G8jqOVFDggw4D8L5ZCrsvowdQraNnH /vzBvireo/ZNhJlScqAMdB4Xs+BzbRm+feAFVekH04ZFsaiZdA3zodPeLyGh4s0a5FTozY EyH7khZE9o9/ssD/HvaoY6owOj+JKws= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kUbIn4UJ; spf=pass (imf14.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736917315; a=rsa-sha256; cv=none; b=h/I3kn2Sbb/EPryjoVT7G4Ojd8428rDcwAfT/mWIbJL4VsmWmz3PyBkplbUmo3E7v/pKNj k/Bo02C5LOuLWyCtArhZ/FuV6HOx39zDTPAUMHfPc6vYgN+5WemMIGHv1lBpn1RdT6ByvF r67/h4r+PnHtOyes1Q5p1nR/bAdfqqU= Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-5d3d2a30afcso10398233a12.3 for ; Tue, 14 Jan 2025 21:01:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736917314; x=1737522114; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bpjwKaFokunhUDCn+gTRF/HY7rsYKAP4TBiw2uLyuvU=; b=kUbIn4UJbeI7pajz869kfFAVMvQVC7AEQOm1U/j2NeVdQmnnXQaUdOzTmqrg6qUKJW McPUSSC0zR5Xc7KVZiCfHFs6QVhIqA7xq793+a1qjFtEC5nx6wLGFxMYj6XXsSdlYemu hs92awojKfyv/leK8wVCtdKQcz7zITjtW64wsy0SRDIRRu/f/TnGbVsIzxZz0DQfAsc6 iZnNWKNE7dUYWXGL5jKvJ+wAFj6iVJHY85pwL2aFSnQ56mgRxhpmdn1u7V74Pnfa8mAc ye12q1lcUXLQqMH3NnXw37CygOjJBybJXHqAIekktinRGJCBTj2ly01P43gtDQsHnHeq KXAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736917314; x=1737522114; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bpjwKaFokunhUDCn+gTRF/HY7rsYKAP4TBiw2uLyuvU=; b=RxepaAhemB5QWvyo3SUd/qMPuQDXKjzGn1tUSK/uwZkC/yGXu8eIBWMLSqeTERrpy6 PZ5JaJstvCAuQjf/7a1Q8KDCP+/lNi68W1JrRN/6PrygSJnIINMGXv5KxE4RhJuECacD ZepiT0a51xrg7koRWwaMCOGr+sZTYjyqGubQNLRaTw5CV0WTU4IcC2x6KKefPk+ERiW3 nJMsN+hR16Ox6zhdNmaii+keDetnMp0Gz4zD5oSmQmhSfM5LNgjaYUbtiQ4iK74gc6RO eCQB/sKeAGNDv43cYFh7doaMuqQANuafcmjhfaoR8XG0PD5LvwOVvyGc/Nm1J/Q1mpvi y8Qw== X-Forwarded-Encrypted: i=1; AJvYcCXa//Op1/y12kMpMYNn5ijn6o9yPruz/zfBaHveV4mpPLor1ZWwbaFHmzatBtjfldTbLFKxEoBSAg==@kvack.org X-Gm-Message-State: AOJu0YySQFU4fuiHUak2RR4iE9bVvMSSZgtCbElf8rFJbvqQxkJ8WGn3 qXCOxcsFc73WtrlWDly3+2P2SQFpD0we2Rgmli5DfF2WDuYjnnK7gJfJVq1w+Vracnad+8LMXnx HvFL2UQCFCfO/bmgj5UzZ07YV3r8= X-Gm-Gg: ASbGncu2Ybf/moHl6ETRG8cYU+lYrEwA5tXjrzNaK6KsyN1UXcgQPjcWwtwUVlf0El6 sH1daPPkVkfKwU1MDG/d59ukIDUuG2GbulNoK+g== X-Google-Smtp-Source: AGHT+IHceNGH9qkj3lFCn0ed4ep2WR2I07p3gIVDoQCCIubWvsQ93/RAcStI6icLSGHpztzeKXwp5UPmBTQXAHdFcYw= X-Received: by 2002:a05:6402:358c:b0:5d2:be16:de1f with SMTP id 4fb4d7f45d1cf-5d972e4cc0fmr24995916a12.23.1736917313425; Tue, 14 Jan 2025 21:01:53 -0800 (PST) MIME-Version: 1.0 References: <20250115033808.40641-1-21cnbao@gmail.com> <20250115033808.40641-5-21cnbao@gmail.com> In-Reply-To: <20250115033808.40641-5-21cnbao@gmail.com> From: Lance Yang Date: Wed, 15 Jan 2025 13:01:16 +0800 X-Gm-Features: AbW1kvYZxRhl5faO24sqrqwlpz3djqt4jY4QfeBoUbeiR1_VP6OzN66fRVqwvPg Message-ID: Subject: Re: [PATCH v3 4/4] mm: Avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Stat-Signature: zuunogggxrhq4m3m9t1bqogunert3m35 X-Rspamd-Queue-Id: 629E810000F X-Rspam-User: X-HE-Tag: 1736917315-23500 X-HE-Meta: U2FsdGVkX19wqJaOgWWMGnukQqzkv5KHkccN426uuCG6F5Fot4+Ngyp1oBgfpo2C6L/BtecTl+mNBx2+d7nUDKRVVlRMSQYwbNndODGjVwK16ZPieuzHfJrQtD15D2+p9h4XU14xOpGmwlsmIussSWO7K0NqcOgEoQBdNvV3BGRZv8qzLmPdz+AU3G6et+C1pVlJ2q4s4j8H7MZpu7ViVYvuAa16PQEjOZG/IWKEQQl8dmviI0zQl92TG5HvjPSQAuZU/Ee32BEKFSpcHK+qMlPf+1m+gskFwsL7FIvXQCqMOU3ZbV6RFMJT/wmEyfqMsYixj07j1JuZJ+3ZDFMWWnti/aobjFK7raVw9i72FHOOaYzr8eJUHzSr0Mpjyj43DvKu6Xxk8441OQk1m0/gIkIqx5cstedGrEUnBQ8jnSddP5pZpQXXMz13n+ApQq6RRigsxyr+KpPDJUoLt8YkaKMm3RzvhVnmXI9UTKzVFUe1tDifASPSsW40C250zLLrZhJSnUdqHkDjURbJSxYt8kikQs72CapVYZjp2izMMyA/3aDWnk+3QpHEjFjii0EAJAqkuNh6Id67p5IckcoU8aJyPOdJo4L8uhtMz7nULGHEMU9W9Z5UTycqihMQ3eqHkI3xOvC6bAhJqycVhg9jurFCvYtk/hJvKcSaAVffnpi7+wpxGV3xEjo3vFn9el6MgDOW9PD7btnhLC+lDqmAE3vmse/LRvOUilCpfV/kwEhlHIi0QbpIogv29/i5m94bQG4adGlmyACzYru9VbbiP6s/YU/3w44+g102+l5OVaVqK59o2Erq6CPSECWw/TsO+CulLqv+b0QeZggEfHFUPm/himb8KqD6b/7s3nILq7nycp03vULLUuAmwHVHDSs8U5F9Kfjea+FAhmdwxNg/1aLaVbxzfvpkCjcaldNgwi8Fl3XW3nM+izafNepf5E2nMyx0Z19VAM8F41xVB2f JIEIil7k 19dONToqYchD0AUNqgwUoFcdD0+EvhJjIpUs5TZ0bobqF1sP0h7LrdklRVzT90Qu+76VVKnfRrIwvkVxRBuQuPXqVkJeQSXG3FH2Wn5+0W47ACFOpFlPkwv/90GyL+U7U7gOrN6JyUXown61eSH+u1fREYEbQfGJxCGOT3+CAKBiuRkOb6FwITVMuxWOs4aOpnqEUu88YGNfYptdFTJapjBtWR6H8moDVbjUFWrMBW1gWp4EOt9KuQBcCEh2fee/+Kjp+4dRrrbgDmcBTStnByF4ZHo5nDJsVPUQYMi/rOuxbQ50cQlucUFqP+nk4q6qCIFnSyzTS+74+VvaYsZwyBGp8Tar1fMPQnXP7hzN+7Vbk7hkA6nwJyI5322abRpGFMYS1cxZRA2gAyoPt182s9YzT2/QlxHGTMptybgHa3+UG+fraq5gttIrjpw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 15, 2025 at 11:38=E2=80=AFAM Barry Song <21cnbao@gmail.com> wro= te: > > From: Barry Song > > The try_to_unmap_one() function currently handles PMD-mapped THPs > inefficiently. It first splits the PMD into PTEs, copies the dirty > state from the PMD to the PTEs, iterates over the PTEs to locate > the dirty state, and then marks the THP as swap-backed. This process > involves unnecessary PMD splitting and redundant iteration. Instead, > this functionality can be efficiently managed in > __discard_anon_folio_pmd_locked(), avoiding the extra steps and > improving performance. > > The following microbenchmark redirties folios after invoking MADV_FREE, > then measures the time taken to perform memory reclamation (actually > set those folios swapbacked again) on the redirtied folios. > > #include > #include > #include > #include > > #define SIZE 128*1024*1024 // 128 MB > > int main(int argc, char *argv[]) > { > while(1) { > volatile int *p =3D mmap(0, SIZE, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > memset((void *)p, 1, SIZE); > madvise((void *)p, SIZE, MADV_FREE); > /* redirty after MADV_FREE */ > memset((void *)p, 1, SIZE); > > clock_t start_time =3D clock(); > madvise((void *)p, SIZE, MADV_PAGEOUT); > clock_t end_time =3D clock(); > > double elapsed_time =3D (double)(end_time - start_time) /= CLOCKS_PER_SEC; > printf("Time taken by reclamation: %f seconds\n", elapsed= _time); > > munmap((void *)p, SIZE); > } > return 0; > } > > Testing results are as below, > w/o patch: > ~ # ./a.out > Time taken by reclamation: 0.007300 seconds > Time taken by reclamation: 0.007226 seconds > Time taken by reclamation: 0.007295 seconds > Time taken by reclamation: 0.007731 seconds > Time taken by reclamation: 0.007134 seconds > Time taken by reclamation: 0.007285 seconds > Time taken by reclamation: 0.007720 seconds > Time taken by reclamation: 0.007128 seconds > Time taken by reclamation: 0.007710 seconds > Time taken by reclamation: 0.007712 seconds > Time taken by reclamation: 0.007236 seconds > Time taken by reclamation: 0.007690 seconds > Time taken by reclamation: 0.007174 seconds > Time taken by reclamation: 0.007670 seconds > Time taken by reclamation: 0.007169 seconds > Time taken by reclamation: 0.007305 seconds > Time taken by reclamation: 0.007432 seconds > Time taken by reclamation: 0.007158 seconds > Time taken by reclamation: 0.007133 seconds > =E2=80=A6 > > w/ patch > > ~ # ./a.out > Time taken by reclamation: 0.002124 seconds > Time taken by reclamation: 0.002116 seconds > Time taken by reclamation: 0.002150 seconds > Time taken by reclamation: 0.002261 seconds > Time taken by reclamation: 0.002137 seconds > Time taken by reclamation: 0.002173 seconds > Time taken by reclamation: 0.002063 seconds > Time taken by reclamation: 0.002088 seconds > Time taken by reclamation: 0.002169 seconds > Time taken by reclamation: 0.002124 seconds > Time taken by reclamation: 0.002111 seconds > Time taken by reclamation: 0.002224 seconds > Time taken by reclamation: 0.002297 seconds > Time taken by reclamation: 0.002260 seconds > Time taken by reclamation: 0.002246 seconds > Time taken by reclamation: 0.002272 seconds > Time taken by reclamation: 0.002277 seconds > Time taken by reclamation: 0.002462 seconds > =E2=80=A6 > > This patch significantly speeds up try_to_unmap_one() by allowing it > to skip redirtied THPs without splitting the PMD. > > Suggested-by: Baolin Wang > Suggested-by: Lance Yang > Signed-off-by: Barry Song > --- > mm/huge_memory.c | 24 +++++++++++++++++------- > mm/rmap.c | 13 ++++++++++--- > 2 files changed, 27 insertions(+), 10 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 3d3ebdc002d5..47cc8c3f8f80 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3070,8 +3070,12 @@ static bool __discard_anon_folio_pmd_locked(struct= vm_area_struct *vma, > int ref_count, map_count; > pmd_t orig_pmd =3D *pmdp; > > - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd)) > + if (pmd_dirty(orig_pmd)) > + folio_set_dirty(folio); > + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { > + folio_set_swapbacked(folio); > return false; > + } If either the PMD or the folio is dirty, should we just return false right = away, regardless of VM_DROPPABLE? There=E2=80=99s no need to proceed further in t= hat case, IMHO ;) Thanks, Lance > > orig_pmd =3D pmdp_huge_clear_flush(vma, addr, pmdp); > > @@ -3098,8 +3102,15 @@ static bool __discard_anon_folio_pmd_locked(struct= vm_area_struct *vma, > * > * The only folio refs must be one from isolation plus the rmap(s= ). > */ > - if (folio_test_dirty(folio) || pmd_dirty(orig_pmd) || > - ref_count !=3D map_count + 1) { > + if (pmd_dirty(orig_pmd)) > + folio_set_dirty(folio); > + if (folio_test_dirty(folio) && !(vma->vm_flags & VM_DROPPABLE)) { > + folio_set_swapbacked(folio); > + set_pmd_at(mm, addr, pmdp, orig_pmd); > + return false; > + } > + > + if (ref_count !=3D map_count + 1) { > set_pmd_at(mm, addr, pmdp, orig_pmd); > return false; > } > @@ -3119,12 +3130,11 @@ bool unmap_huge_pmd_locked(struct vm_area_struct = *vma, unsigned long addr, > { > VM_WARN_ON_FOLIO(!folio_test_pmd_mappable(folio), folio); > VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); > + VM_WARN_ON_FOLIO(folio_test_swapbacked(folio), folio); > VM_WARN_ON_ONCE(!IS_ALIGNED(addr, HPAGE_PMD_SIZE)); > > - if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) > - return __discard_anon_folio_pmd_locked(vma, addr, pmdp, f= olio); > - > - return false; > + return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); > } > > static void remap_page(struct folio *folio, unsigned long nr, int flags) > diff --git a/mm/rmap.c b/mm/rmap.c > index be1978d2712d..a859c399ec7c 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1724,9 +1724,16 @@ static bool try_to_unmap_one(struct folio *folio, = struct vm_area_struct *vma, > } > > if (!pvmw.pte) { > - if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw= .pmd, > - folio)) > - goto walk_done; > + if (folio_test_anon(folio) && !folio_test_swapbac= ked(folio)) { > + if (unmap_huge_pmd_locked(vma, pvmw.addre= ss, pvmw.pmd, folio)) > + goto walk_done; > + /* > + * unmap_huge_pmd_locked has either alrea= dy marked > + * the folio as swap-backed or decided to= retain it > + * due to GUP or speculative references. > + */ > + goto walk_abort; > + } > > if (flags & TTU_SPLIT_HUGE_PMD) { > /* > -- > 2.39.3 (Apple Git-146) >