From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60714C7115C for ; Wed, 25 Jun 2025 10:39:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFD4F6B00CC; Wed, 25 Jun 2025 06:39:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED4806B00CD; Wed, 25 Jun 2025 06:39:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E12166B00CE; Wed, 25 Jun 2025 06:39:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D1E1A6B00CC for ; Wed, 25 Jun 2025 06:39:12 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9B709BEE9C for ; Wed, 25 Jun 2025 10:39:12 +0000 (UTC) X-FDA: 83593575744.02.4C74E63 Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf10.hostedemail.com (Postfix) with ESMTP id C6C9CC000C for ; Wed, 25 Jun 2025 10:39:10 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mKjMVR3J; spf=pass (imf10.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750847950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YlzXXksC6vXcwEWp4Ubz93RuG+XsbFoLo1oGd730hiA=; b=CifE8U7I7Nl7R2Whr9bdADWyX9KSN+NwIGArRW/pdZv2kgFFGqoTV7+4C11BVjmcRLZQu9 gF+m9uZXuhY46LNrG5g/+j+lvT9MBKnka6Ytu/S93dkggc6iNWG9GtRDH1g6ag31q1stI8 9MBb8mochJVFUHovOGBYFQtXcwuJRmE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750847950; a=rsa-sha256; cv=none; b=Pvh7d9K1dESkagrqTWJHgO9Bzehe2yBV0N+nAMTG04FAhRl4ytNvjHkhbn4TMoEnSsX+zO khX2O0zbP3cRNdYqlGAZONVOc2Hg/ay4Yag41/8R6iwkwDRiwNOxAnLX+40SxrI42pm3Wn stZPiKf1/kVYRJNjxTT5ZxYfoggk0Fk= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mKjMVR3J; spf=pass (imf10.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-4e7feaae0e1so4606698137.2 for ; Wed, 25 Jun 2025 03:39:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750847950; x=1751452750; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YlzXXksC6vXcwEWp4Ubz93RuG+XsbFoLo1oGd730hiA=; b=mKjMVR3JAGqNbbGUeH6XoG6H5kU/W5zTACbZGWOKlZJZQPvowHMpp7XwQtcs4nR/iC Zne4Qa1cUbRvTk4tQogTIo5zugYPXyQzcKupbbaUbjhertvXTmZ5s7mhu4Jj93dp7X2c JfZ5IySSWnubL63W5BE5jlNf4LogrEZdZqUYu5Hk/sSkc6JcGPIg/jXPQNBMJ9ekMtT6 EplIggjLM0QIYIop3MfLEQcuV7rHE8eOvREJMBd2D8b7ZzE6imGRQ4v7/gS7ELT5A0LQ lELWzVUYXFkvvbwBGTwGsvvJ6FVQfaQIdUK7WAS8ia/mPDo12DSz+qJB9P3VgX2e4Jnw A5iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750847950; x=1751452750; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YlzXXksC6vXcwEWp4Ubz93RuG+XsbFoLo1oGd730hiA=; b=V5Z4RyYj4OUEV/OoktQfYTuk+QVQuATsEivCa1gADgtCr9+uNzP7odv8NklwFmulQ9 RkLBh2Ry5tLBieRSk17lNBse/s+mdJxlcBKv4YDhknE1ye4PeTBiQ1nEjKIdSm7kGrkQ DWYE9na9oODpc3AFELmd+Q7pZyikiPsC9Hbw7UWG81O9TpUg1vkfQ4mvOPYbGXX3qzJM YNZafuBN5bcD7IvyA2AwOSoEo0Jthw1g5f+vZGuK5/UOnkeQ75DRn1Jch8f5eds9Vmdx yzkI3LLfVkspQVXZ/8RG5/xNZV71tHVEdPn5wraZO7f2v0na11x1YVi7Xf6OqJal2v+a DbZg== X-Forwarded-Encrypted: i=1; AJvYcCWKOs4v0Svwn1JBT6qa4N/pEaVPhZSCBcPgv9v6Wfya+GkehZmu5BqpNkbquFxkcn2lSDVh+R5cMQ==@kvack.org X-Gm-Message-State: AOJu0Yzl2b0ET/OBbLaGP6oTda44GOQLyAtLDJJbOn8xkrKHBNIYsF49 2V6UumA+k4rCULzF/7e4pURzpBqpyQLE448mLb/BUQR5LB8GgFvPxB2EcbtWXBk2+0wcdt5W7NU RvYILQmZ7AxW6v0xBZt7tQK7Reyr5tAA= X-Gm-Gg: ASbGncvcZOr2O2ujKlfm4mQnEG4shlgn+qNxNSNKB2ViMXjHgU8NTneBlhCBOEYcH2b MNuHWYPvbkl+AtM1+cEccIwAlk9LKN1f8pB24XpP0uU/G1TsHVZtDGZI3i8JK+P/etCc1mc/z8y ugzCjgeMYTvEl7bqaxc2Hoe264+zoBBC7IEAbzYPjZfTk4Th3REMJiGA== X-Google-Smtp-Source: AGHT+IG24obpqrnDUZOrsCnGBhTMkO5P7QnbH3Bpi0PqgrwKSJcAQukZsZf78Wd+0Bu3XZMIHzdxuYrYGefspIVUODE= X-Received: by 2002:a05:6102:5107:b0:4e6:df73:f147 with SMTP id ada2fe7eead31-4ecc6a7a5e2mr1211118137.11.1750847949712; Wed, 25 Jun 2025 03:39:09 -0700 (PDT) MIME-Version: 1.0 References: <2c19a6cf-0b42-477b-a672-ed8c1edd4267@redhat.com> <20250624162503.78957-1-ioworker0@gmail.com> <27d174e0-c209-4851-825a-0baeb56df86f@redhat.com> In-Reply-To: <27d174e0-c209-4851-825a-0baeb56df86f@redhat.com> From: Barry Song <21cnbao@gmail.com> Date: Wed, 25 Jun 2025 22:38:57 +1200 X-Gm-Features: Ac12FXxBwpebbLn5QzN4mzU9-3AOYto8c1Frvun91yLq31aKsLs2IVptkEujUM4 Message-ID: Subject: Re: [PATCH v4 3/4] mm: Support batched unmap for lazyfree large folios during reclamation To: David Hildenbrand Cc: Lance Yang , akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C6C9CC000C X-Rspam-User: X-Rspamd-Server: rspam06 X-Stat-Signature: 38fh1g7cxo75dzy71gpizorgo5z44zyy X-HE-Tag: 1750847950-743713 X-HE-Meta: U2FsdGVkX1/N6R+JAWk4zr+JCb/e+9MDZaeCk5uYF/IKNiN1YULWeJzVislfi7+g8rcqfLc3uYU3JhESLFpTNBNOrqcKCHwRKMZDsNoiiBeGGC9nHjbovl6wIUf1MdnutGLen+1WzuTqwgrjD13HSIpmhXDRnBw8TkSVzX4O/khG0DZrwJsCfx+xDarqje/yU16Y2iFJzbrdN92HL98j61DeLFR8PvEb7p5M8HtdbGezHrwiIJWu2904UlM6LYJ79Wkqxg+NC4wdZBPRW7erOrfKpgKt4r/cgidzGErHCWnAg31pCcfXnEfAtlCwjinPuUmd3K6lF3SngnpV8AiSAVLV3zMaIl+udRz++BeGafftv/WT53U1Y/6C4VOZXuQQAH5Nhal+g+0DcvcHYTPzQYP85M77+/trwZiWDyQiktL4im1Sjq5fB9US9Gs0fRRATnR3zoOgZNNOeNuH0Rg6dGoK9POwlsWWTitF8NcoxRZiv7qVHHI8fysqeiEB/uAVvxxw18GPWVl9p1FVFmHTu9HdQh7eRe8CNw2khwlmh0MQs1trYrPPYeO39I4Ruqj37zkg+FSS/WF5IQ7xuIi6CMIBktyDcsnKFBZzMwnPD8nzNQy+gNnZSYodqx7AKoKESKLpTGaxrCsXazGG5hTfNQ4rbdzJ58i/c0AC4X20pVR1xXBZNlRmz65P4cGQH7v3HC55ELIEZwHonPWy3A1jXiRZhUHK1evWqCsBlyAgGsdIS5pIpIsSSikL1qEWK6DxDtv2m+p6EtwrE0OYFR+kxBGJ7aRNrRw9mbJRBWhlW4+gFTTTqTmAf3RXO8dgobNkY1gv6IKRKNpy8fHzZ6jADB6PZsTGbyK0XL+hxAVY8gUlcXWFN7IzuPGWvliJvEDWGc6Ps7x5rcUPnjqkKUSHaqLl1MT0u23Os7oy3dAcB/45YlY1bbF/jnJr/aX5uTn5mKRdTrVhPuR2DChbnXG iG47RZd7 pWds7w7BCfamL19IEvfrbwJgqov3R30r5MMB5F5ojCjxpdAPWGGXASvlUpWtgjwOcGtMNLQB+QYEyrsleYNOcEBnOexksYC9nbYogFGnt98z1cqtd6YF8iIgtBw40EH89pReedCdQPkV9v/m00i1oALF/QthIN3P3NtkbjtH1vMGCbHiG++PtoRP3JsWV1SLCG2mnzVnVf1F6wkW1TlISP58UUAPwBY0dndf4N0iNGDg8lhlv+fdeeIfh1ORJpuYbo7KJ87bYIW1XEDFXAwEa1EuJ/SQShT88uyeo9xrVk1FYkCl3XM9JsQFhyHljDDAyL+2nR8kMIcjZ2s6QE5QPo04yqKusSihsqRvVJkyPmma0gdvwqF65w8M6K/0eQoQpZuwMnvCWgjiGakhryNr5JZ/BeBW1XWkc1V41OApf9NMT+FTONOh0Prag+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > > diff --git a/mm/rmap.c b/mm/rmap.c > > index fb63d9256f09..241d55a92a47 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1847,12 +1847,25 @@ void folio_remove_rmap_pud(struct folio *folio,= struct page *page, > > > > /* We support batch unmapping of PTEs for lazyfree large folios */ > > static inline bool can_batch_unmap_folio_ptes(unsigned long addr, > > - struct folio *folio, pte_t *ptep) > > + struct folio *folio, pte_t = *ptep, > > + struct vm_area_struct *vma) > > { > > const fpb_t fpb_flags =3D FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRT= Y; > > + unsigned long next_pmd, vma_end, end_addr; > > int max_nr =3D folio_nr_pages(folio); > > pte_t pte =3D ptep_get(ptep); > > > > + /* > > + * Limit the batch scan within a single VMA and within a single > > + * page table. > > + */ > > + vma_end =3D vma->vm_end; > > + next_pmd =3D ALIGN(addr + 1, PMD_SIZE); > > + end_addr =3D addr + (unsigned long)max_nr * PAGE_SIZE; > > + > > + if (end_addr > min(next_pmd, vma_end)) > > + return false; > > May I suggest that we clean all that up as we fix it? > > Maybe something like this: > > diff --git a/mm/rmap.c b/mm/rmap.c > index 3b74bb19c11dd..11fbddc6ad8d6 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1845,23 +1845,38 @@ void folio_remove_rmap_pud(struct folio *folio, s= truct page *page, > #endif > } > > -/* We support batch unmapping of PTEs for lazyfree large folios */ > -static inline bool can_batch_unmap_folio_ptes(unsigned long addr, > - struct folio *folio, pte_t *ptep) > +static inline unsigned int folio_unmap_pte_batch(struct folio *folio, > + struct page_vma_mapped_walk *pvmw, enum ttu_flags flags, > + pte_t pte) > { > const fpb_t fpb_flags =3D FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIR= TY; > - int max_nr =3D folio_nr_pages(folio); > - pte_t pte =3D ptep_get(ptep); > + struct vm_area_struct *vma =3D pvmw->vma; > + unsigned long end_addr, addr =3D pvmw->address; > + unsigned int max_nr; > + > + if (flags & TTU_HWPOISON) > + return 1; > + if (!folio_test_large(folio)) > + return 1; > + > + /* We may only batch within a single VMA and a single page table.= */ > + end_addr =3D min_t(unsigned long, ALIGN(addr + 1, PMD_SIZE), vma-= >vm_end); Is this pmd_addr_end()? > + max_nr =3D (end_addr - addr) >> PAGE_SHIFT; > > + /* We only support lazyfree batching for now ... */ > if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) > - return false; > + return 1; > if (pte_unused(pte)) > - return false; > - if (pte_pfn(pte) !=3D folio_pfn(folio)) > - return false; > + return 1; > + /* ... where we must be able to batch the whole folio. */ > + if (pte_pfn(pte) !=3D folio_pfn(folio) || max_nr !=3D folio_nr_pa= ges(folio)) > + return 1; > + max_nr =3D folio_pte_batch(folio, addr, pvmw->pte, pte, max_nr, f= pb_flags, > + NULL, NULL, NULL); > > - return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags,= NULL, > - NULL, NULL) =3D=3D max_nr; > + if (max_nr !=3D folio_nr_pages(folio)) > + return 1; > + return max_nr; > } > > /* > @@ -2024,9 +2039,7 @@ static bool try_to_unmap_one(struct folio *folio, s= truct vm_area_struct *vma, > if (pte_dirty(pteval)) > folio_mark_dirty(folio); > } else if (likely(pte_present(pteval))) { > - if (folio_test_large(folio) && !(flags & TTU_HWPO= ISON) && > - can_batch_unmap_folio_ptes(address, folio, pv= mw.pte)) > - nr_pages =3D folio_nr_pages(folio); > + nr_pages =3D folio_unmap_pte_batch(folio, &pvmw, = flags, pteval); > end_addr =3D address + nr_pages * PAGE_SIZE; > flush_cache_range(vma, address, end_addr); > > > Note that I don't quite understand why we have to batch the whole thing o= r fallback to > individual pages. Why can't we perform other batches that span only some = PTEs? What's special > about 1 PTE vs. 2 PTEs vs. all PTEs? > > > Can someone enlighten me why that is required? It's probably not a strict requirement =E2=80=94 I thought cases where the count is greater than 1 but less than nr_pages might not provide much practical benefit, except perhaps in very rare edge cases, since madv_free() already calls split_folio(). if (folio_test_large(folio)) { bool any_young, any_dirty; nr =3D madvise_folio_pte_batch(addr, end, folio, pt= e, ptent, &any_young, &any_dirty); if (nr < folio_nr_pages(folio)) { ... err =3D split_folio(folio); ... } } Another reason is that when we extend this to non-lazyfree anonymous folios [1], things get complicated: checking anon_exclusive and updating folio_try_share_anon_rmap_pte with the number of PTEs becomes tricky if a folio is partially exclusive and partially shared. [1] https://lore.kernel.org/linux-mm/20250513084620.58231-1-21cnbao@gmail.c= om/ > > -- > Cheers, > > David / dhildenb > Thanks Barry