From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C14B8C4345F for ; Fri, 26 Apr 2024 02:24:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4842C6B0089; Thu, 25 Apr 2024 22:24:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4358F6B008A; Thu, 25 Apr 2024 22:24:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D44C6B008C; Thu, 25 Apr 2024 22:24:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0FC906B0089 for ; Thu, 25 Apr 2024 22:24:10 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 902191206CA for ; Fri, 26 Apr 2024 02:24:09 +0000 (UTC) X-FDA: 82050088218.02.1A1C186 Received: from mail-vk1-f182.google.com (mail-vk1-f182.google.com [209.85.221.182]) by imf14.hostedemail.com (Postfix) with ESMTP id C9514100007 for ; Fri, 26 Apr 2024 02:24:06 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=moLFqPXL; spf=pass (imf14.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.182 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714098246; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6Va3i2u53YtnbZkMZaQ2D1OE9VZfq4BKwsKRX5GT8Uk=; b=hRAOAYoE4YP/Z8N9/aW+o4wqaDAc9dMwW4onSh0Eh/tdlKjZ9pXo661nkTBstDtof+K/Ba 6wxcvvSBdcl9x7hl3O7N5mdTAbQmblHpWiK1U4TihTDVWn+m7El/c3hYiAWF5YlAMFkDuS hSRpN1Y+jTiw+M1hVtcTtUQv05Dkj0E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714098246; a=rsa-sha256; cv=none; b=fNg1tb00CArLwSG3nGDfI+4fQ7J3UCvnPuS0/Xgb1EJKz1hC7UBDWNDn2IqJrc2t2hXkWN 84oud4k6o1ru4jJlxQ13dGGq7QkswtR060z1glczBmGq2RtjupXlEShgfHrRli7nC0nSKF 0qtE6YVRlPLjOYyfUdKJgxitKg7ssj0= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=moLFqPXL; spf=pass (imf14.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.182 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vk1-f182.google.com with SMTP id 71dfb90a1353d-4daa83054easo367966e0c.1 for ; Thu, 25 Apr 2024 19:24:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714098246; x=1714703046; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6Va3i2u53YtnbZkMZaQ2D1OE9VZfq4BKwsKRX5GT8Uk=; b=moLFqPXLjmlpulsXOfcFIlIT6nb778ceOX0B7Mx4HrgwQY7Lpywi6jmx/VAmlx4YdU jIoCWH7DNO5OTuvZc/ZLMfITV8vlu4sFR7ZqwImgZCfS0hkXVIsY3ZsNbE0isMI0UQWT BbKWiXBFtBp2fOC736Kp+hySP3ICrVQyceY6hc6v27pMsvJt5flblcVWULcbJ64ftgHe Btp3VqkIw/+n32EkU4mzuw/zgEiHUa+IfwPJtjkdULKKXCF99Qfh59aq0YUdBGkr84qq zLHNGbc8VrK/hRljDavV0BaAnsAuao03Md7PaNIwiAqAcvNQTDE6c7aWRS6LrQUXFJEa 9kcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714098246; x=1714703046; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6Va3i2u53YtnbZkMZaQ2D1OE9VZfq4BKwsKRX5GT8Uk=; b=EgJ60wCNmOgpHVhHnWmof70+o3eNsCwjVsDTAft/A6+BiyZYtZBTF/NR3SZk6f8OrD yitseGNKSiF58n3LR+cnT4I5741JogO7hUdRZ/WAdKBeXG4z7hxMdNOkE13vRKkLcWuz Ii9KsFRfDM+YL9lPYPu1azLq4AeWslXyfrFXnEJzJ2Ti/NRyyQH86Ye8uRLdatLU6UlN tSXOKsbavRF44pJAnHyQ+oe9kj9863YtT3Nv4nhjJgzbk9n5BF3OP4xMEOQj7QeeuEbQ ivJRplbdg7TEpbe56WSwPTXVDp4ag/H4vUeit1wzPR74ld2lsCyPURXUyuARBK8RU3cQ ZYYg== X-Forwarded-Encrypted: i=1; AJvYcCWX9LF/asGWVaWfGjNtE8OD0qrmxrseSNh1wewOSAuafhieU4Li/OeUnygazbSqD/pAbXoUYIY0dkipJkK1Lz9fSYo= X-Gm-Message-State: AOJu0Yxfgd3EoFbH/d0fYz6D8q7rYaoF5Xco8dVz7OOsFcH23nEImnC/ +M8YMM589zfcbjPrTrAgOFT5juVS/EbI062NQHDhD/slVuYGbcaYLD3uiZdJoQTXEGIlj9lHLqv Dke6OYQiL8t8TnX3XERDCpK69UhY= X-Google-Smtp-Source: AGHT+IFCI8cDM7NSN3tGF52wxpWlhL9r7hkVw5VxcV+XVlBe9j1X9yr0KV+NZ95M5SMCLJkcXr0uN6q3TrkKs2EJJhg= X-Received: by 2002:a05:6122:a1a:b0:4d4:1cca:1a72 with SMTP id 26-20020a0561220a1a00b004d41cca1a72mr1369489vkn.6.1714098245867; Thu, 25 Apr 2024 19:24:05 -0700 (PDT) MIME-Version: 1.0 References: <20240425211136.486184-1-zi.yan@sent.com> <6C31DF81-94FB-4D09-A3B8-0CED2AD8EDDB@nvidia.com> In-Reply-To: <6C31DF81-94FB-4D09-A3B8-0CED2AD8EDDB@nvidia.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 26 Apr 2024 10:23:54 +0800 Message-ID: Subject: Re: [PATCH v4] mm/rmap: do not add fully unmapped large folio to deferred split list To: Zi Yan Cc: Andrew Morton , linux-mm@kvack.org, "Matthew Wilcox (Oracle)" , Yang Shi , Ryan Roberts , David Hildenbrand , Lance Yang , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: enaa4z6edj3gs8mki9qjkzrafk57i8og X-Rspamd-Queue-Id: C9514100007 X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1714098246-702584 X-HE-Meta: U2FsdGVkX18m9wbeT1dsvrOv+1qEfpc/TVvPtZgqysOgqybB7zFwUM/i9sThWJCgYA2e5MT6e+YY8yYiLAZBEjWeGwDZdtiU44xKXfpSGHeKxmxN1s/3WrZMx28o0bgJVpy0cBnK2ykGood0Eip+2W9TMzcq8ZKntKXoMlVzaICLedHUO8kO0YSBC5capNaDyPplgS1EtU2Z+0kErx8tIZECcBrtVsEj9nhV/S+Gr7EFU/x2AOdVLVop9QDG8Lx537RalsktLIrin5GaPcYUPuBUOcWUSULAk2Q2tRcOIxQ0dohWya3/Z2ORAYHSPxy1hVBBCKzW4BExz5Z06oM/OCePfAcAM6j3LlLSnyj6aKPu1lY/M54Z/iBcHLw7k/C6vOi/JhV1gQ8JaoSAn6Kzqu6tlM2Ape+brPnNkW+ST4qweXGa2Jytf1mHjCzXJyh/ofFV2FVqij76xuG9JOIqLgeQe1m126Mc0YAKnyt3xDAIEiWcvtwepHJTh+z1Ua5nDjIMmdKk/NzvfdZ7Jl/Ymau8PUkF6A+pMxMjL8KVu6svJwMyQxPZs3to/MIJW6Kdr+ubj7VB9DPPuGaYnkHyNMCHCWHSAx9MklPSW7jbHe0k0zLWXSkD1Ly28e/JSyNA4R8GsFWAzkESuI7aoYesDrnG1HEqx9PRmh3MtSc0wCqcHGEze5qMvQ3xZrnmahpE0kmd/WDdPnRa9aIot1O8yvopa5z9ofV91fKvwG7T8HbySK4SgUIoDbt67ANsy6l6Qhzreb7rylfy13Z18MB7Jazq9+N0HBs6qT2XnP6Lr5YeO6ErDsDSFFEiI6GHIu4TluQYYC8v9No24cciwnopRlhzZQyEOcDbeCpu/yHepmWKA3WWDkD4E3efmW4zqQCZxuOtujgAONDxKgxTDiCI+0cdwFqfg89Wj9CAO97d0fZPTZS+cQ8n0Xu+JbnltMF3KJaLQiDnyIC0un5O65K /eS6lKbJ PV8ZLgdi5PnSEB3mju5RxtfGdKaAHDULdh32E+EoPRiTF/qQOFApTiMb1FQ8ur972ojQk8vDE8bILmIgHtAc3O7DUUAeTc3Ddoy7xhKUCKNlA9CucJAummkO5BVwRzLEof3yWGorVFiz2KXRrKz0Ths1jvmBkcPOsug4QCJJSl1B9gUHH8Vix6/ZNx4k0D5dvzcK6ieVhb8O9AhygquXscx8OCMHi19S/8PSAr25UBmaytWWvwDZE3drwsMM53BN9PmsSQUjz6SWW21eVPyMEU2qMIejNpMg1IIxgQi7/OUOMmEUEO5EOGhajiPMY6pBoZYQHMVjdyyIkDYlVwhS6DBLGlznbasGAbprUFe0sNFgyRgDrAnO1G3p65nvh/SrbOKoWJc9O4sD0d6yeHGFrM6IePEzwSkcnUTE4nwTPwY1tONuQvngH/96GRj6cj57zDNuIz5Y4AuZs6po= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 26, 2024 at 9:55=E2=80=AFAM Zi Yan wrote: > > On 25 Apr 2024, at 21:45, Barry Song wrote: > > > On Fri, Apr 26, 2024 at 5:11=E2=80=AFAM Zi Yan wrote: > >> > >> From: Zi Yan > >> > >> In __folio_remove_rmap(), a large folio is added to deferred split lis= t > >> if any page in a folio loses its final mapping. But it is possible tha= t > >> the folio is fully unmapped and adding it to deferred split list is > >> unnecessary. > >> > >> For PMD-mapped THPs, that was not really an issue, because removing th= e > >> last PMD mapping in the absence of PTE mappings would not have added t= he > >> folio to the deferred split queue. > >> > >> However, for PTE-mapped THPs, which are now more prominent due to mTHP= , > >> they are always added to the deferred split queue. One side effect > >> is that the THP_DEFERRED_SPLIT_PAGE stat for a PTE-mapped folio can be > >> unintentionally increased, making it look like there are many partiall= y > >> mapped folios -- although the whole folio is fully unmapped stepwise. > >> > >> Core-mm now tries batch-unmapping consecutive PTEs of PTE-mapped THPs > >> where possible starting from commit b06dc281aa99 ("mm/rmap: introduce > >> folio_remove_rmap_[pte|ptes|pmd]()"). When it happens, a whole PTE-map= ped > >> folio is unmapped in one go and can avoid being added to deferred spli= t > >> list, reducing the THP_DEFERRED_SPLIT_PAGE noise. But there will still= be > >> noise when we cannot batch-unmap a complete PTE-mapped folio in one go > >> -- or where this type of batching is not implemented yet, e.g., migrat= ion. > >> > >> To avoid the unnecessary addition, folio->_nr_pages_mapped is checked > >> to tell if the whole folio is unmapped. If the folio is already on > >> deferred split list, it will be skipped, too. > >> > >> Note: commit 98046944a159 ("mm: huge_memory: add the missing > >> folio_test_pmd_mappable() for THP split statistics") tried to exclude > >> mTHP deferred split stats from THP_DEFERRED_SPLIT_PAGE, but it does no= t > >> fix the above issue. A fully unmapped PTE-mapped order-9 THP was still > >> added to deferred split list and counted as THP_DEFERRED_SPLIT_PAGE, > >> since nr is 512 (non zero), level is RMAP_LEVEL_PTE, and inside > >> deferred_split_folio() the order-9 folio is folio_test_pmd_mappable(). > >> > >> Signed-off-by: Zi Yan > >> Reviewed-by: Yang Shi > >> --- > >> mm/rmap.c | 8 +++++--- > >> 1 file changed, 5 insertions(+), 3 deletions(-) > >> > >> diff --git a/mm/rmap.c b/mm/rmap.c > >> index a7913a454028..220ad8a83589 100644 > >> --- a/mm/rmap.c > >> +++ b/mm/rmap.c > >> @@ -1553,9 +1553,11 @@ static __always_inline void __folio_remove_rmap= (struct folio *folio, > >> * page of the folio is unmapped and at least one page > >> * is still mapped. > >> */ > >> - if (folio_test_large(folio) && folio_test_anon(folio)) > >> - if (level =3D=3D RMAP_LEVEL_PTE || nr < nr_pmd= mapped) > >> - deferred_split_folio(folio); > >> + if (folio_test_large(folio) && folio_test_anon(folio) = && > >> + list_empty(&folio->_deferred_list) && > >> + ((level =3D=3D RMAP_LEVEL_PTE && atomic_read(mappe= d)) || > >> + (level =3D=3D RMAP_LEVEL_PMD && nr < nr_pmdmapped= ))) > >> + deferred_split_folio(folio); > > > > Hi Zi Yan, > > in case a mTHP is mapped by two processed (forked but not CoW yet), if = we > > unmap the whole folio by pte level in one process only, are we still ad= ding this > > folio into deferred list? > > No. Because the mTHP is still fully mapped by the other process. In terms= of code, > nr will be 0 in that case and this if condition is skipped. nr is only in= creased > from 0 when one of the subpages in the mTHP has no mapping, namely page->= _mapcount > becomes negative and last is true in the case RMAP_LEVEL_PTE. Ok. i see, so "last" won't be true? case RMAP_LEVEL_PTE: do { last =3D atomic_add_negative(-1, &page->_mapcount); if (last && folio_test_large(folio)) { last =3D atomic_dec_return_relaxed(mapped); last =3D (last < ENTIRELY_MAPPED); } if (last) nr++; } while (page++, --nr_pages > 0); break; > > > -- > Best Regards, > Yan, Zi