From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A652C52D7B for ; Wed, 14 Aug 2024 10:44:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C12B6B0082; Wed, 14 Aug 2024 06:44:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8717D6B0083; Wed, 14 Aug 2024 06:44:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 738206B0085; Wed, 14 Aug 2024 06:44:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 55A706B0082 for ; Wed, 14 Aug 2024 06:44:42 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D67F01C46EE for ; Wed, 14 Aug 2024 10:44:41 +0000 (UTC) X-FDA: 82450517562.30.C5DEAE4 Received: from mail-ua1-f51.google.com (mail-ua1-f51.google.com [209.85.222.51]) by imf03.hostedemail.com (Postfix) with ESMTP id 1EDF520025 for ; Wed, 14 Aug 2024 10:44:39 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723632244; a=rsa-sha256; cv=none; b=xkaGxlcxfQkX4OsWEZge8fzLAG2q5g9rPqguuTCJTE5nhzU71Dd4jtl4A+bkkSGcPizkOW SrutsZPPJZQ/QY+ezR1+s+mMw6n884/qgeMx1CL0ieg3d4ZdSgRriaqmO47ShawHqLN/KY 5oPBDvTnwpOIrjF7ZzccA09a4kISF4A= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723632244; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uhuKELtUQfqg3R4iz6eojJcLJbisHRVhT9vIlErWfrI=; b=ISiL0gmUFyzbv2UCtDvSIkI4tY9Ppm1PH9OZWyAW4D1i6WlJMaV3+czdG0iR4j8FAbOM/R WTaD684u3uMOyMmlPx5BaAjXLB39vXaHpGORwBX5Vp9XwDA25vMeUkWerWx8otSDpcah45 GofOlJxa/xOpBUGImM9Qgh6hEuYgeoY= Received: by mail-ua1-f51.google.com with SMTP id a1e0cc1a2514c-81ff6a80cb2so1890147241.3 for ; Wed, 14 Aug 2024 03:44:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723632279; x=1724237079; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uhuKELtUQfqg3R4iz6eojJcLJbisHRVhT9vIlErWfrI=; b=iSiOHXrSm5p2zPUx4tvYihCIJ06hsTKBN6NWkKkWVQkBV18mvSL5KIBxTBXIvOBWRt gyPtrwDw45wh6tc1cpg87lMjuEr2dUoR+wfMu6We6krtq/L46dCEgI9lbiHRP4S+FLi7 hoINU9vg3Ag7BF2LC+kkVboF0ak2w7YuhqG/a6Utz7dj1wATxPgc3YGLFLjISb6pssPX dtKE9FIQwCgu2IJd1RohC7GHw8aDixcicK8BfRG1eN2TpiPme+iQMje/1IEzsGlzkPxL /EbT2D1hjIW8+Yy4+f5TJA2s8d8iSjnXnO7CStFrumGRIZpnxq8PYD/Kh6fshLObAqOw 4dFA== X-Forwarded-Encrypted: i=1; AJvYcCVKZBT1Mk0gEigkmQcgnZ5H8lih8oVXvWDsq4AZIolh7hU68oQd7z4sRRcMGIQdv/NVn7JmUDq1nRvO2SB/hsqjjcE= X-Gm-Message-State: AOJu0YzLfdq2SLuf0txCDhN0Ma/mIy5eQVjtjDVhAShAIrcsP+Kt3BCA GVqnxdvo4gOnJALY+I6W63IqJvGTFVlf79bE3bCA7797R7WiafW0sJ4ecFBCmNFIpC+WjtMC+4W mSosN67XyCEsCq/PXb6ipVKRNbIs= X-Google-Smtp-Source: AGHT+IGvTxOOiGrecP98na1TniR3pGbwft7uwJhXmKYL1krRvlVLTZHT6Do3qhB3ZBPA61t509Af7/3PGjvMETZRImA= X-Received: by 2002:a05:6102:3f4f:b0:493:b06d:eea2 with SMTP id ada2fe7eead31-497599eb9e5mr2372406137.31.1723632278898; Wed, 14 Aug 2024 03:44:38 -0700 (PDT) MIME-Version: 1.0 References: <20240813120328.1275952-1-usamaarif642@gmail.com> <20240813120328.1275952-5-usamaarif642@gmail.com> In-Reply-To: <20240813120328.1275952-5-usamaarif642@gmail.com> From: Barry Song Date: Wed, 14 Aug 2024 22:44:27 +1200 Message-ID: Subject: Re: [PATCH v3 4/6] mm: Introduce a pageflag for partially mapped folios To: Usama Arif Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ri94dmgcd8bknq9bf94kf8wkgci7tm6h X-Rspamd-Queue-Id: 1EDF520025 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1723632279-770589 X-HE-Meta: U2FsdGVkX18kU2SNh6fnf/+/VaW67C2h+lpA+wfTP+qf+NvHmfe3kieJzpE0Ljxupo6+SPKsOh2uVjKvp5J93XrTxSY7NB/EnnMC6hZ9biDs9xOQV2ukqkpVPBXjkyFCHHbZkPJcWxHPro0ajD8s0TT8YeqlEPEw7akrC05iTobGYWPTUTN3RsR+huDl91f8OVn4JpYpeSldus2A3KGPb8jM1yHR2oMBPleoxrP7QH8Fjm2qs2omP8bAl1wOmeQ4qJRBILL3JkohOWUgY5oGAGdyctQuP6F7ltrgcg+Pulbs90KAtKxjc0bJWMb/8+Rx6lqH5dFB44/SfPVQL+brbHq6g5WuLFOQdohJs9r/yGZ1dkeWF288+BJyip1mO+dp26VxowQRaer/OTXC7ciYOwEzjuK29YEAqGejgNNYN7db5Gi3/NWZvJcjUzIP8ieGrZJSVcq/aQ2Zx7+NR5dglCRmmpe2eVe+bIk6t906Gpn9axLrawvDWftTfSaZ/jCyXZEVod9kJFsNoiPerp1Uhz73TW7ML7VfMYmDzcVLl/+ovPsy6fcUxM5zyKHMg4nyabVCD/Ppa1oZGPlmu2xvaLyexhuO4C6JfS+I1DVFl06Zey4if2aoZy7pxTEPzsyvmxTOG4Kq4tgCgVTwlbJ8FcL29Th7AsKV7mHYXsP2d4TdRpTp3LPejypGCg9FDGOw7vwg+407Mg+Ok/EMp7uOt3OS5Tg6fxRCjohB12SqfOfLnkavJNjp+PWsFZ6eIR80tD0kp1gTKl95L+FXXiTIBrFpVMHsfKqwWN0lkUUrGX1dA6K/p6An1L2uaTYXg7oV0PvJ0KVKE0VXTmhOMl4JWzqZsv6UMuNU2seCjqRGUs4aViYZm7yjQuf9TyGol+umsPFGZ57g/0KNnQu8o3VQda8awXXdwGHBoohlGWF02GkRE9JcXnnuDtRrYR6Unl+hhSERMbn3Pjae6I6W0pg lRtswTuI xHjKDGfKBxcwrVqYojPo9Cw6tj5B3M5xSW/nnOQGmlDIDFkvZ5yozOdSDDN19a9NohkQsxhmorLVZ+UJmWT6+B4liI6IRX+9rbgxxwSnvsidtiYkS5Y2Kvoa5f4Wy8Et8PEgN+syWZigclZ9w76D24K+nP3hCvDt8jeyojfM06Ld0/en0fdd9X5t+DeAU4QjOX9Mha5hPLXGiXvKpS1TH2n1zAmyJfcmGXmdp+9ffkP54Y8fFULcrhWbQ3MBiAXmJNJ8uwj7JLYpgNJgjxJOSvZYkVE/e2SLbEEoHAR0SsxiR2+8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Aug 14, 2024 at 12:03=E2=80=AFAM Usama Arif wrote: > > Currently folio->_deferred_list is used to keep track of > partially_mapped folios that are going to be split under memory > pressure. In the next patch, all THPs that are faulted in and collapsed > by khugepaged are also going to be tracked using _deferred_list. > > This patch introduces a pageflag to be able to distinguish between > partially mapped folios and others in the deferred_list at split time in > deferred_split_scan. Its needed as __folio_remove_rmap decrements > _mapcount, _large_mapcount and _entire_mapcount, hence it won't be > possible to distinguish between partially mapped folios and others in > deferred_split_scan. > > Eventhough it introduces an extra flag to track if the folio is > partially mapped, there is no functional change intended with this > patch and the flag is not useful in this patch itself, it will > become useful in the next patch when _deferred_list has non partially > mapped folios. > > Signed-off-by: Usama Arif > --- > include/linux/huge_mm.h | 4 ++-- > include/linux/page-flags.h | 3 +++ > mm/huge_memory.c | 21 +++++++++++++-------- > mm/hugetlb.c | 1 + > mm/internal.h | 4 +++- > mm/memcontrol.c | 3 ++- > mm/migrate.c | 3 ++- > mm/page_alloc.c | 5 +++-- > mm/rmap.c | 3 ++- > mm/vmscan.c | 3 ++- > 10 files changed, 33 insertions(+), 17 deletions(-) > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > index 4c32058cacfe..969f11f360d2 100644 > --- a/include/linux/huge_mm.h > +++ b/include/linux/huge_mm.h > @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page) > { > return split_huge_page_to_list_to_order(page, NULL, 0); > } > -void deferred_split_folio(struct folio *folio); > +void deferred_split_folio(struct folio *folio, bool partially_mapped); > > void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, > unsigned long address, bool freeze, struct folio *folio); > @@ -495,7 +495,7 @@ static inline int split_huge_page(struct page *page) > { > return 0; > } > -static inline void deferred_split_folio(struct folio *folio) {} > +static inline void deferred_split_folio(struct folio *folio, bool partia= lly_mapped) {} > #define split_huge_pmd(__vma, __pmd, __address) \ > do { } while (0) > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index a0a29bd092f8..cecc1bad7910 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -182,6 +182,7 @@ enum pageflags { > /* At least one page in this folio has the hwpoison flag set */ > PG_has_hwpoisoned =3D PG_active, > PG_large_rmappable =3D PG_workingset, /* anon or file-backed */ > + PG_partially_mapped, /* was identified to be partially mapped */ > }; > > #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) > @@ -861,8 +862,10 @@ static inline void ClearPageCompound(struct page *pa= ge) > ClearPageHead(page); > } > FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) > +FOLIO_FLAG(partially_mapped, FOLIO_SECOND_PAGE) > #else > FOLIO_FLAG_FALSE(large_rmappable) > +FOLIO_FLAG_FALSE(partially_mapped) > #endif > > #define PG_head_mask ((1UL << PG_head)) > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 6df0e9f4f56c..c024ab0f745c 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3397,6 +3397,7 @@ int split_huge_page_to_list_to_order(struct page *p= age, struct list_head *list, > * page_deferred_list. > */ > list_del_init(&folio->_deferred_list); > + folio_clear_partially_mapped(folio); > } > spin_unlock(&ds_queue->split_queue_lock); > if (mapping) { > @@ -3453,11 +3454,12 @@ void __folio_undo_large_rmappable(struct folio *f= olio) > if (!list_empty(&folio->_deferred_list)) { > ds_queue->split_queue_len--; > list_del_init(&folio->_deferred_list); > + folio_clear_partially_mapped(folio); > } > spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > } > > -void deferred_split_folio(struct folio *folio) > +void deferred_split_folio(struct folio *folio, bool partially_mapped) > { > struct deferred_split *ds_queue =3D get_deferred_split_queue(foli= o); > #ifdef CONFIG_MEMCG > @@ -3485,14 +3487,17 @@ void deferred_split_folio(struct folio *folio) > if (folio_test_swapcache(folio)) > return; > > - if (!list_empty(&folio->_deferred_list)) > - return; > - > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > + if (partially_mapped) > + folio_set_partially_mapped(folio); > + else > + folio_clear_partially_mapped(folio); Hi Usama, Do we need this? When can a partially_mapped folio on deferred_list become non-partially-mapped and need a clear? I understand transferring from entirely_map to partially_mapped is a one way process? partially_mapped folios can't go = back to entirely_mapped? I am trying to rebase my NR_SPLIT_DEFERRED counter on top of your work, but this "clear" makes that job quite tricky. as I am not sure if this patch is going to clear the partially_mapped flag for folios on deferred_list. Otherwise, I can simply do the below whenever folio is leaving deferred_lis= t ds_queue->split_queue_len--; list_del_init(&folio->_deferred_list); if (folio_test_clear_partially_mapped(folio)) mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_SPLIT_DEFERRED, -1); > if (list_empty(&folio->_deferred_list)) { > - if (folio_test_pmd_mappable(folio)) > - count_vm_event(THP_DEFERRED_SPLIT_PAGE); > - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFER= RED); > + if (partially_mapped) { > + if (folio_test_pmd_mappable(folio)) > + count_vm_event(THP_DEFERRED_SPLIT_PAGE); > + count_mthp_stat(folio_order(folio), MTHP_STAT_SPL= IT_DEFERRED); > + } > list_add_tail(&folio->_deferred_list, &ds_queue->split_qu= eue); > ds_queue->split_queue_len++; > #ifdef CONFIG_MEMCG > @@ -3541,6 +3546,7 @@ static unsigned long deferred_split_scan(struct shr= inker *shrink, > } else { > /* We lost race with folio_put() */ > list_del_init(&folio->_deferred_list); > + folio_clear_partially_mapped(folio); > ds_queue->split_queue_len--; > } > if (!--sc->nr_to_scan) > @@ -3558,7 +3564,6 @@ static unsigned long deferred_split_scan(struct shr= inker *shrink, > next: > folio_put(folio); > } > - > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > list_splice_tail(&list, &ds_queue->split_queue); > spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 1fdd9eab240c..2ae2d9a18e40 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1758,6 +1758,7 @@ static void __update_and_free_hugetlb_folio(struct = hstate *h, > free_gigantic_folio(folio, huge_page_order(h)); > } else { > INIT_LIST_HEAD(&folio->_deferred_list); > + folio_clear_partially_mapped(folio); > folio_put(folio); > } > } > diff --git a/mm/internal.h b/mm/internal.h > index 52f7fc4e8ac3..d64546b8d377 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -662,8 +662,10 @@ static inline void prep_compound_head(struct page *p= age, unsigned int order) > atomic_set(&folio->_entire_mapcount, -1); > atomic_set(&folio->_nr_pages_mapped, 0); > atomic_set(&folio->_pincount, 0); > - if (order > 1) > + if (order > 1) { > INIT_LIST_HEAD(&folio->_deferred_list); > + folio_clear_partially_mapped(folio); > + } > } > > static inline void prep_compound_tail(struct page *head, int tail_idx) > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index e1ffd2950393..0fd95daecf9a 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4669,7 +4669,8 @@ static void uncharge_folio(struct folio *folio, str= uct uncharge_gather *ug) > VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); > VM_BUG_ON_FOLIO(folio_order(folio) > 1 && > !folio_test_hugetlb(folio) && > - !list_empty(&folio->_deferred_list), folio); > + !list_empty(&folio->_deferred_list) && > + folio_test_partially_mapped(folio), folio); > > /* > * Nobody should be changing or seriously looking at > diff --git a/mm/migrate.c b/mm/migrate.c > index 3288ac041d03..6e32098ac2dc 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1734,7 +1734,8 @@ static int migrate_pages_batch(struct list_head *fr= om, > * use _deferred_list. > */ > if (nr_pages > 2 && > - !list_empty(&folio->_deferred_list)) { > + !list_empty(&folio->_deferred_list) && > + folio_test_partially_mapped(folio)) { > if (!try_split_folio(folio, split_folios,= mode)) { > nr_failed++; > stats->nr_thp_failed +=3D is_thp; > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 408ef3d25cf5..a145c550dd2a 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -957,8 +957,9 @@ static int free_tail_page_prepare(struct page *head_p= age, struct page *page) > break; > case 2: > /* the second tail page: deferred_list overlaps ->mapping= */ > - if (unlikely(!list_empty(&folio->_deferred_list))) { > - bad_page(page, "on deferred list"); > + if (unlikely(!list_empty(&folio->_deferred_list) && > + folio_test_partially_mapped(folio))) { > + bad_page(page, "partially mapped folio on deferre= d list"); > goto out; > } > break; > diff --git a/mm/rmap.c b/mm/rmap.c > index a6b9cd0b2b18..9ad558c2bad0 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1579,7 +1579,8 @@ static __always_inline void __folio_remove_rmap(str= uct folio *folio, > */ > if (partially_mapped && folio_test_anon(folio) && > list_empty(&folio->_deferred_list)) > - deferred_split_folio(folio); > + deferred_split_folio(folio, true); > + > __folio_mod_stat(folio, -nr, -nr_pmdmapped); > > /* > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 25e43bb3b574..25f4e8403f41 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1233,7 +1233,8 @@ static unsigned int shrink_folio_list(struct list_h= ead *folio_list, > * Split partially mapped folios = right away. > * We can free the unmapped pages= without IO. > */ > - if (data_race(!list_empty(&folio-= >_deferred_list)) && > + if (data_race(!list_empty(&folio-= >_deferred_list) && > + folio_test_partially_mapped(f= olio)) && > split_folio_to_list(folio, fo= lio_list)) > goto activate_locked; > } > -- > 2.43.5 > Thanks Barry