From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BDECC3DA4A for ; Wed, 14 Aug 2024 12:37:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA2956B0082; Wed, 14 Aug 2024 08:37:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A52EA6B0083; Wed, 14 Aug 2024 08:37:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 919FD6B0085; Wed, 14 Aug 2024 08:37:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7581C6B0082 for ; Wed, 14 Aug 2024 08:37:03 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2A01DA80F7 for ; Wed, 14 Aug 2024 12:37:03 +0000 (UTC) X-FDA: 82450800726.04.5E1FE79 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf29.hostedemail.com (Postfix) with ESMTP id 0336A120030 for ; Wed, 14 Aug 2024 12:37:00 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Qjtf5PGe; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723639008; a=rsa-sha256; cv=none; b=G6J7o44bdlP23iYrE+h23LRWglqAiP7CQznwVGIGPYlQ98arwgj2dFoosjGgRSj09R3VU8 vyaP83SODwhhEEfpmZoxZ/mXm3f/NVOQcBBXIKtWaklvicsZJm0KsHiyRAVTtOqdIfhDnr SsXKHPv5s5YRJmElpGUAMNW1mLR3WQI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Qjtf5PGe; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723639008; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RZGRk6bxM+/eTZHcLB9jghVV2vIOFeDkAkCxLic2QXc=; b=MtC5q42wTw4+aaC9LaPgSF21iTLVpbL12DBOPV/1QnTJ3MjaTuR2b8kbyaeY21u+DmDf2M GGet/qDm1lu20ZbIWOejabHzeboJknuWKy5kyvbHsEWrS0epEE6BHXD+n39EWRVDDL+D2F YWQSo4rz20veohi9E8W/17SSwyaS74o= Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5a79df5af51so1462401a12.0 for ; Wed, 14 Aug 2024 05:37:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723639019; x=1724243819; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=RZGRk6bxM+/eTZHcLB9jghVV2vIOFeDkAkCxLic2QXc=; b=Qjtf5PGeLKCCvbXT/DcY9uH1lnqFuYB5Af7XXZITkRXdch98m7YNiKT/KPU8QLlJCY bXr+SORvHFhh4ZI9kwDUNG7xM8niauY6LzjaDAm3kaEpb1y7dnKV7HI4XaDuqy7elAzX cxMS2KJkcVSfrlwnCn12wp4iiGlTwns6seZxKM+pJQbBtVpquiHr0ihqrSmGc9YmUikC SRfXm1kOyhz6CSl/rG8bTq3K1uk9rCEzxbUQSd3xaALTRJfSMDsjlHDTVY7A+EEo3ZiL wq9wH3NqpWquEwzHNIhy1YQ+nCAX73IKDxxn83mY/8vKaSiYuBwLDt/mqm3gY5wSfjvc v2vQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723639019; x=1724243819; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RZGRk6bxM+/eTZHcLB9jghVV2vIOFeDkAkCxLic2QXc=; b=Qpo4yJDB6ascPP0A1pzl9KH2JZFhOa45Y0UQ9YWF8/n+9Nkl7lBRiGuahJZc2ZeVmI vsBCsXaxmw5ZAjCBaqIaTUNL0T4ySh++McHVgHmLJjRHbl4DV+lr/hmnVpPss13W6PfZ s3Z1TbI14h9Yu562Xgehx1wFbGWicumKMASCOlmtOmkNVUTXRqDIC4CnFMaYaWTslDdY frIWulTfs6jUoi/FPeEsPQHyBYBDrCqL+E4/L/DNFvx4xyGrAuZVxKft4ClYfl6Zug1u RnhItXle6EWXggyxZTsK+PlplEHJxEX6J1dDsJJBTR/zgYBYxZX3VlSgjNFgIEdtdigo 7akA== X-Forwarded-Encrypted: i=1; AJvYcCX6IZ8fQBzum/Te/csfTrzcTvr69BXYAxcxRWZ4twpMfS4ZorD55lhhR0Wmk1KfewxvYQnTrEEpnOYSsKcIpjVg6Kg= X-Gm-Message-State: AOJu0Yw1WtzX4/79CQpOOA/2rCa05LnELYM9cAF/MBajEHVRoLZBk2I6 S7ZHQVdrTltSWj/5mKBMwYIX+wd0t245LmnwP0YDj+yQVPqMzlpu X-Google-Smtp-Source: AGHT+IEnIGFZX9kzzbutJ1qxerJXhFCeqJ8JjjG8kOisUdAOlgit3F2cBktHmUj3eeqh8oX1Tfi/7w== X-Received: by 2002:a17:907:96ab:b0:a72:5967:b34 with SMTP id a640c23a62f3a-a836afcbfc2mr170367866b.22.1723639019054; Wed, 14 Aug 2024 05:36:59 -0700 (PDT) Received: from ?IPV6:2a03:83e0:1126:4:eb:d0d0:c7fd:c82c? ([2620:10d:c092:500::4:61b7]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a80f4181a4esm167043666b.193.2024.08.14.05.36.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Aug 2024 05:36:58 -0700 (PDT) Message-ID: <88d411c5-6d66-4d41-ae86-e0f943e5fb91@gmail.com> Date: Wed, 14 Aug 2024 13:36:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/6] mm: Introduce a pageflag for partially mapped folios To: Barry Song Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com References: <20240813120328.1275952-1-usamaarif642@gmail.com> <20240813120328.1275952-5-usamaarif642@gmail.com> <59725862-f4fc-456c-bafb-cbd302777881@gmail.com> Content-Language: en-US From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 0336A120030 X-Rspamd-Server: rspam01 X-Stat-Signature: hqy7apuowe6aqffguzbo61bqumx3q4ra X-HE-Tag: 1723639020-235840 X-HE-Meta: U2FsdGVkX18SprIN8Bcyk57rNQGYB4RMbFjbJ9oL8oB9DtvWqHyYDvjAxciBWLe8fS15tX6v5UGf+DVqoMSBDNqtsij5BViBexYdHm/ABVVGtYbESLxykmqGb7QOwnjse07X7ZEGQnrGucGExeSK2ffbpbmVBLqzUDULJsQppmQiOiJHSWFmlRAs9WOvRrttXyS2e3HY3qkv+XIvU4k/vVbyj4TE8XbshHCTx0Xi2P6+mnKjt6a4Jq3+vNQt9L8+LWQ9aVX4I8RLWv+EYAmN/vTMosuVN/0h7PcJFVGXwKKhmRseRDsPd4T9vRrQ7DYMwD1RwMAHIWqu3rvZ+T7GWLkIu4AnDJaUnFvwE0ujv547B3wSTZYyOdzGwmmV7I9QEEZtt+7w0udveNNfQJrV/J6BCVii/y1M+FjiMHbUBlKQjCAWBYiCV62AjcDHrdnAhNQP+JgC5bbKIYenBqODsu5mRuMSgyjQ/M1aMpeAVi1R2spgELTy3YMPp62UKF/hkWk182mbCtJlrzeq9gmTt0cvVRkeQTvBoPbhGvBER3fdDYcAvG0OSX2upSyKMdu8XXNwcyqR7r14dlItJ0XMmbgN0zo1SIzC7C4erfSdoAWKZsnsh4Urk4CYv0o77bcKI2JU/DJMoK8/nGby9UiCrT/klm5UiEl+5s1/NgQmNmGhja60Xvt23xOm3H445LOiEl4adqnVU4BbR0M2NLdBFjVQ6Ub29YJKhJVP069lofvRdFEj6sXFlzIi8jPUp1xFgQS+aIenxrxIN1dmmm+oBObI46YI8c9BuyEY/BIiMMQJCu2aefKk7/WV1RPxe2xQ7YxPS3q3w9tzXzkEs4+O3mE+w2Scflby47fmOxAU8UFwQhtNh+tLsZhaleePdX2AleFBn1cN72XeDYWkLgy0okwN5E1or/irf62P6SID2yYaMBXtkgAdhGSuhE83x/ITdarI4ZZdXGreRz9gk/m y5VnanOT JUXkP+Vkakk6aSuloa/wae49kp5YFyBBHpMF2+q5maM99TJ1dgSCNRKlpnCvwZ2ABCOjB5VeCE73Tm9LCuFza18R5Wknp4uiSOvKym3IxJpY2yOcsjZLP9zbfk9sHacNB3lcTMNI7qhccOwsQ2RgCq5QJ7WNTvdCrBTJcTKchxtmL8QY6UyPoKL5vyCsd2ewcuXlYe9ymZJLabWO3wDqQ7h1t4UdE63Ke+tLLANUved7da68pXV8qqs+Qz68MGPVxQ05WhaBCgZ1objnv9CLCDfHfF+Qh4y6eGe8+PHTnX8Y3sZEFlhR8/PI8vxt4h/i6t7djH80ETjFZbybVleADOYMkJ1D+Pp1Y546XL11ncAu1cnGuPuvw52Agi7ype1pqplavBtwfrbf9gvE+r3BC/uJfgj1JyQGFa2ZIMCJGfgCFfxa01M+jRuXBJw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 14/08/2024 12:23, Barry Song wrote: > On Wed, Aug 14, 2024 at 11:20 PM Usama Arif wrote: >> >> >> >> On 14/08/2024 12:10, Barry Song wrote: >>> On Wed, Aug 14, 2024 at 12:03 AM Usama Arif wrote: >>>> >>>> Currently folio->_deferred_list is used to keep track of >>>> partially_mapped folios that are going to be split under memory >>>> pressure. In the next patch, all THPs that are faulted in and collapsed >>>> by khugepaged are also going to be tracked using _deferred_list. >>>> >>>> This patch introduces a pageflag to be able to distinguish between >>>> partially mapped folios and others in the deferred_list at split time in >>>> deferred_split_scan. Its needed as __folio_remove_rmap decrements >>>> _mapcount, _large_mapcount and _entire_mapcount, hence it won't be >>>> possible to distinguish between partially mapped folios and others in >>>> deferred_split_scan. >>>> >>>> Eventhough it introduces an extra flag to track if the folio is >>>> partially mapped, there is no functional change intended with this >>>> patch and the flag is not useful in this patch itself, it will >>>> become useful in the next patch when _deferred_list has non partially >>>> mapped folios. >>>> >>>> Signed-off-by: Usama Arif >>>> --- >>>> include/linux/huge_mm.h | 4 ++-- >>>> include/linux/page-flags.h | 3 +++ >>>> mm/huge_memory.c | 21 +++++++++++++-------- >>>> mm/hugetlb.c | 1 + >>>> mm/internal.h | 4 +++- >>>> mm/memcontrol.c | 3 ++- >>>> mm/migrate.c | 3 ++- >>>> mm/page_alloc.c | 5 +++-- >>>> mm/rmap.c | 3 ++- >>>> mm/vmscan.c | 3 ++- >>>> 10 files changed, 33 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>>> index 4c32058cacfe..969f11f360d2 100644 >>>> --- a/include/linux/huge_mm.h >>>> +++ b/include/linux/huge_mm.h >>>> @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page) >>>> { >>>> return split_huge_page_to_list_to_order(page, NULL, 0); >>>> } >>>> -void deferred_split_folio(struct folio *folio); >>>> +void deferred_split_folio(struct folio *folio, bool partially_mapped); >>>> >>>> void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, >>>> unsigned long address, bool freeze, struct folio *folio); >>>> @@ -495,7 +495,7 @@ static inline int split_huge_page(struct page *page) >>>> { >>>> return 0; >>>> } >>>> -static inline void deferred_split_folio(struct folio *folio) {} >>>> +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} >>>> #define split_huge_pmd(__vma, __pmd, __address) \ >>>> do { } while (0) >>>> >>>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h >>>> index a0a29bd092f8..cecc1bad7910 100644 >>>> --- a/include/linux/page-flags.h >>>> +++ b/include/linux/page-flags.h >>>> @@ -182,6 +182,7 @@ enum pageflags { >>>> /* At least one page in this folio has the hwpoison flag set */ >>>> PG_has_hwpoisoned = PG_active, >>>> PG_large_rmappable = PG_workingset, /* anon or file-backed */ >>>> + PG_partially_mapped, /* was identified to be partially mapped */ >>>> }; >>>> >>>> #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) >>>> @@ -861,8 +862,10 @@ static inline void ClearPageCompound(struct page *page) >>>> ClearPageHead(page); >>>> } >>>> FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) >>>> +FOLIO_FLAG(partially_mapped, FOLIO_SECOND_PAGE) >>>> #else >>>> FOLIO_FLAG_FALSE(large_rmappable) >>>> +FOLIO_FLAG_FALSE(partially_mapped) >>>> #endif >>>> >>>> #define PG_head_mask ((1UL << PG_head)) >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index 6df0e9f4f56c..c024ab0f745c 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -3397,6 +3397,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, >>>> * page_deferred_list. >>>> */ >>>> list_del_init(&folio->_deferred_list); >>>> + folio_clear_partially_mapped(folio); >>>> } >>>> spin_unlock(&ds_queue->split_queue_lock); >>>> if (mapping) { >>>> @@ -3453,11 +3454,12 @@ void __folio_undo_large_rmappable(struct folio *folio) >>>> if (!list_empty(&folio->_deferred_list)) { >>>> ds_queue->split_queue_len--; >>>> list_del_init(&folio->_deferred_list); >>>> + folio_clear_partially_mapped(folio); >>>> } >>>> spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >>>> } >>>> >>>> -void deferred_split_folio(struct folio *folio) >>>> +void deferred_split_folio(struct folio *folio, bool partially_mapped) >>>> { >>>> struct deferred_split *ds_queue = get_deferred_split_queue(folio); >>>> #ifdef CONFIG_MEMCG >>>> @@ -3485,14 +3487,17 @@ void deferred_split_folio(struct folio *folio) >>>> if (folio_test_swapcache(folio)) >>>> return; >>>> >>>> - if (!list_empty(&folio->_deferred_list)) >>>> - return; >>>> - >>>> spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >>>> + if (partially_mapped) >>>> + folio_set_partially_mapped(folio); >>>> + else >>>> + folio_clear_partially_mapped(folio); >>>> if (list_empty(&folio->_deferred_list)) { >>>> - if (folio_test_pmd_mappable(folio)) >>>> - count_vm_event(THP_DEFERRED_SPLIT_PAGE); >>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); >>>> + if (partially_mapped) { >>>> + if (folio_test_pmd_mappable(folio)) >>>> + count_vm_event(THP_DEFERRED_SPLIT_PAGE); >>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); >>> >>> This code completely broke MTHP_STAT_SPLIT_DEFERRED for PMD_ORDER. It >>> added the folio to the deferred_list as entirely_mapped >>> (partially_mapped == false). >>> However, when partially_mapped becomes true, there's no opportunity to >>> add it again >>> as it has been there on the list. Are you consistently seeing the counter for >>> PMD_ORDER as 0? >>> >> >> Ah I see it, this should fix it? >> >> -void deferred_split_folio(struct folio *folio) >> +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ >> +void deferred_split_folio(struct folio *folio, bool partially_mapped) >> { >> struct deferred_split *ds_queue = get_deferred_split_queue(folio); >> #ifdef CONFIG_MEMCG >> @@ -3485,14 +3488,14 @@ void deferred_split_folio(struct folio *folio) >> if (folio_test_swapcache(folio)) >> return; >> >> - if (!list_empty(&folio->_deferred_list)) >> - return; >> - >> spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >> - if (list_empty(&folio->_deferred_list)) { >> + if (partially_mapped) { >> + folio_set_partially_mapped(folio); >> if (folio_test_pmd_mappable(folio)) >> count_vm_event(THP_DEFERRED_SPLIT_PAGE); >> count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); >> + } >> + if (list_empty(&folio->_deferred_list)) { >> list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); >> ds_queue->split_queue_len++; >> #ifdef CONFIG_MEMCG >> > > not enough. as deferred_split_folio(true) won't be called if folio has been > deferred_list in __folio_remove_rmap(): > > if (partially_mapped && folio_test_anon(folio) && > list_empty(&folio->_deferred_list)) > deferred_split_folio(folio, true); > > so you will still see 0. > ah yes, Thanks. So below diff over the current v3 series should work for all cases: diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b4d72479330d..482e3ab60911 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3483,6 +3483,7 @@ void __folio_undo_large_rmappable(struct folio *folio) spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ void deferred_split_folio(struct folio *folio, bool partially_mapped) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); @@ -3515,16 +3516,16 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) return; spin_lock_irqsave(&ds_queue->split_queue_lock, flags); - if (partially_mapped) + if (partially_mapped) { folio_set_partially_mapped(folio); - else - folio_clear_partially_mapped(folio); + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_DEFERRED_SPLIT_PAGE); + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + } else { + /* partially mapped folios cannont become partially unmapped */ + VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); + } if (list_empty(&folio->_deferred_list)) { - if (partially_mapped) { - if (folio_test_pmd_mappable(folio)) - count_vm_event(THP_DEFERRED_SPLIT_PAGE); - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); - } list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); ds_queue->split_queue_len++; #ifdef CONFIG_MEMCG diff --git a/mm/rmap.c b/mm/rmap.c index 9ad558c2bad0..4c330635aa4e 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1578,7 +1578,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, * Check partially_mapped first to ensure it is a large folio. */ if (partially_mapped && folio_test_anon(folio) && - list_empty(&folio->_deferred_list)) + !folio_test_partially_mapped(folio)) deferred_split_folio(folio, true); __folio_mod_stat(folio, -nr, -nr_pmdmapped);