From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E25E2C52D7B for ; Wed, 14 Aug 2024 11:30:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 737516B0082; Wed, 14 Aug 2024 07:30:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E64E6B0083; Wed, 14 Aug 2024 07:30:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5AE246B0085; Wed, 14 Aug 2024 07:30:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3E4096B0082 for ; Wed, 14 Aug 2024 07:30:41 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E6E56A04F2 for ; Wed, 14 Aug 2024 11:30:40 +0000 (UTC) X-FDA: 82450633440.12.B2C1482 Received: from mail-lj1-f173.google.com (mail-lj1-f173.google.com [209.85.208.173]) by imf08.hostedemail.com (Postfix) with ESMTP id 27E76160027 for ; Wed, 14 Aug 2024 11:30:37 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RYXpMDgd; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.173 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723634980; a=rsa-sha256; cv=none; b=3yyVkmw2zfoJ85UJvfm2/JDItNM6g4O/TfFdL+n/0H03raWVXbcYCxZDqeFludiW52wdqM QquMLjL+IdgosW2iLiQ+IYjZO9sZ2ybhalmEjKw9bMWBFGrMMdWUtQCKjrhNHVstZgmAjL Pg5qWmcLSKlSxlHOxKXECzLJpEXXApY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RYXpMDgd; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.173 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723634980; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G3QbuTR7PsHCd/KkxZbVnV1PTWpZxVQLKz/yB/X9NSk=; b=i6mKzc5muJnXIrCVR0/q20HSpTVk+D2Ftn4w/b1lohyO1LC8HyILLUo4SzODS/L2A2NaDU YwmbYJM7NIGB/o6A9thKViiQ5VGWBhJ4T5IhxorQOzfOOHxTkfCU8VzzteX5ruR8ZLNKXI 95WEl6ciutNEpjbwIXDMDKufnDupUZs= Received: by mail-lj1-f173.google.com with SMTP id 38308e7fff4ca-2f1798eaee6so61357621fa.0 for ; Wed, 14 Aug 2024 04:30:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723635036; x=1724239836; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=G3QbuTR7PsHCd/KkxZbVnV1PTWpZxVQLKz/yB/X9NSk=; b=RYXpMDgdYbXaOTZ3X27vd6mI2E4me17Rwv3/zJgu9fjvPRC2YRsdXJ2CD5tbZ8FBu6 GrSE0GSvwHN/FrFTM8B/QAJ4D4dKcneGPPNqP1jWVYrUUQK9se85uXrWTwC0/GfvuP6I Kky+OrciRPHM++mIO7rF+4kcmNKmEfj8DzUBEL0mdJmvZvgAukw7s6tbPzV9Kdzu8TwZ xcrdunnI21m0E3l2T0CvedM0ckA5WBsTStrRWNwBEtwrcoU/TghD+2XlAdBRwc0oJzGa Ydjp/WSIBYCyfEbxQH+qpdffR1+130FadcF2vK736hBRkofRYg8YvVwn5EdAYMsG2WDq b8jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723635036; x=1724239836; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=G3QbuTR7PsHCd/KkxZbVnV1PTWpZxVQLKz/yB/X9NSk=; b=S6zj1LLvexs9gKIj99f5HfijDGBrj8C0qefLAO/44HIvRLNqUpnpq2mf07YHuBhaDy v6gdkC1vVo4cJZMsTbAUHjzQjDPrTAnBD75yD72NKanWzbX5+c+yN9elFc+fTJiMGes3 oM4BD+jQbUJfeBsAgvfvfv26t4bA0tkqwZM8BZoMVxb0bQfUxTjuOWfIzAVh1Yk0KjK/ bDAUfxlQplVlEt/3dQwTMwXgovlJl/wD87xUriCwBMjFIKMan7AqOfv2y8Jhv65BZm2N nCW2EilxRSiwS4hQjO8n6S9LWRvgXgGWW+8Fa8gMCd9/gsmPmf6dlr+nAKaZ7BFv4KbM syZQ== X-Forwarded-Encrypted: i=1; AJvYcCUjj6gMEQ00/vxlEs3T0+4lt05Lpt4DH5Boh5l49SMZEIbs6POXWJ5la9jq4DDBfMGaOa91j4cbO+dDkuQkMIxiCqc= X-Gm-Message-State: AOJu0YwMgT7tIcwl/WMpWIYZt6Eax7J4aB9o18FnGFtIRNSzdLO0Hda4 dqqF535DAwN5VWU/BfGEl4T77k51S5XG0UGSHKc9IH+tOdbZkyuQ X-Google-Smtp-Source: AGHT+IGrqyNakt/u8rUOXDeQW962cysoAw9v98qCsgoCkg6l6CaEsEDRGE8vwyR0MnhGuub+Ft/sXw== X-Received: by 2002:a2e:d02:0:b0:2ec:5c94:3d99 with SMTP id 38308e7fff4ca-2f3aa1a5887mr13352891fa.2.1723635035883; Wed, 14 Aug 2024 04:30:35 -0700 (PDT) Received: from ?IPV6:2a03:83e0:1126:4:eb:d0d0:c7fd:c82c? ([2620:10d:c092:500::4:61b7]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5bd187f4e19sm3731357a12.20.2024.08.14.04.30.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Aug 2024 04:30:35 -0700 (PDT) Message-ID: <4bf9b1ba-ee3a-4c9a-b8ab-d1cc27838130@gmail.com> Date: Wed, 14 Aug 2024 12:30:34 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/6] mm: Introduce a pageflag for partially mapped folios To: Barry Song Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com References: <20240813120328.1275952-1-usamaarif642@gmail.com> <20240813120328.1275952-5-usamaarif642@gmail.com> <925804e4-0b33-45eb-905d-e00f67192828@gmail.com> Content-Language: en-US From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 27E76160027 X-Stat-Signature: 6sgz4im3sup5jie4z9wad1qfeii8m3qo X-Rspam-User: X-HE-Tag: 1723635037-672823 X-HE-Meta: U2FsdGVkX18v14kMKAuaGHU5udEPnVokg4+p9M3DaPiDLq3lXX0TrYGyiAkqYGMJrCir8ZgLuSPh4koU41JuSapw3rQn6KsZ5WQWURfR+KRd+azwwPzM2pZ/2FPD6/7zvgkZOY2ivcHQ5F78znoAsV/yjoQHZ2KoOQd5UAvLIsDOUty/XTBIqr362JtarIU35ei+lGcqZhMpR/SzHl0HlyXr1UpAtrDXIjQOJ4HBOwYJSWbbd7m3/oaDRxKmf7drcxE/WhlNQ5+9Nm4T18YRhlnPwZxAkvfYbBeJ9l//Z6ZCp9iGQ4nG5zAsyrzMS3tSO3Y/8m18WvG37aAvTBjel4N5s8q7/Zmdyfs0dtPwMzjWeqNk0F69HYpSCWuXV/aZgZHfKCMqziNnQvH5GvY7UGHYgFR7QbxcWNpCYKffnF46JlYWu4McO5q7aoL/M5aqqGrYFX6A9VzZukJi+JxjBUsmoq/Q07M0ho6e87GDh3BMTBvR1cYN50Tk+4/BS5pTNBZQnP6H6GemG2grwIf6uP75yl1gYOtODtgeSMukWVHv1gRxQwHsgbmdehVGWNScnO5y/zaP5f/Wt3QLOogOU+HEVPIbs9w28l1uGuFFDp/RubaAtERY+Tiuu6+JloovbNGqdqsbibSI0B2b6DEEZaUagC5pa5qV4vZ4BoL9JKfEzkh5LQ5xQppARGORADw5RWvhr5qm9if9lDLibBda077IHc3JJNBSmqLNwrBM0yEPcSkj4+4wqtnuIQEXd/9Nxdv8QlVWiUxG9U0Tm9fzPm9JJ+tGiiQSBuX7vTVnIbpxBthCKrVDNZ5qCattHgSDSp90hbM1FGrnjRcGkzjXLuyZO4Q8btjhuGN9S6/hSP+ZII2mFDGbpXI2Fov2z5sfs0wlmsOmJwugiqZyjVFML4I5KMfl3B2dN6Yg2nUSaOgLZXh6DXjYu3REvUP2RP0fBmixXF81HG2tYgxpt3+ ZjB/ZsmL sjybrAzLjKf1s5/dcOnNX4erLycVZdQHlRp06ZrUH98jybGr5Ed+5dJmOe9u5/xDwz+BQcmO4KlRkrxka3pSXkVGYkNd3nh4lEfsyNqtCzV688dyNUBCjToawGrn6FPaFaDQ5JqxglsekKw/2ivFXXOgMZ6pZ8gFj/O+eDbq9nG03aMLJwp36PkjopMd6M7IE4HoBIAqrTFvhxlJmZTOa2uwoCS1reWqTTrvsKv6Fb6VBXqM3ajtPzLmB1H8zEb1t2Y8PlcD8YWMzBqhZpk6IGjje34LTy0gaR5aHKLLUq7ay3ZOw95vJ3Y7BOmuBS//zd7kNDYH3bNYdA65s/wE694alcDM6fsgxYb02d0nnnAashA8MIs1Z4pBWhT8d9n1QOcHi1w1U6CLOIywuPDLmpupQULqNn2a7YDTgPPvUQWcYPM9grJb0Mll7sA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 14/08/2024 12:20, Barry Song wrote: > On Wed, Aug 14, 2024 at 11:11 PM Usama Arif wrote: >> >> >> >> On 14/08/2024 11:44, Barry Song wrote: >>> On Wed, Aug 14, 2024 at 12:03 AM Usama Arif wrote: >>>> >>>> Currently folio->_deferred_list is used to keep track of >>>> partially_mapped folios that are going to be split under memory >>>> pressure. In the next patch, all THPs that are faulted in and collapsed >>>> by khugepaged are also going to be tracked using _deferred_list. >>>> >>>> This patch introduces a pageflag to be able to distinguish between >>>> partially mapped folios and others in the deferred_list at split time in >>>> deferred_split_scan. Its needed as __folio_remove_rmap decrements >>>> _mapcount, _large_mapcount and _entire_mapcount, hence it won't be >>>> possible to distinguish between partially mapped folios and others in >>>> deferred_split_scan. >>>> >>>> Eventhough it introduces an extra flag to track if the folio is >>>> partially mapped, there is no functional change intended with this >>>> patch and the flag is not useful in this patch itself, it will >>>> become useful in the next patch when _deferred_list has non partially >>>> mapped folios. >>>> >>>> Signed-off-by: Usama Arif >>>> --- >>>> include/linux/huge_mm.h | 4 ++-- >>>> include/linux/page-flags.h | 3 +++ >>>> mm/huge_memory.c | 21 +++++++++++++-------- >>>> mm/hugetlb.c | 1 + >>>> mm/internal.h | 4 +++- >>>> mm/memcontrol.c | 3 ++- >>>> mm/migrate.c | 3 ++- >>>> mm/page_alloc.c | 5 +++-- >>>> mm/rmap.c | 3 ++- >>>> mm/vmscan.c | 3 ++- >>>> 10 files changed, 33 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>>> index 4c32058cacfe..969f11f360d2 100644 >>>> --- a/include/linux/huge_mm.h >>>> +++ b/include/linux/huge_mm.h >>>> @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page) >>>> { >>>> return split_huge_page_to_list_to_order(page, NULL, 0); >>>> } >>>> -void deferred_split_folio(struct folio *folio); >>>> +void deferred_split_folio(struct folio *folio, bool partially_mapped); >>>> >>>> void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, >>>> unsigned long address, bool freeze, struct folio *folio); >>>> @@ -495,7 +495,7 @@ static inline int split_huge_page(struct page *page) >>>> { >>>> return 0; >>>> } >>>> -static inline void deferred_split_folio(struct folio *folio) {} >>>> +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} >>>> #define split_huge_pmd(__vma, __pmd, __address) \ >>>> do { } while (0) >>>> >>>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h >>>> index a0a29bd092f8..cecc1bad7910 100644 >>>> --- a/include/linux/page-flags.h >>>> +++ b/include/linux/page-flags.h >>>> @@ -182,6 +182,7 @@ enum pageflags { >>>> /* At least one page in this folio has the hwpoison flag set */ >>>> PG_has_hwpoisoned = PG_active, >>>> PG_large_rmappable = PG_workingset, /* anon or file-backed */ >>>> + PG_partially_mapped, /* was identified to be partially mapped */ >>>> }; >>>> >>>> #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) >>>> @@ -861,8 +862,10 @@ static inline void ClearPageCompound(struct page *page) >>>> ClearPageHead(page); >>>> } >>>> FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) >>>> +FOLIO_FLAG(partially_mapped, FOLIO_SECOND_PAGE) >>>> #else >>>> FOLIO_FLAG_FALSE(large_rmappable) >>>> +FOLIO_FLAG_FALSE(partially_mapped) >>>> #endif >>>> >>>> #define PG_head_mask ((1UL << PG_head)) >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index 6df0e9f4f56c..c024ab0f745c 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -3397,6 +3397,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, >>>> * page_deferred_list. >>>> */ >>>> list_del_init(&folio->_deferred_list); >>>> + folio_clear_partially_mapped(folio); >>>> } >>>> spin_unlock(&ds_queue->split_queue_lock); >>>> if (mapping) { >>>> @@ -3453,11 +3454,12 @@ void __folio_undo_large_rmappable(struct folio *folio) >>>> if (!list_empty(&folio->_deferred_list)) { >>>> ds_queue->split_queue_len--; >>>> list_del_init(&folio->_deferred_list); >>>> + folio_clear_partially_mapped(folio); >>>> } >>>> spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >>>> } >>>> >>>> -void deferred_split_folio(struct folio *folio) >>>> +void deferred_split_folio(struct folio *folio, bool partially_mapped) >>>> { >>>> struct deferred_split *ds_queue = get_deferred_split_queue(folio); >>>> #ifdef CONFIG_MEMCG >>>> @@ -3485,14 +3487,17 @@ void deferred_split_folio(struct folio *folio) >>>> if (folio_test_swapcache(folio)) >>>> return; >>>> >>>> - if (!list_empty(&folio->_deferred_list)) >>>> - return; >>>> - >>>> spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >>>> + if (partially_mapped) >>>> + folio_set_partially_mapped(folio); >>>> + else >>>> + folio_clear_partially_mapped(folio); >>> >>> Hi Usama, >>> >>> Do we need this? When can a partially_mapped folio on deferred_list become >>> non-partially-mapped and need a clear? I understand transferring from >>> entirely_map >>> to partially_mapped is a one way process? partially_mapped folios can't go back >>> to entirely_mapped? >>> >> Hi Barry, >> >> deferred_split_folio function is called in 3 places after this series, at fault, collapse and partial mapping. partial mapping can only happen after fault/collapse, and we have FOLIO_FLAG_FALSE(partially_mapped), i.e. flag initialized to false, so technically its not needed. A partially_mapped folio on deferred list wont become non-partially mapped. >> >> I just did it as a precaution if someone ever changes the kernel and calls deferred_split_folio with partially_mapped set to false after it had been true. The function arguments of deferred_split_folio make it seem that passing partially_mapped=false as an argument would clear it, which is why I cleared it as well. I could change the patch to something like below if it makes things better? i.e. add a comment at the top of the function: >> > > to me, it seems a BUG to call with false after a folio has been > partially_mapped. So I'd rather > VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); > > The below should also fix the MTHP_STAT_SPLIT_DEFERRED > counter this patch is breaking? > > @@ -3515,16 +3522,18 @@ void deferred_split_folio(struct folio *folio, > bool partially_mapped) > return; > > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > - if (partially_mapped) > - folio_set_partially_mapped(folio); > - else > - folio_clear_partially_mapped(folio); > - if (list_empty(&folio->_deferred_list)) { > - if (partially_mapped) { > + if (partially_mapped) { > + if (!folio_test_set_partially_mapped(folio)) { > + mod_mthp_stat(folio_order(folio), > + MTHP_STAT_NR_SPLIT_DEFERRED, 1); > if (folio_test_pmd_mappable(folio)) > count_vm_event(THP_DEFERRED_SPLIT_PAGE); > count_mthp_stat(folio_order(folio), > MTHP_STAT_SPLIT_DEFERRED); > } > + } > + VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); > + > + if (list_empty(&folio->_deferred_list)) { > list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); > ds_queue->split_queue_len++; > #ifdef CONFIG_MEMCG > > So I had sent the below without the VM_WARN_ON_FOLIO as a reply to the other email, below is with VM_WARN. -void deferred_split_folio(struct folio *folio) +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ +void deferred_split_folio(struct folio *folio, bool partially_mapped) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); #ifdef CONFIG_MEMCG @@ -3485,14 +3488,17 @@ void deferred_split_folio(struct folio *folio) if (folio_test_swapcache(folio)) return; - if (!list_empty(&folio->_deferred_list)) - return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); - if (list_empty(&folio->_deferred_list)) { + if (partially_mapped) { + folio_set_partially_mapped(folio); if (folio_test_pmd_mappable(folio)) count_vm_event(THP_DEFERRED_SPLIT_PAGE); count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + } else { + /* partially mapped folios cannont become partially unmapped */ + VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); + } + if (list_empty(&folio->_deferred_list)) { list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); ds_queue->split_queue_len++; #ifdef CONFIG_MEMCG Thanks