From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F31F0E7F13F for ; Tue, 26 Sep 2023 22:07:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6494D8D0055; Tue, 26 Sep 2023 18:07:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FA388D0002; Tue, 26 Sep 2023 18:07:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49A578D0055; Tue, 26 Sep 2023 18:07:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3A3758D0002 for ; Tue, 26 Sep 2023 18:07:33 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 061111CA425 for ; Tue, 26 Sep 2023 22:07:33 +0000 (UTC) X-FDA: 81280135986.15.8985110 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) by imf02.hostedemail.com (Postfix) with ESMTP id 1D41D80011 for ; Tue, 26 Sep 2023 22:07:30 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UQFHcUwv; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf02.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695766051; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2Uno45C0ymY1LBAzD+TJID07sUEUZYVOqDpmb0aXI3A=; b=1KENRcEoU3mdMblvinyqw0AY3NGOChoawK0Y6qLs68gDwUD6IW9bkk0WHsYI1iOtO8GyBj RkUqEqoKzbuPXWc2iPADOadZglbcUKv3iDYWf6vyc//J7jMplDwQ0jJn229Oav7+AgKTMK U7wQTLoKV36WAFxzjYluTcFzILiiUuI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UQFHcUwv; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf02.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695766051; a=rsa-sha256; cv=none; b=PpWAPA2FTDVeuxDwbraVNLAxm0EfyerFMlSQraERmFzAm5tuRGqKNmitZ5I/0HiMbRIFHI b7odMNb0SSEPLwhbKHsnfYXFgMw4q2yO8NwoQ68Yz/G/xAXbf8z4Uq3U2oo4u20yMBWm7+ ZZErLvrLO/+Hfs7e8/6v7gvIK1jYp10= Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-573c62b3cd2so6009999a12.3 for ; Tue, 26 Sep 2023 15:07:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695766050; x=1696370850; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2Uno45C0ymY1LBAzD+TJID07sUEUZYVOqDpmb0aXI3A=; b=UQFHcUwvuMPaVA2NxuCUNne+laD+J2l/+gB2Y+Cr6THqoAdXpf+NY7gNap2OjlCgLQ MceQL48bzvkygIi/uUv5vK2MPpi+TCFgyOvQ5rPpom9wWq9m24EToiJDa0XEP4SOe24P AaYXQV2xe8cwVSwPUI0nV9zDTjPpPcDFldzHwZHd8yicxrzGxQA48/8nX6ivjnjG5Xf8 gFB8YzM1uxH59o/+z+ll1oquHR2eIW98NI2ojtfedS3HseKGitgB567hB+euu1oPwm9Q WxHOSttktwXzeFKqbWdvH+R8rlGHEweRVJHu6ExfI5xsot260ur5vCNLQBNmSz9oCh9N +zbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695766050; x=1696370850; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2Uno45C0ymY1LBAzD+TJID07sUEUZYVOqDpmb0aXI3A=; b=lSIjycyH0A9UtZvl9OgSBVElGGN7V7LFyh44Lx9AMCh90SDkoplnpgWoPB4DVJdGlA pHsWfSFQ9p7tF5LaGfXR9A1Iw9DJyBa3UhKMU/a9olJwVCKPSEeHxkxbKijuINOxZZN2 7mh1MqJaICivPliutzw2n0qmZg6r081TbzqQ6bBQJm0nBc38rB8JgC+fupGoudp2ztTA /pEw+pH3NQH6fRUgiwWU9pMsOPnQmWdR7LFbDz9tGXnsiRwr7a5sftPr13B6X0Cg62l5 bzv3MlEZQfp1IPeD1GSUhhp0uiPA+TVQuudnsQZIxLblWSdPOHqJgQVw6Dfc6xIjw6fM HwFg== X-Gm-Message-State: AOJu0YxKFos4qZNlftg2PfJeMXO85fyACS1rJ5tg6Nt+mRg6sZrkpFI/ pxFrE7OvgSSEHLDElOoX1tBDogQNOC5KVnl/3lBIy4Wsle4= X-Google-Smtp-Source: AGHT+IGGBd6woJGxJUdB/ZbkieCMLjS85nIqrDjs/mIOplaxnQo4+AEZRJAGcvpFK3hgZleF+rxv5XtwTtquGUzhQEE= X-Received: by 2002:a05:6a20:6a20:b0:154:d3ac:2076 with SMTP id p32-20020a056a206a2000b00154d3ac2076mr173124pzk.40.1695766049745; Tue, 26 Sep 2023 15:07:29 -0700 (PDT) MIME-Version: 1.0 References: <20230922193639.10158-1-vishal.moola@gmail.com> <20230922193639.10158-3-vishal.moola@gmail.com> In-Reply-To: <20230922193639.10158-3-vishal.moola@gmail.com> From: Yang Shi Date: Tue, 26 Sep 2023 15:07:18 -0700 Message-ID: Subject: Re: [RFC PATCH 2/2] mm/khugepaged: Remove compound_pagelist To: "Vishal Moola (Oracle)" Cc: linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1D41D80011 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: abjhrwks1s69ujqissmxxhm769oiihhn X-HE-Tag: 1695766050-37162 X-HE-Meta: U2FsdGVkX1+ibS+Ph2Gps1/VuNMs2kHyQfqP+YVx5Fl2OwdchzSOpOr2dnUx4m4cCuyYf3AnWd7/NA8U2uieZynAuhWJmjYW3kdpgBa4kOkZO9yZp4U+occKt76S9KLyP8O2X7cxtxOgF0Z8k39tiORVhIjWHHjAUBF+jm/kGpYJNmT0/VojknGJEY9bnuc8FBTNYwS3RiCnWDeWZnKdCiMr4U+3Q7rXWcbjeFSINewn5w2SQKebTlmiAlpOBANiksuwgMgyqJfAUTm5qzVoyZA+QucWRp4JBdDAlXsIcuI31/iUtmcbGB3MoOiCfDVsuRy9pqRyPDVizk4MCvOXeZgh+trrcH8vjRqDNMHIsFM0ssnMmmhlPNzc51zR3VNBtlJjrb5jPWvWG7p3ledrfHibTHBpN0MyqoGqZqQV34bj6cTw+9oibgH5Wz9r2jylgnSZIFvcb77QWKNxeKVy/vzk3iGaDX7523pm9+EXgTudJ9ZKxYS9DSMUUk7jDWuCJOd1E1M3U79yh0eOM16V+9ywwiQ7pQgE9y4JUfQhxqOSgazgvqFUUYk9xez5znLLNe6h/LUHHz9B4KMKNNhkZTjqoPskmIwT2eYQKDvQTjc3syQYC6ygfeMbnGREaTOomrq9mz/NQeV7zDNoPshH3dzmlQNmPlVUIV9nLwHoPFL8OZ6hXyu6K7TA/PUct8j28YzcCerUOwP9W5haVpvenxHiKHJ/LaoCagwPFN3kyR0QpIRNYbgZIra4QEnsHMmZGcR45G8bacEZKnbx3nzCNElTw/kBgyzBLKEMwQfC8G2R8b25l10+CAIaKz1ElkCEUBHAIIJcSWCGrAgvWVNAXKpbmO00yBw3iSxXbjYqT9CklMCT5VyLCeI7aGSZq9HZAj1wNFZtrU51HM/Gk/trCtSetx9VPC7QJaBq99rNn+aLVVDb6+i+mDwZCr2HKwnGaI9yI9YMlw00pHfvMLm Osdlkv// jPLSX6yguYDuqQEEoOW1gEtT4SzA566R/rKG/v9jPcM12Vm7+obzk5hkC278Kbtn3oCqjEwdMeD3qB0+4gVMq1cDq+Yv/vZxqd4co/A50U0qOCy5GyfwVQq6Yin/uHR5EcvvQ3eGsVw0wTbW+ifnzwkD/JxZ89ltWNDPbZSCV6+KSjgT01is41Km+GyjVQ7bNNxFGAvzttMiLwa+ukfIq9tL+QFt+OVGdT9VAcIS8dZvRiFe57yxwu0Yh2vHddSIe3+eG9vK9TzPf5YJlUR0MPKrqbdBy9t4h4ARTB16TC+K2kqKBSTOmLG4CxFZzzHb8nuoaXDwbTwW9Kgn3nD28HBRDxcDT7QrIsvV1io9EYSI4VebTAQNgZ02lEu0YXeNnDHVHNHMdmGX6FCd4BRQS7wsXvfTfl7hk/DHi7hycLBBNEf8CpWN0Xb8cy42KeRLjUGCAIq79KOXEPg/EykohPkIz9o7hMdvVjlMXujqOU8TBkaKTq9Nn8WRkBw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 22, 2023 at 9:33=E2=80=AFPM Vishal Moola (Oracle) wrote: > > Currently, khugepaged builds a compound_pagelist while scanning, which > is used to properly account for compound pages. We can now account > for a compound page as a singular folio instead, so remove this list. > > Large folios are guaranteed to have consecutive ptes and addresses, so > once the first pte of a large folio is found skip over the rest. The address space may just map a partial folio, for example, in the extreme case the HUGE_PMD size range may have HUGE_PMD_NR folios with mapping one subpage from each folio per PTE. So assuming the PTE mapped folio is mapped consecutively may be wrong. Please refer to collapse_compound_extreme() in tools/testing/selftests/mm/khugepaged.c. > > This helps convert khugepaged to use folios. It removes 3 compound_head > calls in __collapse_huge_page_copy_succeeded(), and removes 980 bytes of > kernel text. > > Signed-off-by: Vishal Moola (Oracle) > --- > mm/khugepaged.c | 76 ++++++++++++------------------------------------- > 1 file changed, 18 insertions(+), 58 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index f46a7a7c489f..b6c7d55a8231 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -498,10 +498,9 @@ static void release_pte_page(struct page *page) > release_pte_folio(page_folio(page)); > } > > -static void release_pte_pages(pte_t *pte, pte_t *_pte, > - struct list_head *compound_pagelist) > +static void release_pte_folios(pte_t *pte, pte_t *_pte) > { > - struct folio *folio, *tmp; > + struct folio *folio; > > while (--_pte >=3D pte) { > pte_t pteval =3D ptep_get(_pte); > @@ -514,12 +513,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pt= e, > continue; > folio =3D pfn_folio(pfn); > if (folio_test_large(folio)) > - continue; > - release_pte_folio(folio); > - } > - > - list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { > - list_del(&folio->lru); > + _pte -=3D folio_nr_pages(folio) - 1; > release_pte_folio(folio); > } > } > @@ -538,8 +532,7 @@ static bool is_refcount_suitable(struct page *page) > static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > unsigned long address, > pte_t *pte, > - struct collapse_control *cc, > - struct list_head *compound_pageli= st) > + struct collapse_control *cc) > { > struct folio *folio =3D NULL; > pte_t *_pte; > @@ -588,19 +581,6 @@ static int __collapse_huge_page_isolate(struct vm_ar= ea_struct *vma, > } > } > > - if (folio_test_large(folio)) { > - struct folio *f; > - > - /* > - * Check if we have dealt with the compound page > - * already > - */ > - list_for_each_entry(f, compound_pagelist, lru) { > - if (folio =3D=3D f) > - goto next; > - } > - } > - > /* > * We can do it before isolate_lru_page because the > * page can't be freed from under us. NOTE: PG_lock > @@ -644,9 +624,6 @@ static int __collapse_huge_page_isolate(struct vm_are= a_struct *vma, > VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); > VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); > > - if (folio_test_large(folio)) > - list_add_tail(&folio->lru, compound_pagelist); > -next: > /* > * If collapse was initiated by khugepaged, check that th= ere is > * enough young pte to justify collapsing the page > @@ -660,6 +637,10 @@ static int __collapse_huge_page_isolate(struct vm_ar= ea_struct *vma, > if (pte_write(pteval)) > writable =3D true; > > + if (folio_test_large(folio)) { > + _pte +=3D folio_nr_pages(folio) - 1; > + address +=3D folio_size(folio) - PAGE_SIZE; > + } > } > > if (unlikely(!writable)) { > @@ -673,7 +654,7 @@ static int __collapse_huge_page_isolate(struct vm_are= a_struct *vma, > return result; > } > out: > - release_pte_pages(pte, _pte, compound_pagelist); > + release_pte_folios(pte, _pte); > trace_mm_collapse_huge_page_isolate(&folio->page, none_or_zero, > referenced, writable, result)= ; > return result; > @@ -682,11 +663,9 @@ static int __collapse_huge_page_isolate(struct vm_ar= ea_struct *vma, > static void __collapse_huge_page_copy_succeeded(pte_t *pte, > struct vm_area_struct *vm= a, > unsigned long address, > - spinlock_t *ptl, > - struct list_head *compoun= d_pagelist) > + spinlock_t *ptl) > { > struct page *src_page; > - struct page *tmp; > pte_t *_pte; > pte_t pteval; > > @@ -706,8 +685,7 @@ static void __collapse_huge_page_copy_succeeded(pte_t= *pte, > } > } else { > src_page =3D pte_page(pteval); > - if (!PageCompound(src_page)) > - release_pte_page(src_page); > + release_pte_page(src_page); > /* > * ptl mostly unnecessary, but preempt has to > * be disabled to update the per-cpu stats > @@ -720,23 +698,12 @@ static void __collapse_huge_page_copy_succeeded(pte= _t *pte, > free_page_and_swap_cache(src_page); > } > } > - > - list_for_each_entry_safe(src_page, tmp, compound_pagelist, lru) { > - list_del(&src_page->lru); > - mod_node_page_state(page_pgdat(src_page), > - NR_ISOLATED_ANON + page_is_file_lru(s= rc_page), > - -compound_nr(src_page)); > - unlock_page(src_page); > - free_swap_cache(src_page); > - putback_lru_page(src_page); > - } > } > > static void __collapse_huge_page_copy_failed(pte_t *pte, > pmd_t *pmd, > pmd_t orig_pmd, > - struct vm_area_struct *vma, > - struct list_head *compound_p= agelist) > + struct vm_area_struct *vma) > { > spinlock_t *pmd_ptl; > > @@ -753,7 +720,7 @@ static void __collapse_huge_page_copy_failed(pte_t *p= te, > * Release both raw and compound pages isolated > * in __collapse_huge_page_isolate. > */ > - release_pte_pages(pte, pte + HPAGE_PMD_NR, compound_pagelist); > + release_pte_folios(pte, pte + HPAGE_PMD_NR); > } > > /* > @@ -769,7 +736,6 @@ static void __collapse_huge_page_copy_failed(pte_t *p= te, > * @vma: the original raw pages' virtual memory area > * @address: starting address to copy > * @ptl: lock on raw pages' PTEs > - * @compound_pagelist: list that stores compound pages > */ > static int __collapse_huge_page_copy(pte_t *pte, > struct page *page, > @@ -777,8 +743,7 @@ static int __collapse_huge_page_copy(pte_t *pte, > pmd_t orig_pmd, > struct vm_area_struct *vma, > unsigned long address, > - spinlock_t *ptl, > - struct list_head *compound_pagelist) > + spinlock_t *ptl) > { > struct page *src_page; > pte_t *_pte; > @@ -804,11 +769,9 @@ static int __collapse_huge_page_copy(pte_t *pte, > } > > if (likely(result =3D=3D SCAN_SUCCEED)) > - __collapse_huge_page_copy_succeeded(pte, vma, address, pt= l, > - compound_pagelist); > + __collapse_huge_page_copy_succeeded(pte, vma, address, pt= l); > else > - __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, > - compound_pagelist); > + __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma)= ; > > return result; > } > @@ -1081,7 +1044,6 @@ static int collapse_huge_page(struct mm_struct *mm,= unsigned long address, > int referenced, int unmapped, > struct collapse_control *cc) > { > - LIST_HEAD(compound_pagelist); > pmd_t *pmd, _pmd; > pte_t *pte; > pgtable_t pgtable; > @@ -1168,8 +1130,7 @@ static int collapse_huge_page(struct mm_struct *mm,= unsigned long address, > > pte =3D pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); > if (pte) { > - result =3D __collapse_huge_page_isolate(vma, address, pte= , cc, > - &compound_pagelist)= ; > + result =3D __collapse_huge_page_isolate(vma, address, pte= , cc); > spin_unlock(pte_ptl); > } else { > result =3D SCAN_PMD_NULL; > @@ -1198,8 +1159,7 @@ static int collapse_huge_page(struct mm_struct *mm,= unsigned long address, > anon_vma_unlock_write(vma->anon_vma); > > result =3D __collapse_huge_page_copy(pte, hpage, pmd, _pmd, > - vma, address, pte_ptl, > - &compound_pagelist); > + vma, address, pte_ptl); > pte_unmap(pte); > if (unlikely(result !=3D SCAN_SUCCEED)) > goto out_up_write; > -- > 2.40.1 >