From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A457C6FA99 for ; Wed, 8 Mar 2023 00:37:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 413CF6B0072; Tue, 7 Mar 2023 19:37:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C31B6B0073; Tue, 7 Mar 2023 19:37:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28B73280001; Tue, 7 Mar 2023 19:37:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 13F996B0072 for ; Tue, 7 Mar 2023 19:37:31 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D097FAB505 for ; Wed, 8 Mar 2023 00:37:30 +0000 (UTC) X-FDA: 80543867460.18.CE606ED Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf02.hostedemail.com (Postfix) with ESMTP id 215EA80005 for ; Wed, 8 Mar 2023 00:37:28 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ltrNKvga; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of jthoughton@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678235849; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HrJguA7nWzVBA3E1N//Y+8ZkAi0kj8Hhxx5ESwehlHk=; b=saKi73tip/Dh4/M+7yPxdhbM6U9J7eO9tPdWARwZmGppmgyvhRNdJdJToWuo1SddHoDTHB ttA5dBnpnBC7Ur20WbeuJtCaqpF0H6j8V/Ey3TSG9CEtLHluODFytqUma+BJ55MfdWrAyW YIuJxOoDt9wP/vbgselNiBSG4CVDoXY= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ltrNKvga; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of jthoughton@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678235849; a=rsa-sha256; cv=none; b=5wqnHFVZwacDEtEMS0oMAdBkkvFLV1TF7LAR3L3lKRpWCxKuxy7Rf/KEGEK7eSd2jSDogR /8q9VYQWoWR0XeAESVxVVyMXFMPg0S+USFVX8dt8sMLpFMrQQtsEZBjyt3yIx0SlQzqQuj wP4x0L0IOJO/f5UC2eYl3c4ROjR9Zgg= Received: by mail-vs1-f47.google.com with SMTP id d20so13989424vsf.11 for ; Tue, 07 Mar 2023 16:37:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678235848; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HrJguA7nWzVBA3E1N//Y+8ZkAi0kj8Hhxx5ESwehlHk=; b=ltrNKvga/FciNkt8pnc5XRcNWEdwZgzoW9hK8ViSk9TbX9mh6mHAEvpjdlMw/XaVN7 VpxXh3OU3cwNzcdzQOPDuX3k78ZkxU7cQIXd8Y7C41kw6lebENu2Ud293gjBckcEstEh 52aaJzsA5Uf6rtKQax5WF8ntRBomyhlnYpw5tn9cezqnnMHaV3TdKoUiiOaTZzFoQ200 qT2v3cS8B7LVftxEmbM5qaK382B7tB1TTEOPGeztDZej1FkhWiLerIq6RhXvd5dU0KmI EFTqtX1qvkG8D+aoJ2iwIzqm81QKMqBApwCd0Z2jaceLyv2tjPYheGQvPT0XSkF5LKZb 0kDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678235848; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HrJguA7nWzVBA3E1N//Y+8ZkAi0kj8Hhxx5ESwehlHk=; b=p1FczYao2wGO6Q98HGuL09nEbO6GDkgxTvFrFdwYhrJ2seK+/FRlsNf21V9pjG0aMN rmEVn+roZJcJNWBrtG7SuIOLX8PvDcQ7DYPBe+UHVtqsUBmi/WXdrxrErg9tPo7buWJH uqMfrkpftb2AwB1fXtE5I5B94yYNd5AQjTrfzo1GcM7EszHPbDK8d0UYhLHHb8ruNcAW 8hTlaUfnNOx1n6AvVguPOOb0p3bMla+uWnqubbjgvASySpgIJLAlieh530dlA6yBe4XV 3B3woFEWBtcwTFUMleEmQ9XBVMGtJUfO4GeRU+Le7MEt1CWVJ9r/00abj1fEvX/7T56X xdhA== X-Gm-Message-State: AO0yUKUY9koQcrbq+PBJfPEaEwpDQva3Whw/ZR1sjxFTT/6uVlb1do39 EFae0NX26ZU6lURbJuDSjFjROkxbiQXaVVF9LELsnA== X-Google-Smtp-Source: AK7set9asn869gPJ+4RN7FAKnvLYMXIAou/RevB42k8mUIYjX1srLeKeYof/1UjEHj1xPg2CBsXFnpn2mXYH5Q38dxU= X-Received: by 2002:a67:f254:0:b0:402:9b84:1be5 with SMTP id y20-20020a67f254000000b004029b841be5mr10830369vsm.7.1678235848046; Tue, 07 Mar 2023 16:37:28 -0800 (PST) MIME-Version: 1.0 References: <20230306230004.1387007-1-jthoughton@google.com> <20230306230004.1387007-2-jthoughton@google.com> <20230307215420.GA59222@monkey> In-Reply-To: <20230307215420.GA59222@monkey> From: James Houghton Date: Tue, 7 Mar 2023 16:36:51 -0800 Message-ID: Subject: Re: [PATCH 1/2] mm: rmap: make hugetlb pages participate in _nr_pages_mapped To: Mike Kravetz Cc: Hugh Dickins , Muchun Song , Peter Xu , "Matthew Wilcox (Oracle)" , Andrew Morton , "Kirill A . Shutemov" , David Hildenbrand , David Rientjes , Axel Rasmussen , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 215EA80005 X-Stat-Signature: xbabnshhwthc5py33n4pchheout66gsj X-HE-Tag: 1678235848-492600 X-HE-Meta: U2FsdGVkX1/3XxnON+SYpfklAakclk9zE51DzgZp8iip9h6URqK+Eyp1DGwiqfpR5L7ueQvMphXEMRG61vnX8osX7fRBMIsCRBdw8u9/rNAwllvMAcpoeJfHct9wjmOfwObK63uIlLMW7F8ElxFCBfgnarRDMylJGyVEpWdwxQR0MvTqQw+b+fjZZbLO348LI6lpNbPy9eNJl65/T35B5gWj39sp2JCG8nAtzObyWmoi7kqWUB03yzXopDRk3MwV+cxDgss2L/cSVf1PLeOUoT4vBGt26VN8xxxoX2jBzq8kBMavLTBcEJwX4C3m7+TEntJyP1L62t5L8ceG8eEZicAnHmILkHVX2w18aoO8dpUG0j/8UJdXpOF9EAP+WjfQ7kGiKQE8t5kFSnY++mXCdf6dpic3PKX7PnVqQ1ZQHTyjPQ6JahrdJj6nf02P0DQT3pZnw2W8DR2EioZy7vhYH+1+6lkIzeA4Xx15rEJBOfnAJfWVZymdBaAmDyr3TZnFlLa4VNJraAFXGXloAFZ+jiJBtKZ+xkOWk9hb0MJ+FZNeaAwt1xyfe7az1/gkjLrkjpk5Wekrve04cZC2vpBCoNtOe10OCw3STCcZ/YYiAojnSIo8silpmQPOHrghInU7b3EyNOq/l3PNKZwpfxm7g9EtbC+5XswI72HfN7W0/RxkjD9ny2P9RSy+TUXiJuE/MuloG0f5WLI48mqsl81rlJx3skWw1Uj5D32zmICHN1V7ty4mJMUwYZEpFNJTKxmmwHVHyOIQRgLoh6tclf46/92i4qyA+9c4dmhxCNBRYKm5tb51//W8UbXNa3kCO+/LShWeh4SnE2atuXaMj2t5hfWT/FCZcynnpsxEGRhzZilp7hedjCivbq8wosd9nxwZ32Bg/aJs7266AqL3hMP86jlxIXQ4pTr3fLWPyxK8zExcag+lm+WsiQt8AlrzVX5o4Ptk4Ls3HIQnzWq9Qsm 2F4D8DN2 UDi+xZPubAL+mmYXaw+qFIaOAGfA/jQLvyOorHBURw6xovTGBEhvNyL5SDDYDJZcRM1vzwxDAnQOSjKeKh/5vujdWZuY0X1j0kw1hDgQhDo1taBtK+Z+tsnympFjQpk5nIfy09zYsCKV1X9cDHXRAXee3KdzSxpOTrr1AWgEweQt4uHwQ/QoO9VPbxmmZft3QS4nSAN1NuMzFZ/aOLTbQHwV+Y+IuBxdAGaZVdi1awgWfG/Oh2K0AYkH/GXV2FCerO4CdwV2Vc4RjhyHdHoHoA8EtAk1gyqLdkkz1bGl734wzXcHd9leMo/ntCEdbjzSuCwC4d4ooO/bVShWqNhJmyZB2/A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 7, 2023 at 1:54=E2=80=AFPM Mike Kravetz wrote: > > On 03/06/23 23:00, James Houghton wrote: > > For compound mappings (compound=3Dtrue), _nr_pages_mapped will now be > > incremented by COMPOUND_MAPPED when the first compound mapping is > > created. > > This sentence makes it sound like incrementing by COMPOUND_MAPPED for > compound pages is introduced by this patch. Rather, it is just for > hugetlb (now always) compound mappings. Perhaps change that to read: > For hugetlb mappings ... Yes this is kind of confusing. I'll fix it like you suggest. > > > For small mappings, _nr_pages_mapped is incremented by 1 when the > > particular small page is mapped for the first time. This is incompatibl= e > > with HPageVmemmapOptimize()ed folios, as most of the tail page structs > > will be mapped read-only. > > > > Currently HugeTLB always passes compound=3Dtrue, but in the future, > > HugeTLB pages may be mapped with small mappings. > > > > To implement this change: > > 1. Replace most of HugeTLB's calls to page_dup_file_rmap() with > > page_add_file_rmap(). The call in copy_hugetlb_page_range() is kept= . > > 2. Update page_add_file_rmap() and page_remove_rmap() to support > > HugeTLB folios. > > 3. Update hugepage_add_anon_rmap() and hugepage_add_new_anon_rmap() to > > also increment _nr_pages_mapped properly. > > > > With these changes, folio_large_is_mapped() no longer needs to check > > _entire_mapcount. > > > > HugeTLB doesn't use LRU or mlock, so page_add_file_rmap() and > > page_remove_rmap() excludes those pieces. It is also important that > > the folio_test_pmd_mappable() check is removed (or changed), as it's > > possible to have a HugeTLB page whose order is not >=3D HPAGE_PMD_ORDER= , > > like arm64's CONT_PTE_SIZE HugeTLB pages. > > > > This patch limits HugeTLB pages to 16G in size. That limit can be > > increased if COMPOUND_MAPPED is raised. > > > > Signed-off-by: James Houghton > > > > Thanks! > > This is a step in the direction of having hugetlb use the same mapcount > scheme as elsewhere. As you mention, with this in place future mapcount > changes should mostly 'just work' for hugetlb. > > Because of this, > Acked-by: Mike Kravetz Thanks! > > I have a few nits below, and I'm sure others will chime in later. > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index ba901c416785..4a975429b91a 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1316,19 +1316,21 @@ void page_add_file_rmap(struct page *page, stru= ct vm_area_struct *vma, > > int nr =3D 0, nr_pmdmapped =3D 0; > > bool first; > > > > - VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); > > + VM_BUG_ON_PAGE(compound && !PageTransHuge(page) > > + && !folio_test_hugetlb(folio), page); > > > > /* Is page being mapped by PTE? Is this its first map to be added= ? */ > > if (likely(!compound)) { > > + if (unlikely(folio_test_hugetlb(folio))) > > + VM_BUG_ON_PAGE(HPageVmemmapOptimized(&folio->page= ), > > + page); > > first =3D atomic_inc_and_test(&page->_mapcount); > > nr =3D first; > > if (first && folio_test_large(folio)) { > > nr =3D atomic_inc_return_relaxed(mapped); > > nr =3D (nr < COMPOUND_MAPPED); > > } > > - } else if (folio_test_pmd_mappable(folio)) { > > - /* That test is redundant: it's for safety or to optimize= out */ > > I 'think' removing this check is OK. It would seem that the caller > knows if the folio is mappable. If we want a similar test, we might be > able to use something like: > > arch_hugetlb_valid_size(folio_size(folio)) > Ack. I think leaving the check(s) removed is fine. > > - > > + } else { > > first =3D atomic_inc_and_test(&folio->_entire_mapcount); > > if (first) { > > nr =3D atomic_add_return_relaxed(COMPOUND_MAPPED,= mapped); > > @@ -1345,6 +1347,9 @@ void page_add_file_rmap(struct page *page, struct= vm_area_struct *vma, > > } > > } > > > > + if (folio_test_hugetlb(folio)) > > + return; > > IMO, a comment saying hugetlb is special and does not participate in lru > would be appropriate here. Will do. > > > + > > if (nr_pmdmapped) > > __lruvec_stat_mod_folio(folio, folio_test_swapbacked(foli= o) ? > > NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmap= ped); > > @@ -1373,24 +1378,18 @@ void page_remove_rmap(struct page *page, struct= vm_area_struct *vma, > > > > VM_BUG_ON_PAGE(compound && !PageHead(page), page); > > > > - /* Hugetlb pages are not counted in NR_*MAPPED */ > > - if (unlikely(folio_test_hugetlb(folio))) { > > - /* hugetlb pages are always mapped with pmds */ > > - atomic_dec(&folio->_entire_mapcount); > > - return; > > - } > > - > > /* Is page being unmapped by PTE? Is this its last map to be remo= ved? */ > > if (likely(!compound)) { > > + if (unlikely(folio_test_hugetlb(folio))) > > + VM_BUG_ON_PAGE(HPageVmemmapOptimized(&folio->page= ), > > + page); > > last =3D atomic_add_negative(-1, &page->_mapcount); > > nr =3D last; > > if (last && folio_test_large(folio)) { > > nr =3D atomic_dec_return_relaxed(mapped); > > nr =3D (nr < COMPOUND_MAPPED); > > } > > - } else if (folio_test_pmd_mappable(folio)) { > > - /* That test is redundant: it's for safety or to optimize= out */ > > - > > + } else { > > last =3D atomic_add_negative(-1, &folio->_entire_mapcount= ); > > if (last) { > > nr =3D atomic_sub_return_relaxed(COMPOUND_MAPPED,= mapped); > > @@ -1407,6 +1406,9 @@ void page_remove_rmap(struct page *page, struct v= m_area_struct *vma, > > } > > } > > > > + if (folio_test_hugetlb(folio)) > > + return; > > Same as above in page_add_file_rmap. > > > + > > if (nr_pmdmapped) { > > if (folio_test_anon(folio)) > > idx =3D NR_ANON_THPS; > > @@ -2541,9 +2543,11 @@ void hugepage_add_anon_rmap(struct page *page, s= truct vm_area_struct *vma, > > first =3D atomic_inc_and_test(&folio->_entire_mapcount); > > VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); > > VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page); > > - if (first) > > + if (first) { > > + atomic_add(COMPOUND_MAPPED, &folio->_nr_pages_mapped); > > __page_set_anon_rmap(folio, page, vma, address, > > !!(flags & RMAP_EXCLUSIVE)); > > + } > > } > > > > void hugepage_add_new_anon_rmap(struct folio *folio, > > @@ -2552,6 +2556,7 @@ void hugepage_add_new_anon_rmap(struct folio *fol= io, > > BUG_ON(address < vma->vm_start || address >=3D vma->vm_end); > > /* increment count (starts at -1) */ > > atomic_set(&folio->_entire_mapcount, 0); > > + atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); > > folio_clear_hugetlb_restore_reserve(folio); > > __page_set_anon_rmap(folio, &folio->page, vma, address, 1); > > } > > Should we look at perhaps modifying page_add_anon_rmap and > folio_add_new_anon_rmap as well? I think I can merge hugepage_add_anon_rmap with page_add_anon_rmap and hugepage_add_new_anon_rmap with folio_add_new_anon_rmap. With them merged, it's pretty easy to see what HugeTLB does differently from generic mm, which is nice. :)