From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33782C433EF for ; Mon, 6 Jun 2022 22:51:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C01E6B0072; Mon, 6 Jun 2022 18:51:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 848F96B0073; Mon, 6 Jun 2022 18:51:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C3A46B0074; Mon, 6 Jun 2022 18:51:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5850F6B0072 for ; Mon, 6 Jun 2022 18:51:53 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3808FD39 for ; Mon, 6 Jun 2022 22:51:53 +0000 (UTC) X-FDA: 79549310106.14.FD441B1 Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by imf16.hostedemail.com (Postfix) with ESMTP id B16B4180035 for ; Mon, 6 Jun 2022 22:51:33 +0000 (UTC) Received: by mail-pg1-f175.google.com with SMTP id h192so7361746pgc.4 for ; Mon, 06 Jun 2022 15:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MdCRLcFwjhMZpcbcusfxY01W51BJCcgJN6p4NnfIdoQ=; b=Apf6KiBSGkDFuLml8ehVqGaq4b2RK9TxIi2lnKaWSJKVxkJ5Zmwr+pMsnFKSKHtu1Z 4BOEMzsUqd3NWlntvxmi4uGLr6T8igSDqbCAmGcQbxLOW7rD7FQO9oIsNtT1qrOasaCS JKsG/At9A853Teiql2p0vBqHyoowfqPIsddFO5VwH9PZNcmm72Cobz/4E0J0NkAHcZO5 MXvhl1XkQjWt25Ds4L2DV6qJNy24kUHhanF0cd/A1/i3uw6E7E7bORm/p/fQ8SYqVC1I d8U/eavLZPnOitIUlyIxv53Nkrf13AbpVNSQKfwBY39UBda3BHL8Yu1nGMAEwYff2CIv ULcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MdCRLcFwjhMZpcbcusfxY01W51BJCcgJN6p4NnfIdoQ=; b=S0CiMiEv+Lh95e5iRDvjPfYbA7Y+jxVqZpgKKLYm+9TtxsYVczhLZa6lHcXqHpvSLM zOKoQKKdgEO/pwO1+6hn6/qf4FHibhzcHy3m20ag1OccmICbpnIiemmq+AK/LbJ9Bc2S Oj5DLcr+Lzue0moD2Vb+xJXlORbjo8y5a+xt+xouKhL2wu8vr6OBUE+oDZyBUygb5+EK /dhMsEqNEZCaCf4PNNr7fvW8Yx9zIEhiGai4XPhs+PDtutc1+M0ZYw63HPPLcq2k7b7q F3d3U0A8r/UtGR+y9m2SFpsJxGL7JND7s9ADwD8x80dzBFJDRWwthfcnDqkfcqXsKo30 4xCw== X-Gm-Message-State: AOAM533s5UzSBJaSwNr/7He/7gHExNvwTEw1xb9R5IO03dea5aRvPkxJ Hyu26Fn5Gbtsliov2HnUoBYBBx6FYz8ZYwzEksE= X-Google-Smtp-Source: ABdhPJydasWCjf+395iJgVVtYys40gCZZtUUWvsyeAjrAk1XeCczVFCWfudrti80j3OsNsthjIVknJ6UyMZnh1QYGPk= X-Received: by 2002:a63:3183:0:b0:3fd:6797:70a8 with SMTP id x125-20020a633183000000b003fd679770a8mr11673357pgx.206.1654555911657; Mon, 06 Jun 2022 15:51:51 -0700 (PDT) MIME-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> <20220604004004.954674-8-zokeefe@google.com> In-Reply-To: <20220604004004.954674-8-zokeefe@google.com> From: Yang Shi Date: Mon, 6 Jun 2022 15:51:39 -0700 Message-ID: Subject: Re: [PATCH v6 07/15] mm/khugepaged: add flag to ignore khugepaged heuristics To: "Zach O'Keefe" Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , Linux MM , Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B16B4180035 X-Stat-Signature: ipm6nqetq4yti1k8mo3brk1atyc1wx51 X-Rspam-User: Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Apf6KiBS; spf=pass (imf16.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1654555893-490675 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 3, 2022 at 5:40 PM Zach O'Keefe wrote: > > Add enforce_page_heuristics flag to struct collapse_control that allows > context to ignore heuristics originally designed to guide khugepaged: > > 1) sysfs-controlled knobs khugepaged_max_ptes_[none|swap|shared] > 2) requirement that some pages in region being collapsed be young or > referenced > > This flag is set in khugepaged collapse context to preserve existing > khugepaged behavior. > > This flag will be used (unset) when introducing madvise collapse > context since here, the user presumably has reason to believe the > collapse will be beneficial and khugepaged heuristics shouldn't tell > the user they are wrong. > > Signed-off-by: Zach O'Keefe Reviewed-by: Yang Shi > --- > mm/khugepaged.c | 55 +++++++++++++++++++++++++++++++++---------------- > 1 file changed, 37 insertions(+), 18 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 03e0da0008f1..c3589b3e238d 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -87,6 +87,13 @@ static struct kmem_cache *mm_slot_cache __read_mostly; > #define MAX_PTE_MAPPED_THP 8 > > struct collapse_control { > + /* > + * Heuristics: > + * - khugepaged_max_ptes_[none|swap|shared] > + * - require memory to be young / referenced > + */ > + bool enforce_page_heuristics; > + > /* Num pages scanned per node */ > int node_load[MAX_NUMNODES]; > > @@ -604,6 +611,7 @@ static bool is_refcount_suitable(struct page *page) > static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > unsigned long address, > pte_t *pte, > + struct collapse_control *cc, > struct list_head *compound_pagelist) > { > struct page *page = NULL; > @@ -617,7 +625,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > if (pte_none(pteval) || (pte_present(pteval) && > is_zero_pfn(pte_pfn(pteval)))) { > if (!userfaultfd_armed(vma) && > - ++none_or_zero <= khugepaged_max_ptes_none) { > + (++none_or_zero <= khugepaged_max_ptes_none || > + !cc->enforce_page_heuristics)) { > continue; > } else { > result = SCAN_EXCEED_NONE_PTE; > @@ -637,8 +646,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > VM_BUG_ON_PAGE(!PageAnon(page), page); > > - if (page_mapcount(page) > 1 && > - ++shared > khugepaged_max_ptes_shared) { > + if (cc->enforce_page_heuristics && page_mapcount(page) > 1 && > + ++shared > khugepaged_max_ptes_shared) { > result = SCAN_EXCEED_SHARED_PTE; > count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); > goto out; > @@ -705,9 +714,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > list_add_tail(&page->lru, compound_pagelist); > next: > /* There should be enough young pte to collapse the page */ > - if (pte_young(pteval) || > - page_is_young(page) || PageReferenced(page) || > - mmu_notifier_test_young(vma->vm_mm, address)) > + if (cc->enforce_page_heuristics && > + (pte_young(pteval) || page_is_young(page) || > + PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, > + address))) > referenced++; > > if (pte_write(pteval)) > @@ -716,7 +726,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > if (unlikely(!writable)) { > result = SCAN_PAGE_RO; > - } else if (unlikely(!referenced)) { > + } else if (unlikely(cc->enforce_page_heuristics && !referenced)) { > result = SCAN_LACK_REFERENCED_PAGE; > } else { > result = SCAN_SUCCEED; > @@ -1096,7 +1106,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, > mmu_notifier_invalidate_range_end(&range); > > spin_lock(pte_ptl); > - result = __collapse_huge_page_isolate(vma, address, pte, > + result = __collapse_huge_page_isolate(vma, address, pte, cc, > &compound_pagelist); > spin_unlock(pte_ptl); > > @@ -1185,7 +1195,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > _pte++, _address += PAGE_SIZE) { > pte_t pteval = *_pte; > if (is_swap_pte(pteval)) { > - if (++unmapped <= khugepaged_max_ptes_swap) { > + if (++unmapped <= khugepaged_max_ptes_swap || > + !cc->enforce_page_heuristics) { > /* > * Always be strict with uffd-wp > * enabled swap entries. Please see > @@ -1204,7 +1215,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > } > if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > if (!userfaultfd_armed(vma) && > - ++none_or_zero <= khugepaged_max_ptes_none) { > + (++none_or_zero <= khugepaged_max_ptes_none || > + !cc->enforce_page_heuristics)) { > continue; > } else { > result = SCAN_EXCEED_NONE_PTE; > @@ -1234,8 +1246,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > goto out_unmap; > } > > - if (page_mapcount(page) > 1 && > - ++shared > khugepaged_max_ptes_shared) { > + if (cc->enforce_page_heuristics && > + page_mapcount(page) > 1 && > + ++shared > khugepaged_max_ptes_shared) { > result = SCAN_EXCEED_SHARED_PTE; > count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); > goto out_unmap; > @@ -1289,14 +1302,17 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > result = SCAN_PAGE_COUNT; > goto out_unmap; > } > - if (pte_young(pteval) || > - page_is_young(page) || PageReferenced(page) || > - mmu_notifier_test_young(vma->vm_mm, address)) > + if (cc->enforce_page_heuristics && > + (pte_young(pteval) || page_is_young(page) || > + PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, > + address))) > referenced++; > } > if (!writable) { > result = SCAN_PAGE_RO; > - } else if (!referenced || (unmapped && referenced < HPAGE_PMD_NR/2)) { > + } else if (cc->enforce_page_heuristics && > + (!referenced || > + (unmapped && referenced < HPAGE_PMD_NR / 2))) { > result = SCAN_LACK_REFERENCED_PAGE; > } else { > result = SCAN_SUCCEED; > @@ -1966,7 +1982,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, > continue; > > if (xa_is_value(page)) { > - if (++swap > khugepaged_max_ptes_swap) { > + if (cc->enforce_page_heuristics && > + ++swap > khugepaged_max_ptes_swap) { > result = SCAN_EXCEED_SWAP_PTE; > count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); > break; > @@ -2017,7 +2034,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, > rcu_read_unlock(); > > if (result == SCAN_SUCCEED) { > - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { > + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && > + cc->enforce_page_heuristics) { > result = SCAN_EXCEED_NONE_PTE; > count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > } else { > @@ -2258,6 +2276,7 @@ static int khugepaged(void *none) > { > struct mm_slot *mm_slot; > struct collapse_control cc = { > + .enforce_page_heuristics = true, > .last_target_node = NUMA_NO_NODE, > /* .gfp set later */ > }; > -- > 2.36.1.255.ge46751e96f-goog >