From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55176C43334 for ; Mon, 11 Jul 2022 20:43:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB2216B010A; Mon, 11 Jul 2022 16:43:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A62126B010B; Mon, 11 Jul 2022 16:43:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92953940010; Mon, 11 Jul 2022 16:43:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7DA2A6B010A for ; Mon, 11 Jul 2022 16:43:55 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 5465960B5F for ; Mon, 11 Jul 2022 20:43:55 +0000 (UTC) X-FDA: 79675995630.21.247EDB5 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by imf13.hostedemail.com (Postfix) with ESMTP id BC5522006A for ; Mon, 11 Jul 2022 20:43:54 +0000 (UTC) Received: by mail-pj1-f46.google.com with SMTP id i8-20020a17090a4b8800b001ef8a65bfbdso6023080pjh.1 for ; Mon, 11 Jul 2022 13:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=aHoS6D1iMOjCIpQtDQT1mv3duK20KLMHbQ6Gm7xamOw=; b=O0Rt+KHYcaipXh4xqDOdf0WWskEdc2s891zelvMusc7bpxZfncP84LutzaCF64/HMf 4ki/oNYv3NJK6CwkpzzofWFv7EuzXvXdFiT2uO8RsZDeFvhBVO5pfoIBP4jdNnCZa1LO X/AagTTG2wr657OdU44JQbWszjNYG9ZJX6TkbgOa/PN5ZEuFzdabX41KzIEnq0MVZsQv aI9ceQjAFKOC03ieaOxyFDAb+fj/8jk1fQ9p8z3Jp4hVd1/YW4pdH0JmF94eXR3Db3nF jucQ8scTRZ/MiqRzQ7Xs1+9B9xuvsbt/T+6HThTey9ICcup/cOPvRhIQz0gkoGjrGIYT yo7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=aHoS6D1iMOjCIpQtDQT1mv3duK20KLMHbQ6Gm7xamOw=; b=zjn34JQnDMIp2pEZxMKHA07JpmYxrEgFHJpMgZBmS/qE7QDdK95Qg5UxDfCWce0sFv UlAoNE1RLtd8v/StyDT7lCNkZUGv/0SzBdXavonxDgVQ6TViFZIXyMwyQUPOiGiHKRrk Q+QBP1Uh7p8+fhAi6W9CSiN8+1/4RPKkud9ZW4YS3uOaiRwvYBb+JFeCH8YW16JE/Irm Tpyb2W+55b9aayG4xuVaukNjj4aOwPZEIjxF2yDxCBz0tBWj8ujiV9plqe/LfRvYZck4 vU+aKdwfYqa4romSGKzlviWu7amMwdTFmhOMsGkLFY4bTLI4DQiOmwl6MXCcXi0fhaYQ McEg== X-Gm-Message-State: AJIora9mw4a4m6cdsEFef6N1tahBV53AQrs0J+M2uSN/eYBCcP++pqPr QFxEe6Ks6jtbQ08TWiuAEylB+RMgUCkaxsBLVbg= X-Google-Smtp-Source: AGRyM1uQRqYSSidfuEiiGaCArwGYzWTPKM3TdVrVB89NRm8KoCSErfiUCHCaa+uqFMUeQeKAqtaeReWwrLis8HnqXlQ= X-Received: by 2002:a17:902:ab8a:b0:16a:7cf2:a394 with SMTP id f10-20020a170902ab8a00b0016a7cf2a394mr20164538plr.26.1657572233592; Mon, 11 Jul 2022 13:43:53 -0700 (PDT) MIME-Version: 1.0 References: <20220706235936.2197195-1-zokeefe@google.com> <20220706235936.2197195-7-zokeefe@google.com> In-Reply-To: <20220706235936.2197195-7-zokeefe@google.com> From: Yang Shi Date: Mon, 11 Jul 2022 13:43:41 -0700 Message-ID: Subject: Re: [mm-unstable v7 06/18] mm/khugepaged: add flag to predicate khugepaged-only behavior To: "Zach O'Keefe" Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , Linux MM , Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=O0Rt+KHY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657572234; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aHoS6D1iMOjCIpQtDQT1mv3duK20KLMHbQ6Gm7xamOw=; b=Ov/qVyJaNqsGVvUECpiBHc23AbVERPprNboE8sPl78n6oFsWZzA/aCeVZST4nyCCGrcJki D6XZ4xgB9SHT1KJlqM0x7JZsGCE9dza0mkp8UiglPLkRai8eGGIb7Rl+tlav0qoytuf0BB +53BxvzvxkvypaCVyIp7HO10Er+wYCk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657572234; a=rsa-sha256; cv=none; b=pGNvUGZ1hq6Vhi9e2UtkD+mA4VsqcSgkccpAKkH3Q408GNSLEyqBPgYFOk0u53xA/FKJw4 c+RTNdBhr2/sNAo4kMNKfTD1KqVv/PmF+Bs9XocSC0fQJ3quuYEiNOxWBJe5Hw/Qe/Zstj oSRRxlZto4Ume0aX4Ahn2fyFyhHBr2k= Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=O0Rt+KHY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: u9gz7ucpbaasyfjs87j69pe1adbc5qp6 X-Rspamd-Queue-Id: BC5522006A X-HE-Tag: 1657572234-124250 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 6, 2022 at 5:06 PM Zach O'Keefe wrote: > > Add .is_khugepaged flag to struct collapse_control so > khugepaged-specific behavior can be elided by MADV_COLLAPSE context. > > Start by protecting khugepaged-specific heuristics by this flag. In > MADV_COLLAPSE, the user presumably has reason to believe the collapse > will be beneficial and khugepaged heuristics shouldn't prevent the user > from doing so: > > 1) sysfs-controlled knobs khugepaged_max_ptes_[none|swap|shared] > > 2) requirement that some pages in region being collapsed be young or > referenced > > Signed-off-by: Zach O'Keefe > --- > > v6 -> v7: There is no functional change here from v6, just a renaming of > flags to explicitly be predicated on khugepaged. Reviewed-by: Yang Shi Just a nit, some conditions check is_khugepaged first, some don't. Why not make them more consistent to check is_khugepaged first? > --- > mm/khugepaged.c | 62 ++++++++++++++++++++++++++++++++++--------------- > 1 file changed, 43 insertions(+), 19 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 147f5828f052..d89056d8cbad 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -73,6 +73,8 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); > * default collapse hugepages if there is at least one pte mapped like > * it would have happened if the vma was large enough during page > * fault. > + * > + * Note that these are only respected if collapse was initiated by khugepaged. > */ > static unsigned int khugepaged_max_ptes_none __read_mostly; > static unsigned int khugepaged_max_ptes_swap __read_mostly; > @@ -86,6 +88,8 @@ static struct kmem_cache *mm_slot_cache __read_mostly; > #define MAX_PTE_MAPPED_THP 8 > > struct collapse_control { > + bool is_khugepaged; > + > /* Num pages scanned per node */ > int node_load[MAX_NUMNODES]; > > @@ -554,6 +558,7 @@ static bool is_refcount_suitable(struct page *page) > static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > unsigned long address, > pte_t *pte, > + struct collapse_control *cc, > struct list_head *compound_pagelist) > { > struct page *page = NULL; > @@ -567,7 +572,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > if (pte_none(pteval) || (pte_present(pteval) && > is_zero_pfn(pte_pfn(pteval)))) { > if (!userfaultfd_armed(vma) && > - ++none_or_zero <= khugepaged_max_ptes_none) { > + (++none_or_zero <= khugepaged_max_ptes_none || > + !cc->is_khugepaged)) { > continue; > } else { > result = SCAN_EXCEED_NONE_PTE; > @@ -587,8 +593,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > VM_BUG_ON_PAGE(!PageAnon(page), page); > > - if (page_mapcount(page) > 1 && > - ++shared > khugepaged_max_ptes_shared) { > + if (cc->is_khugepaged && page_mapcount(page) > 1 && > + ++shared > khugepaged_max_ptes_shared) { > result = SCAN_EXCEED_SHARED_PTE; > count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); > goto out; > @@ -654,10 +660,14 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > if (PageCompound(page)) > list_add_tail(&page->lru, compound_pagelist); > next: > - /* There should be enough young pte to collapse the page */ > - if (pte_young(pteval) || > - page_is_young(page) || PageReferenced(page) || > - mmu_notifier_test_young(vma->vm_mm, address)) > + /* > + * If collapse was initiated by khugepaged, check that there is > + * enough young pte to justify collapsing the page > + */ > + if (cc->is_khugepaged && > + (pte_young(pteval) || page_is_young(page) || > + PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, > + address))) > referenced++; > > if (pte_write(pteval)) > @@ -666,7 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > > if (unlikely(!writable)) { > result = SCAN_PAGE_RO; > - } else if (unlikely(!referenced)) { > + } else if (unlikely(cc->is_khugepaged && !referenced)) { > result = SCAN_LACK_REFERENCED_PAGE; > } else { > result = SCAN_SUCCEED; > @@ -745,6 +755,7 @@ static void khugepaged_alloc_sleep(void) > > > struct collapse_control khugepaged_collapse_control = { > + .is_khugepaged = true, > .last_target_node = NUMA_NO_NODE, > }; > > @@ -1023,7 +1034,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, > mmu_notifier_invalidate_range_end(&range); > > spin_lock(pte_ptl); > - result = __collapse_huge_page_isolate(vma, address, pte, > + result = __collapse_huge_page_isolate(vma, address, pte, cc, > &compound_pagelist); > spin_unlock(pte_ptl); > > @@ -1114,7 +1125,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > _pte++, _address += PAGE_SIZE) { > pte_t pteval = *_pte; > if (is_swap_pte(pteval)) { > - if (++unmapped <= khugepaged_max_ptes_swap) { > + if (++unmapped <= khugepaged_max_ptes_swap || > + !cc->is_khugepaged) { > /* > * Always be strict with uffd-wp > * enabled swap entries. Please see > @@ -1133,7 +1145,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > } > if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > if (!userfaultfd_armed(vma) && > - ++none_or_zero <= khugepaged_max_ptes_none) { > + (++none_or_zero <= khugepaged_max_ptes_none || > + !cc->is_khugepaged)) { > continue; > } else { > result = SCAN_EXCEED_NONE_PTE; > @@ -1163,8 +1176,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > goto out_unmap; > } > > - if (page_mapcount(page) > 1 && > - ++shared > khugepaged_max_ptes_shared) { > + if (cc->is_khugepaged && > + page_mapcount(page) > 1 && > + ++shared > khugepaged_max_ptes_shared) { > result = SCAN_EXCEED_SHARED_PTE; > count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); > goto out_unmap; > @@ -1218,14 +1232,22 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, > result = SCAN_PAGE_COUNT; > goto out_unmap; > } > - if (pte_young(pteval) || > - page_is_young(page) || PageReferenced(page) || > - mmu_notifier_test_young(vma->vm_mm, address)) > + > + /* > + * If collapse was initiated by khugepaged, check that there is > + * enough young pte to justify collapsing the page > + */ > + if (cc->is_khugepaged && > + (pte_young(pteval) || page_is_young(page) || > + PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, > + address))) > referenced++; > } > if (!writable) { > result = SCAN_PAGE_RO; > - } else if (!referenced || (unmapped && referenced < HPAGE_PMD_NR/2)) { > + } else if (cc->is_khugepaged && > + (!referenced || > + (unmapped && referenced < HPAGE_PMD_NR / 2))) { > result = SCAN_LACK_REFERENCED_PAGE; > } else { > result = SCAN_SUCCEED; > @@ -1894,7 +1916,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, > continue; > > if (xa_is_value(page)) { > - if (++swap > khugepaged_max_ptes_swap) { > + if (cc->is_khugepaged && > + ++swap > khugepaged_max_ptes_swap) { > result = SCAN_EXCEED_SWAP_PTE; > count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); > break; > @@ -1945,7 +1968,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, > rcu_read_unlock(); > > if (result == SCAN_SUCCEED) { > - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { > + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && > + cc->is_khugepaged) { > result = SCAN_EXCEED_NONE_PTE; > count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > } else { > -- > 2.37.0.rc0.161.g10f37bed90-goog >