From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C48E0C433F5 for ; Tue, 8 Mar 2022 21:34:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 598238D0007; Tue, 8 Mar 2022 16:34:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 521598D0001; Tue, 8 Mar 2022 16:34:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39B6B8D0007; Tue, 8 Mar 2022 16:34:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 24A328D0001 for ; Tue, 8 Mar 2022 16:34:55 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id D7CA160596 for ; Tue, 8 Mar 2022 21:34:54 +0000 (UTC) X-FDA: 79222524108.01.BA741BE Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf16.hostedemail.com (Postfix) with ESMTP id 6DC9A18000A for ; Tue, 8 Mar 2022 21:34:54 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id o41-20020a17090a0a2c00b001bf06e5badfso292004pjo.3 for ; Tue, 08 Mar 2022 13:34:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=NMj28RQu89IpBjP1IhWkIlBf01qLZ8nKExbaoqPbgbw=; b=bHTSNRFrQNOgrgEyO/iGzwg+dE/4R42AsAOXYzqf6xj86GE8ZRjY2uSxgfFeANeXz4 JlaibAaIlVnbDSQv56rSHq7AO2sFjpCTFUMLD1ei0szirxDiTu1hzULbIkMTaOInHplb 8v7PMpB9jqElfoTCV3lJHV2Olxzntpof6kz5UkNKFZIh9RtgZcBRc1h1f1A0cxQEsxyk VLBVgnjhyfpEdjj1EsfzTKucxJAUIWcB2DwkZiCYCiHxj54lCjnDNU54XXxvZWw2RvQn 5S53oFLmFUFPLQQMmY+cl3bN+1goInKjkxH+GGkA1uTxdeDQkKfMpHjYq98dbyRz8P9n DQiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=NMj28RQu89IpBjP1IhWkIlBf01qLZ8nKExbaoqPbgbw=; b=es7tT4zNpTsAjRd9Y85vwb30NgRp2xGvOKwdBpSOqw3hNjaiiCCfhFyAGqx8+bLUio IjhJhx6ih3Px9TQ6749/TvrFKDhmQla5frESXCS14PgNS2/1ik1H/5QW/Ru7fj9uMCDK lkugSTZgLAO8TQNgKMu0ch8RlEGV8gs/XN7A4olw8XAA0WijCLyZdYMVmslfCfnUNM6U 2eSdDvLrLxp+trux+bYyvFxtfOt+nQPJxJzE9EmedJbaFK+JPQv8FKPGZ5NxLTKpP29E 1GItlc75Fs4VEkWJ8ZcxefSNL/jMoGZkND1w8esDItMR+pyNwW+bDUJYsYm0vzK9pokY Bzdw== X-Gm-Message-State: AOAM531+yewHo/gjFNVtddVOm3XCI3tISl98l2g5b+bLCL87JIDpLb2/ nRBsZ9l7mCkvFL6dmjur9VRQ0+kh/Y8v X-Google-Smtp-Source: ABdhPJxlwlg45n5uaxApAFV2cd0Uim4keKFYP4qHSu4iCB3PSh3HavOUnC0SlYIRfaMhZyrTy/WBlHyu0NX9 X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6a00:13a4:b0:4ce:118f:a822 with SMTP id t36-20020a056a0013a400b004ce118fa822mr20313831pfg.33.1646775293440; Tue, 08 Mar 2022 13:34:53 -0800 (PST) Date: Tue, 8 Mar 2022 13:34:07 -0800 In-Reply-To: <20220308213417.1407042-1-zokeefe@google.com> Message-Id: <20220308213417.1407042-5-zokeefe@google.com> Mime-Version: 1.0 References: <20220308213417.1407042-1-zokeefe@google.com> X-Mailer: git-send-email 2.35.1.616.g0bdcbb4464-goog Subject: [RFC PATCH 04/14] mm/khugepaged: separate khugepaged_scan_pmd() scan and collapse From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matthew Wilcox , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Peter Xu , Richard Henderson , Thomas Bogendoerfer , Yang Shi , "Zach O'Keefe" Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6DC9A18000A X-Stat-Signature: 5jq4d6aas18jaburaf9kcx3nscu3yphh Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bHTSNRFr; spf=pass (imf16.hostedemail.com: domain of 3_csnYgcKCEwD2ysstsu22uzs.q20zw18B-00y9oqy.25u@flex--zokeefe.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3_csnYgcKCEwD2ysstsu22uzs.q20zw18B-00y9oqy.25u@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1646775294-608205 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: khugepaged_scan_pmd() currently does : (1) scan pmd to see if it's suitable for collapse, then (2) do the collapse, if scan succeeds. Separate out (1) so that it can be reused by itself later in the series, and introduce a struct scan_pmd_result to gather data about the scan. Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 107 ++++++++++++++++++++++++++++++------------------ 1 file changed, 67 insertions(+), 40 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index e3399a451662..b204bc1eefa7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1244,27 +1244,34 @@ static void collapse_huge_page(struct mm_struct *mm, return; } -static int khugepaged_scan_pmd(struct mm_struct *mm, - struct vm_area_struct *vma, - unsigned long address, - struct page **hpage, - struct collapse_control *cc) +struct scan_pmd_result { + int result; + bool writable; + int referenced; + int unmapped; + int none_or_zero; + struct page *head; +}; + +static void scan_pmd(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long address, + struct collapse_control *cc, + struct scan_pmd_result *scan_result) { pmd_t *pmd; pte_t *pte, *_pte; - int ret = 0, result = 0, referenced = 0; - int none_or_zero = 0, shared = 0; + int shared = 0; struct page *page = NULL; unsigned long _address; spinlock_t *ptl; - int node = NUMA_NO_NODE, unmapped = 0; - bool writable = false; + int node = NUMA_NO_NODE; VM_BUG_ON(address & ~HPAGE_PMD_MASK); pmd = mm_find_pmd(mm, address); if (!pmd) { - result = SCAN_PMD_NULL; + scan_result->result = SCAN_PMD_NULL; goto out; } @@ -1274,7 +1281,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap || + if (++scan_result->unmapped <= + khugepaged_max_ptes_swap || !cc->enforce_pte_scan_limits) { /* * Always be strict with uffd-wp @@ -1282,23 +1290,24 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, * comment below for pte_uffd_wp(). */ if (pte_swp_uffd_wp(pteval)) { - result = SCAN_PTE_UFFD_WP; + scan_result->result = SCAN_PTE_UFFD_WP; goto out_unmap; } continue; } else { - result = SCAN_EXCEED_SWAP_PTE; + scan_result->result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); goto out_unmap; } } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - (++none_or_zero <= khugepaged_max_ptes_none || + (++scan_result->none_or_zero <= + khugepaged_max_ptes_none || !cc->enforce_pte_scan_limits)) { continue; } else { - result = SCAN_EXCEED_NONE_PTE; + scan_result->result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); goto out_unmap; } @@ -1313,22 +1322,22 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, * userfault messages that falls outside of * the registered range. So, just be simple. */ - result = SCAN_PTE_UFFD_WP; + scan_result->result = SCAN_PTE_UFFD_WP; goto out_unmap; } if (pte_write(pteval)) - writable = true; + scan_result->writable = true; page = vm_normal_page(vma, _address, pteval); if (unlikely(!page)) { - result = SCAN_PAGE_NULL; + scan_result->result = SCAN_PAGE_NULL; goto out_unmap; } if (page_mapcount(page) > 1 && ++shared > khugepaged_max_ptes_shared && cc->enforce_pte_scan_limits) { - result = SCAN_EXCEED_SHARED_PTE; + scan_result->result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; } @@ -1338,25 +1347,25 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, /* * Record which node the original page is from and save this * information to cc->node_load[]. - * Khugepaged will allocate hugepage from the node has the max + * Caller should allocate hugepage from the node has the max * hit record. */ node = page_to_nid(page); if (khugepaged_scan_abort(node, cc)) { - result = SCAN_SCAN_ABORT; + scan_result->result = SCAN_SCAN_ABORT; goto out_unmap; } cc->node_load[node]++; if (!PageLRU(page)) { - result = SCAN_PAGE_LRU; + scan_result->result = SCAN_PAGE_LRU; goto out_unmap; } if (PageLocked(page)) { - result = SCAN_PAGE_LOCK; + scan_result->result = SCAN_PAGE_LOCK; goto out_unmap; } if (!PageAnon(page)) { - result = SCAN_PAGE_ANON; + scan_result->result = SCAN_PAGE_ANON; goto out_unmap; } @@ -1378,35 +1387,53 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, * will be done again later the risk seems low. */ if (!is_refcount_suitable(page)) { - result = SCAN_PAGE_COUNT; + scan_result->result = SCAN_PAGE_COUNT; goto out_unmap; } if (pte_young(pteval) || page_is_young(page) || PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, address)) - referenced++; + scan_result->referenced++; } - if (!writable) { - result = SCAN_PAGE_RO; - } else if (!referenced || (unmapped && referenced < HPAGE_PMD_NR/2)) { - result = SCAN_LACK_REFERENCED_PAGE; + if (!scan_result->writable) { + scan_result->result = SCAN_PAGE_RO; + } else if (!scan_result->referenced || + (scan_result->unmapped && + scan_result->referenced < HPAGE_PMD_NR / 2)) { + scan_result->result = SCAN_LACK_REFERENCED_PAGE; } else { - result = SCAN_SUCCEED; - ret = 1; + scan_result->result = SCAN_SUCCEED; } out_unmap: pte_unmap_unlock(pte, ptl); - if (ret) { +out: + scan_result->head = page; +} + +static int khugepaged_scan_pmd(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long address, + struct page **hpage, + struct collapse_control *cc) +{ + int node; + struct scan_pmd_result scan_result = {}; + + scan_pmd(mm, vma, address, cc, &scan_result); + if (scan_result.result == SCAN_SUCCEED) { node = khugepaged_find_target_node(cc); /* collapse_huge_page will return with the mmap_lock released */ - collapse_huge_page(mm, address, hpage, node, - referenced, unmapped, - cc->enforce_pte_scan_limits); + collapse_huge_page(mm, khugepaged_scan.address, hpage, node, + scan_result.referenced, scan_result.unmapped, + cc->enforce_pte_scan_limits); } -out: - trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, - none_or_zero, result, unmapped); - return ret; + + trace_mm_khugepaged_scan_pmd(mm, scan_result.head, scan_result.writable, + scan_result.referenced, + scan_result.none_or_zero, + scan_result.result, scan_result.unmapped); + + return scan_result.result == SCAN_SUCCEED; } static void collect_mm_slot(struct mm_slot *mm_slot) -- 2.35.1.616.g0bdcbb4464-goog