From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 963B2CCA47F for ; Sat, 25 Jun 2022 09:28:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 627EE8E0295; Sat, 25 Jun 2022 05:28:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FEF56B0123; Sat, 25 Jun 2022 05:28:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EDD18E0295; Sat, 25 Jun 2022 05:28:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4305E6B0122 for ; Sat, 25 Jun 2022 05:28:24 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 158161205D5 for ; Sat, 25 Jun 2022 09:28:24 +0000 (UTC) X-FDA: 79616232528.18.67D6A3C Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf05.hostedemail.com (Postfix) with ESMTP id EC4C6100020 for ; Sat, 25 Jun 2022 09:28:22 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LVTBy19xWzkWYl; Sat, 25 Jun 2022 17:26:34 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 25 Jun 2022 17:28:18 +0800 From: Miaohe Lin To: CC: , , , , , , , , , , , , , Subject: [PATCH v2 2/7] mm/khugepaged: stop swapping in page when VM_FAULT_RETRY occurs Date: Sat, 25 Jun 2022 17:28:11 +0800 Message-ID: <20220625092816.4856-3-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220625092816.4856-1-linmiaohe@huawei.com> References: <20220625092816.4856-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656149303; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cJ6gHuuFMzxoxK3J1fjCOM9TRSBtlHInDffeuG4YqNs=; b=u9mNpOW2y3P2NYYDOrWRmjsjrifbP6O41Ft3KFjCEcs2Vue9TNOJbMhFOPHmCILkLNY47Q Ge0tyeVYsuH4m0ZcUCOA8QhfrU+DGfojg2fWZbhPXPirUVSnvt++IfKyQ/8cQ9XyQS9HXn IgmMQsqEEWPz2GuDV/XxcBgDMOu7VfE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656149303; a=rsa-sha256; cv=none; b=OTaL31Pc2Q7mJCmtZ3bT7byp1M61qaDzDSvfJnDsk+FTtaoV1B3JysU29oMxu02G6sUYga shmrS+KK7vmatGEYQm6wwjzsIKNmAHcHHW3GKqYMoCTN5Z2gvo4mknn4vR/HayiPRD2GLM vYp1vAri2iUOXat4xM/jheTrZO1V0Lg= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: EC4C6100020 X-Stat-Signature: 4xw59cuna8twhy33kad495riyygiyp5c X-HE-Tag: 1656149302-991083 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When do_swap_page returns VM_FAULT_RETRY, we do not retry here and thus swap entry will remain in pagetable. This will result in later failure. So stop swapping in pages in this case to save cpu cycles. As A further optimization, mmap_lock is released when __collapse_huge_page_swapin() fails to avoid relocking mmap_lock. And "swapped_in++" is moved after error handling to make it more accurate. Signed-off-by: Miaohe Lin --- mm/khugepaged.c | 32 ++++++++++++++------------------ 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 8a103e0f8d2b..c6fc4eb8d77b 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -940,8 +940,8 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, * Bring missing pages in from swap, to complete THP collapse. * Only done if khugepaged_scan_pmd believes it is worthwhile. * - * Called and returns without pte mapped or spinlocks held, - * but with mmap_lock held to protect against vma changes. + * Called and returns without pte mapped or spinlocks held. + * Note that if false is returned, mmap_lock will be released. */ static bool __collapse_huge_page_swapin(struct mm_struct *mm, @@ -968,27 +968,24 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, pte_unmap(vmf.pte); continue; } - swapped_in++; ret = do_swap_page(&vmf); - /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ + /* + * do_swap_page returns VM_FAULT_RETRY with released mmap_lock. + * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because + * we do not retry here and swap entry will remain in pagetable + * resulting in later failure. + */ if (ret & VM_FAULT_RETRY) { - mmap_read_lock(mm); - if (hugepage_vma_revalidate(mm, haddr, &vma)) { - /* vma is no longer available, don't continue to swapin */ - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); - return false; - } - /* check if the pmd is still valid */ - if (mm_find_pmd(mm, haddr) != pmd) { - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); - return false; - } + trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); + return false; } if (ret & VM_FAULT_ERROR) { + mmap_read_unlock(mm); trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; } + swapped_in++; } /* Drain LRU add pagevec to remove extra pin on the swapped in pages */ @@ -1054,13 +1051,12 @@ static void collapse_huge_page(struct mm_struct *mm, } /* - * __collapse_huge_page_swapin always returns with mmap_lock locked. - * If it fails, we release mmap_lock and jump out_nolock. + * __collapse_huge_page_swapin will return with mmap_lock released + * when it fails. So we jump out_nolock directly in that case. * Continuing to collapse causes inconsistency. */ if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, pmd, referenced)) { - mmap_read_unlock(mm); goto out_nolock; } -- 2.23.0