From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58495C433EF for ; Thu, 16 Jun 2022 06:09:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC4846B0073; Thu, 16 Jun 2022 02:09:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B74238D0002; Thu, 16 Jun 2022 02:09:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3BA18D0001; Thu, 16 Jun 2022 02:09:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 952226B0073 for ; Thu, 16 Jun 2022 02:09:01 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 6EA081211E6 for ; Thu, 16 Jun 2022 06:09:01 +0000 (UTC) X-FDA: 79583070882.02.3310F19 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf08.hostedemail.com (Postfix) with ESMTP id 7AA4E160084 for ; Thu, 16 Jun 2022 06:09:00 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4LNsBn43vzz1K9yC; Thu, 16 Jun 2022 14:06:57 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 16 Jun 2022 14:08:23 +0800 Subject: Re: [PATCH 2/7] mm/khugepaged: stop swapping in page when VM_FAULT_RETRY occurs To: Yang Shi , Zach O'Keefe CC: Andrew Morton , Andrea Arcangeli , Matthew Wilcox , Vlastimil Babka , David Howells , NeilBrown , Alistair Popple , David Hildenbrand , Suren Baghdasaryan , Peter Xu , Linux MM , Linux Kernel Mailing List References: <20220611084731.55155-1-linmiaohe@huawei.com> <20220611084731.55155-3-linmiaohe@huawei.com> From: Miaohe Lin Message-ID: <3ab39c38-eef5-502c-d290-d745aff7b0bd@huawei.com> Date: Thu, 16 Jun 2022 14:08:23 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655359741; a=rsa-sha256; cv=none; b=yDg1YlhcilyGiuS6KllvajNqVxflg6rCczq6YgAeArPbI7LXesNXFzpflNrgxeMUSfD6zl eo5WWYXeE1ozwAC8b9Sa09yMtD7ij63AnbohxAFNRELw2fl2XaqjJlktcMsdJ0IPWc0zh8 VGZazx4LeuMtOORtHRdZIGPLqk5TRWE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655359741; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HZTjQXjUTDvJaNEAlMZhD8j67fMmi7mXgzT8bGMrinY=; b=eTrpnHEdoHOIkAVtU1r7SO4qHjrrNJeybyVuMffO5x7V3FdPX6yFjxnjapkK6txyCOzpDq mMLtcp1hljOsN3g2h6+0432AHOGCVaXozjCfUexOn8Z2of8VVUwpiv+xxI0Qv8QDuHcU6b qnJLw9EUkPjUt26g8RaRFbxGN/EdG5g= X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7AA4E160084 X-Stat-Signature: uba3uodtkwfm53863ob666uoi684iyhd Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-HE-Tag: 1655359740-255633 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/6/16 1:51, Yang Shi wrote: > On Wed, Jun 15, 2022 at 8:14 AM Zach O'Keefe wrote: >> >> On 11 Jun 16:47, Miaohe Lin wrote: >>> When do_swap_page returns VM_FAULT_RETRY, we do not retry here and thus >>> swap entry will remain in pagetable. This will result in later failure. >>> So stop swapping in pages in this case to save cpu cycles. >>> >>> Signed-off-by: Miaohe Lin >>> --- >>> mm/khugepaged.c | 19 ++++++++----------- >>> 1 file changed, 8 insertions(+), 11 deletions(-) >>> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index 73570dfffcec..a8adb2d1e9c6 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -1003,19 +1003,16 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, >>> swapped_in++; >>> ret = do_swap_page(&vmf); >>> >>> - /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ >>> + /* >>> + * do_swap_page returns VM_FAULT_RETRY with released mmap_lock. >>> + * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because >>> + * we do not retry here and swap entry will remain in pagetable >>> + * resulting in later failure. >>> + */ >>> if (ret & VM_FAULT_RETRY) { >>> mmap_read_lock(mm); >>> - if (hugepage_vma_revalidate(mm, haddr, &vma)) { >>> - /* vma is no longer available, don't continue to swapin */ >>> - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); >>> - return false; >>> - } >>> - /* check if the pmd is still valid */ >>> - if (mm_find_pmd(mm, haddr) != pmd) { >>> - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); >>> - return false; >>> - } >>> + trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); >>> + return false; >>> } >>> if (ret & VM_FAULT_ERROR) { >>> trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); >>> -- >>> 2.23.0 >>> >>> >> >> I've convinced myself this is correct, but don't understand how we got here. >> AFAICT, we've always continued to fault in pages, and, as you mention, don't >> retry ones that have failed with VM_FAULT_RETRY - so >> __collapse_huge_page_isolate() should fail. I don't think (?) there is any >> benefit to continuing to swap if we don't handle VM_FAULT_RETRY appropriately. >> >> So, I think this change looks good from that perspective. I suppose the only >> other question would be: should we handle the VM_FAULT_RETRY case? Maybe 1 >> additional attempt then fail? AFAIK, this mostly (?) happens when the page is >> locked. Maybe it's not worth the extra complexity though.. > > It should be unnecessary for khugepaged IMHO since it will scan all > the valid mm periodically, so it will come back eventually. I tend to agree with Yang. Khugepaged will come back eventually so it's not worth the extra complexity. Thanks both! > >> > . >