From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73741C4332F for ; Fri, 10 Nov 2023 05:05:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A84C96B0127; Fri, 10 Nov 2023 00:05:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A2D216B0129; Fri, 10 Nov 2023 00:05:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85A356B0127; Fri, 10 Nov 2023 00:05:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7459A6B0124 for ; Fri, 10 Nov 2023 00:05:04 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 41E3380108 for ; Fri, 10 Nov 2023 05:05:04 +0000 (UTC) X-FDA: 81440855328.20.BCF0E77 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf30.hostedemail.com (Postfix) with ESMTP id D7A1A8001D for ; Fri, 10 Nov 2023 05:05:01 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=QAYGlQBN; spf=pass (imf30.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699592702; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3OdEQPond3eBhJg8KNy/M7RgQdpQRv6NWUmynbS6pmk=; b=p0xu5wt8zkPnz8OGa/p2wH4twtB+i9Ai0AZ6qXLUDXV8dHNUkx72wl/NO0vsT1e7GvBhwU I4ahLduk+bVFprQG7DdGWuh6kEYVtJV47E2AZq6p+pJqXBf0bYJqXBHPcGGIp4ehcTAzZd 77hrSsEQ8PNcLG7zdHmNupMwvAois/A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699592702; a=rsa-sha256; cv=none; b=5wo1VqJ2wtJHh14HPcF3XQ3/jTkwUi9cTIARwCZt+Tzv6XViy0PbjUZeOgaZMZWz0HMTAE mJd93frdzMRH3nB8m+rIuhDXKxpuDJP0sTmsZDP1Otycp2DAsBtnu9o9qclHbB6xleXwpU Nr5LPV10coTlS2PVJZmFNlxkmFAIZIY= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=QAYGlQBN; spf=pass (imf30.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AA4RN2Y003143; Fri, 10 Nov 2023 05:04:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=pp1; bh=3OdEQPond3eBhJg8KNy/M7RgQdpQRv6NWUmynbS6pmk=; b=QAYGlQBNJrWFygpiMd5d7GFQHqGu3BlPeB0M5yMMN20sKroQWT5D9oATN1sDbr9WC2fx eHW4MtgawuOu7UV8L6RlfX4ELwsxvPsuYUFoYUEaBYB0ERo8wWizVbpbE6XdlsHFItyP c0kiMhDdxF7GAaB4Q1XggCXHRur4VFIBkCxyh0q5WQhXJoTmzk2R+KAy7fiOmI699lZT bejar1Tdv1FfOy7vE1Os3tqel5YSxfpilSpR/uxmTGWzb1NUb85Mvp0fO249KlrOI8Ry 6jfCiyhj7XLqxLPVQnu7XXPvaM8xj1C7OzJ8/A2zeTBB4pkfe6WGh04ExWsjLPX31eRg mA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9dgq8v5n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 05:04:35 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AA4exWd005363; Fri, 10 Nov 2023 05:04:34 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9dgq8v4s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 05:04:34 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AA4XP1Y028299; Fri, 10 Nov 2023 05:04:32 GMT Received: from smtprelay04.wdc07v.mail.ibm.com ([172.16.1.71]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3u7w22rr9b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 05:04:32 +0000 Received: from smtpav02.wdc07v.mail.ibm.com (smtpav02.wdc07v.mail.ibm.com [10.39.53.229]) by smtprelay04.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AA54Wbd45548074 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Nov 2023 05:04:32 GMT Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 76BC85805D; Fri, 10 Nov 2023 05:04:32 +0000 (GMT) Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 277A25805C; Fri, 10 Nov 2023 05:04:26 +0000 (GMT) Received: from skywalker.linux.ibm.com (unknown [9.109.212.144]) by smtpav02.wdc07v.mail.ibm.com (Postfix) with ESMTP; Fri, 10 Nov 2023 05:04:25 +0000 (GMT) X-Mailer: emacs 29.1 (via feedmail 11-beta-1 I) From: "Aneesh Kumar K.V" To: "zhangpeng (AS)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, Matthew Wilcox , lstoakes@gmail.com, hughd@google.com, david@redhat.com, fengwei.yin@intel.com, vbabka@suse.cz, peterz@infradead.org, mgorman@suse.de, mingo@redhat.com, riel@redhat.com, ying.huang@intel.com, hannes@cmpxchg.org, Nanyong Sun , Kefeng Wang Subject: Re: [Question]: major faults are still triggered after mlockall when numa balancing In-Reply-To: <9e62fd9a-bee0-52bf-50a7-498fa17434ee@huawei.com> References: <9e62fd9a-bee0-52bf-50a7-498fa17434ee@huawei.com> Date: Fri, 10 Nov 2023 10:34:23 +0530 Message-ID: <87h6lunqqw.fsf@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 4FyxtxcXYjFHBDowN2BwytQM3JSyx0Uy X-Proofpoint-ORIG-GUID: Sxp_TsW7tfsi1KXbAWNrivzY4Q1dGSpN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-10_01,2023-11-09_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 clxscore=1015 lowpriorityscore=0 spamscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 mlxscore=0 bulkscore=0 adultscore=0 mlxlogscore=678 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311100041 X-Stat-Signature: hwqmwxnxartzyu5brts16x4tbmq33oju X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D7A1A8001D X-Rspam-User: X-HE-Tag: 1699592701-584858 X-HE-Meta: U2FsdGVkX186oJyBwFL8xgqaGAes2+N7ycKXgEnBg5/2pBxkNlxSSVLOmeUqdKdrk4RpjCHAbfcgqQ5cY6T0069YJp1pd6XuCP0H4iLppsGJA/TcsL5ZLTCjy9Tf4FTQGRBGgde/22CQANYaXviufZIz3qPOnm5Pyl2qC3RSpbPHP0VoAW7o8ANjdaZwNzbPStwB4jChJ5XLVtFCuWWk9AneBwNFGsqzMh9vn7Bu0huH9P5NykFjoigLFLSXg3gpi0uVQ8q+bLVMQ7FKBsv4ZsA++w98lDLWGSUitD2xkhZTJJkmrXnScVaciskRnANa1okwobgXEEvFa21s1PFU0bog3XO+bveMiEx/ipb1FlrlKJPwTGIuYuTcvCLX++jgPldU+An5GCUx5alEISmcsD4lxhRwYvU7JDap14bkmYVZFzE2A4eElvDwF+lomse4i/RmxRu8PuCK8mHEZn7xFpzEQcNy7cHg4pvqbUUk0bwRdHUxmTcFuX0ONnm1UNcxX4ZRXPb8eG7MGrRzAXEUNHWkKEkuV96/fJiI278h9JGZljj/UpMGRMrybXaSTb5zamOOVW7aFu3onzv6cHyNxw0j8opHZydL2PA83BsIy3VuFsmqnhZgL03yTnOK1shYmVXCl0i+4ixPdkU1dQ2INYtZocwcKtlcGUN/U5tboYbQDJJP2x26Ejw5+H3RE5D0eLuV41i7cxQ0iXmdB43j6pyx0R7tgZb+phG/DuwrBNFg1JqsxP8QnaGuD22/ra2e75s+URSSlcD3l/an92Wto28f3cVVvwt2a9hx8YFds2e34tJzOu2hJxrmXz/EjERdyAQMIJji3xsr0+5rf43F7hvjzCSrDCd/UijtCLGovRSTwRTsUFpbMHErZhFNuqmgy7rFbuEB15ouevSjp70gW+cztQQrkOQ6mbcyxRqRwO/yH4nCzkVPYMjzI7lmbXvggv9BFYtrkyjvhdtdGgU tlMp8HER MVkEkBQXp73U00/VgUArXYh7l1++B5fvd+sMFzGblbDSkbaZmwLUN6qXZydZbz0QujSCgtwdV1QbNrwA6iPZfFs6d6srB64fULsKaxvF7ZNB1iDg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: "zhangpeng (AS)" writes: > Hi everyone, > > There is a performance issue that has been bothering us recently. > This problem can reproduce in the latest mainline version (Linux 6.6). > > We use mlockall(MCL_CURRENT | MCL_FUTURE) in the user mode process > to avoid performance problems caused by major fault. > > There is a stage in numa fault which will set pte as 0 in do_numa_page() : > ptep_modify_prot_start() will clear the vmf->pte, until > ptep_modify_prot_commit() assign a value to the vmf->pte. > > For the data segment of the user-mode program, the global variable area > is a private mapping. After the pagecache is loaded, the private > anonymous page is generated after the COW is triggered. Mlockall can > lock COW pages (anonymous pages), but the original file pages cannot > be locked and may be reclaimed. If the global variable (private anon page) > is accessed when vmf->pte is zero which is concurrently set by numa fault, > a file page fault will be triggered. > > At this time, the original private file page may have been reclaimed. > If the page cache is not available at this time, a major fault will be > triggered and the file will be read, causing additional overhead. > > Our problem scenario is as follows: > > task 1 task 2 > ------ ------ > /* scan global variables */ > do_numa_page() > spin_lock(vmf->ptl) > ptep_modify_prot_start() > /* set vmf->pte as null */ > /* Access global variables */ > handle_pte_fault() > /* no pte lock */ > do_pte_missing() > do_fault() > do_read_fault() > ptep_modify_prot_commit() > /* ptep update done */ > pte_unmap_unlock(vmf->pte, vmf->ptl) > do_fault_around() > __do_fault() > filemap_fault() > /* page cache is not available > and a major fault is triggered */ > do_sync_mmap_readahead() > /* page_not_uptodate and goto > out_retry. */ > > Is there any way to avoid such a major fault? > This is also true w.r.t change_pte_range() in addition to do_numa_page() ? -aneesh