From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC9E0C4332F for ; Fri, 10 Nov 2023 03:50:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4048B28001B; Thu, 9 Nov 2023 22:50:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B49828001A; Thu, 9 Nov 2023 22:50:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E1CF28001B; Thu, 9 Nov 2023 22:50:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DCAB028001A for ; Thu, 9 Nov 2023 22:50:56 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 856C716044A for ; Fri, 10 Nov 2023 03:50:56 +0000 (UTC) X-FDA: 81440668512.09.4E6C569 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by imf26.hostedemail.com (Postfix) with ESMTP id 9BFCD14000E for ; Fri, 10 Nov 2023 03:50:51 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iBoLlJ7x; arc=reject ("signature check failed: fail, {[1] = sig:microsoft.com:reject}"); spf=pass (imf26.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699588252; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pkg5tuAKQVECCbEFWkSNuiNrq/+xq1t5OaaC3jkoaTI=; b=ZJ+sDfuYwXuw2OhSWXfdC5g9nXe2jPYUeZEDNrKPfMZl2cq+bGlgabl10Py0lNI0sP/NCM y6Onq1vAWJxEC0APPJkoCNA43z1jb/0Ium1AoW6PnU3iTE3Z4u9Ki2d0cjcP6uWtBjaNdc 4LmL5wk4kO8RjM88US0HVj3MWFdxIyg= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1699588252; a=rsa-sha256; cv=fail; b=MhcIO36XiAu4DAfuvrcOgksjDXi44BodbZyqF+ydTWDVJSB3FRuw5ufT763aKPdpuegbyr XlRaV5tqDVNYfADQacsmK3Z5UAlAcJZXxCDAyZ1o+KfRMM+n6FuQSFUJSlEvF4uIzdv7tk 2+iOfJ9REGJta2xvzHSV6olC4XSS0Yc= ARC-Authentication-Results: i=2; imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iBoLlJ7x; arc=reject ("signature check failed: fail, {[1] = sig:microsoft.com:reject}"); spf=pass (imf26.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699588251; x=1731124251; h=message-id:date:subject:to:cc:references:from: in-reply-to:content-transfer-encoding:mime-version; bh=8bkx7VEJ9OUCjjk0SZnUDScitnogWbnFrH8FnrJ0BMs=; b=iBoLlJ7x0StC3tUaWBLnD8I+Cx+ak7qS3Q2msgXV838Lo/tcZz0TLYu3 sB0p/w20aw/xDO8L9DEG/60X7FIqUgShiQ/UHsnzFecS3u9Y/8cpFn5le UIzhLll/txVrjiqBFDqfUWZPHhHj3Tw1IqFjSnDwWEGKHKFoTwFW/Rj0F AG9QeMOwjjESmD9crNW0B264HDy9i7I2k7mP0dqMbE5V3BIlmAfvHg4/W qR1DDTt2HindsBwjEyzRGHryhenjNpDvidw1mHLsQAWnPdUmELTI7O0tu ZzZYNr1fXW8w6qEJX22CALjUsQJ4HgvymS3nv3u9JDNVZAGpI0ZfgwjjU A==; X-IronPort-AV: E=McAfee;i="6600,9927,10889"; a="456626206" X-IronPort-AV: E=Sophos;i="6.03,291,1694761200"; d="scan'208";a="456626206" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2023 19:50:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10889"; a="887243896" X-IronPort-AV: E=Sophos;i="6.03,291,1694761200"; d="scan'208";a="887243896" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by orsmga004.jf.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 09 Nov 2023 19:50:49 -0800 Received: from fmsmsx612.amr.corp.intel.com (10.18.126.92) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Thu, 9 Nov 2023 19:50:48 -0800 Received: from fmsmsx602.amr.corp.intel.com (10.18.126.82) by fmsmsx612.amr.corp.intel.com (10.18.126.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Thu, 9 Nov 2023 19:50:48 -0800 Received: from FMSEDG603.ED.cps.intel.com (10.1.192.133) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34 via Frontend Transport; Thu, 9 Nov 2023 19:50:48 -0800 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (104.47.66.40) by edgegateway.intel.com (192.55.55.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.34; Thu, 9 Nov 2023 19:50:42 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kfCsLDHFuKkGW4fM0Cdgn8tZ5zIXDVSww1ebS2Jc6YeSfOng1OekqNZVefU+jAsFD7DwtlIYCAhy38ufokJPOLiuWXlmEve7f+SxrTi6gXc+8vUNlTMMrzFVzX4bEF/60lv4nFnGZcs5vebxgEbmuNsEqBdIMfZY1NpXc3OEhu1Ctl28KEibXrBjSipJLC5fHc4G1DjXEluTcOZlE30EnjQanmtv02oBumz0FuED7XFrnnQVWkkJzhngI7vQGhxGEMwWXZspRHrQ/JBJzgZYRg80Zr9BXCgOBeZgk1eCrnf2z4no//6gi3WfBPPZLlJiWnm4wCF9pNJ0E5DSm9FCIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=pkg5tuAKQVECCbEFWkSNuiNrq/+xq1t5OaaC3jkoaTI=; b=FzNKslb1OJs+UcQu1rjYJsCEkTZ+CwuzkDUvoNtdHGcX2y6M8H0cQ0dKuawj/hVQPIqDQF7pvodGIs7C4/946t4hhqKwIqJI6NcUspWZDTjk5ThZiewlC4pSb5XQdLdM+ng4EfDoc+FWqD3dGf0cHUtGEDIt591IUcyDnVMCmV7AeXli7UAP4JOZtYJJ8E0nJnir4FHaRC61RfHhw0rklyc9J0tvTU4ZIJjNjYvc8k2NqM3ZHyoY/eTrlJveHDib6X7Xq1q8hS5s02t4slbSOAxPtSa1XuxhgUjiHFqpEtiHy7oBoXJpSe5hMQ8uJ950KoEyqQ9NJBd4qztKHsBang== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from CO1PR11MB4820.namprd11.prod.outlook.com (2603:10b6:303:6f::8) by PH0PR11MB4951.namprd11.prod.outlook.com (2603:10b6:510:43::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.19; Fri, 10 Nov 2023 03:50:41 +0000 Received: from CO1PR11MB4820.namprd11.prod.outlook.com ([fe80::53f6:8fa2:1b02:6012]) by CO1PR11MB4820.namprd11.prod.outlook.com ([fe80::53f6:8fa2:1b02:6012%3]) with mapi id 15.20.6954.029; Fri, 10 Nov 2023 03:50:41 +0000 Message-ID: <98479379-0fff-409a-a60d-2233da114588@intel.com> Date: Fri, 10 Nov 2023 11:50:32 +0800 User-Agent: Mozilla Thunderbird Subject: Re: [Question]: major faults are still triggered after mlockall when numa balancing Content-Language: en-US To: Kefeng Wang , Yang Shi , "zhangpeng (AS)" , Aneesh Kumar K.V CC: , , , Matthew Wilcox , , , , , , , , , , , Nanyong Sun References: <9e62fd9a-bee0-52bf-50a7-498fa17434ee@huawei.com> <648aa9dc-fc42-4f28-af9a-b24adfdcd43d@intel.com> <56e1e123-f593-443a-be5b-754cbfb0e611@huawei.com> From: "Yin, Fengwei" In-Reply-To: <56e1e123-f593-443a-be5b-754cbfb0e611@huawei.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-ClientProxiedBy: SI2PR01CA0022.apcprd01.prod.exchangelabs.com (2603:1096:4:192::14) To CO1PR11MB4820.namprd11.prod.outlook.com (2603:10b6:303:6f::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PR11MB4820:EE_|PH0PR11MB4951:EE_ X-MS-Office365-Filtering-Correlation-Id: fe83750c-2322-4d18-6df3-08dbe1a03563 X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7HfACzsvAhoYLScbFK0xrRPo2oIwn0mMmRNWvExw+uVSrbpNF/XqjUNVgBBJciC4vXDs5F7CMWojqI7w8ZYJIpn5T/BxpIr807khD+8PL44SIvtxDvepmPHrzeaIICo4trpKUM7227PKAwwGSYfsBXyt3qC9wCzc4v8YStqrF5BLBmybeAcnbcYo92uoLp1z79V4/yeEt0Q/criugtDyy7L44n1W5Si62xO8ekOJQeOILd1qgqQbITgnThKRIeMcnF91SaloiI9uOHb9Pd+6I1WwLbp0O6dNJ8cv4BmHos3pOCKAXDQT95nRLBpfEna3p8IvcplhibzLqdRFkScX/EWoV9WWf4mCXatYAIwXX0Eg5NSKr61lFVONXM535iZdS9+vb4lydISclay8/OqvKmVAP8LKmZgkkc6yN/nGQ6yB7pMbQJhDcNqc3ecmk5PnX8KS9SWR7uhX0BHPLEWzihRD2AuN0ucr5xeB7AAJ786R6HJ7fnq/Ba1yyELui9B8UBKtKNOAx4rimhdq6Q9m4k067HNnpyVCCKEaTq1HTabcllbvv3S/yurMm+JTeRtVdWZr5QjH2qzDNYvrQQr2nTQriM2G+3MFFSZoJ+AHXaBbgTz+toT1jXSnuvK0rIEk8h8EgtbIDKZIL9tjLMUTAQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR11MB4820.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(39860400002)(376002)(396003)(366004)(136003)(346002)(230922051799003)(451199024)(64100799003)(186009)(1800799009)(6506007)(8936002)(83380400001)(8676002)(4326008)(6512007)(66476007)(316002)(6486002)(31686004)(2616005)(53546011)(6666004)(26005)(478600001)(110136005)(54906003)(66946007)(66556008)(5660300002)(7416002)(38100700002)(41300700001)(82960400001)(2906002)(36756003)(31696002)(86362001)(45980500001)(43740500002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?VzZ0Zjk1cW5HOXpGN0V2ZWovYjZ4R2c3Z2hJMFN2Tm1jN1NQUnphTmpocU1G?= =?utf-8?B?Sksxc2NhR3BRMHAxL29BRWdwN2NnSGhtbG1EejIxQXhHTzVoWXZoWCtOQjZj?= =?utf-8?B?bmhZS0x0QUJ5WDdISEY0S0M4V0pQcXZyNXRIOUdpaU56elZoYmp4V1c1cXJP?= =?utf-8?B?azhvUmNERjRxMTRHZ2pEaU9CN0dxVnM2ZjBkc0xvZVNwLzJsMXNMbzluQTRL?= =?utf-8?B?NDNYRHlDK2toM0NnTnUraldNUFh3VjI0RkpPTS9QQ3hIcUoyUjBBRjBFMGNh?= =?utf-8?B?Uk1yakZqU3k3ak5xNVlZR2RyNTBxTnBJc3IyKzJFdzE4UzJwcFBnaDBNZzhM?= =?utf-8?B?aXQ4ZXJRdm44VEsxSFF4amJYdzM3UFpLNW56Y3loTXprSnk1QUE1YUJJYUcy?= =?utf-8?B?dE04ckswWGdXL0JKN3lQN3FLWldSaHNPcVJVWlRGcGUzVWVYaXNuNlVKOVB3?= =?utf-8?B?d2s3SEI2VkF1NGlhNmdwdGZZKzEvUDF4OEJNRmtOVktsMEozUjlaZmovVGQ4?= =?utf-8?B?NGpQQUxiYXVFeVdjQWU2RGJUTWlTVGZkTmtMV0NPR3lxUHE1V25RaGxBK1py?= =?utf-8?B?aVdOc2ozTUVpOHo2U24yYzdpV0VOUVUxWG05YzRsSkdFSkd3b3BkSkF4cXJE?= =?utf-8?B?TFBiQXVaRWNNY2tHMDQ3TjFrdDNlWitzWjYraEN2OXk1OS9VaG13UXEveGdL?= =?utf-8?B?VzBoZkdPWHAxVERRY2cyQnpBeFMvOEhsU0QwaldhYmcvdnZJc3VoMnZmcm5Y?= =?utf-8?B?V2VPeTlQb3d4djZxdHR1VUYrYncvdU9oZ1JCTXRSK01JZm1QU2ZiYmY0Uysr?= =?utf-8?B?UDVwV3RyTExpeWhnVkdzZHQvUkx1L1czenQwUVV4MkRodWt4T1BJV2twUnhQ?= =?utf-8?B?UHVJNEora3B0U3pWOW1lWTc0NWcrSjdZNXpnYWVIUzFOOFFjckc0Qm84TU5H?= =?utf-8?B?WldnZGFsb1lWeGt6N29DV0VTYzdmalpkVENsMk8vR3ZycEErUVdhWmdIVXFh?= =?utf-8?B?OTFwTXczN2pZdzlUUFptc283dzhzRUhnZmhUWksrZUUxOWZjVGdBV0hSekw3?= =?utf-8?B?THlkMlFQbHQ3ZXZNSDdzc2QwdnVDR3dCMDB1TVVMRk5mN0dnc3hoaVFFbmNx?= =?utf-8?B?WWc0VGtSWklrM0Irc3FGWUFoSEZwZ1l3Qm5xS2dDT1JaM290blljcFRYdkZW?= =?utf-8?B?NUhlT0VEaHRPS1puL0RMWkJmUkpHUTNXQ0M2R2JUREUvQ0I0dmx5MXNXZ3A1?= =?utf-8?B?dE9KRmpQT0lNTXVPNUhSa2twSVB6bUVSMWxyZTNaUjJsSmEyb1ZGOUlzTTRp?= =?utf-8?B?TVBpVS9CVm54VUwzWmh1YVlZdmp4QWkvK2Vob1ZmTVpndFY2VGtHTS80MFdl?= =?utf-8?B?aGtmT09PdWdLYk43YWtLU3hrcTZnUGhhaWo3S1MyLzFBUU1MK2tHdUNET0xs?= =?utf-8?B?dkpDNVVaYUlyOTRKeFB0L1Q3WUo0djB0ZldzdU1CYWNta0dwcTdPdC94Rm4z?= =?utf-8?B?RjM3NTBLZ2FPZFNTWjVmQ0tpWEtWcFN6eEgxMzJZWkhJck9KYXNWemJHNWx6?= =?utf-8?B?L05yRHdQc3hvR3ZzNjIxQXA0QTNMZXpXUEhnS3l1NithUTZycGd4cjNSRmZk?= =?utf-8?B?YUtWYmF3cEQwb2IvTVdFZ1g4SmZRcEJMcUFoMmJzL1hSRHVIVUMzc0hoalNm?= =?utf-8?B?eHBvaEhPRFUxb3dDb0RReFNlaEtPL1hOY1diVHJ4TTJLSUt0dEI0Y3V1RGFq?= =?utf-8?B?dm5wWEhVZlBwZHB0cnBoN1lXOGVyTll1bmZ5dktOVllnMmppOVBCZEVhekph?= =?utf-8?B?eFZ1YWFRK3Fpb3AzSEZQTUZEOUpqSVdjVnJlbFZ6cHpycUpoWmV5eEljQTBO?= =?utf-8?B?cCtydHNCaThzNUNpQzRMVU42Z05PTG9lS0ZIN2FKS2FxMkpvanhNaTEzemVB?= =?utf-8?B?VS9FcGFwV2ZVa3JidEJsVno3TmEra01yOVlQUEhuSE03VExsZU5ia1V1cGo5?= =?utf-8?B?VENBYnBsenE4QktjQ1JYM1N4MkVqdWF4RnVHdThja3lMUnIvWVhFQW1lV0lk?= =?utf-8?B?dGJLdjN2VkN5TWdjOXRzY1pZSExsWmRKQko2VSs5SktHT05qejluKy9qdS9P?= =?utf-8?Q?RShdhCsboXkrEopM/amt2mj8f?= X-MS-Exchange-CrossTenant-Network-Message-Id: fe83750c-2322-4d18-6df3-08dbe1a03563 X-MS-Exchange-CrossTenant-AuthSource: CO1PR11MB4820.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Nov 2023 03:50:41.2970 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: DDL+p0CkxVfjt3xXHKDSZ1NFDFo+wNThwRzWi1OJZeBd66/fma5IS+NpsLXhmS3YQngQ7n5uOTieECfK43nO9w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR11MB4951 X-OriginatorOrg: intel.com X-Stat-Signature: ubrgjdmjr3nms3ntxzgegf117zxz763b X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 9BFCD14000E X-Rspam-User: X-HE-Tag: 1699588251-221400 X-HE-Meta: U2FsdGVkX19AnkDBB9+1JuJKVCjjLH9sxNS94EOXDNgezeUvNWhyPNadHskZ21tCMFIAGpL+Rt1Oij5mtXmasU1Rdhsv5Ob92vr3jl44tzQUXWzgQTi/LKkAZrORHriojfvVjQtacbOu2rwV2YAefCG0DlEGFd99hzKsC5Gxx6lbbB0JYCE+Y5t2qBqctj8ns+O4kaO178FcuFTTfTijrxgWGZgpzE6xfxokSLwhgon48NmNY9nfZc3cfjuN7i87iN8kqlGbAVeUE5g1cDhtl0MGcc0FWlKIwFvth62fk+IwFcSqMKBojrRRlOzkZRvoxSmD3geeCSkuzBLNg1qAU1xZeWfrPLwX8g52T1MryIBndrdG023p7r1U0WNQCBWTgxlVMGJvWmJpzmD3deZGlPned2QxAPZm6HxlBs9IX3sG8c0BQYGBARaOq3SRiAdv80U82ACWo3NaMGB6bFATbrHiflUlorW3sVB7Pmp7VjfZ6hlM1csyuwPpvLgK1Ah4bg2tqVLBCBFPwtpXIz6XrjKpt/fJZoeXHPItoVEw58yoeAY8OjFZx4CQJUchI+m/CqKp8Lx2q81EbvN37jmZAJGNieZjYAWLS7zJoK2L+taJyNV4kqi/SlOgKNXPQ9YtCcJ0WeqEx5zIi1lsHkTp5ygP66ycUCn72c2NC73l3Nc0m4CqlkjOfls6ecqbXhKv9BC48lg192euLGiaHtS4Ezl0EQRAaDRhOde2JWY4MGUKbViazX42zRXppm/6/+MfZ4YDIjookjvOCmFlm7Dl36Vav2M3P7cFXdaooHe5jYiz5bKEGFNdLkQND4pK19+CmeOced8/OReGYlv/8nuOOYWPq//L0vyan3kl1EVxz1LG3DZxADYVrN8/VnoFixrBe7U1lKUJ6P7+LoozuC2a2199NVXM69Yc5sYZaesenn5SLkvbIk/YatWpmJGxe/V5lkn8QgzSpn+uC9h4xBm WEjYfiOj z3Hv3kfc0JldcYqtPZSdQZVd+wbpXhXESQScDPPPrXxwU5Htujjdc+Ux4pRydDW35ri8WETX9vqVvLgj25c88Fdb2DDsuw/jBh1OpO3AyBAuaWeBN/WHwHWK13Y7HQlL5TYT5OJlszz79vn8+EpYPzXUd9cpgFhyfxsjYv+psilB+sOLv+iRVfayJsbIOiJ1e+vbtM1myZNpEstf2nipra3cxsgB4C1Bk5djd1cNnMVz2eoq6lRnsMRpeD1aLQ/MRAuXAoqdxhLlWiHwXs816gzOW9kcoL7noMzgBhsRghdjFsFwqANmwqqUDGKpOm7YEAYtada9CTNw2eZIDWjAWTPkMT7L0wiYp4359QJaNfdcIsKFfYMAnabap+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11/10/2023 11:39 AM, Kefeng Wang wrote: > > > On 2023/11/10 9:57, Yin, Fengwei wrote: >> >> >> On 11/10/2023 6:54 AM, Yang Shi wrote: >>> On Thu, Nov 9, 2023 at 5:48 AM zhangpeng (AS) wrote: >>>> >>>> Hi everyone, >>>> >>>> There is a performance issue that has been bothering us recently. >>>> This problem can reproduce in the latest mainline version (Linux 6.6). >>>> >>>> We use mlockall(MCL_CURRENT | MCL_FUTURE) in the user mode process >>>> to avoid performance problems caused by major fault. >>>> >>>> There is a stage in numa fault which will set pte as 0 in do_numa_page() : >>>> ptep_modify_prot_start() will clear the vmf->pte, until >>>> ptep_modify_prot_commit() assign a value to the vmf->pte. >>>> >>>> For the data segment of the user-mode program, the global variable area >>>> is a private mapping. After the pagecache is loaded, the private >>>> anonymous page is generated after the COW is triggered. Mlockall can >>>> lock COW pages (anonymous pages), but the original file pages cannot >>>> be locked and may be reclaimed. If the global variable (private anon page) >>>> is accessed when vmf->pte is zero which is concurrently set by numa fault, >>>> a file page fault will be triggered. >>>> >>>> At this time, the original private file page may have been reclaimed. >>>> If the page cache is not available at this time, a major fault will be >>>> triggered and the file will be read, causing additional overhead. >>>> >>>> Our problem scenario is as follows: >>>> >>>> task 1                      task 2 >>>> ------                      ------ >>>> /* scan global variables */ >>>> do_numa_page() >>>>     spin_lock(vmf->ptl) >>>>     ptep_modify_prot_start() >>>>     /* set vmf->pte as null */ >>>>                               /* Access global variables */ >>>>                               handle_pte_fault() >>>>                                 /* no pte lock */ >>>>                                 do_pte_missing() >>>>                                   do_fault() >>>>                                     do_read_fault() >>>>     ptep_modify_prot_commit() >>>>     /* ptep update done */ >>>>     pte_unmap_unlock(vmf->pte, vmf->ptl) >>>>                                       do_fault_around() >>>>                                       __do_fault() >>>>                                         filemap_fault() >>>>                                           /* page cache is not available >>>>                                           and a major fault is triggered */ >>>>                                           do_sync_mmap_readahead() >>>>                                           /* page_not_uptodate and goto >>>>                                           out_retry. */ >>>> >>>> Is there any way to avoid such a major fault? >>> >>> IMHO I don't think it is a bug. The man page quoted by Willy says "All >>> mapped pages are guaranteed to be resident in RAM when the call >>> returns successfully", but the later COW already made the file page >>> unmapped, right? The PTE pointed to the COW'ed anon page. >>> Hypothetically if we kept the file page mlocked and unmapped, >>> munlock() would have not munlocked the file page at all, it would be >>> mlocked in memory forever. >> But in this case, even the COW page is mlocked. There is small window >> that PTE is set to null in do_numa_page(). data segment access (it's to >> COW page which has nothing to do with original page cache) happens in >> this small window will trigger filemap_fault() to fault in original >> page cache. >> >> I had thought to do double check whether vmf->pte is NULL in do_read_fault(). >> But it's not reliable enough. >> >> Matthew's idea to use protnone to block both hardware accessing and >> do_pte_missing() looks more promising to me. > > Actual, we could revert the following patch to avoid this issue, > but this workaroud from ppc... > > commit cee216a696b2004017a5ecb583366093d90b1568 > Author: Aneesh Kumar K.V > Date:   Fri Feb 24 14:59:13 2017 -0800 > >     mm/autonuma: don't use set_pte_at when updating protnone ptes > >     Architectures like ppc64, use privilege access bit to mark pte non >     accessible.  This implies that kernel can do a copy_to_user to an >     address marked for numa fault.  This also implies that there can be a >     parallel hardware update for the pte.  set_pte_at cannot be used in such >     scenarios.  Hence switch the pte update to use ptep_get_and_clear and >     set_pte_at combination. Oh. This means the protnone doesn't work for PPC. > >> >> >> Regards >> Yin, Fengwei >> >>> >>>> >>>> -- >>>> Best Regards, >>>> Peng >>>>