From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67152EB64DC for ; Tue, 18 Jul 2023 00:57:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEFED8D0002; Mon, 17 Jul 2023 20:57:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9FE98D0001; Mon, 17 Jul 2023 20:57:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D67C98D0002; Mon, 17 Jul 2023 20:57:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C711B8D0001 for ; Mon, 17 Jul 2023 20:57:05 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 99B6EC063A for ; Tue, 18 Jul 2023 00:57:05 +0000 (UTC) X-FDA: 81022918410.21.A41ABA7 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf02.hostedemail.com (Postfix) with ESMTP id 60CE780010 for ; Tue, 18 Jul 2023 00:57:03 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=WBEuSMcW; spf=pass (imf02.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689641823; a=rsa-sha256; cv=none; b=eEcM4oy5quMsIaTE2pK+8Nmvnh06bSkQJWg8N37p6mquBMwy3hkA+uPaH/gmee02MC1pnH vV0BGZCKB8JbpnwwVYqlruZlm6LVXEgZhg0O7AXDqCFueP+9rjjoUYnOYtHQaPVn6bEN1/ RLnr+Dax/I4npN7NUiJuTPKjmfTzS3Y= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=WBEuSMcW; spf=pass (imf02.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689641823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hzp3Gxv3+MSEpYzRObiIi3G4pNWXF3TWp9WT22bAW4Q=; b=S6bpGZOSyAgZrz98nr9dN4u+HLcIMWYEiVdU1N4nOxLV9+pk7uDuZBYMGGHvDq9po+4b3I fgIFtwB8fbcQikMB2HaoV8mv2vMgkbVLmKJhPrpHsX3OS57YlIIZCbcH3iQ81WXlESvrSn cD501M6/TyIjc+iqq/gBZ/HIVGoxSVo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689641823; x=1721177823; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=t4bV+J3+25UzS/iEzI9vgtREPrGKJEqdWMMc+J/VAOE=; b=WBEuSMcWbv+AVrauut/FQi9bCSevRXtUTa4kZrgqrYrQQvlMuauOjRqk Se00cjIDQ9Lbd+A0YTVCvxIFoeV0fzYkjbGxsmGrzIu71BTGbqvGPwFVC H+Z/GEiu2U1HCcXV3c/VCI1kWSK/Z6P8TF5hW9haJm/iNXiXK1bLzNJtw sowl6q2di072Q43H1T953bZUiPxzsAgsGYXvVMzl80dtzsn03uqz17AnI iSFDW3gcfdLG/8ZZ/uCUFHNWU0Z/KMUtqob0Fsf1zt71KguwFYUPGECJt ijHNC6qpD9BYOXn8q6UYPkk5hXs4rEvPlYhh4GOR3wZ/QqOIqNXQ2Zg3+ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10774"; a="356014320" X-IronPort-AV: E=Sophos;i="6.01,211,1684825200"; d="scan'208";a="356014320" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jul 2023 17:57:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10774"; a="723398639" X-IronPort-AV: E=Sophos;i="6.01,211,1684825200"; d="scan'208";a="723398639" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jul 2023 17:56:58 -0700 From: "Huang, Ying" To: Mel Gorman Cc: Michal Hocko , , , Arjan Van De Ven , Andrew Morton , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Pavel Tatashin , Matthew Wilcox Subject: Re: [RFC 2/2] mm: alloc/free depth based PCP high auto-tuning References: <20230710065325.290366-1-ying.huang@intel.com> <20230710065325.290366-3-ying.huang@intel.com> <20230712090526.thk2l7sbdcdsllfi@techsingularity.net> <871qhcdwa1.fsf@yhuang6-desk2.ccr.corp.intel.com> <20230714140710.5xbesq6xguhcbyvi@techsingularity.net> <87pm4qdhk4.fsf@yhuang6-desk2.ccr.corp.intel.com> <20230717135017.7ro76lsaninbazvf@techsingularity.net> Date: Tue, 18 Jul 2023 08:55:16 +0800 In-Reply-To: <20230717135017.7ro76lsaninbazvf@techsingularity.net> (Mel Gorman's message of "Mon, 17 Jul 2023 14:50:17 +0100") Message-ID: <87lefeca2z.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 60CE780010 X-Stat-Signature: 5ezwq7b7nox5bpnm6ham4nxbripwdat7 X-Rspam-User: X-HE-Tag: 1689641823-673146 X-HE-Meta: U2FsdGVkX18/HCGW1LyylgW+xZU6hSawuPFJiCR7lwBnCuQJ+N/iIkc2xEmUIeNFPF/tY5dLxC0nly2gCBlGQtDzBJveb5kp9jv6O3FxIxbaQs38OfWl8Cv8hVzwrUvPCKJqj8yRur1cxR8VCL/rZCNDiYy4MliPk70M0iWaKpHbiaHhc86klJc1YI095g0zJho6WEfdLdySEHncn0lbhdwZBPgYQY8T/CNQzmZKF/isAWPNEcUSE4lDdHHp0WnOgvoWAg4BK4+eRnbs4IGOv7h3XcuQT4jW1DSMXp6ig5wrhlyehgfkvCEnYfy9bVZOS5YoovQI+mAhCdNJP1+B2/5paLwlK9ElxPooN1m95r9FGMFjVRnlYNtbJDMwj8ynxFAJcZJVoH1Qjr25NKbWV6MN9CHcPX6Tx2Hq9JsWE/R0dOUpw5h3AlP2whvwvsWSiY6Ij2Ctd00T3NJqwPasofc39ELu5bK+qxVNwtplUdO2bPTdgqUD6unVEGde8lVyyNvFNzWh7xmKxLAnBrJnf+vWJGk/a80eI4XGJyn1H0BEusF4cKt+wM8R86VO90YJE35M0ozlC225UPWo7LJAqSlFuLML6jA+K54gjgymvD9KVMJeDsONVzUY3zP3e7YY4XO6J/8M5ZZbABcFK+T7wvmfGYaF7oLo8ZX0FnPKjU48QXRszFeyyhy4GwbymhsR2W0HVZVxY91oJN46RX5tfgq5SzgJaDH1haed8MPLE33JAR3eZ8/tL2mJedNrWyZAtRpWGs76mlJYdFPacYKrg6xnC2CEewkvMpVL8BpcVcml8+rsk+mAELS2k6lQ/3IwxeSS8mI/ZZIVLHfLb/6VJhGxVgwvH2xP0LwY63RW50PHIPU5fNku+gFtFgVuieSoj5lWwiwlpr138AAdPiQGiU9Bn2MEwv/a4vX7ygp/QZBw+NyXDYkTdLz4zeSFPQbRgHz/2Lisjgj//7SHED7 gK2yTize YFUgNWze3Tw3bdOIjVztxOetCqHurkz3cIvVulh2cmzOzw3nLRM5fIzp5MPHKecLlkdc7PwwU/c5qAm4uXJN9v3Em7prN5A9IuGxhBkMWO454muKtF1wPpQdx18WuG+DictjVzu0AgSECzPD2+m3YRyFtQJD5KVM/Rd+D6tj9kBqQLOop1ipuHlSdAfg0qds8rqnje+vt516j5Eeen53k8Xdq0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Mel Gorman writes: > On Mon, Jul 17, 2023 at 05:16:11PM +0800, Huang, Ying wrote: >> Mel Gorman writes: >> >> > Batch should have a much lower maximum than high because it's a deferred cost >> > that gets assigned to an arbitrary task. The worst case is where a process >> > that is a light user of the allocator incurs the full cost of a refill/drain. >> > >> > Again, intuitively this may be PID Control problem for the "Mix" case >> > to estimate the size of high required to minimise drains/allocs as each >> > drain/alloc is potentially a lock contention. The catchall for corner >> > cases would be to decay high from vmstat context based on pcp->expires. The >> > decay would prevent the "high" being pinned at an artifically high value >> > without any zone lock contention for prolonged periods of time and also >> > mitigate worst-case due to state being per-cpu. The downside is that "high" >> > would also oscillate for a continuous steady allocation pattern as the PID >> > control might pick an ideal value suitable for a long period of time with >> > the "decay" disrupting that ideal value. >> >> Maybe we can track the minimal value of pcp->count. If it's small >> enough recently, we can avoid to decay pcp->high. Because the pages in >> PCP are used for allocations instead of idle. > > Implement as a separate patch. I suspect this type of heuristic will be > very benchmark specific and the complexity may not be worth it in the > general case. OK. >> Another question is as follows. >> >> For example, on CPU A, a large number of pages are freed, and we >> maximize batch and high. So, a large number of pages are put in PCP. >> Then, the possible situations may be, >> >> a) a large number of pages are allocated on CPU A after some time >> b) a large number of pages are allocated on another CPU B >> >> For a), we want the pages are kept in PCP of CPU A as long as possible. >> For b), we want the pages are kept in PCP of CPU A as short as possible. >> I think that we need to balance between them. What is the reasonable >> time to keep pages in PCP without many allocations? >> > > This would be a case where you're relying on vmstat to drain the PCP after > a period of time as it is a corner case. Yes. The remaining question is how long should "a period of time" be? If it's long, the pages in PCP can be used for allocation after some time. If it's short the pages can be put in buddy, so can be used by other workloads if needed. Anyway, I will do some experiment for that. > You cannot reasonably detect the pattern on two separate per-cpu lists > without either inspecting remote CPU state or maintaining global > state. Either would incur cache miss penalties that probably cost more > than the heuristic saves. Yes. Totally agree. Best Regards, Huang, Ying