From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 713EEC10F07 for ; Fri, 8 Dec 2023 02:53:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C41776B008C; Thu, 7 Dec 2023 21:53:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC9B26B0092; Thu, 7 Dec 2023 21:53:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A425F6B0093; Thu, 7 Dec 2023 21:53:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8B0E76B008C for ; Thu, 7 Dec 2023 21:53:16 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6FF68C03EC for ; Fri, 8 Dec 2023 02:53:16 +0000 (UTC) X-FDA: 81542129592.08.4CCDD5E Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) by imf29.hostedemail.com (Postfix) with ESMTP id B304E120013 for ; Fri, 8 Dec 2023 02:53:14 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=esddMIrN; spf=pass (imf29.hostedemail.com: domain of gang.li@linux.dev designates 91.218.175.179 as permitted sender) smtp.mailfrom=gang.li@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702003994; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xBnw5RqwYO2IqY80uM+vPTSc8/TzM8Sh0FOmLemG2G4=; b=A4bRUAJgS0358hCfbrn9WQ/UMBou83MGs4b643+cGfeH0iMPxwlFkIahSCK3RfpjadAdMc UVWcfQb0mhFesqBoapIKDnlGqSeA2wXgjeJFF585+K2Y6XiQ3P6Z6+4dJW9/avmCfTdVBV UYPzS06fkP+5o4kCc708DmKqWm6iF0s= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=esddMIrN; spf=pass (imf29.hostedemail.com: domain of gang.li@linux.dev designates 91.218.175.179 as permitted sender) smtp.mailfrom=gang.li@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702003994; a=rsa-sha256; cv=none; b=JgycLlm8/SxoVjAtvixXoOUXbABpLaPhEo8cJjgY/xGs1HuUSEdGzz6NE1VD++ltae9Ja9 b2Ak7JZhkxT1D7xEwlzq4El1u2Bobh8j/teJfnPI7Xe0YXCpZUDkOROpffBf+3T8XqCRmM anqGLTD5uppoIhXe1OAWawy8Bag9Lfs= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1702003993; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xBnw5RqwYO2IqY80uM+vPTSc8/TzM8Sh0FOmLemG2G4=; b=esddMIrNubz11Tjk5hcSDYsiI76jCdFHcT7RuRX5vkHNO/tt71jh8T95AJcRJ0EkjT5FPP nyp6uRRHoc0E4ynoEJCj3cDTTNvgLfqjFHV2Cv3uc8gZ6cNzKUTvW7ji1pnYkE7HVckU18 dtXx/KCzGEPds0Qw2r7tjuJ+Dt6mSrc= From: Gang Li To: David Hildenbrand , David Rientjes , Mike Kravetz , Muchun Song , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com, Gang Li Subject: [RFC PATCH v2 3/5] padata: dispatch works on different nodes Date: Fri, 8 Dec 2023 10:52:38 +0800 Message-Id: <20231208025240.4744-4-gang.li@linux.dev> In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev> References: <20231208025240.4744-1-gang.li@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: B304E120013 X-Rspam-User: X-Stat-Signature: pooijsswf1wdp9t1s7zwt8xk43jdxipp X-Rspamd-Server: rspam01 X-HE-Tag: 1702003994-42426 X-HE-Meta: U2FsdGVkX1+CISd7EUGLSS7n87FF95Wjgu05iJ/YIp+NlOvNjlntXK3ftmBI5uYauT3t2NqCs9wQbqG6cCndqSBWqtQrMMD5v96qhluWSlI5VX9fc/yZp5tT1MZ3y99sclVaEVNAiuWjHZo9dLxcm3jTEU4LFT3Bv1Ii8tCOx2D0qpzyDngYGfWr4S3ngy4qGZJAf/DJxBDmoWpHq2nSq/oN5UqAQq8h+ubTsDgzTU6+v8siE/eiiBv5NcG/4wbW0ymW9zICpWwszLrl6z8Bp6QE6IhwTWcfGvAxvZpe+PihwrTsHsDJHpO+C8EpjrEpUWj3rT1N9HrX6zxsnLcoUX3VPFG8S1HpXnLLLLJTgkk7d5Ox4vnSTM1aPrut7WGXHGolHN3NnxAG2XZ5oB8tmtHu7kAkukfg6Au4qR/We11+STnrEPJi76ZB41zj1Ujpm09b4IQzlzGnzTBeuKs1cZFH1SSFfsIN17mGE513ObxXR0pBywaKe1I3kY/IykYIBmlzTz38j8a+UJ6CCAPqyg7S37puP76XSyM9cKUqRx71O/iX7cAYGrljJ6zLfbKvrdbqEjaisC6wdMANs1lLlA3MoNv8NpExqvFw9Ki9jsNMEw3P1fr8ZpizsX6lMDddUl8ZKemvY1XMOI8ZgJmm4HRqgy8JceyH5dU038LBUbWCnVwUONt00m1ixCT3yAqPwNxFn6cv3uOw6FZqV3sXzbExlw7q7sXQAm3FkAyN7pWFiUuSPTDPDetjfDWD0/rKKQEycWNHniclFk1Bden9znU5kuMlV8N7/dMzUP5B78kXJ8XDKxIub1eVK2PFjK6f3NFSSrrFWNmqaaVNueNlo1PmNQ1AhEQs8fgr1/ogM8eHrsfmU4F2ubCxWge7twaAF6gznrpHpF43ZhoAwBaG1YI6iqrXJTn9AR8CXMbkXjpgXFfmOyRXrYM0yYAQMZ3nPHiHaxP29ZrkkNhfY3d isf2os5R Zc0rqVWBvcRLnHEkwp6mL5fYV+k0kSKHXLRFO2VfW7NlbiPwKlgOLtHL+l/UZ41SbnuHfI5LP9JppPm8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a group of tasks that access different nodes are scheduled on the same node, they may encounter bandwidth bottlenecks and access latency. Thus, numa_aware flag is introduced here, allowing tasks to be distributed across different nodes to fully utilize the advantage of multi-node systems. Signed-off-by: Gang Li --- include/linux/padata.h | 2 ++ kernel/padata.c | 8 ++++++-- mm/mm_init.c | 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 495b16b6b4d72..f6c58c30ed96a 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -137,6 +137,7 @@ struct padata_shell { * appropriate for one worker thread to do at once. * @max_threads: Max threads to use for the job, actual number may be less * depending on task size and minimum chunk size. + * @numa_aware: Dispatch jobs to different nodes. */ struct padata_mt_job { void (*thread_fn)(unsigned long start, unsigned long end, void *arg); @@ -146,6 +147,7 @@ struct padata_mt_job { unsigned long align; unsigned long min_chunk; int max_threads; + bool numa_aware; }; /** diff --git a/kernel/padata.c b/kernel/padata.c index 179fb1518070c..80f82c563e46a 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -485,7 +485,7 @@ void __init padata_do_multithreaded(struct padata_mt_job *job) struct padata_work my_work, *pw; struct padata_mt_job_state ps; LIST_HEAD(works); - int nworks; + int nworks, nid; if (job->size == 0) return; @@ -517,7 +517,11 @@ void __init padata_do_multithreaded(struct padata_mt_job *job) ps.chunk_size = roundup(ps.chunk_size, job->align); list_for_each_entry(pw, &works, pw_list) - queue_work(system_unbound_wq, &pw->pw_work); + if (job->numa_aware) + queue_work_node((++nid % num_node_state(N_MEMORY)), + system_unbound_wq, &pw->pw_work); + else + queue_work(system_unbound_wq, &pw->pw_work); /* Use the current thread, which saves starting a workqueue worker. */ padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK); diff --git a/mm/mm_init.c b/mm/mm_init.c index 077bfe393b5e2..1226f0c81fcb3 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2234,6 +2234,7 @@ static int __init deferred_init_memmap(void *data) .align = PAGES_PER_SECTION, .min_chunk = PAGES_PER_SECTION, .max_threads = max_threads, + .numa_aware = false, }; padata_do_multithreaded(&job); -- 2.30.2