From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12D66C54ED1 for ; Fri, 23 May 2025 23:22:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DF756B007B; Fri, 23 May 2025 19:22:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 769C46B0082; Fri, 23 May 2025 19:22:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67F476B0085; Fri, 23 May 2025 19:22:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 41F8C6B007B for ; Fri, 23 May 2025 19:22:20 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DD50E1A10F2 for ; Fri, 23 May 2025 23:22:19 +0000 (UTC) X-FDA: 83475748398.07.B6FA9EA Received: from out-184.mta0.migadu.com (out-184.mta0.migadu.com [91.218.175.184]) by imf15.hostedemail.com (Postfix) with ESMTP id 19975A0008 for ; Fri, 23 May 2025 23:22:17 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=UrpNWhEQ; spf=pass (imf15.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.184 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748042538; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pbohqPvTVeXsKSNiF9Zk+FfL3e0e15hYtvlbPcQaa5Q=; b=PN3bqLvipBOQ1ZDcmOi5FTDHoXoAvhg95NWFZUMejwqnWo+q9skxX+BcQaB4dRFD+/LmS7 cNEC6K8Q7gvoU3eCY4WXGw6iXA4oFdAcW87o8TdHwhQPrZyK7Rma/Q8jjNoACcyoIGAgmR daGFcaXgKA547EFgKytcbXrbmMxvuns= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=UrpNWhEQ; spf=pass (imf15.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.184 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748042538; a=rsa-sha256; cv=none; b=xyYyRSWK7AkGCD95GjNvX0B3Hl/U3DEsWczGLnZr7BKUrwuqVSnSz40HKeZgBRhALhXZ6G OcgBE7hJIoGSi15uVO+2FFuNMzaxjC1n1rV0LboBogcgo6znMf92r4FcUHRA1sSgJSC+Sm aOBJRu1jLozDCKdI51WKBz1+i+/oe9Q= Date: Fri, 23 May 2025 16:22:05 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1748042533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pbohqPvTVeXsKSNiF9Zk+FfL3e0e15hYtvlbPcQaa5Q=; b=UrpNWhEQvglIN9mujLRlr0ou2+L+0XlJGEX5uJZXmAr4YdBorlOkOz1isGM1jbHrt0VXT2 kKqPdtjKmEdgSB8PvWWkELaceu55sJNPUrTbSWo9gYuXJJKicPZJI7Yf/JZv1TMFTZ3ZXZ KTehI9aam39MJi6D8EYE/G7mBgJNOms= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Chen Yu Cc: peterz@infradead.org, akpm@linux-foundation.org, mkoutny@suse.com, mingo@redhat.com, tj@kernel.org, hannes@cmpxchg.org, corbet@lwn.net, mgorman@suse.de, mhocko@kernel.org, muchun.song@linux.dev, roman.gushchin@linux.dev, tim.c.chen@intel.com, aubrey.li@intel.com, libo.chen@oracle.com, kprateek.nayak@amd.com, vineethr@linux.ibm.com, venkat88@linux.ibm.com, ayushjai@amd.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yu.chen.surf@foxmail.com, Ayush Jain Subject: Re: [PATCH v5 1/2] sched/numa: fix task swap by skipping kernel threads Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 19975A0008 X-Stat-Signature: gsm9whjif3hfqxtymf49a5uq3bqr43rk X-Rspam-User: X-HE-Tag: 1748042537-534044 X-HE-Meta: U2FsdGVkX19ElNfE5JFUOoGKwYRGgH/xBQ/cHrzJEvBRQUBVoo3c+CqQBgf2VUl/1kw+hJOS0l+XRRzQRLUyqL6GSuVioGLbCDArBzcDtxpswSNsD4QRJpIQXrTsDDf+AVDmZdUIwqdsZpzk/zWls4LD9gFG83n29AK0fC1UcKhNzo+VZDTuqgj4yBE0driWk00l2NQEwCpuB/WCA3WSnT+IB6XxiEdaPUX21uRNVzHfymtRSdrguB97ZlLDpCCvaFPN8tn/b5xV4/nCS4lV4xut4BYSNihELSgcoSj50Qe5rbGCXFVU3fjl4+smcNHf+/qyX0GsQNsb9Ls2UvWM+WCItCWgA9nj3xsM+8FHhKnHABT8iscJ8CJfPDSxeYSCCTFUJYvFBMysvxqBgsZYKZp9VAt/vMEDik3uNofuCOIjQ1JrBgnFzHhBHe/jelsKz9riaWvc6iBB1Wiv8oXjBkrdmSC6Egh7XJ1rMAaSslhV95b84iaKUdzz/rA5qeFQbzae+R+doUtCmJ0aDJq+k2Yec7iPjuWPF37u/ztcu9x6382LUwXdAGTxxtm/bSvvNfOJ+DfG8L8njBHurAhrPp1J8uby6De13XN7EObC07Ub6yinSW2wnATMmmrDyyLWx0yxqmOHKMsTE7d3ucYA5tBL0gSExnudSOm1G+MTSCtp6z/bAlq8JyblHMvNs2VfZOPXCrNLLcYXn+uxrY5AKqPcALOAiWp4mlpEVWXtgOHyYCF1a0Gj0J+KWiMtmvJEygMQTVRmu4Yw4I2k0Seh+5qdOXMIOgcVvOI7h/bHvCK6NrR9wedfURTAQawf/nX5C8nFk8xoKITzRtnEo0MCzChsfOycirW8HDBgvpOnGVPqZsZV6r/d/0SeCxou5wv+1SZF59Ozwdc0gwgEXr75OsqUlhgYYU+dmT6GeYchohLQT4D8HuFyiWcsaAqWXAAopKNAOOdhNPaQyZsyDrM D9l5VqBV OdtqYyFjiPwbCZv1u9AyKlrYzOljiBI3bkSwrcIDuP+WX6+p9hbUk6MQ3tJZI4Wpx+sjkwcy0v2IqvBUtJR97+4tCH86/WiM05F2eLMwJlzHTDLCSuxLnFeURiRCIKi81MFJGcU1CcxeHapIL5gBcH0Tr9+eJM5OmxHBBc8PLWSaG8fjOzXAfRzVbOqRmL139Qb7//EwKqLt25A55pBpXyUYY6xo5dRQsZXQOzaIxfR71s8UEbfkaz9BpPqzroN+6ayHZkvsMUGwX5jYc8dmIVJ08Q+OZLq6jxoXq4wUv6nlt3YyQin75UxV+nuwibYtuVtuxl0C6fVjv/Q0F/ZDzhTjmwjQR1yelvcRM4Z08m7Uff/aRVnG4DBwF1A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 23, 2025 at 08:51:01PM +0800, Chen Yu wrote: > From: Libo Chen > > Task swapping is triggered when there are no idle CPUs in > task A's preferred node. In this case, the NUMA load balancer > chooses a task B on A's preferred node and swaps B with A. This > helps improve NUMA locality without introducing load imbalance > between nodes. In the current implementation, B's NUMA node > preference is not mandatory. That is to say, a kernel thread > might be incorrectly chosen as B. However, kernel thread and > user space thread that does not have mm are not supposed to be > covered by NUMA balancing because NUMA balancing only considers > user pages via VMAs. > > According to Peter's suggestion for fixing this issue, we use > PF_KTHREAD to skip the kernel thread. curr->mm is also checked > because it is possible that user_mode_thread() might create a > user thread without an mm. As per Prateek's analysis, after > adding the PF_KTHREAD check, there is no need to further check > the PF_IDLE flag: > " > - play_idle_precise() already ensures PF_KTHREAD is set before adding > PF_IDLE > > - cpu_startup_entry() is only called from the startup thread which > should be marked with PF_KTHREAD (based on my understanding looking at > commit cff9b2332ab7 ("kernel/sched: Modify initial boot task idle > setup")) > " > > In summary, the check in task_numa_compare() now aligns with > task_tick_numa(). > > Suggested-by: Michal Koutny > Tested-by: Ayush Jain > Signed-off-by: Libo Chen > Tested-by: Venkat Rao Bagalkote > Signed-off-by: Chen Yu Reviewed-by: Shakeel Butt