From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADF5FC433F5 for ; Wed, 5 Oct 2022 14:50:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2710F6B0072; Wed, 5 Oct 2022 10:50:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 220946B0073; Wed, 5 Oct 2022 10:50:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10F686B0074; Wed, 5 Oct 2022 10:50:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 01F106B0072 for ; Wed, 5 Oct 2022 10:50:41 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C15DBAB48B for ; Wed, 5 Oct 2022 14:50:41 +0000 (UTC) X-FDA: 79987182282.29.3663DE5 Received: from r3-25.sinamail.sina.com.cn (r3-25.sinamail.sina.com.cn [202.108.3.25]) by imf13.hostedemail.com (Postfix) with ESMTP id 5B6022000F for ; Wed, 5 Oct 2022 14:50:38 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.249.58.228]) by sina.com (172.16.97.23) with ESMTP id 633D99600001B49C; Wed, 5 Oct 2022 22:49:05 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 95061654919337 From: Hillf Danton To: Valentin Schneider Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lai Jiangshan , Peter Zijlstra , Frederic Weisbecker , Marcelo Tosatti Subject: Re: [PATCH v4 4/4] workqueue: Unbind workers before sending them to exit() Date: Wed, 5 Oct 2022 22:50:22 +0800 Message-Id: <20221005145022.1695-1-hdanton@sina.com> In-Reply-To: References: <20221004150521.822266-1-vschneid@redhat.com> <20221005010832.1934-1-hdanton@sina.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664981441; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cmmBvLmI/3S+FaIGKYdgmdtgxFaXvS+D0jVb8GSCLiw=; b=pDQdoSjvldQLf1Kn5rtPFkW7qU6KKbRY0dTzBO7CuJgqJOIPpCwxm+HBok2WuIx9ahnTJO aEKCGIV2D8Jh3RJk05uuJsCJwd/9i7CKbHeYlIbWgE4Rqpw5y6x++2dYFf7Gmj49C3h6V5 hUp1KA1owKJSLExlf7ROG2qP+k9J+EQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.25 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664981441; a=rsa-sha256; cv=none; b=7tzbfND3gUBm3UpgZMtbgdDJNuHsDZuQqGmpz1/R6WegvoJoVU+Qr4YRTPLRcWE+abVyJi EWObzs8QWg3HdpgbZ0n12X2Ktal5oxwFd8A3TDtdR/L+gvpFkx0ghX7w9UrSObEu9gM1sj bwd9j5D6TNK7TWHo1lKAIvkTnKE7skQ= X-Rspam-User: Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.25 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5B6022000F X-Stat-Signature: deepx9uef64juxon7ujgw1ii1o7mqbqq X-HE-Tag: 1664981438-351916 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000275, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 05 Oct 2022 12:13:17 +0100 Valentin Schneider >On 05/10/22 09:08, Hillf Danton wrote: >> On 4 Oct 2022 16:05:21 +0100 Valentin Schneider >>> It has been reported that isolated CPUs can suffer from interference due to >>> per-CPU kworkers waking up just to die. >>> >>> A surge of workqueue activity during initial setup of a latency-sensitive >>> application (refresh_vm_stats() being one of the culprits) can cause extra >>> per-CPU kworkers to be spawned. Then, said latency-sensitive task can be >>> running merrily on an isolated CPU only to be interrupted sometime later by >>> a kworker marked for death (cf. IDLE_WORKER_TIMEOUT, 5 minutes after last >>> kworker activity). >>> >> Is tick stopped on the isolated CPU? If tick can hit it then it can accept >> more than exiting kworker. > >>From what I've seen in the scenarios where that happens, yes. The >pool->idle_timer gets queued from an isolated CPU and ends up on a >housekeeping CPU (cf. get_target_base()). Yes, you are right. >With nohz_full on the cmdline, wq_unbound_cpumask already excludes isolated >CPU, but that doesn't apply to per-CPU kworkers. Or did you mean some other >mechanism? Bound kworkers can be destroyed by the idle timer on a housekeeping CPU. Diff is only for thoughts. +++ b/kernel/workqueue.c @@ -1985,6 +1985,7 @@ fail: static void destroy_worker(struct worker *worker) { struct worker_pool *pool = worker->pool; + int cpu = smp_processor_id(); lockdep_assert_held(&pool->lock); @@ -1999,6 +2000,12 @@ static void destroy_worker(struct worker list_del_init(&worker->entry); worker->flags |= WORKER_DIE; + + if (!(pool->flags & POOL_DISASSOCIATED) && pool->cpu != cpu) { + /* send worker to die on a housekeeping cpu */ + cpumask_clear(&worker->task->cpus_mask); + cpumask_set_cpu(cpu, &worker->task->cpus_mask); + } wake_up_process(worker->task); }