From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD8B8C54EE9 for ; Wed, 7 Sep 2022 12:00:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCF256B0072; Wed, 7 Sep 2022 08:00:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7E346B0073; Wed, 7 Sep 2022 08:00:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B6D128D0001; Wed, 7 Sep 2022 08:00:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A40AD6B0072 for ; Wed, 7 Sep 2022 08:00:30 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 68B6481068 for ; Wed, 7 Sep 2022 12:00:30 +0000 (UTC) X-FDA: 79885147020.17.C4D39F0 Received: from mail3-162.sinamail.sina.com.cn (mail3-162.sinamail.sina.com.cn [202.108.3.162]) by imf23.hostedemail.com (Postfix) with ESMTP id 37D59140088 for ; Wed, 7 Sep 2022 12:00:27 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.249.57.76]) by sina.com (172.16.97.27) with ESMTP id 631887B500032F93; Wed, 7 Sep 2022 19:59:51 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 26192949283372 From: Hillf Danton To: Kuyo Chang Cc: peterz@infradead.org, mgorman@suse.de, Waiman Long , linux-kernel@vger.kernel.org, linux-mm@kvack.org, jing-ting.wu@mediatek.com Subject: Re: BUG: list_add corruption while doing migrate_swap -> balance_push Date: Wed, 7 Sep 2022 20:00:18 +0800 Message-Id: <20220907120018.2594-1-hdanton@sina.com> In-Reply-To: <6dab6e564e43c952f63f83ef868da6ed829fc1a8.camel@mediatek.com> References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.162 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662552029; a=rsa-sha256; cv=none; b=j8bSUDiWD4TW7DkJo+VdN+y+eqdI01g4SHMcZIybOG+kieHIx5SX+z2AX2O1pgbxoILfDD ORo5mRY8u+O8mjH/r85JcOoDod32KSMkavOnU8QaDeEz6D/v8NNGCU2/DjIFquw0ljWh+z DNZvgy6zpXPjNexrtmENHPFEJ4W4iJY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662552029; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=41E297KQMObyo/U7N/+BchCevahmOqSDbjznv7znR2k=; b=PNvgxG6N26qZchE9KtpCv78lPFbFFC8vkDvRPtewg4r3ay6p+9vy0hz+zi3BqdhT3iCzzF DzUzWDzVib4ApBeVvXdeb9NDVhIUHABAa/PLpSTmC/z/jQmmX95dqD+sWedKYgtOjnX4bs OnDOGfQeTggFKdHC7w8Be3xGZ51w9Jw= Authentication-Results: imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.162 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none X-Rspam-User: X-Rspamd-Queue-Id: 37D59140088 X-Rspamd-Server: rspam05 X-Stat-Signature: 7oq1qm4g59hsobnka7fjwhmfu6yik49c X-HE-Tag: 1662552027-843190 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6 Sep 2022 20:54:58 +0800 Kuyo Chang wrote > Hi, > > [Syndrome] > A list_add corruption error at kernel-5.15, the log shows. > list_add corruption. prev->next should be next (ffffff81a6f08ba0), but > was 0000000000000000. (prev=ffffff81a6f05930). > > The call trace as below: > ipanic_die > notify_die > die > bug_handler > brk_handler > do_debug_exception > el1_dbg > el1h_64_sync_handler > el1h_64_sync > __list_add_valid > cpu_stop_queue_work > stop_one_cpu_nowait > balance_push > __schedule > schedule > do_sched_yield > __arm64_sys_sched_yield > invoke_syscall > el0_svc_common > do_el0_svc > el0_svc > el0t_64_sync_handler > el0t_64_sync > > [Analysis] > By memory dump and analyzing the stopper->works list, the error code > flow as following: > > migrate_swap > ->stop_two_cpus > ->cpu_stop_queue_two_works > ->__cpu_stop_queue_work (add work->list to stopper- > >works respectively) > ->list_add_tail(&work->list, &stopper->works); > ->wake_up_q(&wakeq); > ->wait_for_completion(&done.completion); > ->wait_for_common > ->schedule_timeout > ->schedule > > At this point, the cpu hotplug trigged, > It registers balance_callback by below flow: > cpu_down(cpuid) > ->_cpu_down > ->cpuhp_set_state() > ->set_cpu_dying(cpuid, true) > ->sched_cpu_deactivate > ->balance_push_set(cpuid, true) > ->rq->balance_callback = &balance_push_callback; > > > Finally, > ->__schedule > ->__balance_callbacks > ->do_balance_callbacks(rq, __splice_balance_callbacks(rq, false)); > ->balance_push > ->stop_one_cpu_nowait > *work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg, > .caller = _RET_IP_, }; > At this point the list_head *next, *prev is initial to NULL!! > ->cpu_stop_queue_work > ->__list_add_valid > > Do you have any suggestion for this issue? See if making balance_push() non re-entrable removes the chance for double list add in your case. Hillf --- linux-5.15/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8815,6 +8815,7 @@ static int __balance_push_cpu_stop(void cpu = select_fallback_rq(rq->cpu, p); rq = __migrate_task(rq, &rf, p, cpu); } + this_cpu_ptr(&push_work)->queued = 0; rq_unlock(rq, &rf); raw_spin_unlock_irq(&p->pi_lock); @@ -8838,6 +8839,8 @@ static void balance_push(struct rq *rq) lockdep_assert_rq_held(rq); + if (WARN_ON_ONCE(this_cpu_ptr(&push_work)->queued != 0)) + return; /* * Ensure the thing is persistent until balance_push_set(.on = false); */ @@ -8877,6 +8880,7 @@ static void balance_push(struct rq *rq) return; } + this_cpu_ptr(&push_work)->queued = 1; get_task_struct(push_task); /* * Temporarily drop rq->lock such that we can wake-up the stop task. --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -27,6 +27,7 @@ struct cpu_stop_work { unsigned long caller; void *arg; struct cpu_stop_done *done; + unsigned queued; }; int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg);