From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E298BCF8864 for ; Thu, 20 Nov 2025 15:14:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42DA76B002B; Thu, 20 Nov 2025 10:14:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 405056B0095; Thu, 20 Nov 2025 10:14:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F4496B00DC; Thu, 20 Nov 2025 10:14:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 183C66B002B for ; Thu, 20 Nov 2025 10:14:24 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D8003139322 for ; Thu, 20 Nov 2025 15:14:23 +0000 (UTC) X-FDA: 84131331606.19.CDC4109 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf04.hostedemail.com (Postfix) with ESMTP id E1A684001C for ; Thu, 20 Nov 2025 15:14:21 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="X/qItUXI"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3TDAfaQUKCHkbisbodlldib.Zljifkru-jjhsXZh.lod@flex--elver.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3TDAfaQUKCHkbisbodlldib.Zljifkru-jjhsXZh.lod@flex--elver.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1763651662; a=rsa-sha256; cv=none; b=Pnn99PBW0G3tuE7yvC1O1rFTzTxI/xRAZMOHmTEl6DMd+vaKV1U4upjtFOBDeOiT5te7bb 6fNMv9iXGfpe2bvGqLZ/1705keH8UnFAw2fJELYj26d7RpW1zznH3SCj/nG9KBxd3RMDa3 xBm2d3Lwbmcg0wKF5m1klAr9uUkQDMc= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="X/qItUXI"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3TDAfaQUKCHkbisbodlldib.Zljifkru-jjhsXZh.lod@flex--elver.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3TDAfaQUKCHkbisbodlldib.Zljifkru-jjhsXZh.lod@flex--elver.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1763651662; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YGVDTZeKU8HtP6Bp5mRJ8rt3G3nq7YHNLnBPvMXXBAk=; b=6s7OnCIRFTHaeVcES0ec/MRCXSsWXM1isRnZpkK5I/C9vdffxPWmlilHxlaEk7E4QQRClA YbvXXJF2epiWxqJlBUtEFnSBrIyOit+6RmOA5kenC901Sg0kEL63A2L1iQroHz0wnei6Ks A+JD8x+LuxGWFHqvcU0rtqOLyyU2Hxk= Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-42b478551a6so490765f8f.1 for ; Thu, 20 Nov 2025 07:14:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1763651660; x=1764256460; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YGVDTZeKU8HtP6Bp5mRJ8rt3G3nq7YHNLnBPvMXXBAk=; b=X/qItUXIIf/VYktuJ9o+m7wjLcPfPa2PDAM5MGFHutImvPLRI/ByGB8fRnxlS+BRt2 m1txeFSRt3OKazEAwNB34U0Exha1kfTb/rKJVL1vkiaVsjRFatne+IUSMOxg9wbz2OI1 sWthwcB29QRLP0TEW97mqxUa7ZiZZv3GecAjtqIZb2GUcviyo+Pgm9iJZf9zjkHi3JYd MfddmNCVOsipeDEYxerWYJiO/dljvIA3DBztIfMFVZL/At5Vp66QFeV5IdIuqL9Pntwd wrQit7pinqV1QZEeiXpQ1K9ZNYlzijE3bELhlDiTOQg/6nDc8GfV1cpFv0R5hNUNM0h/ E20Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763651660; x=1764256460; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YGVDTZeKU8HtP6Bp5mRJ8rt3G3nq7YHNLnBPvMXXBAk=; b=cOXs0/Hi3TVDacieHZC32MKZ3UtF/hr6AIBPgEFXCiSRsyEUQ2fH+DRh1yj7NSFZzi sCqb9H41J/lq31gAT7wGgUSn3p2MzCDxEoLk8qlW6o0Fsk2PdGZUtXgQcOdVe2EcT8mv JdO2OhDM4g4fkIyZ+UxghDURMvEQ/lAypsyzoEFfH7HUnykgEsQN5P2C5MICBrA0zXNk 3yOuIXmBud/geIb56kRoQnK/gdMcVCWzceYJQEIPqgYSD8lS0l0a0MzSMNiuOd/iKIBI W5fAy1EhNRsGjFoxPvZR9fMKF5Nki9JZA2vRIVgPCBk3tDQXLBO9klsoxG0zIV7HjcId CcdQ== X-Forwarded-Encrypted: i=1; AJvYcCVyZtjC3DgMmd0aBYF73zpG2c4qY1fiG4gYuxLzzKuWJIuYTf/4pX70xLqH6UQbEXDIL9U4BwYqeQ==@kvack.org X-Gm-Message-State: AOJu0YyvYCNFyTDIJ38/CUItiGeQ+AnJBKx/k4ciAiSz/MZITVi5drBQ RPspdESqf3u8wP0+Ym59exc913jaqm8DvHl9uTiiPHnAaSRTlO0LtOzMcRJVoicw8wovWbkAroI NUw== X-Google-Smtp-Source: AGHT+IFQTWafreV6PpxtXgnfYqg/qQzIwFP62DiWx3258dQlAUbophifEYpA3WmSJ8qPzpQTnBFvuJ4JAg== X-Received: from wrxm13.prod.google.com ([2002:a05:6000:8d:b0:429:c854:2a08]) (user=elver job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:208a:b0:42b:3aca:5a86 with SMTP id ffacd0b85a97d-42cbb2b1e57mr3064595f8f.57.1763651660166; Thu, 20 Nov 2025 07:14:20 -0800 (PST) Date: Thu, 20 Nov 2025 16:10:00 +0100 In-Reply-To: <20251120151033.3840508-7-elver@google.com> Mime-Version: 1.0 References: <20251120145835.3833031-2-elver@google.com> <20251120151033.3840508-7-elver@google.com> X-Mailer: git-send-email 2.52.0.rc1.455.g30608eb744-goog Message-ID: <20251120151033.3840508-36-elver@google.com> Subject: [PATCH v4 35/35] sched: Enable context analysis for core.c and fair.c From: Marco Elver To: elver@google.com, Peter Zijlstra , Boqun Feng , Ingo Molnar , Will Deacon Cc: "David S. Miller" , Luc Van Oostenryck , Chris Li , "Paul E. McKenney" , Alexander Potapenko , Arnd Bergmann , Bart Van Assche , Christoph Hellwig , Dmitry Vyukov , Eric Dumazet , Frederic Weisbecker , Greg Kroah-Hartman , Herbert Xu , Ian Rogers , Jann Horn , Joel Fernandes , Johannes Berg , Jonathan Corbet , Josh Triplett , Justin Stitt , Kees Cook , Kentaro Takeda , Lukas Bulwahn , Mark Rutland , Mathieu Desnoyers , Miguel Ojeda , Nathan Chancellor , Neeraj Upadhyay , Nick Desaulniers , Steven Rostedt , Tetsuo Handa , Thomas Gleixner , Thomas Graf , Uladzislau Rezki , Waiman Long , kasan-dev@googlegroups.com, linux-crypto@vger.kernel.org, linux-doc@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-sparse@vger.kernel.org, linux-wireless@vger.kernel.org, llvm@lists.linux.dev, rcu@vger.kernel.org, Ingo Molnar Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: E1A684001C X-Stat-Signature: kqytegbcxsakude3fwqp4copmhhsxira X-Rspam-User: X-HE-Tag: 1763651661-113326 X-HE-Meta: U2FsdGVkX1/xAHSXF6c2L1Mq7+s1dBJhYeTVWqW+a9g5C+9AZaWh3c/MhemEVHU8QXYSwhCojk0pMA1PmsTr5ARK6Z6FPyyd7WVnbzhRUc6GDDkRb6wBbayj1SZ01q7Pe2VVMSg4THFdpoL78wCCLk/g93CV1uMn6cmFAn589S3TQHZu8ktiaemV7mVlXwwYC88QvL4FGcTTU5/U/FRODDyaoTEcA6YDDFkWyDEZ+ufuAvsB3nJ6DTL/rzOHT08yHSYNrVVdFmjWugYpiniV/nmarEBziaIzVvH3qhg4zZ3B/5n3ch5heR/H1QCzhovdcT52ooxWI9gT/m0Dge/xdE+zinG3CaEKoGLxDK4razfsgZmlEVWQLg2HieF2ZJdFg+gCyimXuIfoylk8w6CXH1CjrhzL53j4oFBWAT8MPSwN66UY0sPA+44m2eVyo4NcRNUHyo4bbNA705sVsdRFV7UvAlvc9V55JMTw4jsHLk+VfOVw/AcIsDZwcZCVKV7wwaUSXTVT/XZOtHXSjyZ3PUuUSfkkVncR09CQ6NnrZ8aOCLBXOAJxIknT4k+u5u7PU4WttFXaz3fKRees01up6lUjQR22XBeXyUBBTrnGUOgWaPT0UTy14dGVDWf6H9pHBQ3yPU+8dAPDuCa3DoHqZSBhoPdWx9NvqzbfJy/KquA4PGsh9+ZlQMEOPF9pWfaoEZdzx3EVJhmpLwnPvtjsvj7SUFcB3EtD0XlvxPKk5bxZ319GZaf6at0k4kfvZJJTrZ9/bVWIGgp2J7yOWQGQazK5u59cikZJ7yZStEN08rdT7MIguvtKE2zbK+WkZE9u4lLOiQL4ZIqprDDhTfYLCGv3x2bJ67Pr61Wvo1P13pX9LAhLXX77zj32AEp+64fwypNWoJe8dO54r3QN+JAtRr685SQtjWHXOiahMQNCXCAxQ6DQLx1pDUCmGEWxJjon+JHmPi5ShV77Y6w2629 2FLkP9NR i+lom7aun0OlZ0FEZqk/BLSZxA697B5FrIhRBHc8a0+bSPAQNiDc0MwotNbHDKVJZW5Oius3wGmLyhNY1qYW4SaPg8uPKv2FAN6wgbKRNmzviylbzaJ09uGlNftcaK/KpYwI70bZcmN5tGtHBn82IXX2pmHMQQ6L4FJ0mHeNQbUJPbld7MI5hA7Of92UlbB1GeH/IpJhJaqTbVl8tBQrZX9epoRx2Yg8UqVk6xB9HrZAjEY919Cr9rSra6gFnQvDmfWUY6G8TQsvlOQRibG0Y8cbyellSlbqpltslHnEb5+xyvMiResm8tTRnnP2yanTx4VJL+24e4/L+UZeHwpgHGpHdmVXDY4k/H3gOMNk4lRkp7YcxoZab0KTl8G4w3mvFIpwq4BEeXWMxHmtCxpFf+SMGJa3y0WFp+Uumd05x05q6ot/EbcUXrSIJxscZ00/FY/TaPub75bFuBWzwe6xuA1mWYmICW1G55E02iI71dFONIsZd9RkD6h3rOH4a1HoclHtjE1g8TKx6K1BRdZrhtOqOHjE4sct1EsToAYnwoqdz4rRHx6Y+y9UhGFbDJKkMZKnowBOGEs6J+muP9ky6BYox/7ilrcfU6ttYpkrKILx9woIu3aBnhWJb1Q0xC+Fc4oca9t48vf5T62MHdVab6zLpqT7SpI6EuTg61Pt5TOBaiYs2SAS4LUNY7Lv5WEK/HHCWh6eBw9jrTDqKEdHd0c3pEBWDi0iCtozKRuDaL30vHfNooBX8kEVVxKOFEvTTyFFraa2lhKX2/0bmEjk78Na4KQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This demonstrates a larger conversion to use Clang's context analysis. The benefit is additional static checking of locking rules, along with better documentation. Notably, kernel/sched contains sufficiently complex synchronization patterns, and application to core.c & fair.c demonstrates that the latest Clang version has become powerful enough to start applying this to more complex subsystems (with some modest annotations and changes). Signed-off-by: Marco Elver Cc: Peter Zijlstra Cc: Ingo Molnar --- v4: * Rename capability -> context analysis. v3: * New patch. --- include/linux/sched.h | 6 +- include/linux/sched/signal.h | 4 +- include/linux/sched/task.h | 5 +- include/linux/sched/wake_q.h | 3 + kernel/sched/Makefile | 3 + kernel/sched/core.c | 89 ++++++++++++------ kernel/sched/fair.c | 9 +- kernel/sched/sched.h | 110 +++++++++++++++-------- scripts/context-analysis-suppression.txt | 1 + 9 files changed, 162 insertions(+), 68 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index b469878de25c..9088cbda6915 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2108,9 +2108,9 @@ static inline int _cond_resched(void) _cond_resched(); \ }) -extern int __cond_resched_lock(spinlock_t *lock); -extern int __cond_resched_rwlock_read(rwlock_t *lock); -extern int __cond_resched_rwlock_write(rwlock_t *lock); +extern int __cond_resched_lock(spinlock_t *lock) __must_hold(lock); +extern int __cond_resched_rwlock_read(rwlock_t *lock) __must_hold_shared(lock); +extern int __cond_resched_rwlock_write(rwlock_t *lock) __must_hold(lock); #define MIGHT_RESCHED_RCU_SHIFT 8 #define MIGHT_RESCHED_PREEMPT_MASK ((1U << MIGHT_RESCHED_RCU_SHIFT) - 1) diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index a63f65aa5bdd..a22248aebcf9 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -738,10 +738,12 @@ static inline int thread_group_empty(struct task_struct *p) (thread_group_leader(p) && !thread_group_empty(p)) extern struct sighand_struct *lock_task_sighand(struct task_struct *task, - unsigned long *flags); + unsigned long *flags) + __acquires(&task->sighand->siglock); static inline void unlock_task_sighand(struct task_struct *task, unsigned long *flags) + __releases(&task->sighand->siglock) { spin_unlock_irqrestore(&task->sighand->siglock, *flags); } diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index 525aa2a632b2..2acf9e1c4b0b 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -214,15 +214,18 @@ static inline struct vm_struct *task_stack_vm_area(const struct task_struct *t) * write_lock_irq(&tasklist_lock), neither inside nor outside. */ static inline void task_lock(struct task_struct *p) + __acquires(&p->alloc_lock) { spin_lock(&p->alloc_lock); } static inline void task_unlock(struct task_struct *p) + __releases(&p->alloc_lock) { spin_unlock(&p->alloc_lock); } -DEFINE_GUARD(task_lock, struct task_struct *, task_lock(_T), task_unlock(_T)) +DEFINE_LOCK_GUARD_1(task_lock, struct task_struct, task_lock(_T->lock), task_unlock(_T->lock)) +DECLARE_LOCK_GUARD_1_ATTRS(task_lock, __assumes_ctx_guard(_T->alloc_lock), /* */) #endif /* _LINUX_SCHED_TASK_H */ diff --git a/include/linux/sched/wake_q.h b/include/linux/sched/wake_q.h index 0f28b4623ad4..765bbc3d54be 100644 --- a/include/linux/sched/wake_q.h +++ b/include/linux/sched/wake_q.h @@ -66,6 +66,7 @@ extern void wake_up_q(struct wake_q_head *head); /* Spin unlock helpers to unlock and call wake_up_q with preempt disabled */ static inline void raw_spin_unlock_wake(raw_spinlock_t *lock, struct wake_q_head *wake_q) + __releases(lock) { guard(preempt)(); raw_spin_unlock(lock); @@ -77,6 +78,7 @@ void raw_spin_unlock_wake(raw_spinlock_t *lock, struct wake_q_head *wake_q) static inline void raw_spin_unlock_irq_wake(raw_spinlock_t *lock, struct wake_q_head *wake_q) + __releases(lock) { guard(preempt)(); raw_spin_unlock_irq(lock); @@ -89,6 +91,7 @@ void raw_spin_unlock_irq_wake(raw_spinlock_t *lock, struct wake_q_head *wake_q) static inline void raw_spin_unlock_irqrestore_wake(raw_spinlock_t *lock, unsigned long flags, struct wake_q_head *wake_q) + __releases(lock) { guard(preempt)(); raw_spin_unlock_irqrestore(lock, flags); diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile index 8ae86371ddcd..b1f1a367034f 100644 --- a/kernel/sched/Makefile +++ b/kernel/sched/Makefile @@ -1,5 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 +CONTEXT_ANALYSIS_core.o := y +CONTEXT_ANALYSIS_fair.o := y + # The compilers are complaining about unused variables inside an if(0) scope # block. This is daft, shut them up. ccflags-y += $(call cc-disable-warning, unused-but-set-variable) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f754a60de848..ebfd107d17af 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -395,6 +395,8 @@ static atomic_t sched_core_count; static struct cpumask sched_core_mask; static void sched_core_lock(int cpu, unsigned long *flags) + __context_unsafe(/* acquires multiple */) + __acquires(&runqueues.__lock) /* overapproximation */ { const struct cpumask *smt_mask = cpu_smt_mask(cpu); int t, i = 0; @@ -405,6 +407,8 @@ static void sched_core_lock(int cpu, unsigned long *flags) } static void sched_core_unlock(int cpu, unsigned long *flags) + __context_unsafe(/* releases multiple */) + __releases(&runqueues.__lock) /* overapproximation */ { const struct cpumask *smt_mask = cpu_smt_mask(cpu); int t; @@ -629,6 +633,7 @@ EXPORT_SYMBOL(__trace_set_current_state); */ void raw_spin_rq_lock_nested(struct rq *rq, int subclass) + __context_unsafe() { raw_spinlock_t *lock; @@ -654,6 +659,7 @@ void raw_spin_rq_lock_nested(struct rq *rq, int subclass) } bool raw_spin_rq_trylock(struct rq *rq) + __context_unsafe() { raw_spinlock_t *lock; bool ret; @@ -695,15 +701,16 @@ void double_rq_lock(struct rq *rq1, struct rq *rq2) raw_spin_rq_lock(rq1); if (__rq_lockp(rq1) != __rq_lockp(rq2)) raw_spin_rq_lock_nested(rq2, SINGLE_DEPTH_NESTING); + else + __acquire_ctx_guard(__rq_lockp(rq2)); /* fake acquire */ double_rq_clock_clear_update(rq1, rq2); } /* - * __task_rq_lock - lock the rq @p resides on. + * ___task_rq_lock - lock the rq @p resides on. */ -struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf) - __acquires(rq->lock) +struct rq *___task_rq_lock(struct task_struct *p, struct rq_flags *rf) { struct rq *rq; @@ -726,9 +733,7 @@ struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf) /* * task_rq_lock - lock p->pi_lock and lock the rq @p resides on. */ -struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf) - __acquires(p->pi_lock) - __acquires(rq->lock) +struct rq *_task_rq_lock(struct task_struct *p, struct rq_flags *rf) { struct rq *rq; @@ -2463,6 +2468,7 @@ static inline bool is_cpu_allowed(struct task_struct *p, int cpu) */ static struct rq *move_queued_task(struct rq *rq, struct rq_flags *rf, struct task_struct *p, int new_cpu) + __must_hold(__rq_lockp(rq)) { lockdep_assert_rq_held(rq); @@ -2509,6 +2515,7 @@ struct set_affinity_pending { */ static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf, struct task_struct *p, int dest_cpu) + __must_hold(__rq_lockp(rq)) { /* Affinity changed (again). */ if (!is_cpu_allowed(p, dest_cpu)) @@ -2545,6 +2552,12 @@ static int migration_cpu_stop(void *data) */ flush_smp_call_function_queue(); + /* + * We may change the underlying rq, but the locks held will + * appropriately be "transferred" when switching. + */ + context_unsafe_alias(rq); + raw_spin_lock(&p->pi_lock); rq_lock(rq, &rf); @@ -2654,6 +2667,8 @@ int push_cpu_stop(void *arg) if (!lowest_rq) goto out_unlock; + lockdep_assert_rq_held(lowest_rq); + // XXX validate p is still the highest prio task if (task_rq(p) == rq) { move_queued_task_locked(rq, lowest_rq, p); @@ -2899,8 +2914,7 @@ void release_user_cpus_ptr(struct task_struct *p) */ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flags *rf, int dest_cpu, unsigned int flags) - __releases(rq->lock) - __releases(p->pi_lock) + __releases(__rq_lockp(rq), &p->pi_lock) { struct set_affinity_pending my_pending = { }, *pending = NULL; bool stop_pending, complete = false; @@ -3055,8 +3069,7 @@ static int __set_cpus_allowed_ptr_locked(struct task_struct *p, struct affinity_context *ctx, struct rq *rq, struct rq_flags *rf) - __releases(rq->lock) - __releases(p->pi_lock) + __releases(__rq_lockp(rq), &p->pi_lock) { const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p); const struct cpumask *cpu_valid_mask = cpu_active_mask; @@ -4348,29 +4361,30 @@ static bool __task_needs_rq_lock(struct task_struct *p) */ int task_call_func(struct task_struct *p, task_call_f func, void *arg) { - struct rq *rq = NULL; struct rq_flags rf; int ret; raw_spin_lock_irqsave(&p->pi_lock, rf.flags); - if (__task_needs_rq_lock(p)) - rq = __task_rq_lock(p, &rf); + if (__task_needs_rq_lock(p)) { + struct rq *rq = __task_rq_lock(p, &rf); - /* - * At this point the task is pinned; either: - * - blocked and we're holding off wakeups (pi->lock) - * - woken, and we're holding off enqueue (rq->lock) - * - queued, and we're holding off schedule (rq->lock) - * - running, and we're holding off de-schedule (rq->lock) - * - * The called function (@func) can use: task_curr(), p->on_rq and - * p->__state to differentiate between these states. - */ - ret = func(p, arg); + /* + * At this point the task is pinned; either: + * - blocked and we're holding off wakeups (pi->lock) + * - woken, and we're holding off enqueue (rq->lock) + * - queued, and we're holding off schedule (rq->lock) + * - running, and we're holding off de-schedule (rq->lock) + * + * The called function (@func) can use: task_curr(), p->on_rq and + * p->__state to differentiate between these states. + */ + ret = func(p, arg); - if (rq) rq_unlock(rq, &rf); + } else { + ret = func(p, arg); + } raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags); return ret; @@ -5046,6 +5060,8 @@ void balance_callbacks(struct rq *rq, struct balance_callback *head) static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next, struct rq_flags *rf) + __releases(__rq_lockp(rq)) + __acquires(__rq_lockp(this_rq())) { /* * Since the runqueue lock will be released by the next @@ -5059,9 +5075,15 @@ prepare_lock_switch(struct rq *rq, struct task_struct *next, struct rq_flags *rf /* this is a valid case when another task releases the spinlock */ rq_lockp(rq)->owner = next; #endif + /* + * Model the rq reference switcheroo. + */ + __release(__rq_lockp(rq)); + __acquire(__rq_lockp(this_rq())); } static inline void finish_lock_switch(struct rq *rq) + __releases(__rq_lockp(rq)) { /* * If we are tracking spinlock dependencies then we have to @@ -5117,6 +5139,7 @@ static inline void kmap_local_sched_in(void) static inline void prepare_task_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next) + __must_hold(__rq_lockp(rq)) { kcov_prepare_switch(prev); sched_info_switch(rq, prev, next); @@ -5148,7 +5171,7 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev, * because prev may have moved to another CPU. */ static struct rq *finish_task_switch(struct task_struct *prev) - __releases(rq->lock) + __releases(__rq_lockp(this_rq())) { struct rq *rq = this_rq(); struct mm_struct *mm = rq->prev_mm; @@ -5236,7 +5259,7 @@ static struct rq *finish_task_switch(struct task_struct *prev) * @prev: the thread we just switched away from. */ asmlinkage __visible void schedule_tail(struct task_struct *prev) - __releases(rq->lock) + __releases(__rq_lockp(this_rq())) { /* * New tasks start with FORK_PREEMPT_COUNT, see there and @@ -5268,6 +5291,7 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) static __always_inline struct rq * context_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next, struct rq_flags *rf) + __releases(__rq_lockp(rq)) { prepare_task_switch(rq, prev, next); @@ -5948,6 +5972,7 @@ static void prev_balance(struct rq *rq, struct task_struct *prev, */ static inline struct task_struct * __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) + __must_hold(__rq_lockp(rq)) { const struct sched_class *class; struct task_struct *p; @@ -6040,6 +6065,7 @@ static void queue_core_balance(struct rq *rq); static struct task_struct * pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) + __must_hold(__rq_lockp(rq)) { struct task_struct *next, *p, *max = NULL; const struct cpumask *smt_mask; @@ -6339,6 +6365,7 @@ static bool steal_cookie_task(int cpu, struct sched_domain *sd) } static void sched_core_balance(struct rq *rq) + __must_hold(__rq_lockp(rq)) { struct sched_domain *sd; int cpu = cpu_of(rq); @@ -6484,6 +6511,7 @@ static inline void sched_core_cpu_dying(unsigned int cpu) {} static struct task_struct * pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) + __must_hold(__rq_lockp(rq)) { return __pick_next_task(rq, prev, rf); } @@ -8141,6 +8169,12 @@ static int __balance_push_cpu_stop(void *arg) struct rq_flags rf; int cpu; + /* + * We may change the underlying rq, but the locks held will + * appropriately be "transferred" when switching. + */ + context_unsafe_alias(rq); + raw_spin_lock_irq(&p->pi_lock); rq_lock(rq, &rf); @@ -8168,6 +8202,7 @@ static DEFINE_PER_CPU(struct cpu_stop_work, push_work); * effective when the hotplug motion is down. */ static void balance_push(struct rq *rq) + __must_hold(__rq_lockp(rq)) { struct task_struct *push_task = rq->curr; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5b752324270b..b8cebd1f1f4d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2876,6 +2876,7 @@ static int preferred_group_nid(struct task_struct *p, int nid) } static void task_numa_placement(struct task_struct *p) + __context_unsafe(/* conditional locking */) { int seq, nid, max_nid = NUMA_NO_NODE; unsigned long max_faults = 0; @@ -4803,7 +4804,8 @@ static inline unsigned long cfs_rq_load_avg(struct cfs_rq *cfs_rq) return cfs_rq->avg.load_avg; } -static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf); +static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf) + __must_hold(__rq_lockp(this_rq)); static inline unsigned long task_util(struct task_struct *p) { @@ -6229,6 +6231,7 @@ static bool distribute_cfs_runtime(struct cfs_bandwidth *cfs_b) * used to track this state. */ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, unsigned long flags) + __must_hold(&cfs_b->lock) { int throttled; @@ -8714,6 +8717,7 @@ static void set_cpus_allowed_fair(struct task_struct *p, struct affinity_context static int balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) + __must_hold(__rq_lockp(rq)) { if (sched_fair_runnable(rq)) return 1; @@ -8867,6 +8871,7 @@ static void set_next_task_fair(struct rq *rq, struct task_struct *p, bool first) struct task_struct * pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) + __must_hold(__rq_lockp(rq)) { struct sched_entity *se; struct task_struct *p; @@ -8953,6 +8958,7 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf } static struct task_struct *__pick_next_task_fair(struct rq *rq, struct task_struct *prev) + __must_hold(__rq_lockp(rq)) { return pick_next_task_fair(rq, prev, NULL); } @@ -12784,6 +12790,7 @@ static inline void nohz_newidle_balance(struct rq *this_rq) { } * > 0 - success, new (fair) tasks present */ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf) + __must_hold(__rq_lockp(this_rq)) { unsigned long next_balance = jiffies + HZ; int this_cpu = this_rq->cpu; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index adfb6e3409d7..2d4b9ba30d58 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1350,8 +1350,13 @@ static inline bool is_migration_disabled(struct task_struct *p) DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); +static __always_inline struct rq *__this_rq(void) +{ + return this_cpu_ptr(&runqueues); +} + #define cpu_rq(cpu) (&per_cpu(runqueues, (cpu))) -#define this_rq() this_cpu_ptr(&runqueues) +#define this_rq() __this_rq() #define task_rq(p) cpu_rq(task_cpu(p)) #define cpu_curr(cpu) (cpu_rq(cpu)->curr) #define raw_rq() raw_cpu_ptr(&runqueues) @@ -1396,6 +1401,7 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq) } static inline raw_spinlock_t *__rq_lockp(struct rq *rq) + __returns_ctx_guard(rq_lockp(rq)) /* alias them */ { if (rq->core_enabled) return &rq->core->__lock; @@ -1492,6 +1498,7 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq) } static inline raw_spinlock_t *__rq_lockp(struct rq *rq) + __returns_ctx_guard(rq_lockp(rq)) /* alias them */ { return &rq->__lock; } @@ -1534,32 +1541,42 @@ static inline bool rt_group_sched_enabled(void) #endif /* !CONFIG_RT_GROUP_SCHED */ static inline void lockdep_assert_rq_held(struct rq *rq) + __assumes_ctx_guard(__rq_lockp(rq)) { lockdep_assert_held(__rq_lockp(rq)); } -extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass); -extern bool raw_spin_rq_trylock(struct rq *rq); -extern void raw_spin_rq_unlock(struct rq *rq); +extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass) + __acquires(__rq_lockp(rq)); + +extern bool raw_spin_rq_trylock(struct rq *rq) + __cond_acquires(true, __rq_lockp(rq)); + +extern void raw_spin_rq_unlock(struct rq *rq) + __releases(__rq_lockp(rq)); static inline void raw_spin_rq_lock(struct rq *rq) + __acquires(__rq_lockp(rq)) { raw_spin_rq_lock_nested(rq, 0); } static inline void raw_spin_rq_lock_irq(struct rq *rq) + __acquires(__rq_lockp(rq)) { local_irq_disable(); raw_spin_rq_lock(rq); } static inline void raw_spin_rq_unlock_irq(struct rq *rq) + __releases(__rq_lockp(rq)) { raw_spin_rq_unlock(rq); local_irq_enable(); } static inline unsigned long _raw_spin_rq_lock_irqsave(struct rq *rq) + __acquires(__rq_lockp(rq)) { unsigned long flags; @@ -1570,6 +1587,7 @@ static inline unsigned long _raw_spin_rq_lock_irqsave(struct rq *rq) } static inline void raw_spin_rq_unlock_irqrestore(struct rq *rq, unsigned long flags) + __releases(__rq_lockp(rq)) { raw_spin_rq_unlock(rq); local_irq_restore(flags); @@ -1818,17 +1836,15 @@ static inline void rq_repin_lock(struct rq *rq, struct rq_flags *rf) rq->clock_update_flags |= rf->clock_update_flags; } -extern -struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf) - __acquires(rq->lock); +#define __task_rq_lock(...) __acquire_ret(___task_rq_lock(__VA_ARGS__), __rq_lockp(__ret)) +extern struct rq *___task_rq_lock(struct task_struct *p, struct rq_flags *rf) __acquires_ret; -extern -struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf) - __acquires(p->pi_lock) - __acquires(rq->lock); +#define task_rq_lock(...) __acquire_ret(_task_rq_lock(__VA_ARGS__), __rq_lockp(__ret)) +extern struct rq *_task_rq_lock(struct task_struct *p, struct rq_flags *rf) + __acquires(&p->pi_lock) __acquires_ret; static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf) - __releases(rq->lock) + __releases(__rq_lockp(rq)) { rq_unpin_lock(rq, rf); raw_spin_rq_unlock(rq); @@ -1836,8 +1852,7 @@ static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf) static inline void task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf) - __releases(rq->lock) - __releases(p->pi_lock) + __releases(__rq_lockp(rq), &p->pi_lock) { rq_unpin_lock(rq, rf); raw_spin_rq_unlock(rq); @@ -1848,44 +1863,45 @@ DEFINE_LOCK_GUARD_1(task_rq_lock, struct task_struct, _T->rq = task_rq_lock(_T->lock, &_T->rf), task_rq_unlock(_T->rq, _T->lock, &_T->rf), struct rq *rq; struct rq_flags rf) +DECLARE_LOCK_GUARD_1_ATTRS(task_rq_lock, __assumes_ctx_guard(_T->pi_lock), /* */) static inline void rq_lock_irqsave(struct rq *rq, struct rq_flags *rf) - __acquires(rq->lock) + __acquires(__rq_lockp(rq)) { raw_spin_rq_lock_irqsave(rq, rf->flags); rq_pin_lock(rq, rf); } static inline void rq_lock_irq(struct rq *rq, struct rq_flags *rf) - __acquires(rq->lock) + __acquires(__rq_lockp(rq)) { raw_spin_rq_lock_irq(rq); rq_pin_lock(rq, rf); } static inline void rq_lock(struct rq *rq, struct rq_flags *rf) - __acquires(rq->lock) + __acquires(__rq_lockp(rq)) { raw_spin_rq_lock(rq); rq_pin_lock(rq, rf); } static inline void rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf) - __releases(rq->lock) + __releases(__rq_lockp(rq)) { rq_unpin_lock(rq, rf); raw_spin_rq_unlock_irqrestore(rq, rf->flags); } static inline void rq_unlock_irq(struct rq *rq, struct rq_flags *rf) - __releases(rq->lock) + __releases(__rq_lockp(rq)) { rq_unpin_lock(rq, rf); raw_spin_rq_unlock_irq(rq); } static inline void rq_unlock(struct rq *rq, struct rq_flags *rf) - __releases(rq->lock) + __releases(__rq_lockp(rq)) { rq_unpin_lock(rq, rf); raw_spin_rq_unlock(rq); @@ -1896,18 +1912,24 @@ DEFINE_LOCK_GUARD_1(rq_lock, struct rq, rq_unlock(_T->lock, &_T->rf), struct rq_flags rf) +DECLARE_LOCK_GUARD_1_ATTRS(rq_lock, __assumes_ctx_guard(__rq_lockp(_T)), /* */); + DEFINE_LOCK_GUARD_1(rq_lock_irq, struct rq, rq_lock_irq(_T->lock, &_T->rf), rq_unlock_irq(_T->lock, &_T->rf), struct rq_flags rf) +DECLARE_LOCK_GUARD_1_ATTRS(rq_lock_irq, __assumes_ctx_guard(__rq_lockp(_T)), /* */); + DEFINE_LOCK_GUARD_1(rq_lock_irqsave, struct rq, rq_lock_irqsave(_T->lock, &_T->rf), rq_unlock_irqrestore(_T->lock, &_T->rf), struct rq_flags rf) -static inline struct rq *this_rq_lock_irq(struct rq_flags *rf) - __acquires(rq->lock) +DECLARE_LOCK_GUARD_1_ATTRS(rq_lock_irqsave, __assumes_ctx_guard(__rq_lockp(_T)), /* */); + +#define this_rq_lock_irq(...) __acquire_ret(_this_rq_lock_irq(__VA_ARGS__), __rq_lockp(__ret)) +static inline struct rq *_this_rq_lock_irq(struct rq_flags *rf) __acquires_ret { struct rq *rq; @@ -2908,8 +2930,13 @@ static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2) #define DEFINE_LOCK_GUARD_2(name, type, _lock, _unlock, ...) \ __DEFINE_UNLOCK_GUARD(name, type, _unlock, type *lock2; __VA_ARGS__) \ static inline class_##name##_t class_##name##_constructor(type *lock, type *lock2) \ + __no_context_analysis \ { class_##name##_t _t = { .lock = lock, .lock2 = lock2 }, *_T = &_t; \ _lock; return _t; } +#define DECLARE_LOCK_GUARD_2_ATTRS(_name, _lock, _unlock) \ +static inline class_##_name##_t class_##_name##_constructor(lock_##_name##_t *_T1, \ + lock_##_name##_t *_T2) _lock; \ +static inline void class_##_name##_destructor(class_##_name##_t *_T) _unlock static inline bool rq_order_less(struct rq *rq1, struct rq *rq2) { @@ -2937,7 +2964,8 @@ static inline bool rq_order_less(struct rq *rq1, struct rq *rq2) return rq1->cpu < rq2->cpu; } -extern void double_rq_lock(struct rq *rq1, struct rq *rq2); +extern void double_rq_lock(struct rq *rq1, struct rq *rq2) + __acquires(__rq_lockp(rq1), __rq_lockp(rq2)); #ifdef CONFIG_PREEMPTION @@ -2950,9 +2978,8 @@ extern void double_rq_lock(struct rq *rq1, struct rq *rq2); * also adds more overhead and therefore may reduce throughput. */ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) - __releases(this_rq->lock) - __acquires(busiest->lock) - __acquires(this_rq->lock) + __must_hold(__rq_lockp(this_rq)) + __acquires(__rq_lockp(busiest)) { raw_spin_rq_unlock(this_rq); double_rq_lock(this_rq, busiest); @@ -2969,12 +2996,16 @@ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) * regardless of entry order into the function. */ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) - __releases(this_rq->lock) - __acquires(busiest->lock) - __acquires(this_rq->lock) + __must_hold(__rq_lockp(this_rq)) + __acquires(__rq_lockp(busiest)) { - if (__rq_lockp(this_rq) == __rq_lockp(busiest) || - likely(raw_spin_rq_trylock(busiest))) { + if (__rq_lockp(this_rq) == __rq_lockp(busiest)) { + __acquire(__rq_lockp(busiest)); /* already held */ + double_rq_clock_clear_update(this_rq, busiest); + return 0; + } + + if (likely(raw_spin_rq_trylock(busiest))) { double_rq_clock_clear_update(this_rq, busiest); return 0; } @@ -2997,6 +3028,8 @@ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) * double_lock_balance - lock the busiest runqueue, this_rq is locked already. */ static inline int double_lock_balance(struct rq *this_rq, struct rq *busiest) + __must_hold(__rq_lockp(this_rq)) + __acquires(__rq_lockp(busiest)) { lockdep_assert_irqs_disabled(); @@ -3004,14 +3037,17 @@ static inline int double_lock_balance(struct rq *this_rq, struct rq *busiest) } static inline void double_unlock_balance(struct rq *this_rq, struct rq *busiest) - __releases(busiest->lock) + __releases(__rq_lockp(busiest)) { if (__rq_lockp(this_rq) != __rq_lockp(busiest)) raw_spin_rq_unlock(busiest); + else + __release(__rq_lockp(busiest)); /* fake release */ lock_set_subclass(&__rq_lockp(this_rq)->dep_map, 0, _RET_IP_); } static inline void double_lock(spinlock_t *l1, spinlock_t *l2) + __acquires(l1, l2) { if (l1 > l2) swap(l1, l2); @@ -3021,6 +3057,7 @@ static inline void double_lock(spinlock_t *l1, spinlock_t *l2) } static inline void double_lock_irq(spinlock_t *l1, spinlock_t *l2) + __acquires(l1, l2) { if (l1 > l2) swap(l1, l2); @@ -3030,6 +3067,7 @@ static inline void double_lock_irq(spinlock_t *l1, spinlock_t *l2) } static inline void double_raw_lock(raw_spinlock_t *l1, raw_spinlock_t *l2) + __acquires(l1, l2) { if (l1 > l2) swap(l1, l2); @@ -3039,6 +3077,7 @@ static inline void double_raw_lock(raw_spinlock_t *l1, raw_spinlock_t *l2) } static inline void double_raw_unlock(raw_spinlock_t *l1, raw_spinlock_t *l2) + __releases(l1, l2) { raw_spin_unlock(l1); raw_spin_unlock(l2); @@ -3048,6 +3087,8 @@ DEFINE_LOCK_GUARD_2(double_raw_spinlock, raw_spinlock_t, double_raw_lock(_T->lock, _T->lock2), double_raw_unlock(_T->lock, _T->lock2)) +DECLARE_LOCK_GUARD_2_ATTRS(double_raw_spinlock, __assumes_ctx_guard(_T1) __assumes_ctx_guard(_T2), /* */); + /* * double_rq_unlock - safely unlock two runqueues * @@ -3055,13 +3096,12 @@ DEFINE_LOCK_GUARD_2(double_raw_spinlock, raw_spinlock_t, * you need to do so manually after calling. */ static inline void double_rq_unlock(struct rq *rq1, struct rq *rq2) - __releases(rq1->lock) - __releases(rq2->lock) + __releases(__rq_lockp(rq1), __rq_lockp(rq2)) { if (__rq_lockp(rq1) != __rq_lockp(rq2)) raw_spin_rq_unlock(rq2); else - __release(rq2->lock); + __release(__rq_lockp(rq2)); /* fake release */ raw_spin_rq_unlock(rq1); } diff --git a/scripts/context-analysis-suppression.txt b/scripts/context-analysis-suppression.txt index df25c3d07a5b..fd8951d06706 100644 --- a/scripts/context-analysis-suppression.txt +++ b/scripts/context-analysis-suppression.txt @@ -26,6 +26,7 @@ src:*include/linux/refcount.h=emit src:*include/linux/rhashtable.h=emit src:*include/linux/rwlock*.h=emit src:*include/linux/rwsem.h=emit +src:*include/linux/sched*=emit src:*include/linux/seqlock*.h=emit src:*include/linux/spinlock*.h=emit src:*include/linux/srcu*.h=emit -- 2.52.0.rc1.455.g30608eb744-goog