From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6575FC004D4 for ; Wed, 18 Jan 2023 08:00:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6EA86B0075; Wed, 18 Jan 2023 03:00:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B1E616B0078; Wed, 18 Jan 2023 03:00:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C3A66B007B; Wed, 18 Jan 2023 03:00:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8BC5B6B0075 for ; Wed, 18 Jan 2023 03:00:29 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5A187403E5 for ; Wed, 18 Jan 2023 08:00:29 +0000 (UTC) X-FDA: 80367172578.09.5223801 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf21.hostedemail.com (Postfix) with ESMTP id 9EB461C001F for ; Wed, 18 Jan 2023 08:00:27 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=mLPAcj75; spf=pass (imf21.hostedemail.com: domain of npiggin@gmail.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674028827; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=n6PuCTERzdY2d/gYc8R3maLl/fbctX4OPH6RObWuYyk=; b=JcNTAFUE+uizupTes8y0x544HAQ/R1npHHFR4GIKa8Bsg1V544XcvqYAOA8s6oy0pfPa2U ATKdn9SGlZ2PRzFGeSRppVfl1iAzKflxYlRibfJsmSs8CkUIZFbW0BcFzjDvlhPJo3y8nS DJR8ZoHF/xqcostbYoA/l6HY4iFssh8= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=mLPAcj75; spf=pass (imf21.hostedemail.com: domain of npiggin@gmail.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674028827; a=rsa-sha256; cv=none; b=xZBmY5FZLjRj83H0fsxMJAHsdmV1ZvnhJ9VpsnQh4k2QuSCWQ/xxDEoXirIhs2wRaa9A2q DiQRuReS9D6JH7qqPqq1gOsUQUip6zmHxqiaWNArQ4nH8mebQKeeTJ07se0rWdbITI2/vP Qnu3zHJ/ukDi7o45h9OFlIgJbf+8eEQ= Received: by mail-pj1-f52.google.com with SMTP id q64so34968041pjq.4 for ; Wed, 18 Jan 2023 00:00:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=n6PuCTERzdY2d/gYc8R3maLl/fbctX4OPH6RObWuYyk=; b=mLPAcj75durv64NidSKodcHmBhYym/DDM+iL8MW5HDxvzQ5mMgHhbdgrXMpJGbOi4Y mVW9wBj5zIok1vK4Tj2bKVEe6bs1LBNMhZmZtD+n8mfhO8aNk1YcaysWuuWRHrSw6pFM UHPaAcT1FbrZaF9vdXEgSY47DOHbKtLKA3Adq6HXuimKYkD6R1fgcTphKi5s0lO2eqH5 eupSgPjW+jy9HiX5ukWEqS2r+u8df35c62Wc5Sm3oOxka7ARVkbcVCZk+bOH3OlLVlH4 xJHj62JSP0Q0g1rP9oo+WXMSAJy8ojYfY/mTTWZkllr+1/fbmo7L0Y7+eK5PDWH5WUNO 0Yfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=n6PuCTERzdY2d/gYc8R3maLl/fbctX4OPH6RObWuYyk=; b=o29YF3ee6RPCCKo3CaOeJwpY5KLZuq4tspy4NfhYBixfdW7gLfsp3Y9kxfvgzbZy9s MhcQAU0wPrT7NSj3qiBv1iKkdv9qEjKYW7yJIhdwFAUI4f3ef0thsNZWuMgM/hUkWdgV kbL18u7Y/QRpNKxf+Ip2F6kE0OXnIWh611V+6N5MV6TQ1VTM12zWSWHcBwzjYmHlkEpL vClRmVIbu1nYvnHgwNoh3P00UBeQnvUsTgzzjJS9G2aO339Fgp3kXhYB1NRLY51nXDEG OK7GVz0lPu36u7qKqtVfDKBQAkK1A1YAjqp69H0Dwt8MBD3jhkDcPz0i3YLa9BiJbyJi HK5A== X-Gm-Message-State: AFqh2kps4N9cjaDoxo8MzlhFqbuVcbtATAZxZDJ1z2K1GjOYQERTYVS6 DX/jA6AgoQ0GK9RC059WQAw= X-Google-Smtp-Source: AMrXdXtoyvjMvP0/AkXBEr4j6oVuizDW6dMWCoEY70bY2IPo7cXCt1CBRz4u1euS2CBCOAp/EY+llw== X-Received: by 2002:a17:90a:dd98:b0:229:8526:ba98 with SMTP id l24-20020a17090add9800b002298526ba98mr6450941pjv.12.1674028826410; Wed, 18 Jan 2023 00:00:26 -0800 (PST) Received: from bobo.ibm.com (193-116-102-45.tpgi.com.au. [193.116.102.45]) by smtp.gmail.com with ESMTPSA id y2-20020a17090a16c200b002272616d3e1sm738462pje.40.2023.01.18.00.00.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Jan 2023 00:00:25 -0800 (PST) From: Nicholas Piggin To: Andrew Morton Cc: Nicholas Piggin , Andy Lutomirski , Linus Torvalds , linux-arch , linux-mm , linuxppc-dev@lists.ozlabs.org Subject: [PATCH v6 1/5] lazy tlb: introduce lazy tlb mm refcount helper functions Date: Wed, 18 Jan 2023 18:00:07 +1000 Message-Id: <20230118080011.2258375-2-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230118080011.2258375-1-npiggin@gmail.com> References: <20230118080011.2258375-1-npiggin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: w5tg1shd7iw96jcrfbq9kmnjgmoqjkyh X-Rspam-User: X-Rspamd-Queue-Id: 9EB461C001F X-Rspamd-Server: rspam06 X-HE-Tag: 1674028827-685044 X-HE-Meta: U2FsdGVkX19jn6kpjAJtQc8CV9MLxEwKPhO7qEUH7819/Uan30K4v5PoSTugddiAxiS74m1YuzPnK/0IFnbgf5/4hg5bWTqPzOU2sV5fDXUDKB0f7ViVEEIq2My2QV6zROOOWF+StV7s3dUWFy63De3qNQHpeD9w44idRm3w7tI1WjuFN8l1zWIxxolieGTWWJLWXUvS8s9DSTQCFYZ4pVguiuL3UZ0zYcp6HRf2xEgYHqwtgExZJHA6sUXLos9A61CJ7MLoa7/hOmtejvGUoDaCRrosjjHHhMtxRSqoVR23e4tNW5vFKAIQq56+3FBUoQ1OPNEVjLUWAhdaxZM/E+B9ggTBkY1aEcDYI6osZldLs1wTDUE6jABtIExizVoir8F4dcl8pFMkyzRmc/1yyWScUpJks5U0v1fIT9t3DTz6ZjbBfTRPH5v7CJnKUFgpwMVNbHr4XMWGuRTuykAUY5xb21wmKLyBoB40LRJkY/G2JdMLtLy2KJIhoDDGg4az6cJOr+4nEicIUxmoIsxBaTMIhWg+L2dZ0NDctwU9BK+iinLG3SGgyzu8eACUEbhassUij0NlRL/C4uKYP1OC3RqJAIEMlBtOK3eI4YjL52n0l/YAEhAe/OrO6e6x4rZHRvpiDMVWzMgL69imo9BdKDfPLAfBvWdeRQnZ4Sptug5T1AHZ5kPK9QlSRSkUG99vbbUwqoq4nYr25oYG9vOMKBNEv4fbeP9zykQxCiermXa4zJDsDUWFu7EujRyw2KrasQicl6QTmzQxi5Ocj5cs4cdso6Fpbzff90Z77Fr3TDaSu6GPwKPW13oomWeceXat+fyc1lI5YGGBouijMQ+nXXfCw3qThWU15eC9xG0EDSjfVtt6i8QiawE7NR05g8QktbcMPoZJtX7Pb9hUQ985bEcWbJ2+PUi6QBroK+w4JoPGsG3ygKi0pbXHqu4ZSc9rVblvQm2FM1tUJOG7ISc i8IS7o88 F7vDra5+wEQN8NmUhCr/VCMSWbQtU6u5mLFCyxHNz93Kh1UuMx9HnJQ77o51h1EuuikxOMIfAJ3fH4/VjBMkoqHVw1rFpeCA33yL6HHJRD/ifRbkeQ1lQNF+0EZz4pG/5M/ic X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add explicit _lazy_tlb annotated functions for lazy tlb mm refcounting. This makes the lazy tlb mm references more obvious, and allows the refcounting scheme to be modified in later changes. The only functional change is in kthread_use_mm/kthread_unuse_mm is because it is clever with refcounting: If it happens that the kthread's lazy tlb mm (active_mm) is the same as the mm to be used, the code doesn't touch the refcount but rather transfers the lazy refcount to used-mm refcount. If the lazy tlb mm refcount is no longer equivalent to the regular refcount, this trick can not be used. mmgrab a regular reference on mm to use, and mmdrop_lazy_tlb the previous active_mm. Signed-off-by: Nicholas Piggin --- arch/arm/mach-rpc/ecard.c | 2 +- arch/powerpc/kernel/smp.c | 2 +- arch/powerpc/mm/book3s64/radix_tlb.c | 4 ++-- fs/exec.c | 2 +- include/linux/sched/mm.h | 16 ++++++++++++++++ kernel/cpu.c | 2 +- kernel/exit.c | 2 +- kernel/kthread.c | 21 +++++++++++++-------- kernel/sched/core.c | 15 ++++++++------- 9 files changed, 44 insertions(+), 22 deletions(-) diff --git a/arch/arm/mach-rpc/ecard.c b/arch/arm/mach-rpc/ecard.c index 53813f9464a2..c30df1097c52 100644 --- a/arch/arm/mach-rpc/ecard.c +++ b/arch/arm/mach-rpc/ecard.c @@ -253,7 +253,7 @@ static int ecard_init_mm(void) current->mm = mm; current->active_mm = mm; activate_mm(active_mm, mm); - mmdrop(active_mm); + mmdrop_lazy_tlb(active_mm); ecard_init_pgtables(mm); return 0; } diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 6b90f10a6c81..7db6b3faea65 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1611,7 +1611,7 @@ void start_secondary(void *unused) if (IS_ENABLED(CONFIG_PPC32)) setup_kup(); - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); current->active_mm = &init_mm; smp_store_cpu_info(cpu); diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index 4e29b619578c..282359ab525b 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -794,10 +794,10 @@ void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush) if (current->active_mm == mm) { WARN_ON_ONCE(current->mm != NULL); /* Is a kernel thread and is using mm as the lazy tlb */ - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); current->active_mm = &init_mm; switch_mm_irqs_off(mm, &init_mm, current); - mmdrop(mm); + mmdrop_lazy_tlb(mm); } /* diff --git a/fs/exec.c b/fs/exec.c index ab913243a367..1a32a88db173 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1033,7 +1033,7 @@ static int exec_mmap(struct mm_struct *mm) mmput(old_mm); return 0; } - mmdrop(active_mm); + mmdrop_lazy_tlb(active_mm); return 0; } diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 2a243616f222..5376caf6fcf3 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -79,6 +79,22 @@ static inline void mmdrop_sched(struct mm_struct *mm) } #endif +/* Helpers for lazy TLB mm refcounting */ +static inline void mmgrab_lazy_tlb(struct mm_struct *mm) +{ + mmgrab(mm); +} + +static inline void mmdrop_lazy_tlb(struct mm_struct *mm) +{ + mmdrop(mm); +} + +static inline void mmdrop_lazy_tlb_sched(struct mm_struct *mm) +{ + mmdrop_sched(mm); +} + /** * mmget() - Pin the address space associated with a &struct mm_struct. * @mm: The address space to pin. diff --git a/kernel/cpu.c b/kernel/cpu.c index 6c0a92ca6bb5..189895288d9d 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -623,7 +623,7 @@ static int finish_cpu(unsigned int cpu) */ if (mm != &init_mm) idle->active_mm = &init_mm; - mmdrop(mm); + mmdrop_lazy_tlb(mm); return 0; } diff --git a/kernel/exit.c b/kernel/exit.c index 15dc2ec80c46..1a4608d765e4 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -537,7 +537,7 @@ static void exit_mm(void) return; sync_mm_rss(mm); mmap_read_lock(mm); - mmgrab(mm); + mmgrab_lazy_tlb(mm); BUG_ON(mm != current->active_mm); /* more a memory barrier than a real lock */ task_lock(current); diff --git a/kernel/kthread.c b/kernel/kthread.c index f97fd01a2932..691b213e578f 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1410,14 +1410,19 @@ void kthread_use_mm(struct mm_struct *mm) WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD)); WARN_ON_ONCE(tsk->mm); + /* + * It's possible that tsk->active_mm == mm here, but we must + * still mmgrab(mm) and mmdrop_lazy_tlb(active_mm), because lazy + * mm may not have its own refcount (see mmgrab/drop_lazy_tlb()). + */ + mmgrab(mm); + task_lock(tsk); /* Hold off tlb flush IPIs while switching mm's */ local_irq_disable(); active_mm = tsk->active_mm; - if (active_mm != mm) { - mmgrab(mm); + if (active_mm != mm) tsk->active_mm = mm; - } tsk->mm = mm; membarrier_update_current_mm(mm); switch_mm_irqs_off(active_mm, mm, tsk); @@ -1434,12 +1439,9 @@ void kthread_use_mm(struct mm_struct *mm) * memory barrier after storing to tsk->mm, before accessing * user-space memory. A full memory barrier for membarrier * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by - * mmdrop(), or explicitly with smp_mb(). + * mmdrop_lazy_tlb(). */ - if (active_mm != mm) - mmdrop(active_mm); - else - smp_mb(); + mmdrop_lazy_tlb(active_mm); } EXPORT_SYMBOL_GPL(kthread_use_mm); @@ -1467,10 +1469,13 @@ void kthread_unuse_mm(struct mm_struct *mm) local_irq_disable(); tsk->mm = NULL; membarrier_update_current_mm(NULL); + mmgrab_lazy_tlb(mm); /* active_mm is still 'mm' */ enter_lazy_tlb(mm, tsk); local_irq_enable(); task_unlock(tsk); + + mmdrop(mm); } EXPORT_SYMBOL_GPL(kthread_unuse_mm); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 25b582b6ee5f..26aaa974ee6d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5140,13 +5140,14 @@ static struct rq *finish_task_switch(struct task_struct *prev) * rq->curr, before returning to userspace, so provide them here: * * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly - * provided by mmdrop(), + * provided by mmdrop_lazy_tlb(), * - a sync_core for SYNC_CORE. */ if (mm) { membarrier_mm_sync_core_before_usermode(mm); - mmdrop_sched(mm); + mmdrop_lazy_tlb_sched(mm); } + if (unlikely(prev_state == TASK_DEAD)) { if (prev->sched_class->task_dead) prev->sched_class->task_dead(prev); @@ -5203,9 +5204,9 @@ context_switch(struct rq *rq, struct task_struct *prev, /* * kernel -> kernel lazy + transfer active - * user -> kernel lazy + mmgrab() active + * user -> kernel lazy + mmgrab_lazy_tlb() active * - * kernel -> user switch + mmdrop() active + * kernel -> user switch + mmdrop_lazy_tlb() active * user -> user switch */ if (!next->mm) { // to kernel @@ -5213,7 +5214,7 @@ context_switch(struct rq *rq, struct task_struct *prev, next->active_mm = prev->active_mm; if (prev->mm) // from user - mmgrab(prev->active_mm); + mmgrab_lazy_tlb(prev->active_mm); else prev->active_mm = NULL; } else { // to user @@ -5230,7 +5231,7 @@ context_switch(struct rq *rq, struct task_struct *prev, lru_gen_use_mm(next->mm); if (!prev->mm) { // from kernel - /* will mmdrop() in finish_task_switch(). */ + /* will mmdrop_lazy_tlb() in finish_task_switch(). */ rq->prev_mm = prev->active_mm; prev->active_mm = NULL; } @@ -9859,7 +9860,7 @@ void __init sched_init(void) /* * The boot idle thread does lazy MMU switching as well: */ - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); enter_lazy_tlb(&init_mm, current); /* -- 2.37.2