From: David Hildenbrand <david@redhat.com>
To: Yair Podemsky <ypodemsk@redhat.com>,
linux@armlinux.org.uk, mpe@ellerman.id.au, npiggin@gmail.com,
christophe.leroy@csgroup.eu, hca@linux.ibm.com,
gor@linux.ibm.com, agordeev@linux.ibm.com,
borntraeger@linux.ibm.com, svens@linux.ibm.com,
davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
hpa@zytor.com, will@kernel.org, aneesh.kumar@linux.ibm.com,
akpm@linux-foundation.org, peterz@infradead.org, arnd@arndb.de,
keescook@chromium.org, paulmck@kernel.org, jpoimboe@kernel.org,
samitolvanen@google.com, frederic@kernel.org, ardb@kernel.org,
juerg.haefliger@canonical.com, rmk+kernel@armlinux.org.uk,
geert+renesas@glider.be, tony@atomide.com,
linus.walleij@linaro.org, sebastian.reichel@collabora.com,
nick.hawkins@hpe.com, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
sparclinux@vger.kernel.org, linux-arch@vger.kernel.org,
linux-mm@kvack.org, mtosatti@redhat.com, vschneid@redhat.com,
dhildenb@redhat.com
Cc: alougovs@redhat.com
Subject: Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode
Date: Tue, 4 Apr 2023 16:03:38 +0200 [thread overview]
Message-ID: <d21bfe1d-46e6-5547-cdcb-0d851bf0834a@redhat.com> (raw)
In-Reply-To: <20230404134224.137038-4-ypodemsk@redhat.com>
On 04.04.23 15:42, Yair Podemsky wrote:
> The tlb_remove_table_smp_sync IPI is used to ensure the outdated tlb page
> is not currently being accessed and can be cleared.
> This occurs once all CPUs have left the lockless gup code section.
> If they reenter the page table walk, the pointers will be to the new
> pages.
> Therefore the IPI is only needed for CPUs in kernel mode.
> By preventing the IPI from being sent to CPUs not in kernel mode,
> Latencies are reduced.
>
> Race conditions considerations:
> The context state check is vulnerable to race conditions between the
> moment the context state is read to when the IPI is sent (or not).
>
> Here are these scenarios.
> case 1:
> CPU-A CPU-B
>
> state == CONTEXT_KERNEL
> int state = atomic_read(&ct->state);
> Kernel-exit:
> state == CONTEXT_USER
> if (state & CT_STATE_MASK == CONTEXT_KERNEL)
>
> In this case, the IPI will be sent to CPU-B despite it is no longer in
> the kernel. The consequence of which would be an unnecessary IPI being
> handled by CPU-B, causing a reduction in latency.
> This would have been the case every time without this patch.
>
> case 2:
> CPU-A CPU-B
>
> modify pagetables
> tlb_flush (memory barrier)
> state == CONTEXT_USER
> int state = atomic_read(&ct->state);
> Kernel-enter:
> state == CONTEXT_KERNEL
> READ(pagetable values)
> if (state & CT_STATE_MASK == CONTEXT_USER)
>
> In this case, the IPI will not be sent to CPU-B despite it returning to
> the kernel and even reading the pagetable.
> However since this CPU-B has entered the pagetable after the
> modification it is reading the new, safe values.
>
> The only case when this IPI is truly necessary is when CPU-B has entered
> the lockless gup code section before the pagetable modifications and
> has yet to exit them, in which case it is still in the kernel.
>
> Signed-off-by: Yair Podemsky <ypodemsk@redhat.com>
> ---
> mm/mmu_gather.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
> index 5ea9be6fb87c..731d955e152d 100644
> --- a/mm/mmu_gather.c
> +++ b/mm/mmu_gather.c
> @@ -9,6 +9,7 @@
> #include <linux/smp.h>
> #include <linux/swap.h>
> #include <linux/rmap.h>
> +#include <linux/context_tracking_state.h>
>
> #include <asm/pgalloc.h>
> #include <asm/tlb.h>
> @@ -191,6 +192,20 @@ static void tlb_remove_table_smp_sync(void *arg)
> /* Simply deliver the interrupt */
> }
>
> +
> +#ifdef CONFIG_CONTEXT_TRACKING
> +static bool cpu_in_kernel(int cpu, void *info)
> +{
> + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> + int state = atomic_read(&ct->state);
> + /* will return true only for cpus in kernel space */
> + return state & CT_STATE_MASK == CONTEXT_KERNEL;
> +}
> +#define CONTEXT_PREDICATE cpu_in_kernel
> +#else
> +#define CONTEXT_PREDICATE NULL
> +#endif /* CONFIG_CONTEXT_TRACKING */
> +
> #ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
> #define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
> #else
> @@ -206,8 +221,8 @@ void tlb_remove_table_sync_one(struct mm_struct *mm)
> * It is however sufficient for software page-table walkers that rely on
> * IRQ disabling.
> */
> - on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
> - NULL, true);
> + on_each_cpu_cond_mask(CONTEXT_PREDICATE, tlb_remove_table_smp_sync,
> + NULL, true, REMOVE_TABLE_IPI_MASK);
> }
>
> static void tlb_remove_table_rcu(struct rcu_head *head)
Maybe a bit cleaner by avoiding CONTEXT_PREDICATE, still not completely nice
(an empty dummy function "cpu_maybe_in_kernel" might be cleanest but would
be slightly slower for !CONFIG_CONTEXT_TRACKING):
#ifdef CONFIG_CONTEXT_TRACKING
static bool cpu_in_kernel(int cpu, void *info)
{
struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
int state = atomic_read(&ct->state);
/* will return true only for cpus in kernel space */
return state & CT_STATE_MASK == CONTEXT_KERNEL;
}
#endif /* CONFIG_CONTEXT_TRACKING */
...
#ifdef CONFIG_CONTEXT_TRACKING
on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
NULL, true);
#else /* CONFIG_CONTEXT_TRACKING */
on_each_cpu_cond_mask(cpu_in_kernel, tlb_remove_table_smp_sync,
NULL, true, REMOVE_TABLE_IPI_MASK);
#endif /* CONFIG_CONTEXT_TRACKING */
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2023-04-04 14:03 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-04 13:42 [PATCH 0/3] send tlb_remove_table_smp_sync IPI only to necessary CPUs Yair Podemsky
2023-04-04 13:42 ` [PATCH 1/3] arch: Introduce ARCH_HAS_CPUMASK_BITS Yair Podemsky
2023-04-04 13:47 ` David Hildenbrand
2023-04-04 13:42 ` [PATCH 2/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
2023-04-04 14:57 ` Peter Zijlstra
2023-04-04 13:42 ` [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode Yair Podemsky
2023-04-04 14:03 ` David Hildenbrand [this message]
2023-04-04 15:12 ` Peter Zijlstra
2023-04-04 16:00 ` Peter Zijlstra
2023-04-05 0:53 ` Hillf Danton
2023-04-05 10:43 ` Frederic Weisbecker
2023-04-05 11:10 ` Frederic Weisbecker
2023-04-05 11:41 ` Peter Zijlstra
2023-04-05 12:00 ` David Hildenbrand
2023-04-05 12:05 ` Frederic Weisbecker
2023-04-05 12:31 ` Frederic Weisbecker
2023-04-05 12:45 ` Valentin Schneider
2023-04-06 13:38 ` Peter Zijlstra
2023-04-06 14:11 ` Valentin Schneider
2023-04-06 14:39 ` Peter Zijlstra
2023-04-05 19:45 ` Marcelo Tosatti
2023-04-05 19:52 ` Peter Zijlstra
2023-04-06 12:38 ` Marcelo Tosatti
2023-04-06 13:29 ` Peter Zijlstra
2023-04-06 14:04 ` Peter Zijlstra
2023-04-06 14:42 ` David Hildenbrand
2023-04-06 15:06 ` Peter Zijlstra
2023-04-06 15:02 ` Peter Zijlstra
2023-04-06 15:51 ` David Hildenbrand
2023-04-06 18:27 ` Peter Zijlstra
2023-04-19 11:30 ` David Hildenbrand
2023-04-19 11:39 ` Marcelo Tosatti
2023-04-05 19:43 ` Marcelo Tosatti
2023-04-05 19:54 ` Peter Zijlstra
2023-04-06 12:49 ` Marcelo Tosatti
2023-04-06 13:32 ` Peter Zijlstra
2023-04-19 11:01 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d21bfe1d-46e6-5547-cdcb-0d851bf0834a@redhat.com \
--to=david@redhat.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=alougovs@redhat.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=ardb@kernel.org \
--cc=arnd@arndb.de \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=christophe.leroy@csgroup.eu \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dhildenb@redhat.com \
--cc=frederic@kernel.org \
--cc=geert+renesas@glider.be \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=jpoimboe@kernel.org \
--cc=juerg.haefliger@canonical.com \
--cc=keescook@chromium.org \
--cc=linus.walleij@linaro.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=mtosatti@redhat.com \
--cc=nick.hawkins@hpe.com \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rmk+kernel@armlinux.org.uk \
--cc=samitolvanen@google.com \
--cc=sebastian.reichel@collabora.com \
--cc=sparclinux@vger.kernel.org \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=tony@atomide.com \
--cc=vschneid@redhat.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=ypodemsk@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox