linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Yair Podemsky <ypodemsk@redhat.com>,
	linux@armlinux.org.uk, mpe@ellerman.id.au, npiggin@gmail.com,
	christophe.leroy@csgroup.eu, hca@linux.ibm.com,
	gor@linux.ibm.com, agordeev@linux.ibm.com,
	borntraeger@linux.ibm.com, svens@linux.ibm.com,
	davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, will@kernel.org, aneesh.kumar@linux.ibm.com,
	akpm@linux-foundation.org, peterz@infradead.org, arnd@arndb.de,
	keescook@chromium.org, paulmck@kernel.org, jpoimboe@kernel.org,
	samitolvanen@google.com, frederic@kernel.org, ardb@kernel.org,
	juerg.haefliger@canonical.com, rmk+kernel@armlinux.org.uk,
	geert+renesas@glider.be, tony@atomide.com,
	linus.walleij@linaro.org, sebastian.reichel@collabora.com,
	nick.hawkins@hpe.com, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org, mtosatti@redhat.com, vschneid@redhat.com,
	dhildenb@redhat.com
Cc: alougovs@redhat.com
Subject: Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode
Date: Tue, 4 Apr 2023 16:03:38 +0200	[thread overview]
Message-ID: <d21bfe1d-46e6-5547-cdcb-0d851bf0834a@redhat.com> (raw)
In-Reply-To: <20230404134224.137038-4-ypodemsk@redhat.com>

On 04.04.23 15:42, Yair Podemsky wrote:
> The tlb_remove_table_smp_sync IPI is used to ensure the outdated tlb page
> is not currently being accessed and can be cleared.
> This occurs once all CPUs have left the lockless gup code section.
> If they reenter the page table walk, the pointers will be to the new
> pages.
> Therefore the IPI is only needed for CPUs in kernel mode.
> By preventing the IPI from being sent to CPUs not in kernel mode,
> Latencies are reduced.
> 
> Race conditions considerations:
> The context state check is vulnerable to race conditions between the
> moment the context state is read to when the IPI is sent (or not).
> 
> Here are these scenarios.
> case 1:
> CPU-A                                             CPU-B
> 
>                                                    state == CONTEXT_KERNEL
> int state = atomic_read(&ct->state);
>                                                    Kernel-exit:
>                                                    state == CONTEXT_USER
> if (state & CT_STATE_MASK == CONTEXT_KERNEL)
> 
> In this case, the IPI will be sent to CPU-B despite it is no longer in
> the kernel. The consequence of which would be an unnecessary IPI being
> handled by CPU-B, causing a reduction in latency.
> This would have been the case every time without this patch.
> 
> case 2:
> CPU-A                                             CPU-B
> 
> modify pagetables
> tlb_flush (memory barrier)
>                                                    state == CONTEXT_USER
> int state = atomic_read(&ct->state);
>                                                    Kernel-enter:
>                                                    state == CONTEXT_KERNEL
>                                                    READ(pagetable values)
> if (state & CT_STATE_MASK == CONTEXT_USER)
> 
> In this case, the IPI will not be sent to CPU-B despite it returning to
> the kernel and even reading the pagetable.
> However since this CPU-B has entered the pagetable after the
> modification it is reading the new, safe values.
> 
> The only case when this IPI is truly necessary is when CPU-B has entered
> the lockless gup code section before the pagetable modifications and
> has yet to exit them, in which case it is still in the kernel.
> 
> Signed-off-by: Yair Podemsky <ypodemsk@redhat.com>
> ---
>   mm/mmu_gather.c | 19 +++++++++++++++++--
>   1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
> index 5ea9be6fb87c..731d955e152d 100644
> --- a/mm/mmu_gather.c
> +++ b/mm/mmu_gather.c
> @@ -9,6 +9,7 @@
>   #include <linux/smp.h>
>   #include <linux/swap.h>
>   #include <linux/rmap.h>
> +#include <linux/context_tracking_state.h>
>   
>   #include <asm/pgalloc.h>
>   #include <asm/tlb.h>
> @@ -191,6 +192,20 @@ static void tlb_remove_table_smp_sync(void *arg)
>   	/* Simply deliver the interrupt */
>   }
>   
> +
> +#ifdef CONFIG_CONTEXT_TRACKING
> +static bool cpu_in_kernel(int cpu, void *info)
> +{
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> +	int state = atomic_read(&ct->state);
> +	/* will return true only for cpus in kernel space */
> +	return state & CT_STATE_MASK == CONTEXT_KERNEL;
> +}
> +#define CONTEXT_PREDICATE cpu_in_kernel
> +#else
> +#define CONTEXT_PREDICATE NULL
> +#endif /* CONFIG_CONTEXT_TRACKING */
> +
>   #ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
>   #define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
>   #else
> @@ -206,8 +221,8 @@ void tlb_remove_table_sync_one(struct mm_struct *mm)
>   	 * It is however sufficient for software page-table walkers that rely on
>   	 * IRQ disabling.
>   	 */
> -	on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
> -			NULL, true);
> +	on_each_cpu_cond_mask(CONTEXT_PREDICATE, tlb_remove_table_smp_sync,
> +			NULL, true, REMOVE_TABLE_IPI_MASK);
>   }
>   
>   static void tlb_remove_table_rcu(struct rcu_head *head)


Maybe a bit cleaner by avoiding CONTEXT_PREDICATE, still not completely nice
(an empty dummy function "cpu_maybe_in_kernel" might be cleanest but would
be slightly slower for !CONFIG_CONTEXT_TRACKING):

#ifdef CONFIG_CONTEXT_TRACKING
static bool cpu_in_kernel(int cpu, void *info)
{
	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
	int state = atomic_read(&ct->state);
	/* will return true only for cpus in kernel space */
	return state & CT_STATE_MASK == CONTEXT_KERNEL;
}
#endif /* CONFIG_CONTEXT_TRACKING */


...
#ifdef CONFIG_CONTEXT_TRACKING
	on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
			 NULL, true);
#else /* CONFIG_CONTEXT_TRACKING */
	on_each_cpu_cond_mask(cpu_in_kernel, tlb_remove_table_smp_sync,
			      NULL, true, REMOVE_TABLE_IPI_MASK);
#endif /* CONFIG_CONTEXT_TRACKING */


-- 
Thanks,

David / dhildenb



  reply	other threads:[~2023-04-04 14:03 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04 13:42 [PATCH 0/3] send tlb_remove_table_smp_sync IPI only to necessary CPUs Yair Podemsky
2023-04-04 13:42 ` [PATCH 1/3] arch: Introduce ARCH_HAS_CPUMASK_BITS Yair Podemsky
2023-04-04 13:47   ` David Hildenbrand
2023-04-04 13:42 ` [PATCH 2/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
2023-04-04 14:57   ` Peter Zijlstra
2023-04-04 13:42 ` [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode Yair Podemsky
2023-04-04 14:03   ` David Hildenbrand [this message]
2023-04-04 15:12   ` Peter Zijlstra
2023-04-04 16:00     ` Peter Zijlstra
2023-04-05  0:53       ` Hillf Danton
2023-04-05 10:43   ` Frederic Weisbecker
2023-04-05 11:10     ` Frederic Weisbecker
2023-04-05 11:41       ` Peter Zijlstra
2023-04-05 12:00         ` David Hildenbrand
2023-04-05 12:05         ` Frederic Weisbecker
2023-04-05 12:31           ` Frederic Weisbecker
2023-04-05 12:45           ` Valentin Schneider
2023-04-06 13:38             ` Peter Zijlstra
2023-04-06 14:11               ` Valentin Schneider
2023-04-06 14:39                 ` Peter Zijlstra
2023-04-05 19:45       ` Marcelo Tosatti
2023-04-05 19:52         ` Peter Zijlstra
2023-04-06 12:38           ` Marcelo Tosatti
2023-04-06 13:29             ` Peter Zijlstra
2023-04-06 14:04               ` Peter Zijlstra
2023-04-06 14:42                 ` David Hildenbrand
2023-04-06 15:06                   ` Peter Zijlstra
2023-04-06 15:02                 ` Peter Zijlstra
2023-04-06 15:51                   ` David Hildenbrand
2023-04-06 18:27                     ` Peter Zijlstra
2023-04-19 11:30                       ` David Hildenbrand
2023-04-19 11:39                         ` Marcelo Tosatti
2023-04-05 19:43     ` Marcelo Tosatti
2023-04-05 19:54       ` Peter Zijlstra
2023-04-06 12:49         ` Marcelo Tosatti
2023-04-06 13:32           ` Peter Zijlstra
2023-04-19 11:01             ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d21bfe1d-46e6-5547-cdcb-0d851bf0834a@redhat.com \
    --to=david@redhat.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alougovs@redhat.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=dhildenb@redhat.com \
    --cc=frederic@kernel.org \
    --cc=geert+renesas@glider.be \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@kernel.org \
    --cc=juerg.haefliger@canonical.com \
    --cc=keescook@chromium.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=mtosatti@redhat.com \
    --cc=nick.hawkins@hpe.com \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rmk+kernel@armlinux.org.uk \
    --cc=samitolvanen@google.com \
    --cc=sebastian.reichel@collabora.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tony@atomide.com \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=ypodemsk@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox