From: Mark Rutland <mark.rutland@arm.com>
To: Tycho Andersen <tycho@docker.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-hardening@lists.openwall.com,
Marco Benatto <marco.antonio.780@gmail.com>,
Juerg Haefliger <juerg.haefliger@canonical.com>
Subject: Re: [kernel-hardening] [PATCH v5 04/10] arm64: Add __flush_tlb_one()
Date: Thu, 24 Aug 2017 16:45:19 +0100 [thread overview]
Message-ID: <20170824154518.GB29665@leverpostej> (raw)
In-Reply-To: <20170823171302.ubnv7qyrexhhpbs7@smitten>
On Wed, Aug 23, 2017 at 11:13:02AM -0600, Tycho Andersen wrote:
> On Wed, Aug 23, 2017 at 06:04:43PM +0100, Mark Rutland wrote:
> > On Wed, Aug 23, 2017 at 10:58:42AM -0600, Tycho Andersen wrote:
> > > Hi Mark,
> > >
> > > On Mon, Aug 14, 2017 at 05:50:47PM +0100, Mark Rutland wrote:
> > > > That said, is there any reason not to use flush_tlb_kernel_range()
> > > > directly?
> > >
> > > So it turns out that there is a difference between __flush_tlb_one() and
> > > flush_tlb_kernel_range() on x86: flush_tlb_kernel_range() flushes all the TLBs
> > > via on_each_cpu(), where as __flush_tlb_one() only flushes the local TLB (which
> > > I think is enough here).
> >
> > That sounds suspicious; I don't think that __flush_tlb_one() is
> > sufficient.
> >
> > If you only do local TLB maintenance, then the page is left accessible
> > to other CPUs via the (stale) kernel mappings. i.e. the page isn't
> > exclusively mapped by userspace.
>
> I thought so too, so I tried to test it with something like the patch
> below. But it correctly failed for me when using __flush_tlb_one(). I
> suppose I'm doing something wrong in the test, but I'm not sure what.
I suspect the issue is that you use a completion to synchronise the
mapping.
The reader thread will block (i.e. it we go into schedule() and
something else will run), and I guess that on x86, that the
context-switch this entails upon completion happens to invalidate the
TLBs.
Instead, you could serialise the update with the reader doing:
/* spin until address is published to us */
addr = smp_cond_load_acquire(arg->virt_addr, VAL != NULL);
read_map(addr);
... and the writer doing:
user_addr = do_map(...)
...
smp_store_release(arg->virt_addr, user_addr);
There would still be a chance of a context-switch, but it wouldn't be
mandatory.
As an aside, it looks like DEBUG_PAGEALLOC on x86 has the problem w.r.t.
under-invalidating, juding by the comments in x86's
__kernel_map_pages(). It only invalidates the local TLBs, even though
it should do it on all CPUs.
Thanks,
Mark.
>
> Tycho
>
>
> From 1d1b0a18d56cf1634072096231bfbaa96cb2aa16 Mon Sep 17 00:00:00 2001
> From: Tycho Andersen <tycho@docker.com>
> Date: Tue, 22 Aug 2017 18:07:12 -0600
> Subject: [PATCH] add XPFO_SMP test
>
> Signed-off-by: Tycho Andersen <tycho@docker.com>
> ---
> drivers/misc/lkdtm.h | 1 +
> drivers/misc/lkdtm_core.c | 1 +
> drivers/misc/lkdtm_xpfo.c | 139 ++++++++++++++++++++++++++++++++++++++++++----
> 3 files changed, 130 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
> index fc53546113c1..34a6ee37f216 100644
> --- a/drivers/misc/lkdtm.h
> +++ b/drivers/misc/lkdtm.h
> @@ -67,5 +67,6 @@ void lkdtm_USERCOPY_KERNEL(void);
> /* lkdtm_xpfo.c */
> void lkdtm_XPFO_READ_USER(void);
> void lkdtm_XPFO_READ_USER_HUGE(void);
> +void lkdtm_XPFO_SMP(void);
>
> #endif
> diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
> index 164bc404f416..9544e329de4b 100644
> --- a/drivers/misc/lkdtm_core.c
> +++ b/drivers/misc/lkdtm_core.c
> @@ -237,6 +237,7 @@ struct crashtype crashtypes[] = {
> CRASHTYPE(USERCOPY_KERNEL),
> CRASHTYPE(XPFO_READ_USER),
> CRASHTYPE(XPFO_READ_USER_HUGE),
> + CRASHTYPE(XPFO_SMP),
> };
>
>
> diff --git a/drivers/misc/lkdtm_xpfo.c b/drivers/misc/lkdtm_xpfo.c
> index c72509128eb3..7600fdcae22f 100644
> --- a/drivers/misc/lkdtm_xpfo.c
> +++ b/drivers/misc/lkdtm_xpfo.c
> @@ -4,22 +4,27 @@
>
> #include "lkdtm.h"
>
> +#include <linux/cpumask.h>
> #include <linux/mman.h>
> #include <linux/uaccess.h>
> #include <linux/xpfo.h>
> +#include <linux/kthread.h>
>
> -void read_user_with_flags(unsigned long flags)
> +#include <linux/delay.h>
> +#include <linux/sched/task.h>
> +
> +#define XPFO_DATA 0xdeadbeef
> +
> +static unsigned long do_map(unsigned long flags)
> {
> - unsigned long user_addr, user_data = 0xdeadbeef;
> - phys_addr_t phys_addr;
> - void *virt_addr;
> + unsigned long user_addr, user_data = XPFO_DATA;
>
> user_addr = vm_mmap(NULL, 0, PAGE_SIZE,
> PROT_READ | PROT_WRITE | PROT_EXEC,
> flags, 0);
> if (user_addr >= TASK_SIZE) {
> pr_warn("Failed to allocate user memory\n");
> - return;
> + return 0;
> }
>
> if (copy_to_user((void __user *)user_addr, &user_data,
> @@ -28,25 +33,61 @@ void read_user_with_flags(unsigned long flags)
> goto free_user;
> }
>
> + return user_addr;
> +
> +free_user:
> + vm_munmap(user_addr, PAGE_SIZE);
> + return 0;
> +}
> +
> +static unsigned long *user_to_kernel(unsigned long user_addr)
> +{
> + phys_addr_t phys_addr;
> + void *virt_addr;
> +
> phys_addr = user_virt_to_phys(user_addr);
> if (!phys_addr) {
> pr_warn("Failed to get physical address of user memory\n");
> - goto free_user;
> + return 0;
> }
>
> virt_addr = phys_to_virt(phys_addr);
> if (phys_addr != virt_to_phys(virt_addr)) {
> pr_warn("Physical address of user memory seems incorrect\n");
> - goto free_user;
> + return 0;
> }
>
> + return virt_addr;
> +}
> +
> +static void read_map(unsigned long *virt_addr)
> +{
> pr_info("Attempting bad read from kernel address %p\n", virt_addr);
> - if (*(unsigned long *)virt_addr == user_data)
> - pr_info("Huh? Bad read succeeded?!\n");
> + if (*(unsigned long *)virt_addr == XPFO_DATA)
> + pr_err("FAIL: Bad read succeeded?!\n");
> else
> - pr_info("Huh? Bad read didn't fail but data is incorrect?!\n");
> + pr_err("FAIL: Bad read didn't fail but data is incorrect?!\n");
> +}
> +
> +static void read_user_with_flags(unsigned long flags)
> +{
> + unsigned long user_addr, *kernel;
> +
> + user_addr = do_map(flags);
> + if (!user_addr) {
> + pr_err("FAIL: map failed\n");
> + return;
> + }
> +
> + kernel = user_to_kernel(user_addr);
> + if (!kernel) {
> + pr_err("FAIL: user to kernel conversion failed\n");
> + goto free_user;
> + }
> +
> + read_map(kernel);
>
> - free_user:
> +free_user:
> vm_munmap(user_addr, PAGE_SIZE);
> }
>
> @@ -60,3 +101,79 @@ void lkdtm_XPFO_READ_USER_HUGE(void)
> {
> read_user_with_flags(MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB);
> }
> +
> +struct smp_arg {
> + struct completion map_done;
> + unsigned long *virt_addr;
> + unsigned int cpu;
> +};
> +
> +static int smp_reader(void *parg)
> +{
> + struct smp_arg *arg = parg;
> +
> + if (arg->cpu != smp_processor_id()) {
> + pr_err("FAIL: scheduled on wrong CPU?\n");
> + return 0;
> + }
> +
> + wait_for_completion(&arg->map_done);
> +
> + if (arg->virt_addr)
> + read_map(arg->virt_addr);
> +
> + return 0;
> +}
> +
> +/* The idea here is to read from the kernel's map on a different thread than
> + * did the mapping (and thus the TLB flushing), to make sure that the page
> + * faults on other cores too.
> + */
> +void lkdtm_XPFO_SMP(void)
> +{
> + unsigned long user_addr;
> + struct task_struct *thread;
> + int ret;
> + struct smp_arg arg;
> +
> + init_completion(&arg.map_done);
> +
> + if (num_online_cpus() < 2) {
> + pr_err("not enough to do a multi cpu test\n");
> + return;
> + }
> +
> + arg.cpu = (smp_processor_id() + 1) % num_online_cpus();
> + thread = kthread_create(smp_reader, &arg, "lkdtm_xpfo_test");
> + if (IS_ERR(thread)) {
> + pr_err("couldn't create kthread? %ld\n", PTR_ERR(thread));
> + return;
> + }
> +
> + kthread_bind(thread, arg.cpu);
> + get_task_struct(thread);
> + wake_up_process(thread);
> +
> + user_addr = do_map(MAP_PRIVATE | MAP_ANONYMOUS);
> + if (user_addr) {
> + arg.virt_addr = user_to_kernel(user_addr);
> + /* child thread checks for failure */
> + }
> +
> + complete(&arg.map_done);
> +
> + /* there must be a better way to do this. */
> + while (1) {
> + if (thread->exit_state)
> + break;
> + msleep_interruptible(100);
> + }
> +
> + ret = kthread_stop(thread);
> + if (ret != SIGKILL)
> + pr_err("FAIL: thread wasn't killed: %d\n", ret);
> + put_task_struct(thread);
> +
> + if (user_addr)
> + vm_munmap(user_addr, PAGE_SIZE);
> +}
> --
> 2.11.0
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-24 15:46 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-09 20:07 [PATCH v5 00/10] Add support for eXclusive Page Frame Ownership Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 01/10] mm: add MAP_HUGETLB support to vm_mmap Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 02/10] mm, x86: Add support for eXclusive Page Frame Ownership (XPFO) Tycho Andersen
2017-08-14 18:51 ` Laura Abbott
2017-08-14 22:30 ` Laura Abbott
2017-08-15 3:47 ` Tycho Andersen
2017-08-15 3:51 ` Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 03/10] swiotlb: Map the buffer if it was unmapped by XPFO Tycho Andersen
2017-08-10 13:01 ` Konrad Rzeszutek Wilk
2017-08-10 16:22 ` Tycho Andersen
2017-09-20 16:19 ` Dave Hansen
2017-09-20 22:47 ` Tycho Andersen
2017-09-20 23:25 ` Dave Hansen
2017-08-09 20:07 ` [PATCH v5 04/10] arm64: Add __flush_tlb_one() Tycho Andersen
2017-08-12 11:26 ` [kernel-hardening] " Mark Rutland
2017-08-14 16:35 ` Tycho Andersen
2017-08-14 16:50 ` Mark Rutland
2017-08-14 17:01 ` Tycho Andersen
2017-08-23 16:58 ` Tycho Andersen
2017-08-23 17:04 ` Mark Rutland
2017-08-23 17:13 ` Tycho Andersen
2017-08-24 15:45 ` Mark Rutland [this message]
2017-08-29 17:24 ` Tycho Andersen
2017-08-30 5:31 ` Juerg Haefliger
2017-08-30 16:47 ` Tycho Andersen
2017-08-31 9:43 ` Juerg Haefliger
2017-08-31 9:47 ` Mark Rutland
2017-08-31 21:21 ` Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 05/10] arm64/mm: Add support for XPFO Tycho Andersen
2017-08-11 18:01 ` [kernel-hardening] " Laura Abbott
2017-08-11 20:19 ` Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 06/10] arm64/mm: Disable section mappings if XPFO is enabled Tycho Andersen
2017-08-11 17:25 ` [kernel-hardening] " Laura Abbott
2017-08-11 21:13 ` Tycho Andersen
2017-08-11 21:52 ` Tycho Andersen
2017-08-12 11:17 ` Mark Rutland
2017-08-14 16:22 ` Tycho Andersen
2017-08-14 18:42 ` Laura Abbott
2017-08-14 20:28 ` Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 07/10] arm64/mm: Don't flush the data cache if the page is unmapped by XPFO Tycho Andersen
2017-08-12 11:57 ` [kernel-hardening] " Mark Rutland
2017-08-14 16:54 ` Mark Rutland
2017-08-14 20:27 ` Tycho Andersen
2017-08-15 9:39 ` Mark Rutland
2017-08-09 20:07 ` [PATCH v5 08/10] arm64/mm: Add support for XPFO to swiotlb Tycho Andersen
2017-08-10 13:11 ` Konrad Rzeszutek Wilk
2017-08-10 16:35 ` Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 09/10] mm: add a user_virt_to_phys symbol Tycho Andersen
2017-08-09 20:07 ` [PATCH v5 10/10] lkdtm: Add test for XPFO Tycho Andersen
2017-08-12 20:24 ` kbuild test robot
2017-08-14 16:21 ` Tycho Andersen
2017-08-12 21:05 ` kbuild test robot
2017-08-14 19:10 ` Kees Cook
2017-08-14 20:29 ` Tycho Andersen
2017-08-11 23:35 ` [kernel-hardening] [PATCH v5 00/10] Add support for eXclusive Page Frame Ownership Laura Abbott
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170824154518.GB29665@leverpostej \
--to=mark.rutland@arm.com \
--cc=juerg.haefliger@canonical.com \
--cc=kernel-hardening@lists.openwall.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marco.antonio.780@gmail.com \
--cc=tycho@docker.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox