From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Carlos Llamas <cmllamas@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Michal Hocko <mhocko@suse.com>,
Christian Brauner <brauner@kernel.org>,
linux-mm@kvack.org, bpf@vger.kernel.org,
kernel-team@android.com, Liam Howlett <liam.howlett@oracle.com>,
Suren Baghdasaryan <surenb@google.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails
Date: Fri, 30 Sep 2022 15:34:10 -0700 [thread overview]
Message-ID: <CAEf4BzZC=NAT9-SCWzBkAGhYusZHokhKBQrMNSDuTWfZnr_B6A@mail.gmail.com> (raw)
In-Reply-To: <20220930003844.1210987-1-cmllamas@google.com>
On Thu, Sep 29, 2022 at 5:51 PM Carlos Llamas <cmllamas@google.com> wrote:
>
> Commit c462ac288f2c ("mm: Introduce arch_validate_flags()") added a late
> check in mmap_region() to let architectures validate vm_flags. The check
> needs to happen after calling ->mmap() as the flags can potentially be
> modified during this callback.
>
> If arch_validate_flags() check fails we unmap and free the vma. However,
> the error path fails to undo the ->mmap() call that previously succeeded
> and depending on the specific ->mmap() implementation this translates to
> reference increments, memory allocations and other operations what will
> not be cleaned up.
>
> There are several places (mainly device drivers) where this is an issue.
> However, one specific example is bpf_map_mmap() which keeps count of the
> mappings in map->writecnt. The count is incremented on ->mmap() and then
> decremented on vm_ops->close(). When arch_validate_flags() fails this
> count is off since bpf_map_mmap_close() is never called.
>
> One can reproduce this issue in arm64 devices with MTE support. Here the
> vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
> previously. From userspace then is enough to pass the PROT_MTE flag to
> mmap() syscall to trigger the arch_validate_flags() failure.
>
> The following program reproduces this issue:
> ---
> #include <stdio.h>
> #include <unistd.h>
> #include <linux/unistd.h>
> #include <linux/bpf.h>
> #include <sys/mman.h>
>
> int main(void)
> {
> union bpf_attr attr = {
> .map_type = BPF_MAP_TYPE_ARRAY,
> .key_size = sizeof(int),
> .value_size = sizeof(long long),
> .max_entries = 256,
> .map_flags = BPF_F_MMAPABLE,
> };
> int fd;
>
> fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
> mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);
>
> return 0;
> }
> ---
>
> By manually adding some log statements to the vm_ops callbacks we can
> confirm that when passing PROT_MTE to mmap() the map->writecnt is off
> upon ->release():
>
> With PROT_MTE flag:
> root@debian:~# ./bpf-test
> [ 111.263874] bpf_map_write_active_inc: map=9 writecnt=1
> [ 111.288763] bpf_map_release: map=9 writecnt=1
>
> Without PROT_MTE flag:
> root@debian:~# ./bpf-test
> [ 157.816912] bpf_map_write_active_inc: map=10 writecnt=1
> [ 157.830442] bpf_map_write_active_dec: map=10 writecnt=0
> [ 157.832396] bpf_map_release: map=10 writecnt=0
>
> This patch fixes the above issue by calling vm_ops->close() when the
> arch_validate_flags() check fails, after this we can proceed to unmap
> and free the vma on the error path.
>
> Fixes: c462ac288f2c ("mm: Introduce arch_validate_flags()")
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Liam Howlett <liam.howlett@oracle.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: <stable@vger.kernel.org> # v5.10+
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---
Makes sense to me, open/close callbacks should be symmetrical. From
BPF-side of things:
Acked-by: Andrii Nakryiko <andrii@kernel.org>
> mm/mmap.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9d780f415be3..36c08e2c78da 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> if (!arch_validate_flags(vma->vm_flags)) {
> error = -EINVAL;
> if (file)
> - goto unmap_and_free_vma;
> + goto close_and_free_vma;
> else
> goto free_vma;
> }
> @@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>
> return addr;
>
> +close_and_free_vma:
> + if (vma->vm_ops && vma->vm_ops->close)
> + vma->vm_ops->close(vma);
> unmap_and_free_vma:
> fput(vma->vm_file);
> vma->vm_file = NULL;
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>
prev parent reply other threads:[~2022-09-30 22:34 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-30 0:38 Carlos Llamas
2022-09-30 9:59 ` Catalin Marinas
2022-09-30 22:34 ` Andrii Nakryiko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEf4BzZC=NAT9-SCWzBkAGhYusZHokhKBQrMNSDuTWfZnr_B6A@mail.gmail.com' \
--to=andrii.nakryiko@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cmllamas@google.com \
--cc=kernel-team@android.com \
--cc=liam.howlett@oracle.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox