* [RFC][PATCH] mm: Split locks in remap_file_pages()
@ 2024-10-18 14:47 Roberto Sassu
2024-10-18 15:05 ` Jann Horn
2024-10-18 15:42 ` Lorenzo Stoakes
0 siblings, 2 replies; 5+ messages in thread
From: Roberto Sassu @ 2024-10-18 14:47 UTC (permalink / raw)
To: akpm, Liam.Howlett, lorenzo.stoakes, vbabka, jannh
Cc: linux-mm, linux-kernel, ebpqwerty472123, paul, zohar,
dmitry.kasatkin, eric.snowberg, jmorris, serge, linux-integrity,
linux-security-module, bpf, linux-fsdevel, Kirill A. Shutemov,
stable, syzbot+91ae49e1c1a2634d20c0, Roberto Sassu
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
remap_file_pages()") fixed a security issue, it added an LSM check when
trying to remap file pages, so that LSMs have the opportunity to evaluate
such action like for other memory operations such as mmap() and mprotect().
However, that commit called security_mmap_file() inside the mmap_lock lock,
while the other calls do it before taking the lock, after commit
8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
This caused lock inversion issue with IMA which was taking the mmap_lock
and i_mutex lock in the opposite way when the remap_file_pages() system
call was called.
Solve the issue by splitting the critical region in remap_file_pages() in
two regions: the first takes a read lock of mmap_lock and retrieves the VMA
and the file associated, and calculate the 'prot' and 'flags' variable; the
second takes a write lock on mmap_lock, checks that the VMA flags and the
VMA file descriptor are the same as the ones obtained in the first critical
region (otherwise the system call fails), and calls do_mmap().
In between, after releasing the read lock and taking the write lock, call
security_mmap_file(), and solve the lock inversion issue.
Cc: stable@vger.kernel.org
Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> (Calculate prot and flags earlier)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 45 insertions(+), 17 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 9c0fb43064b5..762944427e03 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
unsigned long populate = 0;
unsigned long ret = -EINVAL;
struct file *file;
+ vm_flags_t vm_flags;
pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
current->comm, current->pid);
@@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
if (pgoff + (size >> PAGE_SHIFT) < pgoff)
return ret;
- if (mmap_write_lock_killable(mm))
+ if (mmap_read_lock_killable(mm))
+ return -EINTR;
+
+ vma = vma_lookup(mm, start);
+
+ if (!vma || !(vma->vm_flags & VM_SHARED)) {
+ mmap_read_unlock(mm);
+ return -EINVAL;
+ }
+
+ prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
+ prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
+ prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
+
+ flags &= MAP_NONBLOCK;
+ flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
+ if (vma->vm_flags & VM_LOCKED)
+ flags |= MAP_LOCKED;
+
+ /* Save vm_flags used to calculate prot and flags, and recheck later. */
+ vm_flags = vma->vm_flags;
+ file = get_file(vma->vm_file);
+
+ mmap_read_unlock(mm);
+
+ ret = security_mmap_file(file, prot, flags);
+ if (ret) {
+ fput(file);
+ return ret;
+ }
+
+ ret = -EINVAL;
+
+ if (mmap_write_lock_killable(mm)) {
+ fput(file);
return -EINTR;
+ }
vma = vma_lookup(mm, start);
- if (!vma || !(vma->vm_flags & VM_SHARED))
+ if (!vma)
+ goto out;
+
+ if (vma->vm_flags != vm_flags)
+ goto out;
+
+ if (vma->vm_file != file)
goto out;
if (start + size > vma->vm_end) {
@@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
goto out;
}
- prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
- prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
- prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
-
- flags &= MAP_NONBLOCK;
- flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
- if (vma->vm_flags & VM_LOCKED)
- flags |= MAP_LOCKED;
-
- file = get_file(vma->vm_file);
- ret = security_mmap_file(vma->vm_file, prot, flags);
- if (ret)
- goto out_fput;
ret = do_mmap(vma->vm_file, start, size,
prot, flags, 0, pgoff, &populate, NULL);
-out_fput:
- fput(file);
out:
mmap_write_unlock(mm);
+ fput(file);
if (populate)
mm_populate(ret, populate);
if (!IS_ERR_VALUE(ret))
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC][PATCH] mm: Split locks in remap_file_pages()
2024-10-18 14:47 [RFC][PATCH] mm: Split locks in remap_file_pages() Roberto Sassu
@ 2024-10-18 15:05 ` Jann Horn
2024-10-18 15:42 ` Lorenzo Stoakes
1 sibling, 0 replies; 5+ messages in thread
From: Jann Horn @ 2024-10-18 15:05 UTC (permalink / raw)
To: Roberto Sassu
Cc: akpm, Liam.Howlett, lorenzo.stoakes, vbabka, linux-mm,
linux-kernel, ebpqwerty472123, paul, zohar, dmitry.kasatkin,
eric.snowberg, jmorris, serge, linux-integrity,
linux-security-module, bpf, linux-fsdevel, Kirill A. Shutemov,
stable, syzbot+91ae49e1c1a2634d20c0, Roberto Sassu
On Fri, Oct 18, 2024 at 4:48 PM Roberto Sassu
<roberto.sassu@huaweicloud.com> wrote:
> Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
> remap_file_pages()") fixed a security issue, it added an LSM check when
> trying to remap file pages, so that LSMs have the opportunity to evaluate
> such action like for other memory operations such as mmap() and mprotect().
>
> However, that commit called security_mmap_file() inside the mmap_lock lock,
> while the other calls do it before taking the lock, after commit
> 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
>
> This caused lock inversion issue with IMA which was taking the mmap_lock
> and i_mutex lock in the opposite way when the remap_file_pages() system
> call was called.
>
> Solve the issue by splitting the critical region in remap_file_pages() in
> two regions: the first takes a read lock of mmap_lock and retrieves the VMA
> and the file associated, and calculate the 'prot' and 'flags' variable; the
> second takes a write lock on mmap_lock, checks that the VMA flags and the
> VMA file descriptor are the same as the ones obtained in the first critical
> region (otherwise the system call fails), and calls do_mmap().
>
> In between, after releasing the read lock and taking the write lock, call
> security_mmap_file(), and solve the lock inversion issue.
>
> Cc: stable@vger.kernel.org
> Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
> Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/
> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> (Calculate prot and flags earlier)
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Jann Horn <jannh@google.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC][PATCH] mm: Split locks in remap_file_pages()
2024-10-18 14:47 [RFC][PATCH] mm: Split locks in remap_file_pages() Roberto Sassu
2024-10-18 15:05 ` Jann Horn
@ 2024-10-18 15:42 ` Lorenzo Stoakes
2024-10-18 15:45 ` Roberto Sassu
1 sibling, 1 reply; 5+ messages in thread
From: Lorenzo Stoakes @ 2024-10-18 15:42 UTC (permalink / raw)
To: Roberto Sassu
Cc: akpm, Liam.Howlett, vbabka, jannh, linux-mm, linux-kernel,
ebpqwerty472123, paul, zohar, dmitry.kasatkin, eric.snowberg,
jmorris, serge, linux-integrity, linux-security-module, bpf,
linux-fsdevel, Kirill A. Shutemov, stable,
syzbot+91ae49e1c1a2634d20c0, Roberto Sassu
On Fri, Oct 18, 2024 at 04:47:10PM +0200, Roberto Sassu wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>
> Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
> remap_file_pages()") fixed a security issue, it added an LSM check when
> trying to remap file pages, so that LSMs have the opportunity to evaluate
> such action like for other memory operations such as mmap() and mprotect().
>
> However, that commit called security_mmap_file() inside the mmap_lock lock,
> while the other calls do it before taking the lock, after commit
> 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
>
> This caused lock inversion issue with IMA which was taking the mmap_lock
> and i_mutex lock in the opposite way when the remap_file_pages() system
> call was called.
>
> Solve the issue by splitting the critical region in remap_file_pages() in
> two regions: the first takes a read lock of mmap_lock and retrieves the VMA
> and the file associated, and calculate the 'prot' and 'flags' variable; the
> second takes a write lock on mmap_lock, checks that the VMA flags and the
> VMA file descriptor are the same as the ones obtained in the first critical
> region (otherwise the system call fails), and calls do_mmap().
>
> In between, after releasing the read lock and taking the write lock, call
> security_mmap_file(), and solve the lock inversion issue.
Great description!
>
> Cc: stable@vger.kernel.org
> Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
> Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/
> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> (Calculate prot and flags earlier)
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Other than some nits below:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
I think you're definitely good to un-RFC here.
> ---
> mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++---------------
> 1 file changed, 45 insertions(+), 17 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9c0fb43064b5..762944427e03 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> unsigned long populate = 0;
> unsigned long ret = -EINVAL;
> struct file *file;
> + vm_flags_t vm_flags;
>
> pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
> current->comm, current->pid);
> @@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> if (pgoff + (size >> PAGE_SHIFT) < pgoff)
> return ret;
>
> - if (mmap_write_lock_killable(mm))
> + if (mmap_read_lock_killable(mm))
> + return -EINTR;
I'm kinda verbose generally, but I'd love a comment like:
/*
* Look up VMA under read lock first so we can perform the security
* without holding locks (which can be problematic). We reacquire a
* write lock later and check nothing changed underneath us.
*/
> +
> + vma = vma_lookup(mm, start);
> +
> + if (!vma || !(vma->vm_flags & VM_SHARED)) {
> + mmap_read_unlock(mm);
> + return -EINVAL;
> + }
> +
> + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
> + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
> + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
> +
> + flags &= MAP_NONBLOCK;
> + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> + if (vma->vm_flags & VM_LOCKED)
> + flags |= MAP_LOCKED;
> +
> + /* Save vm_flags used to calculate prot and flags, and recheck later. */
> + vm_flags = vma->vm_flags;
> + file = get_file(vma->vm_file);
> +
> + mmap_read_unlock(mm);
> +
Maybe worth adding a comment to explain why you're doing this without the
lock so somebody looking at this later can understand the dance?
> + ret = security_mmap_file(file, prot, flags);
> + if (ret) {
> + fput(file);
> + return ret;
> + }
> +
> + ret = -EINVAL;
> +
Again, being verbose, I'd put something here like:
/* OK security check passed, take write lock + let it rip */
> + if (mmap_write_lock_killable(mm)) {
> + fput(file);
> return -EINTR;
> + }
>
> vma = vma_lookup(mm, start);
>
> - if (!vma || !(vma->vm_flags & VM_SHARED))
> + if (!vma)
> + goto out;
> +
I'd also add something like:
/* Make sure things didn't change under us. */
> + if (vma->vm_flags != vm_flags)
> + goto out;
> +
And drop this newline to group them together (super nitty I know, sorry!)
> + if (vma->vm_file != file)
> goto out;
>
> if (start + size > vma->vm_end) {
> @@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> goto out;
> }
>
> - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
> - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
> - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
> -
> - flags &= MAP_NONBLOCK;
> - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> - if (vma->vm_flags & VM_LOCKED)
> - flags |= MAP_LOCKED;
> -
> - file = get_file(vma->vm_file);
> - ret = security_mmap_file(vma->vm_file, prot, flags);
> - if (ret)
> - goto out_fput;
> ret = do_mmap(vma->vm_file, start, size,
> prot, flags, 0, pgoff, &populate, NULL);
> -out_fput:
> - fput(file);
> out:
> mmap_write_unlock(mm);
> + fput(file);
> if (populate)
> mm_populate(ret, populate);
> if (!IS_ERR_VALUE(ret))
> --
> 2.34.1
>
These are just nits, this looks good to me!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC][PATCH] mm: Split locks in remap_file_pages()
2024-10-18 15:42 ` Lorenzo Stoakes
@ 2024-10-18 15:45 ` Roberto Sassu
2024-10-18 15:50 ` Roberto Sassu
0 siblings, 1 reply; 5+ messages in thread
From: Roberto Sassu @ 2024-10-18 15:45 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: akpm, Liam.Howlett, vbabka, jannh, linux-mm, linux-kernel,
ebpqwerty472123, paul, zohar, dmitry.kasatkin, eric.snowberg,
jmorris, serge, linux-integrity, linux-security-module, bpf,
linux-fsdevel, Kirill A. Shutemov, stable,
syzbot+91ae49e1c1a2634d20c0, Roberto Sassu
On Fri, 2024-10-18 at 16:42 +0100, Lorenzo Stoakes wrote:
> On Fri, Oct 18, 2024 at 04:47:10PM +0200, Roberto Sassu wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> >
> > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
> > remap_file_pages()") fixed a security issue, it added an LSM check when
> > trying to remap file pages, so that LSMs have the opportunity to evaluate
> > such action like for other memory operations such as mmap() and mprotect().
> >
> > However, that commit called security_mmap_file() inside the mmap_lock lock,
> > while the other calls do it before taking the lock, after commit
> > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
> >
> > This caused lock inversion issue with IMA which was taking the mmap_lock
> > and i_mutex lock in the opposite way when the remap_file_pages() system
> > call was called.
> >
> > Solve the issue by splitting the critical region in remap_file_pages() in
> > two regions: the first takes a read lock of mmap_lock and retrieves the VMA
> > and the file associated, and calculate the 'prot' and 'flags' variable; the
> > second takes a write lock on mmap_lock, checks that the VMA flags and the
> > VMA file descriptor are the same as the ones obtained in the first critical
> > region (otherwise the system call fails), and calls do_mmap().
> >
> > In between, after releasing the read lock and taking the write lock, call
> > security_mmap_file(), and solve the lock inversion issue.
>
> Great description!
>
> >
> > Cc: stable@vger.kernel.org
> > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
> > Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/
> > Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> (Calculate prot and flags earlier)
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
> Other than some nits below:
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> I think you're definitely good to un-RFC here.
Perfect, will do. Thank you!
Roberto
> > ---
> > mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++---------------
> > 1 file changed, 45 insertions(+), 17 deletions(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 9c0fb43064b5..762944427e03 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > unsigned long populate = 0;
> > unsigned long ret = -EINVAL;
> > struct file *file;
> > + vm_flags_t vm_flags;
> >
> > pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
> > current->comm, current->pid);
> > @@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > if (pgoff + (size >> PAGE_SHIFT) < pgoff)
> > return ret;
> >
> > - if (mmap_write_lock_killable(mm))
> > + if (mmap_read_lock_killable(mm))
> > + return -EINTR;
>
> I'm kinda verbose generally, but I'd love a comment like:
>
> /*
> * Look up VMA under read lock first so we can perform the security
> * without holding locks (which can be problematic). We reacquire a
> * write lock later and check nothing changed underneath us.
> */
>
> > +
> > + vma = vma_lookup(mm, start);
> > +
> > + if (!vma || !(vma->vm_flags & VM_SHARED)) {
> > + mmap_read_unlock(mm);
> > + return -EINVAL;
> > + }
> > +
> > + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
> > + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
> > + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
> > +
> > + flags &= MAP_NONBLOCK;
> > + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> > + if (vma->vm_flags & VM_LOCKED)
> > + flags |= MAP_LOCKED;
> > +
> > + /* Save vm_flags used to calculate prot and flags, and recheck later. */
> > + vm_flags = vma->vm_flags;
> > + file = get_file(vma->vm_file);
> > +
> > + mmap_read_unlock(mm);
> > +
>
> Maybe worth adding a comment to explain why you're doing this without the
> lock so somebody looking at this later can understand the dance?
>
> > + ret = security_mmap_file(file, prot, flags);
> > + if (ret) {
> > + fput(file);
> > + return ret;
> > + }
> > +
> > + ret = -EINVAL;
> > +
>
> Again, being verbose, I'd put something here like:
>
> /* OK security check passed, take write lock + let it rip */
>
> > + if (mmap_write_lock_killable(mm)) {
> > + fput(file);
> > return -EINTR;
> > + }
> >
> > vma = vma_lookup(mm, start);
> >
> > - if (!vma || !(vma->vm_flags & VM_SHARED))
> > + if (!vma)
> > + goto out;
> > +
>
> I'd also add something like:
>
> /* Make sure things didn't change under us. */
>
> > + if (vma->vm_flags != vm_flags)
> > + goto out;
> > +
>
> And drop this newline to group them together (super nitty I know, sorry!)
>
> > + if (vma->vm_file != file)
> > goto out;
> >
> > if (start + size > vma->vm_end) {
> > @@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > goto out;
> > }
> >
> > - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
> > - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
> > - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
> > -
> > - flags &= MAP_NONBLOCK;
> > - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> > - if (vma->vm_flags & VM_LOCKED)
> > - flags |= MAP_LOCKED;
> > -
> > - file = get_file(vma->vm_file);
> > - ret = security_mmap_file(vma->vm_file, prot, flags);
> > - if (ret)
> > - goto out_fput;
> > ret = do_mmap(vma->vm_file, start, size,
> > prot, flags, 0, pgoff, &populate, NULL);
> > -out_fput:
> > - fput(file);
> > out:
> > mmap_write_unlock(mm);
> > + fput(file);
> > if (populate)
> > mm_populate(ret, populate);
> > if (!IS_ERR_VALUE(ret))
> > --
> > 2.34.1
> >
>
> These are just nits, this looks good to me!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC][PATCH] mm: Split locks in remap_file_pages()
2024-10-18 15:45 ` Roberto Sassu
@ 2024-10-18 15:50 ` Roberto Sassu
0 siblings, 0 replies; 5+ messages in thread
From: Roberto Sassu @ 2024-10-18 15:50 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: akpm, Liam.Howlett, vbabka, jannh, linux-mm, linux-kernel,
ebpqwerty472123, paul, zohar, dmitry.kasatkin, eric.snowberg,
jmorris, serge, linux-integrity, linux-security-module, bpf,
linux-fsdevel, Kirill A. Shutemov, stable,
syzbot+91ae49e1c1a2634d20c0, Roberto Sassu
On Fri, 2024-10-18 at 17:45 +0200, Roberto Sassu wrote:
> On Fri, 2024-10-18 at 16:42 +0100, Lorenzo Stoakes wrote:
> > On Fri, Oct 18, 2024 at 04:47:10PM +0200, Roberto Sassu wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > >
> > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in
> > > remap_file_pages()") fixed a security issue, it added an LSM check when
> > > trying to remap file pages, so that LSMs have the opportunity to evaluate
> > > such action like for other memory operations such as mmap() and mprotect().
> > >
> > > However, that commit called security_mmap_file() inside the mmap_lock lock,
> > > while the other calls do it before taking the lock, after commit
> > > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem").
> > >
> > > This caused lock inversion issue with IMA which was taking the mmap_lock
> > > and i_mutex lock in the opposite way when the remap_file_pages() system
> > > call was called.
> > >
> > > Solve the issue by splitting the critical region in remap_file_pages() in
> > > two regions: the first takes a read lock of mmap_lock and retrieves the VMA
> > > and the file associated, and calculate the 'prot' and 'flags' variable; the
> > > second takes a write lock on mmap_lock, checks that the VMA flags and the
> > > VMA file descriptor are the same as the ones obtained in the first critical
> > > region (otherwise the system call fails), and calls do_mmap().
> > >
> > > In between, after releasing the read lock and taking the write lock, call
> > > security_mmap_file(), and solve the lock inversion issue.
> >
> > Great description!
> >
> > >
> > > Cc: stable@vger.kernel.org
> > > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()")
> > > Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
> > > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/
> > > Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> (Calculate prot and flags earlier)
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >
> > Other than some nits below:
> >
> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> >
> > I think you're definitely good to un-RFC here.
>
> Perfect, will do. Thank you!
I'm just going to change a bit the commit title:
mm: Split critical region in remap_file_pages() and invoke LSMs in
between
Roberto
> Roberto
>
> > > ---
> > > mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++---------------
> > > 1 file changed, 45 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 9c0fb43064b5..762944427e03 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > > unsigned long populate = 0;
> > > unsigned long ret = -EINVAL;
> > > struct file *file;
> > > + vm_flags_t vm_flags;
> > >
> > > pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n",
> > > current->comm, current->pid);
> > > @@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > > if (pgoff + (size >> PAGE_SHIFT) < pgoff)
> > > return ret;
> > >
> > > - if (mmap_write_lock_killable(mm))
> > > + if (mmap_read_lock_killable(mm))
> > > + return -EINTR;
> >
> > I'm kinda verbose generally, but I'd love a comment like:
> >
> > /*
> > * Look up VMA under read lock first so we can perform the security
> > * without holding locks (which can be problematic). We reacquire a
> > * write lock later and check nothing changed underneath us.
> > */
> >
> > > +
> > > + vma = vma_lookup(mm, start);
> > > +
> > > + if (!vma || !(vma->vm_flags & VM_SHARED)) {
> > > + mmap_read_unlock(mm);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
> > > + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
> > > + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
> > > +
> > > + flags &= MAP_NONBLOCK;
> > > + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> > > + if (vma->vm_flags & VM_LOCKED)
> > > + flags |= MAP_LOCKED;
> > > +
> > > + /* Save vm_flags used to calculate prot and flags, and recheck later. */
> > > + vm_flags = vma->vm_flags;
> > > + file = get_file(vma->vm_file);
> > > +
> > > + mmap_read_unlock(mm);
> > > +
> >
> > Maybe worth adding a comment to explain why you're doing this without the
> > lock so somebody looking at this later can understand the dance?
> >
> > > + ret = security_mmap_file(file, prot, flags);
> > > + if (ret) {
> > > + fput(file);
> > > + return ret;
> > > + }
> > > +
> > > + ret = -EINVAL;
> > > +
> >
> > Again, being verbose, I'd put something here like:
> >
> > /* OK security check passed, take write lock + let it rip */
> >
> > > + if (mmap_write_lock_killable(mm)) {
> > > + fput(file);
> > > return -EINTR;
> > > + }
> > >
> > > vma = vma_lookup(mm, start);
> > >
> > > - if (!vma || !(vma->vm_flags & VM_SHARED))
> > > + if (!vma)
> > > + goto out;
> > > +
> >
> > I'd also add something like:
> >
> > /* Make sure things didn't change under us. */
> >
> > > + if (vma->vm_flags != vm_flags)
> > > + goto out;
> > > +
> >
> > And drop this newline to group them together (super nitty I know, sorry!)
> >
> > > + if (vma->vm_file != file)
> > > goto out;
> > >
> > > if (start + size > vma->vm_end) {
> > > @@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > > goto out;
> > > }
> > >
> > > - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
> > > - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0;
> > > - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0;
> > > -
> > > - flags &= MAP_NONBLOCK;
> > > - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> > > - if (vma->vm_flags & VM_LOCKED)
> > > - flags |= MAP_LOCKED;
> > > -
> > > - file = get_file(vma->vm_file);
> > > - ret = security_mmap_file(vma->vm_file, prot, flags);
> > > - if (ret)
> > > - goto out_fput;
> > > ret = do_mmap(vma->vm_file, start, size,
> > > prot, flags, 0, pgoff, &populate, NULL);
> > > -out_fput:
> > > - fput(file);
> > > out:
> > > mmap_write_unlock(mm);
> > > + fput(file);
> > > if (populate)
> > > mm_populate(ret, populate);
> > > if (!IS_ERR_VALUE(ret))
> > > --
> > > 2.34.1
> > >
> >
> > These are just nits, this looks good to me!
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-18 15:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-18 14:47 [RFC][PATCH] mm: Split locks in remap_file_pages() Roberto Sassu
2024-10-18 15:05 ` Jann Horn
2024-10-18 15:42 ` Lorenzo Stoakes
2024-10-18 15:45 ` Roberto Sassu
2024-10-18 15:50 ` Roberto Sassu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox