linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ruihan Li <lrh2000@pku.edu.cn>
To: syzbot+fcf1a817ceb50935ce99@syzkaller.appspotmail.com
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, pasha.tatashin@soleen.com,
	gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
	syzkaller-bugs@googlegroups.com, Ruihan Li <lrh2000@pku.edu.cn>
Subject: usbdev_mmap causes type confusion in page_table_check
Date: Sun,  7 May 2023 21:58:44 +0800	[thread overview]
Message-ID: <20230507135844.1231056-1-lrh2000@pku.edu.cn> (raw)
In-Reply-To: <000000000000258e5e05fae79fc1@google.com>

Hi all,

Recently, syzbot reported [1] ("kernel BUG in page_table_check_clear"). After
some bisection, I found out that when usbdev_mmap calls remap_pfn_range on
kmalloc'ed memory, it causes type confusion between struct folio and slab in
page_table_check. As I will explain below, it seems that both usb-side and
mm-side need some fixes. So I have cc'd linux-usb and linux-mm here, as well
as their maintainers, to see whether there are any comments on the proposed
way to fix.

 [1] https://lore.kernel.org/all/000000000000258e5e05fae79fc1@google.com/

To handle mmap requests for /dev/bus/usb/XXX/YYY, usbdev_mmap first allocates
memory by calling usb_alloc_coherent and then maps it into the user space
using remap_pfn_range:

static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
{
	// ...
	mem = usb_alloc_coherent(ps->dev, size, GFP_USER | __GFP_NOWARN,
			&dma_handle);
	// ...
	if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
		if (remap_pfn_range(vma, vma->vm_start,
				    virt_to_phys(usbm->mem) >> PAGE_SHIFT,
				    size, vma->vm_page_prot) < 0) {
			// ...
		}
	}
	// ...
}

Note that in this case, we consider the DMA-unavailable scenario, which, to be
honest, is rare in practice. However, syzbot emulates USB devices using a
module named dummy_hcd, and since these devices are emulated, DMA is not
available at all.

Meanwhile, usb_alloc_coherent calls hcd_buffer_alloc, which uses kmalloc for
memory allocation:

void *hcd_buffer_alloc(
	struct usb_bus		*bus,
	size_t			size,
	gfp_t			mem_flags,
	dma_addr_t		*dma
)
{
	// ...
	if (!hcd_uses_dma(hcd)) {
		*dma = ~(dma_addr_t) 0;
		return kmalloc(size, mem_flags);
	}
	// ...
}

However, during the update of the page table to map the kmalloc'd page into
the user space, page_table_check_set attempts to determine whether the page is
anonymous using PageAnon:

static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
				 unsigned long pfn, unsigned long pgcnt,
				 bool rw)
{
	// ...
	anon = PageAnon(page);
	for (i = 0; i < pgcnt; i++) {
		// ...
		if (anon) {
			BUG_ON(atomic_read(&ptc->file_map_count));
			BUG_ON(atomic_inc_return(&ptc->anon_map_count) > 1 && rw);
		} else {
			BUG_ON(atomic_read(&ptc->anon_map_count));
			BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0);
		}
		// ...
	}
	// ...
}

This call to PageAnon is invalid for slab pages because slab reuses the bits
in struct page/folio to store its internal states, and the anonymity bit only
exists in struct page/folio. As a result, the counters are incorrectly updated
and checked in page_table_check_set and page_table_check_clear, leading to the
bug being raised.

Intuitively, I don't think it's reasonable to call remap_pfn_range to map
kmalloc'd pages into the user space. In the past, kmalloc'd pages might have
had alignment issues when certain memory debugging options were enabled.
Although this has been fixed in commit 59bb47985c1d ("mm, sl[aou]b: guarantee
natural alignment for kmalloc(power-of-two)"), it has been shown that
performing such mapping is still buggy, as demonstrated by the type confusion
in page_table_check. Therefore, I propose adding a new function,
hcd_buffer_alloc_pages, to guarantee the page requirements (i.e., no
intermediate allocators, such as slab/slub). Below is a diff as a quick
example:

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index fbb087b72..514bdc949 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -112,7 +112,7 @@ void hcd_buffer_destroy(struct usb_hcd *hcd)
  * better sharing and to leverage mm/slab.c intelligence.
  */
 
-void *hcd_buffer_alloc(
+void *hcd_buffer_alloc_pages(
 	struct usb_bus		*bus,
 	size_t			size,
 	gfp_t			mem_flags,
@@ -126,12 +126,13 @@ void *hcd_buffer_alloc(
 		return NULL;
 
 	if (hcd->localmem_pool)
-		return gen_pool_dma_alloc(hcd->localmem_pool, size, dma);
+		return gen_pool_dma_alloc_align(hcd->localmem_pool, size,
+						dma, PAGE_SIZE);
 
 	/* some USB hosts just use PIO */
 	if (!hcd_uses_dma(hcd)) {
 		*dma = ~(dma_addr_t) 0;
-		return kmalloc(size, mem_flags);
+		return (void *)__get_free_pages(mem_flags, get_order(size));
 	}
 
 	for (i = 0; i < HCD_BUFFER_POOLS; i++) {

(There appears to be another issue. In cases where hcd->localmem_pool is
non-null, I suspect that remap_pfn_range should not be used. This is a DMA
scenario, and the DMA handle is properly returned, so dma_mmap_coherent should
be used instead. Am I correct?)

This does not actually fix the type confusion bug in page_table_check_*. It
should be noted that by leveraging /dev/mem, users can map arbitrary physical
memory regions into the user space, which is also achieved through
remap_pfn_range. I'm not 100% certain whether a fix is necessary, as one may
argue that using page table checks (a kernel hardening technique, which means
security is important) and /dev/mem (clearly insecure and potentially
exploitable) together is completely unreasonable.

If a fix is deemed necessary, I think taking the flag VM_PFNMAP into
consideration is a reasonable solution, that said, in page_table_check_*,
 * when the VM_PFNMAP flag is set, all operations and checks on
   file_map_count and anon_map_count counters should be bypassed;
 * when the VM_PFNMAP flag is not set, an additionally check to ensure
   folio_test_slab evaluates to false should be performed.

The implementation should be straightforward. However, I noticed that the
page_table_check_* hooks are called in arch/* instead of mm/*, which not only
limits its supported architectures (currently x86_64, arm64, and risc-v) but
also makes it difficult to get the struct vm_area_struct as a parameter, since
the struct vm_area_struct is not passed from mm/* to arch/* when mapping or
unmapping pages. I have looked at d283d422c6c4 ("x86: mm: add x86_64 support
for page table check"), but I don't see a valid reason. Perhaps Pasha can
provide some explanation about this implementation choice?

Thanks,
Ruihan Li



  reply	other threads:[~2023-05-07 13:59 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05  0:46 [syzbot] [mm?] kernel BUG in page_table_check_clear syzbot
2023-05-07 13:58 ` Ruihan Li [this message]
2023-05-08 21:27   ` usbdev_mmap causes type confusion in page_table_check Pasha Tatashin
2023-05-08 21:36     ` Matthew Wilcox
2023-05-08 21:48       ` Pasha Tatashin
2023-05-08 21:52         ` Matthew Wilcox
2023-05-08 21:55           ` Pasha Tatashin
2023-05-08 22:46             ` David Hildenbrand
2023-05-08 23:17               ` Pasha Tatashin
2023-05-08 23:21                 ` Pasha Tatashin
2023-05-08 23:37                   ` David Hildenbrand
2023-05-09  0:07                     ` Pasha Tatashin
2023-05-08 21:37     ` David Hildenbrand
2023-05-09 13:25   ` Christoph Hellwig
2023-05-09 14:01     ` Greg KH
2023-05-10 13:17       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230507135844.1231056-1-lrh2000@pku.edu.cn \
    --to=lrh2000@pku.edu.cn \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=syzbot+fcf1a817ceb50935ce99@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox