Re: HARDENED_USERCOPY will BUG on multiple slub objects coalesced into an sk_buff fragment

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Anton Eidelman <anton@lightbitslabs.com>
To: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	Laura Abbott <labbott@redhat.com>, Linux-MM <linux-mm@kvack.org>,
	linux-hardened@lists.openwall.com
Subject: Re: HARDENED_USERCOPY will BUG on multiple slub objects coalesced into an sk_buff fragment
Date: Fri, 1 Jun 2018 16:34:01 -0700	[thread overview]
Message-ID: <CAKYffwpAAgD+a+0kebid43tpyS6L+8o=4hBbDvhfgaoV_gze1g@mail.gmail.com> (raw)
In-Reply-To: <CAGXu5jLvN5bmakZ3aDu4TRB9+_DYVaCX2LTLtKvsqgYpjMaNsA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5401 bytes --]

Hi all,

I do not have a way of reproducing this decent enough to recommend: I'll
keep digging.

The page belongs to a slub when the fragment is being constructed in
__skb_fill_page_desc(), see the instrumentation I used below.
When usercopy triggers, .coals shows values of 2/3 for 128/192 bytes
respectively.

The question is how the RX sk_buff ends up having data fragment in a
PageSlab page.
Some network drivers use netdev_alloc_frag() so pages indeed come
from page_frag allocator.
Others (mellanox, intel) just alloc_page() when filling their RX
descriptors.
In both cases the pages will be refcounted properly.

I suspect my kernel TCP traffic that uses kernel_sendpage() for bio pages
AND slub pages.

Thanks a lot!
Anton
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a098d95..7cd744c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -40,6 +40,7 @@
 #include <linux/in6.h>
 #include <linux/if_packet.h>
 #include <net/flow.h>
+#include <linux/slub_def.h>

 /* The interface for checksum offload between the stack and networking
drivers
  * is as follows...
@@ -316,7 +317,8 @@ struct skb_frag_struct {
        } page;
 #if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
        __u32 page_offset;
-       __u32 size;
+       __u16 size;
+       __u16 coals;
 #else
        __u16 page_offset;
        __u16 size;
@@ -1850,9 +1852,11 @@ static inline void __skb_fill_page_desc(struct
sk_buff *skb, int i,
         */
        frag->page.p              = page;
        frag->page_offset         = off;
+       frag->coals               = 0;
        skb_frag_size_set(frag, size);

        page = compound_head(page);
+       *WARN_ON(PageSlab(page) && (page->slab_cache->size < size)); //
does NOT trigger*
        if (page_is_pfmemalloc(page))
                skb->pfmemalloc = true;
 }
@@ -2849,10 +2853,14 @@ static inline bool skb_can_coalesce(struct sk_buff
*skb, int i,
                                    const struct page *page, int off)
 {
        if (i) {
-               const struct skb_frag_struct *frag =
&skb_shinfo(skb)->frags[i - 1];
+               struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i -
1];

-               return page == skb_frag_page(frag) &&
+               bool ret = page == skb_frag_page(frag) &&
                       off == frag->page_offset + skb_frag_size(frag);
+               if (unlikely(ret))
*+                       if (PageSlab(compound_head((struct page *)page)))*
*+                               frag->coals++;*
+               return ret;
        }
        return false;
 }


On Fri, Jun 1, 2018 at 2:55 PM, Kees Cook <keescook@chromium.org> wrote:

> On Fri, Jun 1, 2018 at 1:58 PM, Matthew Wilcox <willy@infradead.org>
> wrote:
> > On Fri, Jun 01, 2018 at 01:49:38PM -0700, Kees Cook wrote:
> >> On Fri, Jun 1, 2018 at 12:02 PM, Laura Abbott <labbott@redhat.com>
> wrote:
> >> > (cc-ing some interested people)
> >> >
> >> >
> >> >
> >> > On 05/31/2018 05:03 PM, Anton Eidelman wrote:
> >> >> Here's a rare issue I reproduce on 4.12.10 (centos config): full log
> >> >> sample below.
> >>
> >> Thanks for digging into this! Do you have any specific reproducer for
> >> this? If so, I'd love to try a bisection, as I'm surprised this has
> >> only now surfaced: hardened usercopy was introduced in 4.8 ...
> >>
> >> >> An innocent process (dhcpclient) is about to receive a datagram, but
> >> >> during skb_copy_datagram_iter() usercopy triggers a BUG in:
> >> >> usercopy.c:check_heap_object() -> slub.c:__check_heap_object(),
> because
> >> >> the sk_buff fragment being copied crosses the 64-byte slub object
> boundary.
> >> >>
> >> >> Example __check_heap_object() context:
> >> >>    n=128    << usually 128, sometimes 192.
> >> >>    object_size=64
> >> >>    s->size=64
> >> >>    page_address(page)=0xffff880233f7c000
> >> >>    ptr=0xffff880233f7c540
> >> >>
> >> >> My take on the root cause:
> >> >>    When adding data to an skb, new data is appended to the current
> >> >> fragment if the new chunk immediately follows the last one: by simply
> >> >> increasing the frag->size, skb_frag_size_add().
> >> >>    See include/linux/skbuff.h:skb_can_coalesce() callers.
> >>
> >> Oooh, sneaky:
> >>                 return page == skb_frag_page(frag) &&
> >>                        off == frag->page_offset + skb_frag_size(frag);
> >>
> >> Originally I was thinking that slab red-zoning would get triggered
> >> too, but I see the above is checking to see if these are precisely
> >> neighboring allocations, I think.
> >>
> >> But then ... how does freeing actually work? I'm really not sure how
> >> this seeming layering violation could be safe in other areas?
> >
> > I'm confused ... I thought skb frags came from the page_frag allocator,
> > not the slab allocator.  But then why would the slab hardening trigger?
>
> Well that would certainly make more sense (well, the sense about
> alloc/free). Having it overlap with a slab allocation, though, that's
> quite bad. Perhaps this is a very odd use-after-free case? I.e. freed
> page got allocated to slab, and when it got copied out, usercopy found
> it spanned a slub object?
>
> [ 655.602500] usercopy: kernel memory exposure attempt detected from
> ffff88022a31aa00 (kmalloc-64) (192 bytes)
>
> This wouldn't be the first time usercopy triggered due to a memory
> corruption...
>
> -Kees
>
> --
> Kees Cook
> Pixel Security
>

[-- Attachment #2: Type: text/html, Size: 11295 bytes --]

next prev parent reply	other threads:[~2018-06-01 23:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-01  0:03 Anton Eidelman
2018-06-01 19:02 ` Laura Abbott
2018-06-01 20:49   ` Kees Cook
2018-06-01 20:58     ` Matthew Wilcox
2018-06-01 21:55       ` Kees Cook
2018-06-01 23:34         ` Anton Eidelman [this message]
2018-06-05 15:27           ` Christopher Lameter
2018-06-05 20:45             ` Anton Eidelman
2018-06-05 21:43               ` Christopher Lameter
2018-06-05 14:51       ` Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKYffwpAAgD+a+0kebid43tpyS6L+8o=4hBbDvhfgaoV_gze1g@mail.gmail.com' \
    --to=anton@lightbitslabs.com \
    --cc=keescook@chromium.org \
    --cc=labbott@redhat.com \
    --cc=linux-hardened@lists.openwall.com \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox