[DESIGN] Hardening page allocator against type confusion

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [DESIGN] Hardening page allocator against type confusion
@ 2024-09-25 19:46 Matthew Wilcox
  2024-10-03 14:27 ` Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2024-09-25 19:46 UTC (permalink / raw)
  To: linux-mm; +Cc: Kees Cook, Jann Horn

Kees and I had a fun discussion at Plumbers.

We're trying to harden against type confusion, where we think we have
a pointer to one thing, but it turns out to be a pointer to a different
thing.  There's various ways this can be harmful, which Kees has laid out
before when adding slab buckets.  eg see https://lwn.net/Articles/978976/

Not all allocations come from slab though.  If we free a slab object
and the slab it was in gets freed back to the page allocator, it can
turn into almost anything else _quickly_ as the page allocator fronts
the buddy allocator with a stack of recently-freed pages (called PCP,
not to be confused with percpu memory), so if the attacker can arrange
for a page table allocation to come in soon after a slab free, it is
very likely to be the memory they have access to.

My proposal is that we resolve this "type confusion" by having separate
PCP lists for different types of pages.  We'll need to have this for
memdescs anyway, so this is just shifting some of the work left.

We'd reduce the exploitability of type confusion by using a per-CPU,
per-type stack of recently used pages.  To turn a slab page into a page
table page, the attacker would have to cause a dozen slabs to be freed on
this CPU, pushing this one into the buddy allocator.  Then they'd have
to cause the allocating task to empty its stack of page table pages,
causing the attackable slab to be pulled from the buddy.  It's still
possible, but it's harder.

Harder enough?  I don't know, hence this email.  We can get into the
API design (and then the implementation design) if we have agreement
that this is the right approach to be taking.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [DESIGN] Hardening page allocator against type confusion
  2024-09-25 19:46 [DESIGN] Hardening page allocator against type confusion Matthew Wilcox
@ 2024-10-03 14:27 ` Vlastimil Babka
  2024-10-03 16:50   ` Matthew Wilcox
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2024-10-03 14:27 UTC (permalink / raw)
  To: Matthew Wilcox, linux-mm; +Cc: Kees Cook, Jann Horn

On 9/25/24 21:46, Matthew Wilcox wrote:
> Kees and I had a fun discussion at Plumbers.
> 
> We're trying to harden against type confusion, where we think we have
> a pointer to one thing, but it turns out to be a pointer to a different
> thing.  There's various ways this can be harmful, which Kees has laid out
> before when adding slab buckets.  eg see https://lwn.net/Articles/978976/
> 
> Not all allocations come from slab though.  If we free a slab object
> and the slab it was in gets freed back to the page allocator, it can
> turn into almost anything else _quickly_ as the page allocator fronts
> the buddy allocator with a stack of recently-freed pages (called PCP,
> not to be confused with percpu memory), so if the attacker can arrange
> for a page table allocation to come in soon after a slab free, it is
> very likely to be the memory they have access to.
> 
> My proposal is that we resolve this "type confusion" by having separate
> PCP lists for different types of pages.  We'll need to have this for
> memdescs anyway, so this is just shifting some of the work left.
> 
> We'd reduce the exploitability of type confusion by using a per-CPU,
> per-type stack of recently used pages.  To turn a slab page into a page
> table page, the attacker would have to cause a dozen slabs to be freed on
> this CPU, pushing this one into the buddy allocator.  Then they'd have
> to cause the allocating task to empty its stack of page table pages,
> causing the attackable slab to be pulled from the buddy.  It's still
> possible, but it's harder.
> 
> Harder enough?  I don't know, hence this email.  We can get into the
> API design (and then the implementation design) if we have agreement
> that this is the right approach to be taking.

Not a security expert but I doubt it's harder enough?

I thought the robust mitigation here was SLAB_VIRTUAL


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [DESIGN] Hardening page allocator against type confusion
  2024-10-03 14:27 ` Vlastimil Babka
@ 2024-10-03 16:50   ` Matthew Wilcox
  2024-10-04 14:04     ` Vlastimil Babka
  2024-10-04 18:01     ` Kees Cook
  0 siblings, 2 replies; 5+ messages in thread
From: Matthew Wilcox @ 2024-10-03 16:50 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: linux-mm, Kees Cook, Jann Horn

On Thu, Oct 03, 2024 at 04:27:12PM +0200, Vlastimil Babka wrote:
> On 9/25/24 21:46, Matthew Wilcox wrote:
> > Kees and I had a fun discussion at Plumbers.
> > 
> > We're trying to harden against type confusion, where we think we have
> > a pointer to one thing, but it turns out to be a pointer to a different
> > thing.  There's various ways this can be harmful, which Kees has laid out
> > before when adding slab buckets.  eg see https://lwn.net/Articles/978976/
> > 
> > Not all allocations come from slab though.  If we free a slab object
> > and the slab it was in gets freed back to the page allocator, it can
> > turn into almost anything else _quickly_ as the page allocator fronts
> > the buddy allocator with a stack of recently-freed pages (called PCP,
> > not to be confused with percpu memory), so if the attacker can arrange
> > for a page table allocation to come in soon after a slab free, it is
> > very likely to be the memory they have access to.
> > 
> > My proposal is that we resolve this "type confusion" by having separate
> > PCP lists for different types of pages.  We'll need to have this for
> > memdescs anyway, so this is just shifting some of the work left.
> > 
> > We'd reduce the exploitability of type confusion by using a per-CPU,
> > per-type stack of recently used pages.  To turn a slab page into a page
> > table page, the attacker would have to cause a dozen slabs to be freed on
> > this CPU, pushing this one into the buddy allocator.  Then they'd have
> > to cause the allocating task to empty its stack of page table pages,
> > causing the attackable slab to be pulled from the buddy.  It's still
> > possible, but it's harder.
> > 
> > Harder enough?  I don't know, hence this email.  We can get into the
> > API design (and then the implementation design) if we have agreement
> > that this is the right approach to be taking.
> 
> Not a security expert but I doubt it's harder enough?
> 
> I thought the robust mitigation here was SLAB_VIRTUAL

Well, this is for allocations that _don't_ come from slab.  Like page
tables and page cache or anoymous memory.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [DESIGN] Hardening page allocator against type confusion
  2024-10-03 16:50   ` Matthew Wilcox
@ 2024-10-04 14:04     ` Vlastimil Babka
  2024-10-04 18:01     ` Kees Cook
  1 sibling, 0 replies; 5+ messages in thread
From: Vlastimil Babka @ 2024-10-04 14:04 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, Kees Cook, Jann Horn

On 10/3/24 18:50, Matthew Wilcox wrote:
> On Thu, Oct 03, 2024 at 04:27:12PM +0200, Vlastimil Babka wrote:
>> On 9/25/24 21:46, Matthew Wilcox wrote:
>> > Kees and I had a fun discussion at Plumbers.
>> > 
>> > We're trying to harden against type confusion, where we think we have
>> > a pointer to one thing, but it turns out to be a pointer to a different
>> > thing.  There's various ways this can be harmful, which Kees has laid out
>> > before when adding slab buckets.  eg see https://lwn.net/Articles/978976/
>> > 
>> > Not all allocations come from slab though.  If we free a slab object
>> > and the slab it was in gets freed back to the page allocator, it can
>> > turn into almost anything else _quickly_ as the page allocator fronts
>> > the buddy allocator with a stack of recently-freed pages (called PCP,
>> > not to be confused with percpu memory), so if the attacker can arrange
>> > for a page table allocation to come in soon after a slab free, it is
>> > very likely to be the memory they have access to.

The paragraph above seems very much about allocations that were originally
from slab. AFAIU that's the biggest concern for type confusion.

>> > My proposal is that we resolve this "type confusion" by having separate
>> > PCP lists for different types of pages.  We'll need to have this for
>> > memdescs anyway, so this is just shifting some of the work left.
>> > 
>> > We'd reduce the exploitability of type confusion by using a per-CPU,
>> > per-type stack of recently used pages.  To turn a slab page into a page
>> > table page, the attacker would have to cause a dozen slabs to be freed on
>> > this CPU, pushing this one into the buddy allocator.  Then they'd have
>> > to cause the allocating task to empty its stack of page table pages,
>> > causing the attackable slab to be pulled from the buddy.  It's still
>> > possible, but it's harder.
>> > 
>> > Harder enough?  I don't know, hence this email.  We can get into the
>> > API design (and then the implementation design) if we have agreement
>> > that this is the right approach to be taking.
>> 
>> Not a security expert but I doubt it's harder enough?
>> 
>> I thought the robust mitigation here was SLAB_VIRTUAL
> 
> Well, this is for allocations that _don't_ come from slab.  Like page
> tables and page cache or anoymous memory.

See above. I'm not sure how relevant are the other kinds of page type
confusion in practice. Maybe large kmalloc() (not technically slab
allocations) or arbitrary kernel alloc_pages() allocations reused as e.g.
page tables?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [DESIGN] Hardening page allocator against type confusion
  2024-10-03 16:50   ` Matthew Wilcox
  2024-10-04 14:04     ` Vlastimil Babka
@ 2024-10-04 18:01     ` Kees Cook
  1 sibling, 0 replies; 5+ messages in thread
From: Kees Cook @ 2024-10-04 18:01 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Vlastimil Babka, linux-mm, Jann Horn

On Thu, Oct 03, 2024 at 05:50:39PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 03, 2024 at 04:27:12PM +0200, Vlastimil Babka wrote:
> > On 9/25/24 21:46, Matthew Wilcox wrote:
> > > Kees and I had a fun discussion at Plumbers.
> > > 
> > > We're trying to harden against type confusion, where we think we have
> > > a pointer to one thing, but it turns out to be a pointer to a different
> > > thing.  There's various ways this can be harmful, which Kees has laid out
> > > before when adding slab buckets.  eg see https://lwn.net/Articles/978976/
> > > 
> > > Not all allocations come from slab though.  If we free a slab object
> > > and the slab it was in gets freed back to the page allocator, it can
> > > turn into almost anything else _quickly_ as the page allocator fronts
> > > the buddy allocator with a stack of recently-freed pages (called PCP,
> > > not to be confused with percpu memory), so if the attacker can arrange
> > > for a page table allocation to come in soon after a slab free, it is
> > > very likely to be the memory they have access to.
> > > 
> > > My proposal is that we resolve this "type confusion" by having separate
> > > PCP lists for different types of pages.  We'll need to have this for
> > > memdescs anyway, so this is just shifting some of the work left.
> > > 
> > > We'd reduce the exploitability of type confusion by using a per-CPU,
> > > per-type stack of recently used pages.  To turn a slab page into a page
> > > table page, the attacker would have to cause a dozen slabs to be freed on
> > > this CPU, pushing this one into the buddy allocator.  Then they'd have
> > > to cause the allocating task to empty its stack of page table pages,
> > > causing the attackable slab to be pulled from the buddy.  It's still
> > > possible, but it's harder.
> > > 
> > > Harder enough?  I don't know, hence this email.  We can get into the
> > > API design (and then the implementation design) if we have agreement
> > > that this is the right approach to be taking.
> > 
> > Not a security expert but I doubt it's harder enough?
> > 
> > I thought the robust mitigation here was SLAB_VIRTUAL
> 
> Well, this is for allocations that _don't_ come from slab.  Like page
> tables and page cache or anoymous memory.

I'd really like to hear Jann's thoughts on this.

My instinct is that if it makes it harder to attack but provides some
better performance or reliability characteristics, it's very worth it.
:)

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-04 18:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-25 19:46 [DESIGN] Hardening page allocator against type confusion Matthew Wilcox
2024-10-03 14:27 ` Vlastimil Babka
2024-10-03 16:50   ` Matthew Wilcox
2024-10-04 14:04     ` Vlastimil Babka
2024-10-04 18:01     ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox