linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
       [not found]       ` <Y6Z+WpqN59ZjIKkk@zx2c4.com>
@ 2022-12-26 14:24         ` Jason A. Donenfeld
  2022-12-26 14:43           ` Jason A. Donenfeld
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2022-12-26 14:24 UTC (permalink / raw)
  To: Eric Biggers, x86, linux-mm
  Cc: pbonzini, qemu-devel, Laurent Vivier, Michael S . Tsirkin,
	Peter Maydell, Philippe Mathieu-Daudé,
	Richard Henderson, Ard Biesheuvel, Gerd Hoffmann, kasan-dev,
	Dmitry Vyukov

Hi,

I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
working on it though. Details of where I'm at are below the quote below.

On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > Hi Eric,
> > 
> > Replying to you from my telephone, and I'm traveling the next two days,
> > but I thought I should mention some preliminary results right away from
> > doing some termux compiles:
> > 
> > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > Hi Jason,
> > > 
> > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > 
> > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > ---
> > > >  hw/i386/microvm.c | 2 +-
> > > >  hw/i386/pc_piix.c | 3 ++-
> > > >  hw/i386/pc_q35.c  | 3 ++-
> > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > 
> > > 
> > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > is no output at all.  I bisected it to this commit, and I verified that the
> > > following change to QEMU's master branch makes the problem go away:
> > > 
> > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > index b48047f50c..42f5b07d2f 100644
> > > --- a/hw/i386/pc_piix.c
> > > +++ b/hw/i386/pc_piix.c
> > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > >      pc_i440fx_machine_options(m);
> > >      m->alias = "pc";
> > >      m->is_default = true;
> > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > >  }
> > > 
> > > I've attached the kernel config I am seeing the problem on.
> > > 
> > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > 
> > > Any idea what is causing this?
> > 
> > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> >   So there's no KASAN issue with the actual parser.
> > 
> > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > 
> > That makes me suspect that it's file size related, and QEMU or the BIOS
> > is placing setup data at an overlapping offset by accident, or something
> > similar.
> 
> I removed the file systems from your config to bring the kernel size
> back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> on the right track here...

QEMU sticks setup_data after the kernel image, the same as kexec-tools
and everything else. Apparently, when the kernel image is large, the
call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
that points some place bogus, and the system crashes or does something
weird. I haven't yet determined what this limit is, but in my current
test kernel, a value of 0x0000000001327650 is enough to make it point to
rubbish.

Is this expected? What's going on here?

Jason


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
  2022-12-26 14:24         ` [PATCH v5 4/4] x86: re-enable rng seeding via SetupData Jason A. Donenfeld
@ 2022-12-26 14:43           ` Jason A. Donenfeld
  2022-12-26 16:57             ` Jason A. Donenfeld
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2022-12-26 14:43 UTC (permalink / raw)
  To: Eric Biggers, x86, linux-mm
  Cc: pbonzini, qemu-devel, Laurent Vivier, Michael S . Tsirkin,
	Peter Maydell, Philippe Mathieu-Daudé,
	Richard Henderson, Ard Biesheuvel, Gerd Hoffmann, kasan-dev,
	Dmitry Vyukov

On Mon, Dec 26, 2022 at 03:24:07PM +0100, Jason A. Donenfeld wrote:
> Hi,
> 
> I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
> working on it though. Details of where I'm at are below the quote below.
> 
> On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> > On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > > Hi Eric,
> > > 
> > > Replying to you from my telephone, and I'm traveling the next two days,
> > > but I thought I should mention some preliminary results right away from
> > > doing some termux compiles:
> > > 
> > > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > > Hi Jason,
> > > > 
> > > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > > 
> > > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > > ---
> > > > >  hw/i386/microvm.c | 2 +-
> > > > >  hw/i386/pc_piix.c | 3 ++-
> > > > >  hw/i386/pc_q35.c  | 3 ++-
> > > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > > 
> > > > 
> > > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > > is no output at all.  I bisected it to this commit, and I verified that the
> > > > following change to QEMU's master branch makes the problem go away:
> > > > 
> > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > index b48047f50c..42f5b07d2f 100644
> > > > --- a/hw/i386/pc_piix.c
> > > > +++ b/hw/i386/pc_piix.c
> > > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > > >      pc_i440fx_machine_options(m);
> > > >      m->alias = "pc";
> > > >      m->is_default = true;
> > > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > > >  }
> > > > 
> > > > I've attached the kernel config I am seeing the problem on.
> > > > 
> > > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > > 
> > > > Any idea what is causing this?
> > > 
> > > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> > >   So there's no KASAN issue with the actual parser.
> > > 
> > > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > > 
> > > That makes me suspect that it's file size related, and QEMU or the BIOS
> > > is placing setup data at an overlapping offset by accident, or something
> > > similar.
> > 
> > I removed the file systems from your config to bring the kernel size
> > back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> > on the right track here...
> 
> QEMU sticks setup_data after the kernel image, the same as kexec-tools
> and everything else. Apparently, when the kernel image is large, the
> call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
> that points some place bogus, and the system crashes or does something
> weird. I haven't yet determined what this limit is, but in my current
> test kernel, a value of 0x0000000001327650 is enough to make it point to
> rubbish.
> 
> Is this expected? What's going on here?

Attaching gdb to QEMU and switching it to physical memory mode
(`maintenance packet Qqemu.PhyMemMode:1 `) indicates that it
early_memremap is actually working fine and something *else* is at this
address? That's kinda weird... Is KASAN populating physical addresses
immediately after the kernel image extremely early in boot? I'm seeing
the crash happen from early_reserve_memory()->
memblock_x86_reserve_range_setup_data(), which should be before
kasan_init() even runs. Is QEMU calculating kernel_size wrong, when it
goes to determine where to put the setup_data data? But that's the same
calculation as used everywhere else, so hmm...

Jason


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
  2022-12-26 14:43           ` Jason A. Donenfeld
@ 2022-12-26 16:57             ` Jason A. Donenfeld
  2022-12-27 13:36               ` Jason A. Donenfeld
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2022-12-26 16:57 UTC (permalink / raw)
  To: Eric Biggers, x86, linux-mm
  Cc: pbonzini, qemu-devel, Laurent Vivier, Michael S . Tsirkin,
	Peter Maydell, Philippe Mathieu-Daudé,
	Richard Henderson, Ard Biesheuvel, Gerd Hoffmann, kasan-dev,
	Dmitry Vyukov

On Mon, Dec 26, 2022 at 03:43:04PM +0100, Jason A. Donenfeld wrote:
> On Mon, Dec 26, 2022 at 03:24:07PM +0100, Jason A. Donenfeld wrote:
> > Hi,
> > 
> > I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
> > working on it though. Details of where I'm at are below the quote below.
> > 
> > On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> > > On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > > > Hi Eric,
> > > > 
> > > > Replying to you from my telephone, and I'm traveling the next two days,
> > > > but I thought I should mention some preliminary results right away from
> > > > doing some termux compiles:
> > > > 
> > > > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > > > Hi Jason,
> > > > > 
> > > > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > > > 
> > > > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > > > ---
> > > > > >  hw/i386/microvm.c | 2 +-
> > > > > >  hw/i386/pc_piix.c | 3 ++-
> > > > > >  hw/i386/pc_q35.c  | 3 ++-
> > > > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > 
> > > > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > > > is no output at all.  I bisected it to this commit, and I verified that the
> > > > > following change to QEMU's master branch makes the problem go away:
> > > > > 
> > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > > index b48047f50c..42f5b07d2f 100644
> > > > > --- a/hw/i386/pc_piix.c
> > > > > +++ b/hw/i386/pc_piix.c
> > > > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > > > >      pc_i440fx_machine_options(m);
> > > > >      m->alias = "pc";
> > > > >      m->is_default = true;
> > > > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > > > >  }
> > > > > 
> > > > > I've attached the kernel config I am seeing the problem on.
> > > > > 
> > > > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > > > 
> > > > > Any idea what is causing this?
> > > > 
> > > > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> > > >   So there's no KASAN issue with the actual parser.
> > > > 
> > > > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > > > 
> > > > That makes me suspect that it's file size related, and QEMU or the BIOS
> > > > is placing setup data at an overlapping offset by accident, or something
> > > > similar.
> > > 
> > > I removed the file systems from your config to bring the kernel size
> > > back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> > > on the right track here...
> > 
> > QEMU sticks setup_data after the kernel image, the same as kexec-tools
> > and everything else. Apparently, when the kernel image is large, the
> > call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
> > that points some place bogus, and the system crashes or does something
> > weird. I haven't yet determined what this limit is, but in my current
> > test kernel, a value of 0x0000000001327650 is enough to make it point to
> > rubbish.
> > 
> > Is this expected? What's going on here?
> 
> Attaching gdb to QEMU and switching it to physical memory mode
> (`maintenance packet Qqemu.PhyMemMode:1 `) indicates that it
> early_memremap is actually working fine and something *else* is at this
> address? That's kinda weird... Is KASAN populating physical addresses
> immediately after the kernel image extremely early in boot? I'm seeing
> the crash happen from early_reserve_memory()->
> memblock_x86_reserve_range_setup_data(), which should be before
> kasan_init() even runs. Is QEMU calculating kernel_size wrong, when it
> goes to determine where to put the setup_data data? But that's the same
> calculation as used everywhere else, so hmm...
> 
> Jason

If bzImage is 15770544 bytes, it does not boot. If bzImage is 15641776
bytes, it does boot. So something is happening somewhat close to the
16MB mark?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
  2022-12-26 16:57             ` Jason A. Donenfeld
@ 2022-12-27 13:36               ` Jason A. Donenfeld
  2022-12-28  3:07                 ` Jason A. Donenfeld
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2022-12-27 13:36 UTC (permalink / raw)
  To: Eric Biggers, x86, linux-mm
  Cc: pbonzini, qemu-devel, Laurent Vivier, Michael S . Tsirkin,
	Peter Maydell, Philippe Mathieu-Daudé,
	Richard Henderson, Ard Biesheuvel, Gerd Hoffmann, kasan-dev,
	Dmitry Vyukov

On Mon, Dec 26, 2022 at 05:57:30PM +0100, Jason A. Donenfeld wrote:
> On Mon, Dec 26, 2022 at 03:43:04PM +0100, Jason A. Donenfeld wrote:
> > On Mon, Dec 26, 2022 at 03:24:07PM +0100, Jason A. Donenfeld wrote:
> > > Hi,
> > > 
> > > I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
> > > working on it though. Details of where I'm at are below the quote below.
> > > 
> > > On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> > > > On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > > > > Hi Eric,
> > > > > 
> > > > > Replying to you from my telephone, and I'm traveling the next two days,
> > > > > but I thought I should mention some preliminary results right away from
> > > > > doing some termux compiles:
> > > > > 
> > > > > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > > > > Hi Jason,
> > > > > > 
> > > > > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > > > > 
> > > > > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > > > > ---
> > > > > > >  hw/i386/microvm.c | 2 +-
> > > > > > >  hw/i386/pc_piix.c | 3 ++-
> > > > > > >  hw/i386/pc_q35.c  | 3 ++-
> > > > > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > 
> > > > > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > > > > is no output at all.  I bisected it to this commit, and I verified that the
> > > > > > following change to QEMU's master branch makes the problem go away:
> > > > > > 
> > > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > > > index b48047f50c..42f5b07d2f 100644
> > > > > > --- a/hw/i386/pc_piix.c
> > > > > > +++ b/hw/i386/pc_piix.c
> > > > > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > > > > >      pc_i440fx_machine_options(m);
> > > > > >      m->alias = "pc";
> > > > > >      m->is_default = true;
> > > > > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > > > > >  }
> > > > > > 
> > > > > > I've attached the kernel config I am seeing the problem on.
> > > > > > 
> > > > > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > > > > 
> > > > > > Any idea what is causing this?
> > > > > 
> > > > > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> > > > >   So there's no KASAN issue with the actual parser.
> > > > > 
> > > > > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > > > > 
> > > > > That makes me suspect that it's file size related, and QEMU or the BIOS
> > > > > is placing setup data at an overlapping offset by accident, or something
> > > > > similar.
> > > > 
> > > > I removed the file systems from your config to bring the kernel size
> > > > back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> > > > on the right track here...
> > > 
> > > QEMU sticks setup_data after the kernel image, the same as kexec-tools
> > > and everything else. Apparently, when the kernel image is large, the
> > > call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
> > > that points some place bogus, and the system crashes or does something
> > > weird. I haven't yet determined what this limit is, but in my current
> > > test kernel, a value of 0x0000000001327650 is enough to make it point to
> > > rubbish.
> > > 
> > > Is this expected? What's going on here?
> > 
> > Attaching gdb to QEMU and switching it to physical memory mode
> > (`maintenance packet Qqemu.PhyMemMode:1 `) indicates that it
> > early_memremap is actually working fine and something *else* is at this
> > address? That's kinda weird... Is KASAN populating physical addresses
> > immediately after the kernel image extremely early in boot? I'm seeing
> > the crash happen from early_reserve_memory()->
> > memblock_x86_reserve_range_setup_data(), which should be before
> > kasan_init() even runs. Is QEMU calculating kernel_size wrong, when it
> > goes to determine where to put the setup_data data? But that's the same
> > calculation as used everywhere else, so hmm...
> > 
> > Jason
> 
> If bzImage is 15770544 bytes, it does not boot. If bzImage is 15641776
> bytes, it does boot. So something is happening somewhat close to the
> 16MB mark?
> 

Okay, the issue is that it's being decompressed to an area that overlaps
the source. So for example in my test kernel:

input_addr: 0x3f112bf
output_addr: 0x1000000
output_len: 0x3a5d7d8

Since 0x3a5d7d8 + 0x1000000 > 0x3f112bf, eventually this corrupts the
setup_data at the end there.

Now digging into what can be done about it.

Jason


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
  2022-12-27 13:36               ` Jason A. Donenfeld
@ 2022-12-28  3:07                 ` Jason A. Donenfeld
  2022-12-28 14:39                   ` Jason A. Donenfeld
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2022-12-28  3:07 UTC (permalink / raw)
  To: Eric Biggers, x86, linux-mm
  Cc: pbonzini, qemu-devel, Laurent Vivier, Michael S . Tsirkin,
	Peter Maydell, Philippe Mathieu-Daudé,
	Richard Henderson, Ard Biesheuvel, Gerd Hoffmann, kasan-dev,
	Dmitry Vyukov

On Tue, Dec 27, 2022 at 02:36:54PM +0100, Jason A. Donenfeld wrote:
> On Mon, Dec 26, 2022 at 05:57:30PM +0100, Jason A. Donenfeld wrote:
> > On Mon, Dec 26, 2022 at 03:43:04PM +0100, Jason A. Donenfeld wrote:
> > > On Mon, Dec 26, 2022 at 03:24:07PM +0100, Jason A. Donenfeld wrote:
> > > > Hi,
> > > > 
> > > > I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
> > > > working on it though. Details of where I'm at are below the quote below.
> > > > 
> > > > On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> > > > > On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > > > > > Hi Eric,
> > > > > > 
> > > > > > Replying to you from my telephone, and I'm traveling the next two days,
> > > > > > but I thought I should mention some preliminary results right away from
> > > > > > doing some termux compiles:
> > > > > > 
> > > > > > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > > > > > Hi Jason,
> > > > > > > 
> > > > > > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > > > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > > > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > > > > > 
> > > > > > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > > > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > > > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > > > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > > > > > ---
> > > > > > > >  hw/i386/microvm.c | 2 +-
> > > > > > > >  hw/i386/pc_piix.c | 3 ++-
> > > > > > > >  hw/i386/pc_q35.c  | 3 ++-
> > > > > > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > > > > > 
> > > > > > > 
> > > > > > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > > > > > is no output at all.  I bisected it to this commit, and I verified that the
> > > > > > > following change to QEMU's master branch makes the problem go away:
> > > > > > > 
> > > > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > > > > index b48047f50c..42f5b07d2f 100644
> > > > > > > --- a/hw/i386/pc_piix.c
> > > > > > > +++ b/hw/i386/pc_piix.c
> > > > > > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > > > > > >      pc_i440fx_machine_options(m);
> > > > > > >      m->alias = "pc";
> > > > > > >      m->is_default = true;
> > > > > > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > > > > > >  }
> > > > > > > 
> > > > > > > I've attached the kernel config I am seeing the problem on.
> > > > > > > 
> > > > > > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > > > > > 
> > > > > > > Any idea what is causing this?
> > > > > > 
> > > > > > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> > > > > >   So there's no KASAN issue with the actual parser.
> > > > > > 
> > > > > > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > > > > > 
> > > > > > That makes me suspect that it's file size related, and QEMU or the BIOS
> > > > > > is placing setup data at an overlapping offset by accident, or something
> > > > > > similar.
> > > > > 
> > > > > I removed the file systems from your config to bring the kernel size
> > > > > back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> > > > > on the right track here...
> > > > 
> > > > QEMU sticks setup_data after the kernel image, the same as kexec-tools
> > > > and everything else. Apparently, when the kernel image is large, the
> > > > call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
> > > > that points some place bogus, and the system crashes or does something
> > > > weird. I haven't yet determined what this limit is, but in my current
> > > > test kernel, a value of 0x0000000001327650 is enough to make it point to
> > > > rubbish.
> > > > 
> > > > Is this expected? What's going on here?
> > > 
> > > Attaching gdb to QEMU and switching it to physical memory mode
> > > (`maintenance packet Qqemu.PhyMemMode:1 `) indicates that it
> > > early_memremap is actually working fine and something *else* is at this
> > > address? That's kinda weird... Is KASAN populating physical addresses
> > > immediately after the kernel image extremely early in boot? I'm seeing
> > > the crash happen from early_reserve_memory()->
> > > memblock_x86_reserve_range_setup_data(), which should be before
> > > kasan_init() even runs. Is QEMU calculating kernel_size wrong, when it
> > > goes to determine where to put the setup_data data? But that's the same
> > > calculation as used everywhere else, so hmm...
> > > 
> > > Jason
> > 
> > If bzImage is 15770544 bytes, it does not boot. If bzImage is 15641776
> > bytes, it does boot. So something is happening somewhat close to the
> > 16MB mark?
> > 
> 
> Okay, the issue is that it's being decompressed to an area that overlaps
> the source. So for example in my test kernel:
> 
> input_addr: 0x3f112bf
> output_addr: 0x1000000
> output_len: 0x3a5d7d8
> 
> Since 0x3a5d7d8 + 0x1000000 > 0x3f112bf, eventually this corrupts the
> setup_data at the end there.
> 
> Now digging into what can be done about it.

Not quite. input_addr doesn't matter, since setup_data still points to
the old mapping.

So the actual issue is:

compressed_size: 	0xf028d4
decompressed_size:      0x3a5d7d8
setup_data:      	0x100000 + compressed_size
output_addr:    	0x1000000 (this is LOAD_PHYSICAL_ADDR)

Since `output_addr + decompressed_size > setup_data && output_addr <
setup_data`, then it means the decompressor will write over setup_data.

Note that this is also a problem for SETUP_DTB, so it's a longstanding
bug.

I'm experimenting now with appending lots of zeros between the kernel
image and setup_data, so that the decompressor doesn't overwrite
setup_data, but so far it's not working.

Another option would be to have the build system warn when this is going
to happen, and suggest that the user increase the value of
CONFIG_PHYSICAL_START. This might be the best option...

Jason


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 4/4] x86: re-enable rng seeding via SetupData
  2022-12-28  3:07                 ` Jason A. Donenfeld
@ 2022-12-28 14:39                   ` Jason A. Donenfeld
  0 siblings, 0 replies; 6+ messages in thread
From: Jason A. Donenfeld @ 2022-12-28 14:39 UTC (permalink / raw)
  To: Eric Biggers, x86, linux-mm
  Cc: pbonzini, qemu-devel, Laurent Vivier, Michael S . Tsirkin,
	Peter Maydell, Philippe Mathieu-Daudé,
	Richard Henderson, Ard Biesheuvel, Gerd Hoffmann, kasan-dev,
	Dmitry Vyukov

On Wed, Dec 28, 2022 at 04:07:13AM +0100, Jason A. Donenfeld wrote:
> On Tue, Dec 27, 2022 at 02:36:54PM +0100, Jason A. Donenfeld wrote:
> > On Mon, Dec 26, 2022 at 05:57:30PM +0100, Jason A. Donenfeld wrote:
> > > On Mon, Dec 26, 2022 at 03:43:04PM +0100, Jason A. Donenfeld wrote:
> > > > On Mon, Dec 26, 2022 at 03:24:07PM +0100, Jason A. Donenfeld wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm currently stumped at the moment, so adding linux-mm@ and x86@. Still
> > > > > working on it though. Details of where I'm at are below the quote below.
> > > > > 
> > > > > On Sat, Dec 24, 2022 at 05:21:46AM +0100, Jason A. Donenfeld wrote:
> > > > > > On Sat, Dec 24, 2022 at 04:09:08AM +0100, Jason A. Donenfeld wrote:
> > > > > > > Hi Eric,
> > > > > > > 
> > > > > > > Replying to you from my telephone, and I'm traveling the next two days,
> > > > > > > but I thought I should mention some preliminary results right away from
> > > > > > > doing some termux compiles:
> > > > > > > 
> > > > > > > On Fri, Dec 23, 2022 at 04:14:00PM -0800, Eric Biggers wrote:
> > > > > > > > Hi Jason,
> > > > > > > > 
> > > > > > > > On Wed, Sep 21, 2022 at 11:31:34AM +0200, Jason A. Donenfeld wrote:
> > > > > > > > > This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but
> > > > > > > > > for 7.2 rather than 7.1, now that modifying setup_data is safe to do.
> > > > > > > > > 
> > > > > > > > > Cc: Laurent Vivier <laurent@vivier.eu>
> > > > > > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > > > > > > Cc: Peter Maydell <peter.maydell@linaro.org>
> > > > > > > > > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > > > > > > > > Cc: Richard Henderson <richard.henderson@linaro.org>
> > > > > > > > > Cc: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > > > > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > > > > > > > ---
> > > > > > > > >  hw/i386/microvm.c | 2 +-
> > > > > > > > >  hw/i386/pc_piix.c | 3 ++-
> > > > > > > > >  hw/i386/pc_q35.c  | 3 ++-
> > > > > > > > >  3 files changed, 5 insertions(+), 3 deletions(-)
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > After upgrading to QEMU 7.2, Linux 6.1 no longer boots with some configs.  There
> > > > > > > > is no output at all.  I bisected it to this commit, and I verified that the
> > > > > > > > following change to QEMU's master branch makes the problem go away:
> > > > > > > > 
> > > > > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > > > > > index b48047f50c..42f5b07d2f 100644
> > > > > > > > --- a/hw/i386/pc_piix.c
> > > > > > > > +++ b/hw/i386/pc_piix.c
> > > > > > > > @@ -441,6 +441,7 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > > > > > > >      pc_i440fx_machine_options(m);
> > > > > > > >      m->alias = "pc";
> > > > > > > >      m->is_default = true;
> > > > > > > > +    PC_MACHINE_CLASS(m)->legacy_no_rng_seed = true;
> > > > > > > >  }
> > > > > > > > 
> > > > > > > > I've attached the kernel config I am seeing the problem on.
> > > > > > > > 
> > > > > > > > For some reason, the problem also goes away if I disable CONFIG_KASAN.
> > > > > > > > 
> > > > > > > > Any idea what is causing this?
> > > > > > > 
> > > > > > > - Commenting out the call to parse_setup_data() doesn't fix the issue.
> > > > > > >   So there's no KASAN issue with the actual parser.
> > > > > > > 
> > > > > > > - Using KASAN_OUTLINE rather than INLINE does fix the issue!
> > > > > > > 
> > > > > > > That makes me suspect that it's file size related, and QEMU or the BIOS
> > > > > > > is placing setup data at an overlapping offset by accident, or something
> > > > > > > similar.
> > > > > > 
> > > > > > I removed the file systems from your config to bring the kernel size
> > > > > > back down, and voila, it works, even with KASAN_INLINE. So perhaps I'm
> > > > > > on the right track here...
> > > > > 
> > > > > QEMU sticks setup_data after the kernel image, the same as kexec-tools
> > > > > and everything else. Apparently, when the kernel image is large, the
> > > > > call to early_memremap(boot_params.hdr.setup_data, ...) returns a value
> > > > > that points some place bogus, and the system crashes or does something
> > > > > weird. I haven't yet determined what this limit is, but in my current
> > > > > test kernel, a value of 0x0000000001327650 is enough to make it point to
> > > > > rubbish.
> > > > > 
> > > > > Is this expected? What's going on here?
> > > > 
> > > > Attaching gdb to QEMU and switching it to physical memory mode
> > > > (`maintenance packet Qqemu.PhyMemMode:1 `) indicates that it
> > > > early_memremap is actually working fine and something *else* is at this
> > > > address? That's kinda weird... Is KASAN populating physical addresses
> > > > immediately after the kernel image extremely early in boot? I'm seeing
> > > > the crash happen from early_reserve_memory()->
> > > > memblock_x86_reserve_range_setup_data(), which should be before
> > > > kasan_init() even runs. Is QEMU calculating kernel_size wrong, when it
> > > > goes to determine where to put the setup_data data? But that's the same
> > > > calculation as used everywhere else, so hmm...
> > > > 
> > > > Jason
> > > 
> > > If bzImage is 15770544 bytes, it does not boot. If bzImage is 15641776
> > > bytes, it does boot. So something is happening somewhat close to the
> > > 16MB mark?
> > > 
> > 
> > Okay, the issue is that it's being decompressed to an area that overlaps
> > the source. So for example in my test kernel:
> > 
> > input_addr: 0x3f112bf
> > output_addr: 0x1000000
> > output_len: 0x3a5d7d8
> > 
> > Since 0x3a5d7d8 + 0x1000000 > 0x3f112bf, eventually this corrupts the
> > setup_data at the end there.
> > 
> > Now digging into what can be done about it.
> 
> Not quite. input_addr doesn't matter, since setup_data still points to
> the old mapping.
> 
> So the actual issue is:
> 
> compressed_size: 	0xf028d4
> decompressed_size:      0x3a5d7d8
> setup_data:      	0x100000 + compressed_size
> output_addr:    	0x1000000 (this is LOAD_PHYSICAL_ADDR)
> 
> Since `output_addr + decompressed_size > setup_data && output_addr <
> setup_data`, then it means the decompressor will write over setup_data.
> 
> Note that this is also a problem for SETUP_DTB, so it's a longstanding
> bug.
> 
> I'm experimenting now with appending lots of zeros between the kernel
> image and setup_data, so that the decompressor doesn't overwrite
> setup_data, but so far it's not working.
> 
> Another option would be to have the build system warn when this is going
> to happen, and suggest that the user increase the value of
> CONFIG_PHYSICAL_START. This might be the best option...

I posted a patch:
https://lore.kernel.org/qemu-devel/20221228143831.396245-1-Jason@zx2c4.com/

We can move discussion on the topic over to that thread now.

Jason


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-12-28 14:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220921093134.2936487-1-Jason@zx2c4.com>
     [not found] ` <20220921093134.2936487-4-Jason@zx2c4.com>
     [not found]   ` <Y6ZESPx4ettBLuMt@sol.localdomain>
     [not found]     ` <Y6ZtVGtFpUNQP+KU@zx2c4.com>
     [not found]       ` <Y6Z+WpqN59ZjIKkk@zx2c4.com>
2022-12-26 14:24         ` [PATCH v5 4/4] x86: re-enable rng seeding via SetupData Jason A. Donenfeld
2022-12-26 14:43           ` Jason A. Donenfeld
2022-12-26 16:57             ` Jason A. Donenfeld
2022-12-27 13:36               ` Jason A. Donenfeld
2022-12-28  3:07                 ` Jason A. Donenfeld
2022-12-28 14:39                   ` Jason A. Donenfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox