linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: linux-kernel@vger.kernel.org, pasha.tatashin@soleen.com,
	 Cong Wang <cwang@multikernel.io>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Baoquan He <bhe@redhat.com>, Alexander Graf <graf@amazon.com>,
	Mike Rapoport <rppt@kernel.org>,
	 Changyuan Lyu <changyuanl@google.com>,
	kexec@lists.infradead.org, linux-mm@kvack.org,
	 multikernel@lists.linux.dev
Subject: Re: [RFC Patch 0/7] kernel: Introduce multikernel architecture support
Date: Wed, 24 Sep 2025 10:18:22 -0700	[thread overview]
Message-ID: <CAM_iQpWe50B4hfVTd-xTemByvoM_bg7eA9y8qwZjBsYcAbY9fA@mail.gmail.com> (raw)
In-Reply-To: <20250923170545.GA509965@fedora>

On Tue, Sep 23, 2025 at 10:05 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Mon, Sep 22, 2025 at 03:41:18PM -0700, Cong Wang wrote:
> > On Mon, Sep 22, 2025 at 7:28 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > On Sat, Sep 20, 2025 at 02:40:18PM -0700, Cong Wang wrote:
> > > > On Fri, Sep 19, 2025 at 2:27 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > >
> > > > > On Thu, Sep 18, 2025 at 03:25:59PM -0700, Cong Wang wrote:
> > > > > > This patch series introduces multikernel architecture support, enabling
> > > > > > multiple independent kernel instances to coexist and communicate on a
> > > > > > single physical machine. Each kernel instance can run on dedicated CPU
> > > > > > cores while sharing the underlying hardware resources.
> > > > > >
> > > > > > The multikernel architecture provides several key benefits:
> > > > > > - Improved fault isolation between different workloads
> > > > > > - Enhanced security through kernel-level separation
> > > > >
> > > > > What level of isolation does this patch series provide? What stops
> > > > > kernel A from accessing kernel B's memory pages, sending interrupts to
> > > > > its CPUs, etc?
> > > >
> > > > It is kernel-enforced isolation, therefore, the trust model here is still
> > > > based on kernel. Hence, a malicious kernel would be able to disrupt,
> > > > as you described. With memory encryption and IPI filtering, I think
> > > > that is solvable.
> > >
> > > I think solving this is key to the architecture, at least if fault
> > > isolation and security are goals. A cooperative architecture where
> > > nothing prevents kernels from interfering with each other simply doesn't
> > > offer fault isolation or security.
> >
> > Kernel and kernel modules can be signed today, kexec also supports
> > kernel signing via kexec_file_load(). It migrates at least untrusted
> > kernels, although kernels can be still exploited via 0-day.
>
> Kernel signing also doesn't protect against bugs in one kernel
> interfering with another kernel.

This is also true, this is why memory encryption and authentication
could help. Hardware vendors can catch up with software, which
is how virtualization evolved (e.g. VPDA didn't exist when KVM was
invented).

>
> > >
> > > On CPU architectures that offer additional privilege modes it may be
> > > possible to run a supervisor on every CPU to restrict access to
> > > resources in the spawned kernel. Kernels would need to be modified to
> > > call into the supervisor instead of accessing certain resources
> > > directly.
> > >
> > > IOMMU and interrupt remapping control would need to be performed by the
> > > supervisor to prevent spawned kernels from affecting each other.
> >
> > That's right, security vs performance. A lot of times we have to balance
> > between these two. This is why Kata Container today runs a container
> > inside a VM.
> >
> > This largely depends on what users could compromise, there is no single
> > right answer here.
> >
> > For example, in a fully-controlled private cloud, security exploits are
> > probably not even a concern. Sacrificing performance for a non-concern
> > is not reasonable.
> >
> > >
> > > This seems to be the price of fault isolation and security. It ends up
> > > looking similar to a hypervisor, but maybe it wouldn't need to use
> > > virtualization extensions, depending on the capabilities of the CPU
> > > architecture.
> >
> > Two more points:
> >
> > 1) Security lockdown. Security lockdown transforms multikernel from
> > "0-day means total compromise" to "0-day means single workload
> > compromise with rapid recovery." This is still a significant improvement
> > over containers where a single kernel 0-day compromises everything
> > simultaneously.
>
> I don't follow. My understanding is that multikernel currently does not
> prevent spawned kernels from affecting each other, so a kernel 0-day in
> multikernel still compromises everything?

Linux kernel lockdown does reduce the blast radius of a 0-day exploit,
but it doesn’t eliminate it. I hope this is clearer.

>
> >
> > 2) Rapid kernel updates: A more practical way to eliminate 0-day
> > exploits is to update kernel more frequently, today the major blocker
> > is the downtime required by kernel reboot, which is what multikernel
> > aims to resolve.
>
> If kernel upgrades are the main use case for multikernel, then I guess
> isolation is not necessary. Two kernels would only run side-by-side for
> a limited period of time and they would have access to the same
> workloads.

Zero-downtime upgrade is probably the last we could achieve
with multikernel, as a true zero-downtime requires significant effort
on kernel-to-kernel coordination, so we would essentially need to
establish a protocol (via KHO, I hope) here.

On the other hand, isolation is relatively easy and more useful.
I understand you don't like kernel isolation, however, we need to
recognize the success of containers today, regardless we like it or
not.

By the way, although just a theory, I hope multikernel does not
prevent users using virtualization inside, as VM does not prevent
running containers inside. The choice should always be on users'
side, not ours.

I hope this helps.

Regards,
Cong Wang


  parent reply	other threads:[~2025-09-24 17:18 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-18 22:25 Cong Wang
2025-09-18 22:26 ` [RFC Patch 1/7] kexec: Introduce multikernel support via kexec Cong Wang
2025-09-18 22:26 ` [RFC Patch 2/7] x86: Introduce SMP INIT trampoline for multikernel CPU bootstrap Cong Wang
2025-09-18 22:26 ` [RFC Patch 3/7] x86: Introduce MULTIKERNEL_VECTOR for inter-kernel communication Cong Wang
2025-09-18 22:26 ` [RFC Patch 4/7] kernel: Introduce generic multikernel IPI communication framework Cong Wang
2025-09-18 22:26 ` [RFC Patch 5/7] x86: Introduce arch_cpu_physical_id() to obtain physical CPU ID Cong Wang
2025-09-18 22:26 ` [RFC Patch 6/7] kexec: Implement dynamic kimage tracking Cong Wang
2025-09-18 22:26 ` [RFC Patch 7/7] kexec: Add /proc/multikernel interface for " Cong Wang
2025-09-19 10:10 ` [syzbot ci] Re: kernel: Introduce multikernel architecture support syzbot ci
2025-09-19 13:14 ` [RFC Patch 0/7] " Pasha Tatashin
2025-09-20 21:13   ` Cong Wang
2025-09-19 21:26 ` Stefan Hajnoczi
2025-09-20 21:40   ` Cong Wang
2025-09-22 14:28     ` Stefan Hajnoczi
2025-09-22 22:41       ` Cong Wang
2025-09-23 17:05         ` Stefan Hajnoczi
2025-09-24 11:38           ` David Hildenbrand
2025-09-24 12:51             ` Stefan Hajnoczi
2025-09-24 18:28               ` Cong Wang
2025-09-24 19:03                 ` Stefan Hajnoczi
2025-09-27 19:42                   ` Cong Wang
2025-09-29 15:11                     ` Stefan Hajnoczi
2025-10-02  4:17                       ` Cong Wang
2025-09-24 17:18           ` Cong Wang [this message]
2025-09-21  1:47 ` Hillf Danton
2025-09-22 21:55   ` Cong Wang
2025-09-24  1:12     ` Hillf Danton
2025-09-24 17:30       ` Cong Wang
2025-09-24 22:42         ` Hillf Danton
2025-09-21  5:54 ` Jan Engelhardt
2025-09-21  6:24   ` Mike Rapoport
2025-09-24 17:51 ` Christoph Lameter (Ampere)
2025-09-24 18:39   ` Cong Wang
2025-09-26  9:50     ` Jarkko Sakkinen
2025-09-27 20:43       ` Cong Wang
2025-09-28 14:22         ` Jarkko Sakkinen
2025-09-28 14:36           ` Jarkko Sakkinen
2025-09-28 14:41             ` Jarkko Sakkinen
2025-09-25 15:47 ` Jiaxun Yang
2025-09-27 20:06   ` Cong Wang
2025-09-26  9:01 ` Jarkko Sakkinen
2025-09-27 20:27   ` Cong Wang
2025-09-27 20:39     ` Pasha Tatashin
2025-09-28 14:08     ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM_iQpWe50B4hfVTd-xTemByvoM_bg7eA9y8qwZjBsYcAbY9fA@mail.gmail.com \
    --to=xiyou.wangcong@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=changyuanl@google.com \
    --cc=cwang@multikernel.io \
    --cc=graf@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=multikernel@lists.linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=rppt@kernel.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox