Re: Bug: broken /proc/kcore in 6.13

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Alexandre Ferrieux <alexandre.ferrieux@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	linux-trace-users@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Mike Rapoport <rppt@kernel.org>
Subject: Re: Bug: broken /proc/kcore in 6.13
Date: Fri, 17 Jan 2025 18:13:25 +0000	[thread overview]
Message-ID: <ecb22f03-ca7f-4212-9f02-cceafb9cfb7f@lucifer.local> (raw)
In-Reply-To: <9f3ba47b-e0ce-4d7a-b60f-e5b8d96b0f2b@lucifer.local>

+cc Mike

OK so nothing to worry about here - the feature that causes this problem
has been completely disabled. This may not be in Linus's tree yet but will
be for 6.13 release [0].

I think the vread_iter() check for 0 can wait for 6.14, as once the area of
memory is identified this should never happen, but we do want to pick up on
it, with a WARN_ON_ONCE() to catch stuff like this right away.

Thanks so much for the repro, though I observed the 'core /proc/kcore'
command freezing up before any 'disass' in my qemu setup, interestingly!

[0]:https://lore.kernel.org/all/20250113112934.GA8385@noisy.programming.kicks-ass.net/

On Fri, Jan 17, 2025 at 04:31:54PM +0000, Lorenzo Stoakes wrote:
> On Fri, Jan 17, 2025 at 04:28:32PM +0100, Alexandre Ferrieux wrote:
> >
> >
> > On 17/01/2025 16:19, Alexandre Ferrieux wrote:
> > > On 17/01/2025 15:44, Lorenzo Stoakes wrote:
> > >>> Alexandre Ferrieux <alexandre.ferrieux@gmail.com> wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> Somewhere in the 6.13 branch (not bisected yet, sorry), it stopped being
> > >>>> possible to disassemble the running kernel from gdb through /proc/kcore.
> > >> Thanks for the report! Much appreciated.
> > >>
> > >> I may try to bisect here also unless you're close to finding the commit that
> > >> broke this?
> > >
> > > I'm currently homing in on copy_page_to_iter_nofault(), will report shortly :)
> >
> > Hmm, actually, that baby ain't cooperative:
> >
> >   [Fri Jan 17 15:23:05 2025] trace_kprobe: Could not probe notrace function
> >   copy_page_to_iter_nofault
> >
> > ... if I cannot insert kprobes to sniff around, I'm a bit stuck :}
> > So I think you'll reach the goal faster than me !
> >
> > PS: For your bisection: the last working kernel I know of is Debian's 6.12 final:
> >
> >   ii  linux-image-6.12.9-amd64         6.12.9-1                         amd64
> >     Linux 6.12 for 64-bit PCs (signed)
> >
> >
>
> Cheers much appreciated, have been able to repro and am bisecting now! Will
> update with results when done.

OK I bisected this to commit 5185e7f9f3bd ("x86/module: enable ROX caches for
module text on 64 bit").

It seems that vmalloc logic is used to handle module memory too in
vread_iter(), and somehow the execmem stuff is breaking this.

So this would explain why this worked previously, and it was in fact ok to
assume vread_iter() should never return 0 (though I believe we should now
definitely check this and error out if so).

I have tracked it down to (forgive me Alexandre, I realise I'm duplicating
some of your analysis, I'm just doing things from the kernel side here :>)

read_kcore_iter()
-> vread_iter()
-> aligned_vread_iter() (returns 0, indicating error on copy)
  -> gets page via vmalloc_to_page()
-> copy_page_to_iter_nofault()
-> copy_to_user_iter_nofault()
-> copy_to_user_nofault()
-> __copy_to_user_inatomic()
-> raw_copy_to_user()
-> copy_user_generic() [page fault]

In discussion with Mike, he pointed me at execmem_cache_populate() marking
the region as not being direct-map valid via execmem_set_direct_map_valid().

However it seems the problem is that the above logic results in the
following calls:

copy_page_to_iter_nofault() -> kmap_local_page() -> page_to_virt()

Which _assume_ the mapping is in the direct map afaict. It's not, so we get
a page fault, which is fixed up and results in the 0 result and the whole
problem.

So I think this code would have to be modified to be aware of such
non-direct map memory for this to work.

In any case it's moot as this feature is now disabled. But hopefully the
analysis helps Mike in the next spin of his ROX series!

next prev parent reply	other threads:[~2025-01-17 18:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <05ea473e-d7e9-4ca5-ad91-ba8c00618fb4@orange.com>
2025-01-17 13:40 ` Steven Rostedt
2025-01-17 14:44   ` Lorenzo Stoakes
2025-01-17 15:19     ` Alexandre Ferrieux
2025-01-17 15:28       ` Alexandre Ferrieux
2025-01-17 16:31         ` Lorenzo Stoakes
2025-01-17 18:13           ` Lorenzo Stoakes [this message]
2025-01-17 19:27             ` Alexandre Ferrieux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ecb22f03-ca7f-4212-9f02-cceafb9cfb7f@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=alexandre.ferrieux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox