linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	linux-mm@kvack.org,
	"linux-kernel@vger.kernel.org List"
	<linux-kernel@vger.kernel.org>
Subject: Re: Oops in VMA code
Date: Thu, 16 Jun 2011 09:06:55 +0200	[thread overview]
Message-ID: <4DDCD104-305E-48B1-8155-BD17380632F2@suse.de> (raw)
In-Reply-To: <BANLkTimB5gEZ2S=b9EiiWR-_u+o+wEPyjw@mail.gmail.com>


On 16.06.2011, at 08:54, Linus Torvalds wrote:

> On Wed, Jun 15, 2011 at 11:20 PM, Alexander Graf <agraf@suse.de> wrote:
>> 
>> On 16.06.2011, at 07:59, Linus Torvalds wrote:
>>> 
>>> r26 has the value 0xc00090026236bbb0, and that "90" byte in the middle
>>> there looks bogus. It's not a valid pointer any more, but if that "9"
>>> had been a zero, it would have been.
>> 
>> Please see my reply to Ben here.
> 
> Your reply to Ben seems to say that 0xc00000026236bbb0 wouldn't have
> been a valid address, because you don't have that much memory.
> 
> But that's clearly not true. All the other registers have valid
> pointers in them, and the stack pointer (r1) is c000000262987cd0, for
> example. And that stack is clearly valid - if the kernel stack pointer
> was corrupted, you'd never have gotten as far as reporting the oops.
> 
> So you may have only 8GB of RAM in that machine, but if so, there's
> some empty unmapped physical space. Because clearly your RAM is _not_
> limited to being mapped to below 0xc000000200000000.

Ah, yes. The PowerMacs have this nice memory hole, so RAM is actually mapped non-linearly:

Top of RAM: 0x280000000, Total RAM: 0x200000000

So you're right. The address does look valid.

> To recap: I'm pretty sure the memory corruption is just the "90" byte.
> The rest of the pointer looks too much like a pointer to be otherwise.
> Whether that's due to a two-bit error (unlikely) or a wild byte write
> (or 16-bit write with zeroes) is hard to say. USUALLY when we have
> wild pointer errors, the corruption is more than just a few bits, but
> it could have been something that sets a few bits in software, and
> just sets them using a stale pointer.

That could very well be - the unaligned location is very odd indeed. So some ORing function sounds likely.

>> Yup, so let's keep this documented for now. Actually, the more I think about it the more it looks like simple random memory corruption by someone else in the kernel - and that's basically impossible to track and will give completely different bugs next time around :(.
> 
> We've had several bugs found by the pattern of the corruption, so I
> wouldn't say "impossible to track". Even if the next time ends up
> being a completely different oops (because the corruption happened in
> a totally different kind of data structure), it might be possible that
> there's that same "90" byte pattern, for example.
> 
> But it needs more than one bug report to see what the pattern is.
> Usually it takes a _lot_ more..

Yeah, let's wait for that moment then :). For now everything's pure speculation.


Alex

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-16  7:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-15 21:59 Alexander Graf
2011-06-16  4:32 ` Linus Torvalds
2011-06-16  5:32   ` Alexander Graf
2011-06-16  5:59     ` Linus Torvalds
2011-06-16  6:20       ` Alexander Graf
2011-06-16  6:54         ` Linus Torvalds
2011-06-16  7:06           ` Alexander Graf [this message]
2011-06-16  7:14           ` Benjamin Herrenschmidt
2011-06-16  6:02     ` Benjamin Herrenschmidt
2011-06-16  6:12       ` Alexander Graf
2011-06-16  6:16       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DDCD104-305E-48B1-8155-BD17380632F2@suse.de \
    --to=agraf@suse.de \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox