linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: pacman@kosh.dhis.org
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
	Mel Gorman <mel@csn.ul.ie>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
Date: Tue, 19 Oct 2010 22:23:45 -0500 (GMT+5)	[thread overview]
Message-ID: <20101020032345.5240.qmail@kosh.dhis.org> (raw)
In-Reply-To: <1287522168.2198.5.camel@pasglop>

Benjamin Herrenschmidt writes:
> 
> On Tue, 2010-10-19 at 22:47 +0200, Segher Boessenkool wrote:
> > 
> > It looks like it is the frame counter in an USB OHCI HCCA.
> > 16-bit, 1kHz update, offset x'80 in a page.
> > 
> > So either the kernel forgot to call quiesce on it, or the firmware
> > doesn't implement that, or the firmware messed up some other way.
> 
> I vote for the FW being on crack. Wouldn't be the first time with
> Pegasos.
> 
> It's an OHCI or an UHCI in there ?

There's one of each... UHCI on the motherboard, OHCI on a card in a PCI
expansion slot. They shipped the ODW with the extra controller on an
expansion card since the on-board UHCI doesn't do USB2.0.

And that OHCI controller does appear to be the culprit. The 2 affected
addresses tick at 1000Hz until ohci-hcd is modprobe'd, then they stop.

I think the mm people can consider this closed. 6dda9d55 didn't do anything
but expose a problem which has been here all along. Will drop them from Cc
list in any further messages.

> 
> Can you try in prom_init.c changing the prom_close_stdin() function to
> also close "stdout" ? 
> 
>          if (prom_getprop(_prom->chosen, "stdin", &val, sizeof(val)) > 0)
>                  call_prom("close", 1, 0, val);
> +        if (prom_getprop(_prom->chosen, "stdout", &val, sizeof(val)) > 0)
> +               call_prom("close", 1, 0, val);
> 
> See if that makes a difference ?

Huge difference. With no stdout to print to, the kernel seems to freeze up.
Or at least it loses the console. The last message it prints is "Device tree
struct 0x00933000 -> 0x00957000" then there's just nothing. I waited a while
for the console to come on but it didn't.

The diff fragment above applied inside prom_close_stdin, but there are some
prom_printf calls after prom_close_stdin. Calling prom_printf after closing
stdout sounds like it could be bad. If I moved it down below all the
prom_printf's, it would be after the "quiesce" call. Would that be acceptable
(or even interesting as an experiment)? Does a close need a quiesce after it?

-- 
Alan Curry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-10-20  3:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-09  9:57 pacman
2010-10-11 12:52 ` Christoph Lameter
2010-10-11 14:30 ` Mel Gorman
2010-10-11 20:35   ` pacman
2010-10-11 21:00   ` Andrew Morton
2010-10-13 14:40     ` Mel Gorman
2010-10-13 17:52       ` pacman
2010-10-18 11:33         ` Mel Gorman
2010-10-18 19:10           ` pacman
2010-10-18 21:10             ` Benjamin Herrenschmidt
2010-10-18 21:33               ` pacman
2010-10-19 10:16                 ` Benjamin Herrenschmidt
2010-10-19 18:10                   ` pacman
2010-10-19 20:47                     ` Segher Boessenkool
2010-10-19 21:02                       ` Benjamin Herrenschmidt
2010-10-20  3:23                         ` pacman [this message]
2010-10-20 10:32                           ` Benjamin Herrenschmidt
2010-10-19 20:58                     ` Benjamin Herrenschmidt
2010-10-18 19:37           ` Andrew Morton
2010-10-18 21:02             ` Benjamin Herrenschmidt
2010-10-18 21:55             ` Thomas Gleixner
2010-10-19 16:24               ` Helmut Grohne
2010-10-19 16:42                 ` Thomas Gleixner
2010-10-18 20:59       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101020032345.5240.qmail@kosh.dhis.org \
    --to=pacman@kosh.dhis.org \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mel@csn.ul.ie \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox