* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
[not found] ` <4AB29F4A.3030102@intel.com>
@ 2009-09-17 22:27 ` Rafael J. Wysocki
2009-09-22 23:35 ` Karol Lewandowski
0 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2009-09-17 22:27 UTC (permalink / raw)
To: david.graham
Cc: Karol Lewandowski, e1000-devel, linux-kernel, linux-mm, Andrew Morton
On Thursday 17 September 2009, Graham, David wrote:
> Rafael J. Wysocki wrote:
> > On Tuesday 15 September 2009, Karol Lewandowski wrote:
> >> Hello,
> >>
> >> I'm getting following oops sometimes during resume on my Thinkpad T21
> >> (where "sometimes" means about 10/1 good/bad ratio):
> >>
> >> ifconfig: page allocation failure. order:5, mode:0x8020
> >
> > Well, this only tells you that an attempt to make order 5 allocation failed,
> > which is not unusual at all.
> >
> > Allocations of this order are quite likely to fail if memory is fragmented,
> > the probability of which rises with the number of suspend-resume cycles already
> > carried out.
> >
> > I guess the driver releases its DMA buffer during suspend and attempts to
> > allocate it back on resume, which is not really smart (if that really is the
> > case).
> >
> Yes, we free a 70KB block (0x80 by 0x230 bytes) on suspend and
> reallocate on resume, and so that's an Order 5 request. It looks
> symmetric, and hasn't changed for years. I don't think we are leaking
> memory, which points back to that the memory is too fragmented to
> satisfy the request.
>
> I also concur that Rafael's commit 6905b1f1 shouldn't change the logic
> in the driver for systems with e100 (like yours Karol) that could
> already sleep, and I don't see anything else in the driver that looks to
> be relevant. I'm expecting that your test result without commit 6905b1f1
> will still show the problem.
>
> So I wonder if this new issue may be triggered by some other change in
> the memory subsystem ?
I think so. There have been reports about order 2 allocations failing for
2.6.31, so it looks like newer kernels are more likely to expose such problems.
Adding linux-mm to the CC list.
Thanks,
Rafael
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-17 22:27 ` [BUG 2.6.30+] e100 sometimes causes oops during resume Rafael J. Wysocki
@ 2009-09-22 23:35 ` Karol Lewandowski
2009-09-22 23:51 ` Rafael J. Wysocki
2009-09-29 13:58 ` Mel Gorman
0 siblings, 2 replies; 9+ messages in thread
From: Karol Lewandowski @ 2009-09-22 23:35 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: david.graham, Karol Lewandowski, e1000-devel, linux-kernel,
linux-mm, Andrew Morton
On Fri, Sep 18, 2009 at 12:27:37AM +0200, Rafael J. Wysocki wrote:
> On Thursday 17 September 2009, Graham, David wrote:
> > Rafael J. Wysocki wrote:
> > > I guess the driver releases its DMA buffer during suspend and attempts to
> > > allocate it back on resume, which is not really smart (if that really is the
> > > case).
> > Yes, we free a 70KB block (0x80 by 0x230 bytes) on suspend and
> > reallocate on resume, and so that's an Order 5 request. It looks
> > symmetric, and hasn't changed for years. I don't think we are leaking
> > memory, which points back to that the memory is too fragmented to
> > satisfy the request.
> >
> > I also concur that Rafael's commit 6905b1f1 shouldn't change the logic
> > in the driver for systems with e100 (like yours Karol) that could
> > already sleep, and I don't see anything else in the driver that looks to
> > be relevant. I'm expecting that your test result without commit 6905b1f1
> > will still show the problem.
> >
> > So I wonder if this new issue may be triggered by some other change in
> > the memory subsystem ?
> I think so. There have been reports about order 2 allocations failing for
> 2.6.31, so it looks like newer kernels are more likely to expose such problems.
>
> Adding linux-mm to the CC list.
I've hit this bug 2 times since my last email. Is there anything I
could do?
Maybe I should revert following commits (chosen somewhat randomly)?
1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
commit above
Any ideas?
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-22 23:35 ` Karol Lewandowski
@ 2009-09-22 23:51 ` Rafael J. Wysocki
2009-09-23 14:22 ` Karol Lewandowski
2009-09-29 13:58 ` Mel Gorman
1 sibling, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2009-09-22 23:51 UTC (permalink / raw)
To: Karol Lewandowski
Cc: david.graham, e1000-devel, linux-kernel, linux-mm, Andrew Morton
On Wednesday 23 September 2009, Karol Lewandowski wrote:
> On Fri, Sep 18, 2009 at 12:27:37AM +0200, Rafael J. Wysocki wrote:
> > On Thursday 17 September 2009, Graham, David wrote:
> > > Rafael J. Wysocki wrote:
> > > > I guess the driver releases its DMA buffer during suspend and attempts to
> > > > allocate it back on resume, which is not really smart (if that really is the
> > > > case).
>
> > > Yes, we free a 70KB block (0x80 by 0x230 bytes) on suspend and
> > > reallocate on resume, and so that's an Order 5 request. It looks
> > > symmetric, and hasn't changed for years. I don't think we are leaking
> > > memory, which points back to that the memory is too fragmented to
> > > satisfy the request.
> > >
> > > I also concur that Rafael's commit 6905b1f1 shouldn't change the logic
> > > in the driver for systems with e100 (like yours Karol) that could
> > > already sleep, and I don't see anything else in the driver that looks to
> > > be relevant. I'm expecting that your test result without commit 6905b1f1
> > > will still show the problem.
> > >
> > > So I wonder if this new issue may be triggered by some other change in
> > > the memory subsystem ?
>
> > I think so. There have been reports about order 2 allocations failing for
> > 2.6.31, so it looks like newer kernels are more likely to expose such problems.
> >
> > Adding linux-mm to the CC list.
>
> I've hit this bug 2 times since my last email. Is there anything I
> could do?
>
> Maybe I should revert following commits (chosen somewhat randomly)?
>
> 1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
>
> 2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
> commit above
>
> Any ideas?
You can try that IMO.
Best,
Rafael
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-22 23:51 ` Rafael J. Wysocki
@ 2009-09-23 14:22 ` Karol Lewandowski
2009-09-23 21:45 ` Rafael J. Wysocki
0 siblings, 1 reply; 9+ messages in thread
From: Karol Lewandowski @ 2009-09-23 14:22 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Karol Lewandowski, david.graham, e1000-devel, linux-kernel,
linux-mm, Andrew Morton
On Wed, Sep 23, 2009 at 01:51:36AM +0200, Rafael J. Wysocki wrote:
> On Wednesday 23 September 2009, Karol Lewandowski wrote:
> > On Fri, Sep 18, 2009 at 12:27:37AM +0200, Rafael J. Wysocki wrote:
> > > Adding linux-mm to the CC list.
> >
> > I've hit this bug 2 times since my last email. Is there anything I
> > could do?
> >
> > Maybe I should revert following commits (chosen somewhat randomly)?
> >
> > 1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
> >
> > 2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
> > commit above
> >
> > Any ideas?
>
> You can try that IMO.
Reverting commits above made situation worse. Hints? Obvious
solutions? ;-)
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-23 14:22 ` Karol Lewandowski
@ 2009-09-23 21:45 ` Rafael J. Wysocki
0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2009-09-23 21:45 UTC (permalink / raw)
To: Karol Lewandowski
Cc: david.graham, e1000-devel, linux-kernel, linux-mm, Andrew Morton
On Wednesday 23 September 2009, Karol Lewandowski wrote:
> On Wed, Sep 23, 2009 at 01:51:36AM +0200, Rafael J. Wysocki wrote:
> > On Wednesday 23 September 2009, Karol Lewandowski wrote:
> > > On Fri, Sep 18, 2009 at 12:27:37AM +0200, Rafael J. Wysocki wrote:
> > > > Adding linux-mm to the CC list.
> > >
> > > I've hit this bug 2 times since my last email. Is there anything I
> > > could do?
> > >
> > > Maybe I should revert following commits (chosen somewhat randomly)?
> > >
> > > 1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
> > >
> > > 2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
> > > commit above
> > >
> > > Any ideas?
> >
> > You can try that IMO.
>
> Reverting commits above made situation worse. Hints? Obvious
> solutions? ;-)
Not really, at least not from me. :-(
Best,
Rafael
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-22 23:35 ` Karol Lewandowski
2009-09-22 23:51 ` Rafael J. Wysocki
@ 2009-09-29 13:58 ` Mel Gorman
2009-09-30 15:37 ` Karol Lewandowski
1 sibling, 1 reply; 9+ messages in thread
From: Mel Gorman @ 2009-09-29 13:58 UTC (permalink / raw)
To: Karol Lewandowski
Cc: Rafael J. Wysocki, david.graham, e1000-devel, linux-kernel,
linux-mm, Andrew Morton
On Wed, Sep 23, 2009 at 01:35:31AM +0200, Karol Lewandowski wrote:
> On Fri, Sep 18, 2009 at 12:27:37AM +0200, Rafael J. Wysocki wrote:
> > On Thursday 17 September 2009, Graham, David wrote:
> > > Rafael J. Wysocki wrote:
> > > > I guess the driver releases its DMA buffer during suspend and attempts to
> > > > allocate it back on resume, which is not really smart (if that really is the
> > > > case).
>
> > > Yes, we free a 70KB block (0x80 by 0x230 bytes) on suspend and
> > > reallocate on resume, and so that's an Order 5 request. It looks
> > > symmetric, and hasn't changed for years. I don't think we are leaking
> > > memory, which points back to that the memory is too fragmented to
> > > satisfy the request.
> > >
> > > I also concur that Rafael's commit 6905b1f1 shouldn't change the logic
> > > in the driver for systems with e100 (like yours Karol) that could
> > > already sleep, and I don't see anything else in the driver that looks to
> > > be relevant. I'm expecting that your test result without commit 6905b1f1
> > > will still show the problem.
> > >
> > > So I wonder if this new issue may be triggered by some other change in
> > > the memory subsystem ?
>
> > I think so. There have been reports about order 2 allocations failing for
> > 2.6.31, so it looks like newer kernels are more likely to expose such problems.
> >
> > Adding linux-mm to the CC list.
>
> I've hit this bug 2 times since my last email. Is there anything I
> could do?
>
> Maybe I should revert following commits (chosen somewhat randomly)?
>
> 1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
>
> 2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
> commit above
>
> Any ideas?
>
Those commits should only make a difference on small-memory machines.
The exact value of "small" varies but on 32 bit x86 without PAE, it would
be 20MB of RAM. The fact reverting the two patches makes any difference at
all is a surprise and likely a co-incidence.
If you have a reliable reproduction case, would it be possible to bisect
between the points
d239171e4f6efd58d7e423853056b1b6a74f1446..b70d94ee438b3fd9c15c7691d7a932a135c18101
to see if the problem is in there anywhere?
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-29 13:58 ` Mel Gorman
@ 2009-09-30 15:37 ` Karol Lewandowski
2009-09-30 15:55 ` Mel Gorman
0 siblings, 1 reply; 9+ messages in thread
From: Karol Lewandowski @ 2009-09-30 15:37 UTC (permalink / raw)
To: Mel Gorman
Cc: Karol Lewandowski, Rafael J. Wysocki, david.graham, e1000-devel,
linux-kernel, linux-mm, Andrew Morton
On Tue, Sep 29, 2009 at 02:58:11PM +0100, Mel Gorman wrote:
> On Wed, Sep 23, 2009 at 01:35:31AM +0200, Karol Lewandowski wrote:
> > Maybe I should revert following commits (chosen somewhat randomly)?
> >
> > 1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
> >
> > 2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
> > commit above
> >
> > Any ideas?
> >
>
> Those commits should only make a difference on small-memory machines.
> The exact value of "small" varies but on 32 bit x86 without PAE, it would
> be 20MB of RAM. The fact reverting the two patches makes any difference at
> all is a surprise and likely a co-incidence.
>
> If you have a reliable reproduction case, would it be possible to bisect
> between the points
> d239171e4f6efd58d7e423853056b1b6a74f1446..b70d94ee438b3fd9c15c7691d7a932a135c18101
> to see if the problem is in there anywhere?
I've started with bc75d33f0 (one commit before d239171e4 in Linus'
tree) but then my system fails to resume.
Whatever I do (change fb/Xorg drivers, disable X, etc.) I always end
up with unusable display and something that looks like hard-locked
system (I haven't tested network connectivity from another box, but
console is surely dead).
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-30 15:37 ` Karol Lewandowski
@ 2009-09-30 15:55 ` Mel Gorman
2009-09-30 18:48 ` Karol Lewandowski
0 siblings, 1 reply; 9+ messages in thread
From: Mel Gorman @ 2009-09-30 15:55 UTC (permalink / raw)
To: Karol Lewandowski
Cc: Rafael J. Wysocki, david.graham, e1000-devel, linux-kernel,
linux-mm, Andrew Morton
On Wed, Sep 30, 2009 at 05:37:30PM +0200, Karol Lewandowski wrote:
> On Tue, Sep 29, 2009 at 02:58:11PM +0100, Mel Gorman wrote:
> > On Wed, Sep 23, 2009 at 01:35:31AM +0200, Karol Lewandowski wrote:
> > > Maybe I should revert following commits (chosen somewhat randomly)?
> > >
> > > 1. 49255c619fbd482d704289b5eb2795f8e3b7ff2e
> > >
> > > 2. dd5d241ea955006122d76af88af87de73fec25b4 - alters changes made by
> > > commit above
> > >
> > > Any ideas?
> > >
> >
> > Those commits should only make a difference on small-memory machines.
> > The exact value of "small" varies but on 32 bit x86 without PAE, it would
> > be 20MB of RAM. The fact reverting the two patches makes any difference at
> > all is a surprise and likely a co-incidence.
> >
> > If you have a reliable reproduction case, would it be possible to bisect
> > between the points
> > d239171e4f6efd58d7e423853056b1b6a74f1446..b70d94ee438b3fd9c15c7691d7a932a135c18101
> > to see if the problem is in there anywhere?
>
> I've started with bc75d33f0 (one commit before d239171e4 in Linus'
> tree) but then my system fails to resume.
>
Does the bug require a suspend/resume or would something like
rmmod e100
updatedb
modprobe e100
reproduce the problem?
> Whatever I do (change fb/Xorg drivers, disable X, etc.) I always end
> up with unusable display and something that looks like hard-locked
> system (I haven't tested network connectivity from another box, but
> console is surely dead).
>
> Thanks.
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG 2.6.30+] e100 sometimes causes oops during resume
2009-09-30 15:55 ` Mel Gorman
@ 2009-09-30 18:48 ` Karol Lewandowski
0 siblings, 0 replies; 9+ messages in thread
From: Karol Lewandowski @ 2009-09-30 18:48 UTC (permalink / raw)
To: Mel Gorman
Cc: Karol Lewandowski, Rafael J. Wysocki, david.graham, e1000-devel,
linux-kernel, linux-mm, Andrew Morton
On Wed, Sep 30, 2009 at 04:55:43PM +0100, Mel Gorman wrote:
> On Wed, Sep 30, 2009 at 05:37:30PM +0200, Karol Lewandowski wrote:
> > I've started with bc75d33f0 (one commit before d239171e4 in Linus'
> > tree) but then my system fails to resume.
> >
>
> Does the bug require a suspend/resume or would something like
>
> rmmod e100
> updatedb
> modprobe e100
>
> reproduce the problem?
Yes, it does reproduce the problem. Thanks a lot for that.
I'll try to bisect it as my free time permits (which may take a while,
unfortunately).
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-09-30 18:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20090915120538.GA26806@bizet.domek.prywatny>
[not found] ` <200909170118.53965.rjw@sisk.pl>
[not found] ` <4AB29F4A.3030102@intel.com>
2009-09-17 22:27 ` [BUG 2.6.30+] e100 sometimes causes oops during resume Rafael J. Wysocki
2009-09-22 23:35 ` Karol Lewandowski
2009-09-22 23:51 ` Rafael J. Wysocki
2009-09-23 14:22 ` Karol Lewandowski
2009-09-23 21:45 ` Rafael J. Wysocki
2009-09-29 13:58 ` Mel Gorman
2009-09-30 15:37 ` Karol Lewandowski
2009-09-30 15:55 ` Mel Gorman
2009-09-30 18:48 ` Karol Lewandowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox