From: Linus Torvalds <torvalds@linux-foundation.org>
To: Dave Jones <davej@redhat.com>, Hillf Danton <dhillf@gmail.com>,
Linux-MM <linux-mm@kvack.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Hugh Dickins <hughd@google.com>
Subject: Re: unused swap offset / bad page map.
Date: Mon, 26 Aug 2013 13:15:59 -0700 [thread overview]
Message-ID: <CA+55aFw_bhMOP73owFHRFHZDAYEdWgF9j-502Aq9tZe3tEfmwg@mail.gmail.com> (raw)
In-Reply-To: <20130826190757.GB27768@redhat.com>
On Mon, Aug 26, 2013 at 12:08 PM, Dave Jones <davej@redhat.com> wrote:
>
> [ 4588.541886] swap_free: Unused swap offset entry 00002d15
> [ 4588.541952] BUG: Bad page map in process trinity-kid12 pte:005a2a80 pmd:22c01f067
>
> I can reproduce this pretty quickly by driving the system into swapping using
> a few instances of 'trinity -C64' (this creates 64 threads)
>
> I'm not sure how far back this bug goes, so I'll try some older kernels
> and see if I can bisect it, because we don't seem to be getting closer
> to figuring out what's actually happening..
Bisecting would indeed be good. But I get the feeling that you'll need
to go back a *long* time, because the swap_map[] code hasn't changed
in ages.
I'm adding Hugh Dickins to the cc just in case he hasn't seen this on
linux-mm, because the swap_map[] code is complex as hell, and Hugh did
touch some of it last. The whole swap_map[] thing is complicated by:
- it's a single byte per swap entry
- it's not even a *structured* byte, but a single counter that has
several "fields" by hand
- it has a count in the low 6 bits, with a magic "bad" value (which
is also a magic "continuation" value if one of the high bits are set)
- it has two magic bits: HAS_CACHE and CONTINUED
- it has a _third_ magic value (SWAP_MAP_SHMEM) which is "CONTINUED+BAD"
- we increment this nasty pseudo-counter wildly hackily, and and have
magic special case checks for the odd cases
and if we get any of the special cases wrong, we'll
increment/decrement it wrong, and we're screwed.
The *locking* looks pretty simple, though. It's a simple spinlock. We
do some optimistic tests outside the spinlock, but the actual
allocation and modification seem to all be inside the lock and
re-check any optimistic values afaik.
So I'm almost likely to think that we are more likely to have
something wrong in the messy magical special cases. I'm wondering if
we should get rid of the continuation crap, for example, and expand
the "one byte per swap page" to two bytes instead.
Hugh, I think you know this code best, because you added the last
special case (that SWAP_MAP_SHMEM value). Comments?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-08-26 20:16 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-07 5:51 Dave Jones
2013-08-07 10:04 ` Hillf Danton
2013-08-07 15:30 ` Dave Jones
2013-08-08 15:20 ` Hillf Danton
2013-08-08 15:36 ` Dave Jones
2013-08-19 23:18 ` Dave Jones
2013-08-20 4:39 ` Hillf Danton
2013-08-21 20:49 ` Dave Jones
2013-08-22 0:35 ` Hillf Danton
2013-08-22 3:21 ` Hillf Danton
2013-08-23 3:21 ` Dave Jones
2013-08-23 3:27 ` Hillf Danton
2013-08-23 3:53 ` Dave Jones
2013-08-26 3:45 ` Hillf Danton
2013-08-26 19:08 ` Dave Jones
2013-08-26 20:15 ` Linus Torvalds [this message]
2013-08-26 20:46 ` Linus Torvalds
2013-08-26 22:08 ` Hugh Dickins
2013-08-26 22:28 ` Dave Jones
2013-08-27 8:37 ` Cyrill Gorcunov
2013-08-27 16:24 ` Dave Jones
2013-08-27 16:32 ` Cyrill Gorcunov
2013-08-26 23:15 ` Linus Torvalds
2013-08-27 5:44 ` Cyrill Gorcunov
2013-08-26 20:18 ` Cyrill Gorcunov
2013-08-26 20:37 ` Dave Jones
2013-08-26 20:42 ` Cyrill Gorcunov
2013-08-26 21:37 ` Cyrill Gorcunov
2013-08-26 21:42 ` Dave Jones
2013-08-26 21:49 ` Cyrill Gorcunov
2013-08-26 21:59 ` Dave Jones
2013-08-07 15:54 ` Dave Jones
2013-08-23 9:08 Hillf Danton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+55aFw_bhMOP73owFHRFHZDAYEdWgF9j-502Aq9tZe3tEfmwg@mail.gmail.com \
--to=torvalds@linux-foundation.org \
--cc=davej@redhat.com \
--cc=dhillf@gmail.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox