From: Hugh Dickins <hugh@veritas.com>
To: Rik van Riel <riel@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, kernel-testers@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: 2.6.26-rc5-mm2 (swap_state.c:77)
Date: Thu, 12 Jun 2008 22:15:54 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.64.0806122131330.10415@blonde.site> (raw)
In-Reply-To: <20080612152905.6cb294ae@cuia.bos.redhat.com>
On Thu, 12 Jun 2008, Rik van Riel wrote:
> On Thu, 12 Jun 2008 09:58:38 +1000
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> > > Does loopback over tmpfs use a different allocation path?
> >
> > I'm sorry, hmm I didn't look closely enough and forgot that
> > write_begin/write_end requires the callee to allocate the page
> > as well, and that Hugh had nicely unified most of that.
> >
> > So maybe it's not that. It's pretty easy to hit I found with
> > ext2 mounted over loopback on a tmpfs file.
The loop-on-tmpfs write side is okay nowaways, but the read side
still has to use shmem_readpage, with page passed in from splice.
> Turns out the loopback driver uses splice, which moves
> the pages from one place to another. This is why you
> were seeing the problem with loopback, but not with
> just a really big file on tmpfs.
>
> I'm trying to make sense of all the splice code now
> and will send fix as soon as I know how to fix this
> problem in a nice way.
There's no need to make sense of all the splice code, it's just
that it's doing add_to_page_cache_lru (on a page not marked as
SwapBacked), then shmem and swap_state consistency relies on it
as having been marked as SwapBacked. Normally, yes, shmem_getpage
is the one that allocates the page, but in this case it's already
been done outside, awkward (and long predates loop's use of splice).
It's remarkably hard to correct the LRU of a page once it's been
launched towards one. Is it still on this cpu's pagevec? Have we
been preempted and it's on another cpu's pagevec? If it's reached
the LRU, has vmscan whisked it off for a moment, even though it's
PageLocked? Until now it's been that the LRUs are self-correcting,
but these patches move away from that.
I don't know how to fix this problem in a nice way. For the moment,
to proceed with testing, I'm using the hack below. But perhaps that
screws things up for the other !mapping_cap_account_dirty filesystems
e.g. ramfs, I just haven't tried them yet - nor shall in the next
couple of days.
It could be turned into a proper bdi check of its own, instead of
parasiting off cap_account_dirty. But I'm not yet convinced by any
of the PageSwapBacked stuff, so currently preferring a quick hack
to a grand scheme.
It's not clear to me why tmpfs file pages should be counted as anon
pages rather than file pages; though it is clear that switching their
LRU midstream, when swizzled to swap, can have implementation problems.
I don't really get why SwapBacked is the important consideration:
I can see that you may want different balancing for pages mapped
into userspace from pages just cached in kernel; but SwapBacked?
Am I right to think that the memcontrol stuff is now all broken,
because memcontrol.c hasn't yet been converted to the more LRUs?
Certainly I'm now hanging when trying to run in a restricted memcg.
Unrelated fix to compiler warning and silly /proc/meminfo numbers
below too, that one raises fewer questions!
Hugh
--- 2.6.26-rc5-mm3/mm/filemap.c 2008-06-12 11:03:35.000000000 +0100
+++ linux/mm/filemap.c 2008-06-12 21:28:43.000000000 +0100
@@ -496,6 +496,8 @@ int add_to_page_cache_lru(struct page *p
{
int ret = add_to_page_cache(page, mapping, offset, gfp_mask);
if (ret == 0) {
+ if (!mapping_cap_account_dirty(mapping))
+ SetPageSwapBacked(page);
if (page_is_file_cache(page))
lru_cache_add_file(page);
else
--- 2.6.26-rc5-mm3/fs/proc/proc_misc.c 2008-06-12 11:03:28.000000000 +0100
+++ linux/fs/proc/proc_misc.c 2008-06-12 16:58:34.000000000 +0100
@@ -216,7 +216,7 @@ static int meminfo_read_proc(char *page,
K(pages[LRU_INACTIVE_FILE]),
#ifdef CONFIG_UNEVICTABLE_LRU
K(pages[LRU_UNEVICTABLE]),
- K(pages[NR_MLOCK]),
+ K(global_page_state(NR_MLOCK)),
#endif
#ifdef CONFIG_HIGHMEM
K(i.totalhigh),
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-06-12 21:15 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-10 5:31 2.6.26-rc5-mm2 Andrew Morton
2008-06-10 6:12 ` 2.6.26-rc5-mm2 Nick Piggin
2008-06-10 7:28 ` 2.6.26-rc5-mm2 Nick Piggin
2008-06-10 8:34 ` 2.6.26-rc5-mm2 Andrew Morton
2008-06-10 8:48 ` 2.6.26-rc5-mm2 Nick Piggin
2008-06-10 9:15 ` 2.6.26-rc5-mm2 Andrew Morton
2008-06-10 12:34 ` 2.6.26-rc5-mm2 Rik van Riel
2008-06-11 18:09 ` 2.6.26-rc5-mm2 Rik van Riel
2008-06-11 23:58 ` 2.6.26-rc5-mm2 Nick Piggin
2008-06-12 19:29 ` 2.6.26-rc5-mm2 Rik van Riel
2008-06-12 21:15 ` Hugh Dickins [this message]
2008-06-13 17:45 ` 2.6.26-rc5-mm2 (swap_state.c:77) Rik van Riel
2008-06-13 21:15 ` Hugh Dickins
2008-06-13 22:03 ` Rik van Riel
2008-06-10 15:34 ` 2.6.26-rc5-mm2 Lee Schermerhorn
2008-06-10 16:50 ` 2.6.26-rc5-mm2 Hugh Dickins
2008-06-10 10:20 ` 2.6.26-rc5-mm2 lockup up on Intel G33+ICH9R+Core2Duo, -mm1 okay Grant Coady
2008-06-10 18:18 ` Andrew Morton
2008-06-10 21:48 ` Grant Coady
2008-06-10 11:50 ` 2.6.26-rc5-mm2 compile error in vmscan.c Helge Hafting
2008-06-10 12:23 ` Johannes Weiner
2008-06-10 18:37 ` Andrew Morton
2008-06-12 8:13 ` Helge Hafting
2008-06-11 2:26 ` 2.6.26-rc5-mm2 (compile error in mm/memory_hotplug.c) Yasunori Goto
2008-06-11 6:00 ` 2.6.26-rc5-mm2: OOM with 1G free swap Alexey Dobriyan
2008-06-11 6:11 ` Nick Piggin
2008-06-11 6:15 ` Nick Piggin
2008-06-11 6:27 ` Andrew Morton
2008-06-11 6:31 ` Nick Piggin
2008-06-11 6:36 ` KOSAKI Motohiro
2008-06-11 7:31 ` Frederik Deweerdt
2008-06-11 12:57 ` Rik van Riel
2008-06-11 13:44 ` Nick Piggin
2008-06-11 17:56 ` [BUG] 2.6.26-rc5-mm2 - kernel BUG at arch/x86/kernel/setup.c:388! Kamalesh Babulal
2008-06-11 18:28 ` Dave Hansen
2008-06-11 18:37 ` Vegard Nossum
2008-06-12 6:55 ` Kamalesh Babulal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0806122131330.10415@blonde.site \
--to=hugh@veritas.com \
--cc=akpm@linux-foundation.org \
--cc=kernel-testers@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox