From: Miquel van Smoorenburg <miquels@cistron.nl>
To: linux-mm@kvack.org
Subject: pages not marked as accessed on non-page boundaries
Date: Sun, 5 Dec 2004 15:13:43 +0100 [thread overview]
Message-ID: <20041205141342.GA29174@cistron.nl> (raw)
(Not sure if this should go to linux-kernel or linux-mm,
I'll try the latter first)
In the current kernel (I used 2.6.9), pages read into memory
through read() are only marked as accessed if the read
started at offset 0 of the page.
When you have a database accessing small amounts of data
in an index file randomly, then most of those pages will
not be marked as read and will be thrown out too soon.
I noticed this when I was writing a patch for something else-
posix_fadvise(LINUX_FADV_STICKY) support, with which you can
ask the kernel to try to keep the pages of a file in core
a bit more aggressively than normal. I'l probably post that later.
Would it be a good thing to fix this ? Patch is below.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
PATCH: mark_page_accessed() for read()s on non-page boundaries
When reading a (partial) page from disk using read(), the kernel only
marks the page as "accessed" if the read started at a page boundary.
This means that files that are accessed randomly at non-page boundaries
(usually database style files) will not be cached properly.
The patch below uses the readahead state instead. If a page is read(),
it is marked as "accessed" if the previous read() was for a different
page, whatever the offset in the page.
Signed-Off-By: Miquel van Smoorenburg <miquels@cistron.nl>
diff --exclude-from=exclude -ruN linux-2.6.9-rc4-tw.ORIG/mm/filemap.c linux-2.6.9-rc4-tw/mm/filemap.c
--- linux-2.6.9-rc4.ORIG/mm/filemap.c 2004-10-23 22:21:18.000000000 +0200
+++ linux-2.6.9-rc4/mm/filemap.c 2004-10-25 12:58:26.000000000 +0200
@@ -718,6 +718,7 @@
{
struct inode *inode = mapping->host;
unsigned long index, end_index, offset;
+ unsigned long prev_page;
loff_t isize;
struct page *cached_page;
int error;
@@ -748,6 +749,8 @@
}
nr = nr - offset;
+ prev_page = ra.next_size ? ra.prev_page : (unsigned long)-1;
+
cond_resched();
page_cache_readahead(mapping, &ra, filp, index);
@@ -755,10 +758,13 @@
page = find_get_page(mapping, index);
if (unlikely(page == NULL)) {
handle_ra_miss(mapping, &ra, index);
+ prev_page = (unsigned long)-1;
goto no_cached_page;
}
- if (!PageUptodate(page))
+ if (!PageUptodate(page)) {
+ prev_page = (unsigned long)-1;
goto page_not_up_to_date;
+ }
page_ok:
/* If users can be writing to this page using arbitrary
@@ -769,9 +775,10 @@
flush_dcache_page(page);
/*
- * Mark the page accessed if we read the beginning.
+ * Mark the page accessed only if this was the initial
+ * read, not for subsequential sub-page sized reads.
*/
- if (!offset)
+ if (prev_page != index)
mark_page_accessed(page);
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next reply other threads:[~2004-12-05 14:13 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-05 14:13 Miquel van Smoorenburg [this message]
2004-12-05 14:44 ` Rik van Riel
2004-12-06 14:44 ` Miquel van Smoorenburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041205141342.GA29174@cistron.nl \
--to=miquels@cistron.nl \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox