From: Adam Litke <agl@us.ibm.com>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Hugetlb: Shared memory race
Date: Tue, 10 Jan 2006 13:22:31 -0600 [thread overview]
Message-ID: <1136920951.23288.5.camel@localhost.localdomain> (raw)
I have discovered a race caused by the interaction of demand faulting
with the hugetlb overcommit accounting patch. Attached is a workaround
for the problem. Can anyone suggest a better approach to solving the
race I'll describe below? If not, would the attached workaround be
acceptable?
The race occurs when multiple threads shmat a hugetlb area and begin
faulting in it's pages. During a hugetlb fault, hugetlb_no_page checks
for the page in the page cache. If not found, it allocates (and zeroes)
a new page and tries to add it to the page cache. If this fails, the
huge page is freed and we retry the page cache lookup (assuming someone
else beat us to the add_to_page_cache call).
The above works fine, but due to the large window (while zeroing the
huge page) it is possible that many threads could be "borrowing" pages
only to return them later. This causes free_hugetlb_pages to be lower
than the logical number of free pages and some threads trying to shmat
can falsely fail the accounting check.
The workaround disables the accounting check that happens at shmat time.
It was already done at shmget time (which is the normal semantics
anyway).
Signed-off-by: Adam Litke <agl@us.ibm.com>
inode.c | 10 ++++++++++
1 files changed, 10 insertions(+)
diff -upN reference/fs/hugetlbfs/inode.c current/fs/hugetlbfs/inode.c
--- reference/fs/hugetlbfs/inode.c
+++ current/fs/hugetlbfs/inode.c
@@ -74,6 +74,14 @@ huge_pages_needed(struct address_space *
pgoff_t next = vma->vm_pgoff;
pgoff_t endpg = next + ((end - start) >> PAGE_SHIFT);
+ /*
+ * Accounting for shared memory segments is done at shmget time
+ * so we can skip the check now to avoid a race where hugetlb_no_page
+ * is zeroing hugetlb pages not yet in the page cache.
+ */
+ if (vma->vm_file->f_dentry->d_inode->i_blocks != 0)
+ return 0;
+
pagevec_init(&pvec, 0);
while (next < endpg) {
if (!pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE))
@@ -832,6 +840,8 @@ struct file *hugetlb_zero_setup(size_t s
d_instantiate(dentry, inode);
inode->i_size = size;
+ /* Mark this file is used for shared memory */
+ inode->i_blocks = 1;
inode->i_nlink = 0;
file->f_vfsmnt = mntget(hugetlbfs_vfsmount);
file->f_dentry = dentry;
--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2006-01-10 19:22 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-10 19:22 Adam Litke [this message]
2006-01-10 19:44 ` William Lee Irwin III
2006-01-11 22:02 ` [PATCH 1/2] hugetlb: Delay page zeroing for faulted pages Adam Litke
2006-01-11 22:24 ` [PATCH 2/2] hugetlb: synchronize alloc with page cache insert Adam Litke
2006-01-11 22:52 ` William Lee Irwin III
2006-01-11 23:03 ` Adam Litke
2006-01-11 23:24 ` William Lee Irwin III
2006-01-11 23:46 ` Chen, Kenneth W
2006-01-12 0:40 ` Chen, Kenneth W
2006-01-12 1:05 ` William Lee Irwin III
2006-01-12 17:26 ` Adam Litke
2006-01-12 19:07 ` Chen, Kenneth W
2006-01-12 19:48 ` Adam Litke
2006-01-12 20:06 ` Chen, Kenneth W
2006-01-11 22:42 ` [PATCH 1/2] hugetlb: Delay page zeroing for faulted pages William Lee Irwin III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1136920951.23288.5.camel@localhost.localdomain \
--to=agl@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox