* [RFC:PATCH 001/012] Make iommu_map_sg deal with less-than-page-aligned data
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
@ 2007-05-24 12:11 ` Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 002/012] Allow file systems to specify whether to store file tails Dave Kleikamp
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:11 UTC (permalink / raw)
To: linux-mm
Make iommu_map_sg deal with less-than-page-aligned data
The code actually assumes that the page_address() is page aligned (or
at least IOMMU_PAGE-aligned).
Using vaddr is more accurate, and saves a pointer dereference as well.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
arch/powerpc/kernel/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -Nurp linux000/arch/powerpc/kernel/iommu.c linux001/arch/powerpc/kernel/iommu.c
--- linux000/arch/powerpc/kernel/iommu.c 2007-05-21 15:14:49.000000000 -0500
+++ linux001/arch/powerpc/kernel/iommu.c 2007-05-23 22:53:11.000000000 -0500
@@ -325,7 +325,7 @@ int iommu_map_sg(struct iommu_table *tbl
/* Convert entry to a dma_addr_t */
entry += tbl->it_offset;
dma_addr = entry << IOMMU_PAGE_SHIFT;
- dma_addr |= (s->offset & ~IOMMU_PAGE_MASK);
+ dma_addr |= (vaddr & ~IOMMU_PAGE_MASK);
DBG(" - %lu pages, entry: %lx, dma_addr: %lx\n",
npages, entry, dma_addr);
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 002/012] Allow file systems to specify whether to store file tails
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 001/012] Make iommu_map_sg deal with less-than-page-aligned data Dave Kleikamp
@ 2007-05-24 12:11 ` Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 003/012] Add tail to address space and define PG_pagetail page flag Dave Kleikamp
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:11 UTC (permalink / raw)
To: linux-mm
Allow file systems to specify whether to enable page cache tails
This allows us to test and enable each file system independently. It also
gives the file system the flexibility to have a mount flag enable or disable
page cache tails.
Initially, I am only testing on ext4 & jfs, so as not to damage my root file
system (ext3).
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
fs/ext4/super.c | 4 ++++
fs/jfs/super.c | 3 +++
include/linux/fs.h | 2 ++
3 files changed, 9 insertions(+)
diff -Nurp linux001/fs/ext4/super.c linux002/fs/ext4/super.c
--- linux001/fs/ext4/super.c 2007-05-21 15:15:33.000000000 -0500
+++ linux002/fs/ext4/super.c 2007-05-23 22:53:11.000000000 -0500
@@ -1548,6 +1548,10 @@ static int ext4_fill_super (struct super
sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
((sbi->s_mount_opt & EXT4_MOUNT_POSIX_ACL) ? MS_POSIXACL : 0);
+#ifdef CONFIG_VM_FILE_TAILS
+ /* ToDo: Make this a mount option */
+ sb->s_flags |= MS_FILE_TAIL;
+#endif
if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV &&
(EXT4_HAS_COMPAT_FEATURE(sb, ~0U) ||
diff -Nurp linux001/fs/jfs/super.c linux002/fs/jfs/super.c
--- linux001/fs/jfs/super.c 2007-05-21 15:15:34.000000000 -0500
+++ linux002/fs/jfs/super.c 2007-05-23 22:53:11.000000000 -0500
@@ -439,6 +439,9 @@ static int jfs_fill_super(struct super_b
#ifdef CONFIG_JFS_POSIX_ACL
sb->s_flags |= MS_POSIXACL;
#endif
+#ifdef CONFIG_VM_FILE_TAILS
+ sb->s_flags |= MS_FILE_TAIL;
+#endif
if (newLVSize) {
printk(KERN_ERR "resize option for remount only\n");
diff -Nurp linux001/include/linux/fs.h linux002/include/linux/fs.h
--- linux001/include/linux/fs.h 2007-05-21 15:15:43.000000000 -0500
+++ linux002/include/linux/fs.h 2007-05-23 22:53:11.000000000 -0500
@@ -123,6 +123,7 @@ extern int dir_notify_enable;
#define MS_SLAVE (1<<19) /* change to slave */
#define MS_SHARED (1<<20) /* change to shared */
#define MS_RELATIME (1<<21) /* Update atime relative to mtime/ctime. */
+#define MS_FILE_TAIL (1<<22) /* Store file tail efficiently in page cache */
#define MS_ACTIVE (1<<30)
#define MS_NOUSER (1<<31)
@@ -182,6 +183,7 @@ extern int dir_notify_enable;
#define IS_NOCMTIME(inode) ((inode)->i_flags & S_NOCMTIME)
#define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE)
#define IS_PRIVATE(inode) ((inode)->i_flags & S_PRIVATE)
+#define IS_FILE_TAIL_CAPABLE(inode) __IS_FLG(inode, MS_FILE_TAIL)
/* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 003/012] Add tail to address space and define PG_pagetail page flag
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 001/012] Make iommu_map_sg deal with less-than-page-aligned data Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 002/012] Allow file systems to specify whether to store file tails Dave Kleikamp
@ 2007-05-24 12:11 ` Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 004/012] Replace PAGE_CACHE_SIZE with page_data_size() Dave Kleikamp
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:11 UTC (permalink / raw)
To: linux-mm
Add tail to address space and define PG_filetail page flag
The tail pointer in struct address_space needs to be block-aligned so that
i/o can be performed directly to/from the buffer. The allocated buffer may
not be aligned properly, so the pointer is stored in tail_buf in order to
be freed properly.
Note: Changing from slab to slub should ensure that the allocated buffer
will be properly aligned, so only one pointer will be needed.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
arch/powerpc/Kconfig | 9 +++++++++
include/linux/fs.h | 4 ++++
include/linux/page-flags.h | 9 +++++++++
3 files changed, 22 insertions(+)
diff -Nurp linux002/arch/powerpc/Kconfig linux003/arch/powerpc/Kconfig
--- linux002/arch/powerpc/Kconfig 2007-05-21 15:14:48.000000000 -0500
+++ linux003/arch/powerpc/Kconfig 2007-05-23 22:53:11.000000000 -0500
@@ -552,6 +552,15 @@ config PPC_64K_PAGES
while on hardware with such support, it will be used to map
normal application pages.
+config VM_FILE_TAILS
+ bool "Store file tails in slab cache"
+ depends on PPC_64K_PAGES
+ help
+ If the data at the end of a file, or the entire file, is small,
+ the kernel will attempt to store that data in the slab cache,
+ rather than allocate an entire page in the page cache.
+ If unsure, say N here.
+
config SCHED_SMT
bool "SMT (Hyperthreading) scheduler support"
depends on PPC64 && SMP
diff -Nurp linux002/include/linux/fs.h linux003/include/linux/fs.h
--- linux002/include/linux/fs.h 2007-05-23 22:53:11.000000000 -0500
+++ linux003/include/linux/fs.h 2007-05-23 22:53:11.000000000 -0500
@@ -452,6 +452,10 @@ struct address_space {
spinlock_t private_lock; /* for use by the address_space */
struct list_head private_list; /* ditto */
struct address_space *assoc_mapping; /* ditto */
+#ifdef CONFIG_VM_FILE_TAILS
+ void *tail; /* block-aligned, slab-packed file tail */
+ void *tail_buf; /* unaligned buffer holding tail */
+#endif
} __attribute__((aligned(sizeof(long))));
/*
* On most architectures that alignment is already the case; but
diff -Nurp linux002/include/linux/page-flags.h linux003/include/linux/page-flags.h
--- linux002/include/linux/page-flags.h 2007-05-21 15:15:44.000000000 -0500
+++ linux003/include/linux/page-flags.h 2007-05-23 22:53:11.000000000 -0500
@@ -101,6 +101,7 @@
* 64 bit | FIELDS | ?????? FLAGS |
* 63 32 0
*/
+#define PG_filetail 30 /* Pseudo-page representing tail */
#define PG_uncached 31 /* Page has been mapped as uncached */
#endif
@@ -270,6 +271,14 @@ static inline void __ClearPageTail(struc
#define SetPageUncached(page) set_bit(PG_uncached, &(page)->flags)
#define ClearPageUncached(page) clear_bit(PG_uncached, &(page)->flags)
+#ifdef CONFIG_VM_FILE_TAILS
+#define PageFileTail(page) test_bit(PG_filetail, &(page)->flags)
+#else
+#define PageFileTail(page) (0)
+#endif
+#define SetPageFileTail(page) set_bit(PG_filetail, &(page)->flags)
+#define ClearPageFileTail(page) clear_bit(PG_filetail, &(page)->flags)
+
struct page; /* forward declaration */
extern void cancel_dirty_page(struct page *page, unsigned int account_size);
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 004/012] Replace PAGE_CACHE_SIZE with page_data_size()
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (2 preceding siblings ...)
2007-05-24 12:11 ` [RFC:PATCH 003/012] Add tail to address space and define PG_pagetail page flag Dave Kleikamp
@ 2007-05-24 12:11 ` Dave Kleikamp
2007-05-24 12:11 ` [RFC:PATCH 005/012] Base file tail function Dave Kleikamp
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:11 UTC (permalink / raw)
To: linux-mm
Replace PAGE_CACHE_SIZE with page_data_size()
Code that zeroes an entire page needs to be aware that tail pages may not
be PAGE_CACHE_SIZE bytes long.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
fs/buffer.c | 28 ++++++++++++++++------------
fs/mpage.c | 4 ++--
fs/reiserfs/inode.c | 3 ++-
include/linux/pagemap.h | 28 ++++++++++++++++++++++++++++
mm/truncate.c | 2 +-
5 files changed, 49 insertions(+), 16 deletions(-)
diff -Nurp linux003/fs/buffer.c linux004/fs/buffer.c
--- linux003/fs/buffer.c 2007-05-21 15:15:32.000000000 -0500
+++ linux004/fs/buffer.c 2007-05-23 22:53:11.000000000 -0500
@@ -875,7 +875,7 @@ struct buffer_head *alloc_page_buffers(s
try_again:
head = NULL;
- offset = PAGE_SIZE;
+ offset = page_data_size(page);
while ((offset -= size) >= 0) {
bh = alloc_buffer_head(GFP_NOFS);
if (!bh)
@@ -1411,7 +1411,7 @@ void set_bh_page(struct buffer_head *bh,
struct page *page, unsigned long offset)
{
bh->b_page = page;
- BUG_ON(offset >= PAGE_SIZE);
+ BUG_ON(offset >= page_data_size(page));
if (PageHighMem(page))
/*
* This catches illegal uses and preserves the offset:
@@ -1752,8 +1752,10 @@ static int __block_prepare_write(struct
struct buffer_head *bh, *head, *wait[2], **wait_bh=wait;
BUG_ON(!PageLocked(page));
- BUG_ON(from > PAGE_CACHE_SIZE);
+ BUG_ON(from > page_data_size(page));
BUG_ON(to > PAGE_CACHE_SIZE);
+ if (to > page_data_size(page))
+ to = page_data_size(page);
BUG_ON(from > to);
blocksize = 1 << inode->i_blkbits;
@@ -2098,12 +2100,14 @@ int cont_prepare_write(struct page *page
(*bytes)++;
}
status = __block_prepare_write(inode, new_page, zerofrom,
- PAGE_CACHE_SIZE, get_block);
+ page_data_size(new_page),
+ get_block);
if (status)
goto out_unmap;
- zero_user_page(page, zerofrom, PAGE_CACHE_SIZE - zerofrom,
- KM_USER0);
- generic_commit_write(NULL, new_page, zerofrom, PAGE_CACHE_SIZE);
+ zero_user_page(page, zerofrom,
+ page_data_size(new_page) - zerofrom, KM_USER0);
+ generic_commit_write(NULL, new_page, zerofrom,
+ page_data_size(new_page));
unlock_page(new_page);
page_cache_release(new_page);
}
@@ -2234,7 +2238,7 @@ int nobh_prepare_write(struct page *page
* page is fully mapped-to-disk.
*/
for (block_start = 0, block_in_page = 0;
- block_start < PAGE_CACHE_SIZE;
+ block_start < page_data_size(page);
block_in_page++, block_start += blocksize) {
unsigned block_end = block_start + blocksize;
int create;
@@ -2328,7 +2332,7 @@ failed:
* Error recovery is pretty slack. Clear the page and mark it dirty
* so we'll later zero out any blocks which _were_ allocated.
*/
- zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+ zero_user_page(page, 0, page_data_size(page), KM_USER0);
SetPageUptodate(page);
set_page_dirty(page);
return ret;
@@ -2397,7 +2401,7 @@ int nobh_writepage(struct page *page, ge
* the page size, the remaining memory is zeroed when mapped, and
* writes to that region are not written out to the file."
*/
- zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, KM_USER0);
+ zero_user_page(page, offset, page_data_size(page) - offset, KM_USER0);
out:
ret = mpage_writepage(page, get_block, wbc);
if (ret == -EAGAIN)
@@ -2431,7 +2435,7 @@ int nobh_truncate_page(struct address_sp
to = (offset + blocksize) & ~(blocksize - 1);
ret = a_ops->prepare_write(NULL, page, offset, to);
if (ret == 0) {
- zero_user_page(page, offset, PAGE_CACHE_SIZE - offset,
+ zero_user_page(page, offset, page_data_size(page) - offset,
KM_USER0);
/*
* It would be more correct to call aops->commit_write()
@@ -2557,7 +2561,7 @@ int block_write_full_page(struct page *p
* the page size, the remaining memory is zeroed when mapped, and
* writes to that region are not written out to the file."
*/
- zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, KM_USER0);
+ zero_user_page(page, offset, page_data_size(page) - offset, KM_USER0);
return __block_write_full_page(inode, page, get_block, wbc);
}
diff -Nurp linux003/fs/mpage.c linux004/fs/mpage.c
--- linux003/fs/mpage.c 2007-05-21 15:15:35.000000000 -0500
+++ linux004/fs/mpage.c 2007-05-23 22:53:11.000000000 -0500
@@ -285,7 +285,7 @@ do_mpage_readpage(struct bio *bio, struc
if (first_hole != blocks_per_page) {
zero_user_page(page, first_hole << blkbits,
- PAGE_CACHE_SIZE - (first_hole << blkbits),
+ page_data_size(page) - (first_hole << blkbits),
KM_USER0);
if (first_hole == 0) {
SetPageUptodate(page);
@@ -585,7 +585,7 @@ page_is_mapped:
if (page->index > end_index || !offset)
goto confused;
- zero_user_page(page, offset, PAGE_CACHE_SIZE - offset,
+ zero_user_page(page, offset, page_data_size(page) - offset,
KM_USER0);
}
diff -Nurp linux003/fs/reiserfs/inode.c linux004/fs/reiserfs/inode.c
--- linux003/fs/reiserfs/inode.c 2007-05-21 15:15:36.000000000 -0500
+++ linux004/fs/reiserfs/inode.c 2007-05-23 22:53:11.000000000 -0500
@@ -2373,7 +2373,8 @@ static int reiserfs_write_full_page(stru
unlock_page(page);
return 0;
}
- zero_user_page(page, last_offset, PAGE_CACHE_SIZE - last_offset, KM_USER0);
+ zero_user_page(page, last_offset,
+ page_data_size(page) - last_offset, KM_USER0);
}
bh = head;
block = page->index << (PAGE_CACHE_SHIFT - s->s_blocksize_bits);
diff -Nurp linux003/include/linux/pagemap.h linux004/include/linux/pagemap.h
--- linux003/include/linux/pagemap.h 2007-05-21 15:15:44.000000000 -0500
+++ linux004/include/linux/pagemap.h 2007-05-23 22:53:11.000000000 -0500
@@ -58,6 +58,34 @@ static inline void mapping_set_gfp_mask(
#define PAGE_CACHE_MASK PAGE_MASK
#define PAGE_CACHE_ALIGN(addr) (((addr)+PAGE_CACHE_SIZE-1)&PAGE_CACHE_MASK)
+#ifdef CONFIG_VM_FILE_TAILS
+static inline pgoff_t file_tail_index(struct address_space *mapping)
+{
+ return (pgoff_t) (i_size_read(mapping->host) >> PAGE_CACHE_SHIFT);
+}
+
+/*
+ * Round up to file system block size so that we can read
+ * directly into the buffer
+ */
+static inline int file_tail_buf_size(struct address_space *mapping)
+{
+ int block_mask = (1 << mapping->host->i_blkbits) - 1;
+ int tail_bytes = i_size_read(mapping->host) & (PAGE_CACHE_SIZE - 1);
+ return ALIGN(tail_bytes, block_mask);
+}
+
+static inline int page_data_size(struct page *page)
+{
+ if (PageFileTail(page))
+ return file_tail_buf_size(page->mapping);
+ else
+ return PAGE_CACHE_SIZE;
+}
+#else
+#define page_data_size(page) PAGE_CACHE_SIZE
+#endif
+
#define page_cache_get(page) get_page(page)
#define page_cache_release(page) put_page(page)
void release_pages(struct page **pages, int nr, int cold);
diff -Nurp linux003/mm/truncate.c linux004/mm/truncate.c
--- linux003/mm/truncate.c 2007-05-21 15:15:48.000000000 -0500
+++ linux004/mm/truncate.c 2007-05-23 22:53:11.000000000 -0500
@@ -47,7 +47,7 @@ void do_invalidatepage(struct page *page
static inline void truncate_partial_page(struct page *page, unsigned partial)
{
- zero_user_page(page, partial, PAGE_CACHE_SIZE - partial, KM_USER0);
+ zero_user_page(page, partial, page_data_size(page) - partial, KM_USER0);
if (PagePrivate(page))
do_invalidatepage(page, partial);
}
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 005/012] Base file tail function
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (3 preceding siblings ...)
2007-05-24 12:11 ` [RFC:PATCH 004/012] Replace PAGE_CACHE_SIZE with page_data_size() Dave Kleikamp
@ 2007-05-24 12:11 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 006/012] Modify lowmem_page_address() & page_to_phys() to special case tail page Dave Kleikamp
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:11 UTC (permalink / raw)
To: linux-mm
Base file tail function
This is the code to allocate, free, and unpack the tail into a normal page.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
include/linux/file_tail.h | 67 ++++++++++
mm/Makefile | 1
mm/file_tail.c | 293 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 361 insertions(+)
diff -Nurp linux004/include/linux/file_tail.h linux005/include/linux/file_tail.h
--- linux004/include/linux/file_tail.h 1969-12-31 18:00:00.000000000 -0600
+++ linux005/include/linux/file_tail.h 2007-05-23 22:53:11.000000000 -0500
@@ -0,0 +1,67 @@
+#ifndef FILE_TAIL_H
+#define FILE_TAIL_H
+
+#include <linux/fs.h>
+#include <linux/pagemap.h>
+
+/*
+ * VM File Tails are used to compactly store the data at the end of the
+ * file in a small SLAB-allocated buffer when the base page size is large.
+ */
+
+#ifdef CONFIG_VM_FILE_TAILS
+
+extern struct page *page_cache_alloc_tail(struct address_space *);
+extern void page_cache_free_tail(struct page *);
+extern void __page_cache_free_tail_buffer(struct page *);
+
+static inline void page_cache_free_tail_buffer(struct page *page)
+{
+ if (PageFileTail(page))
+ __page_cache_free_tail_buffer(page);
+}
+
+/*
+ * Caller must hold write_lock_irq(&mapping->tree_lock)
+ */
+extern int __unpack_file_tail(struct address_space *);
+
+static inline int unpack_file_tail(struct address_space *mapping)
+{
+ int rc;
+ write_lock_irq(&mapping->tree_lock);
+ rc = __unpack_file_tail(mapping);
+ write_unlock_irq(&mapping->tree_lock);
+ return rc;
+}
+
+static inline void preallocate_page_cache_tail(struct address_space *mapping,
+ unsigned long end_index)
+{
+ struct inode *inode = mapping->host;
+ struct page *page;
+
+ if (mapping->tail)
+ return;
+ if (!IS_FILE_TAIL_CAPABLE(inode))
+ return;
+ if (file_tail_index(mapping) != end_index)
+ return;
+ if (file_tail_buf_size(mapping) > PAGE_CACHE_SIZE / 2)
+ return;
+
+ page = page_cache_alloc_tail(mapping);
+ if (page)
+ page_cache_release(page);
+}
+
+#else /* !CONFIG_VM_FILE_TAILS */
+
+#define unpack_file_tail(mapping) 0
+#define page_cache_free_tail(page) do {} while (0)
+#define page_cache_free_tail_buffer(page) do {} while (0)
+#define preallocate_page_cache_tail(page, end_index) do {} while (0)
+
+#endif /* CONFIG_VM_FILE_TAILS */
+
+#endif /* FILE_TAIL_H */
diff -Nurp linux004/mm/Makefile linux005/mm/Makefile
--- linux004/mm/Makefile 2007-05-21 15:15:48.000000000 -0500
+++ linux005/mm/Makefile 2007-05-23 22:53:11.000000000 -0500
@@ -31,4 +31,5 @@ obj-$(CONFIG_FS_XIP) += filemap_xip.o
obj-$(CONFIG_MIGRATION) += migrate.o
obj-$(CONFIG_SMP) += allocpercpu.o
obj-$(CONFIG_QUICKLIST) += quicklist.o
+obj-$(CONFIG_VM_FILE_TAILS) += file_tail.o
diff -Nurp linux004/mm/file_tail.c linux005/mm/file_tail.c
--- linux004/mm/file_tail.c 1969-12-31 18:00:00.000000000 -0600
+++ linux005/mm/file_tail.c 2007-05-23 22:53:11.000000000 -0500
@@ -0,0 +1,293 @@
+/*
+ * linux/mm/file_tail.c
+ *
+ * Copyright (C) International Business Machines Corp., 2006-2007
+ * Author: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
+ */
+
+/*
+ * VM File Tails are used to compactly store the data at the end of the
+ * file in a small SLAB-allocated buffer when the base page size is large.
+ */
+
+#include <linux/file_tail.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/pagemap.h>
+#include <linux/slab.h>
+#include <linux/buffer_head.h>
+#include <linux/swap.h>
+#include <linux/mm_inline.h>
+#include "internal.h"
+
+static struct kmem_cache *tail_page_cachep;
+
+/*
+ * Maybe this could become more generic, but for now, I need it here
+ */
+static void lru_cache_delete(struct page *page)
+{
+ if (PageLRU(page)) {
+ unsigned long flags;
+ struct zone *zone = page_zone(page);
+
+ spin_lock_irqsave(&zone->lru_lock, flags);
+ BUG_ON(!PageLRU(page));
+ __ClearPageLRU(page);
+ del_page_from_lru(zone, page);
+ spin_unlock_irqrestore(&zone->lru_lock, flags);
+ }
+}
+
+/*
+ * Unpack short_page into full_page.
+ * short_page is locked and has no buffers bound to it.
+ * full_page is newly allocated.
+ */
+static int unpack_tail(struct address_space *mapping, pgoff_t index,
+ struct page *short_page, struct page *full_page)
+{
+ int error;
+ char *kaddr;
+ char *tail;
+ char *tail_buf;
+ int tail_length;
+
+ /* This is the equivalent of remove_from_page_cache and
+ * add_to_page_cache_lru, without dropping tree_lock
+ */
+ error = radix_tree_preload(mapping_gfp_mask(mapping));
+ if (unlikely(error))
+ return error;
+
+ write_lock_irq(&mapping->tree_lock);
+ radix_tree_delete(&mapping->page_tree, index);
+ short_page->mapping = NULL;
+ tail = mapping->tail;
+ tail_buf = mapping->tail_buf;
+ mapping->tail = mapping->tail_buf = NULL;
+
+ error = radix_tree_insert(&mapping->page_tree, index, full_page);
+ if (unlikely(error)) {
+ printk(KERN_ERR "unpack_tail: radix_tree_insert failed!\n");
+ kfree(tail_buf);
+ unlock_page(short_page);
+ page_cache_release(short_page);
+ return error;
+ }
+ page_cache_get(full_page);
+ SetPageLocked(full_page);
+ full_page->mapping = mapping;
+ full_page->index = index;
+
+ write_unlock_irq(&mapping->tree_lock);
+ radix_tree_preload_end();
+ page_cache_release(short_page); /* page cache ref */
+
+ /* Copy data from tail to full page */
+ if (PageUptodate(short_page)) {
+ kaddr = kmap_atomic(full_page, KM_USER0);
+ tail_length = file_tail_buf_size(mapping);
+ memcpy(kaddr, tail, tail_length);
+ memset(kaddr+tail_length, 0, PAGE_CACHE_SIZE - tail_length);
+ kunmap_atomic(kaddr, KM_USER0);
+ SetPageUptodate(full_page);
+ }
+ kfree(tail_buf);
+
+ /* finalize full_page */
+ if (PageUptodate(short_page) && PageDirty(short_page)) {
+ SetPageDirty(full_page);
+ write_lock_irq(&mapping->tree_lock);
+ radix_tree_tag_set(&mapping->page_tree, index,
+ PAGECACHE_TAG_DIRTY);
+ write_unlock_irq(&mapping->tree_lock);
+ }
+ lru_cache_add(full_page);
+ unlock_page(full_page);
+ page_cache_release(full_page);
+
+ /* release short_page */
+ unlock_page(short_page);
+ page_cache_release(short_page);
+
+ return 0;
+}
+
+/*
+ * Caller must hold write lock on mapping->tree_lock
+ */
+int __unpack_file_tail(struct address_space *mapping)
+{
+ pgoff_t index;
+ struct page *full_page = NULL;
+ int rc = 0;
+ struct page *short_page;
+
+ while (mapping->tail) {
+ write_unlock_irq(&mapping->tree_lock);
+ index = file_tail_index(mapping);
+
+ /* Allocate full page */
+ if (!full_page)
+ full_page = page_cache_alloc(mapping);
+ if (!full_page) {
+ rc = -ENOMEM;
+ write_lock_irq(&mapping->tree_lock);
+ break;
+ }
+
+ /* Get & lock short page */
+ short_page = find_lock_page(mapping, index);
+ if (!short_page || !PageFileTail(short_page)) {
+ if (short_page) {
+ unlock_page(short_page);
+ page_cache_release(short_page);
+ }
+ write_lock_irq(&mapping->tree_lock);
+ continue;
+ }
+ wait_on_page_writeback(short_page);
+ lru_cache_delete(short_page);
+ /* We have the tail page locked, so this shouldn't go away */
+ BUG_ON(!mapping->tail);
+
+ if (page_has_buffers(short_page) &&
+ !try_to_release_page(short_page,
+ mapping_gfp_mask(mapping))) {
+ /* How hard to do we need to try? */
+ sync_blockdev(mapping->host->i_sb->s_bdev);
+ if (page_has_buffers(short_page) &&
+ !try_to_release_page(short_page,
+ mapping_gfp_mask(mapping))) {
+ printk(KERN_ERR "__unpack_file_tail: "
+ "can't release page\n");
+ page_cache_release(short_page);
+ rc = -EIO; /* What's a good return code? */
+ write_lock_irq(&mapping->tree_lock);
+ break;
+ }
+ }
+
+ rc = unpack_tail(mapping, index, short_page, full_page);
+ if (rc) {
+ write_lock_irq(&mapping->tree_lock);
+ break;
+ }
+ full_page = NULL;
+
+ /*
+ * unlikely, but check to see if there was no tail added
+ * back. We need to return with tree_lock held.
+ */
+ write_lock_irq(&mapping->tree_lock);
+
+ }
+ if (full_page)
+ page_cache_release(full_page);
+ return rc;
+}
+
+static void init_once(void *ptr, struct kmem_cache *cachep, unsigned long flags)
+{
+ struct page *page = (struct page *)ptr;
+
+ memset(page, 0, sizeof(struct page));
+ reset_page_mapcount(page);
+ INIT_LIST_HEAD(&page->lru);
+ SetPageFileTail(page);
+}
+
+static __init int file_tail_init(void)
+{
+ tail_page_cachep = kmem_cache_create("tail_page_cache",
+ sizeof(struct page), 0, 0,
+ init_once, NULL);
+ if (tail_page_cachep == NULL) {
+ printk (KERN_ERR "Failed to create tail_page_cache\n");
+ return -ENOMEM;
+ }
+ return 0;
+}
+__initcall(file_tail_init);
+
+struct page *page_cache_alloc_tail(struct address_space *mapping)
+{
+ int block_size = 1 << mapping->host->i_blkbits;
+ int error;
+ pgoff_t index;
+ struct page *page;
+ int size;
+ void *tail;
+ void *tail_buf;
+
+ size = file_tail_buf_size(mapping);
+ index = file_tail_index(mapping);
+
+ page = find_get_page(mapping, index);
+ if (page)
+ return page;
+
+ page = kmem_cache_alloc(tail_page_cachep, GFP_KERNEL);
+ if (!page)
+ return NULL;
+
+ /*
+ * For pages up to 1/8 of a page, kmalloc returns well-aligned
+ * buffers. For smaller allocations, we need to align it ourselves
+ */
+ if (size < PAGE_SIZE >> 3) {
+ tail_buf = kmalloc(size + block_size - 1, GFP_KERNEL);
+ tail = (void *)ALIGN((size_t)tail_buf, block_size);
+ } else
+ tail_buf = tail = kmalloc(size, GFP_KERNEL);
+
+ if (!tail) {
+ kmem_cache_free(tail_page_cachep, page);
+ return NULL;
+ }
+ /* Just to make sure */
+ BUG_ON((size_t)tail & (block_size - 1));
+
+ set_page_count(page, 1);
+ page->flags = 0;
+ SetPageFileTail(page);
+
+ error = add_to_page_cache_lru(page, mapping, index,
+ mapping_gfp_mask(mapping));
+ if (error) {
+ kfree(tail_buf);
+ kmem_cache_free(tail_page_cachep, page);
+ return NULL;
+ }
+ write_lock_irq(&mapping->tree_lock);
+ /*
+ * Make sure the file size didn't change
+ */
+ if (mapping->tail || (index != file_tail_index(mapping)) ||
+ (size != file_tail_buf_size(mapping))) {
+ write_unlock_irq(&mapping->tree_lock);
+ __put_page(page);
+ page_cache_release(page);
+ kfree(tail_buf);
+ return NULL;
+ }
+ mapping->tail = tail;
+ mapping->tail_buf = tail_buf;
+ write_unlock_irq(&mapping->tree_lock);
+ unlock_page(page);
+
+ return page;
+}
+
+void page_cache_free_tail(struct page *page)
+{
+ kmem_cache_free(tail_page_cachep, page);
+}
+
+void __page_cache_free_tail_buffer(struct page *page)
+{
+ struct address_space *mapping = page->mapping;
+ kfree(mapping->tail_buf);
+ mapping->tail_buf = mapping->tail = NULL;
+}
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 006/012] Modify lowmem_page_address() & page_to_phys() to special case tail page
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (4 preceding siblings ...)
2007-05-24 12:11 ` [RFC:PATCH 005/012] Base file tail function Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 007/012] Avoid page_to_pfn() on " Dave Kleikamp
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
Modify lowmem_page_address() & page_to_phys() to special case tail page
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
include/asm-powerpc/io.h | 8 +++++++-
include/linux/mm.h | 2 ++
2 files changed, 9 insertions(+), 1 deletion(-)
diff -Nurp linux005/include/asm-powerpc/io.h linux006/include/asm-powerpc/io.h
--- linux005/include/asm-powerpc/io.h 2007-05-21 15:15:41.000000000 -0500
+++ linux006/include/asm-powerpc/io.h 2007-05-23 22:53:11.000000000 -0500
@@ -19,6 +19,7 @@ extern int check_legacy_ioport(unsigned
#define PNPBIOS_BASE 0xf000
#include <linux/compiler.h>
+#include <linux/mm.h>
#include <asm/page.h>
#include <asm/byteorder.h>
#include <asm/synch.h>
@@ -702,7 +703,12 @@ static inline void * phys_to_virt(unsign
/*
* Change "struct page" to physical address.
*/
-#define page_to_phys(page) (page_to_pfn(page) << PAGE_SHIFT)
+static inline unsigned long page_to_phys(struct page *page)
+{
+ if (unlikely(PageFileTail(page)))
+ return __pa(page->mapping->tail);
+ return page_to_pfn(page) << PAGE_SHIFT;
+}
/* We do NOT want virtual merging, it would put too much pressure on
* our iommu allocator. Instead, we want drivers to be smart enough
diff -Nurp linux005/include/linux/mm.h linux006/include/linux/mm.h
--- linux005/include/linux/mm.h 2007-05-21 15:15:44.000000000 -0500
+++ linux006/include/linux/mm.h 2007-05-23 22:53:11.000000000 -0500
@@ -557,6 +557,8 @@ static inline void set_page_links(struct
static __always_inline void *lowmem_page_address(struct page *page)
{
+ if (unlikely(PageFileTail(page)))
+ return page->mapping->tail;
return __va(page_to_pfn(page) << PAGE_SHIFT);
}
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 007/012] Avoid page_to_pfn() on tail page
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (5 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 006/012] Modify lowmem_page_address() & page_to_phys() to special case tail page Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 008/012] bh_offset needs to take page_address into consideration Dave Kleikamp
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
Avoid page_to_pfn() on tail page
On ppc64, we don't need bounce buffers to do I/O. This will need more work
for other architectures.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
block/ll_rw_blk.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff -Nurp linux006/block/ll_rw_blk.c linux007/block/ll_rw_blk.c
--- linux006/block/ll_rw_blk.c 2007-05-21 15:14:57.000000000 -0500
+++ linux007/block/ll_rw_blk.c 2007-05-23 22:53:12.000000000 -0500
@@ -1221,7 +1221,8 @@ void blk_recount_segments(request_queue_
* considered part of another segment, since that might
* change with the bounce page.
*/
- high = page_to_pfn(bv->bv_page) > q->bounce_pfn;
+ high = (!PageFileTail(bv->bv_page) &&
+ page_to_pfn(bv->bv_page) > q->bounce_pfn);
if (high || highprv)
goto new_hw_segment;
if (cluster) {
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 008/012] bh_offset needs to take page_address into consideration
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (6 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 007/012] Avoid page_to_pfn() on " Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 009/012] Wrap i_size_write Dave Kleikamp
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
bh_offset needs to take page_address into consideration
ToDo: Check how well gcc optimizes bh_offset when CONFIG_VM_FILE_TAILS is
not defined. Some optimization may be needed, but we want to avoid
unnecessary ifdefs.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
include/linux/buffer_head.h | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff -Nurp linux007/include/linux/buffer_head.h linux008/include/linux/buffer_head.h
--- linux007/include/linux/buffer_head.h 2007-05-21 15:15:43.000000000 -0500
+++ linux008/include/linux/buffer_head.h 2007-05-23 22:53:12.000000000 -0500
@@ -129,7 +129,17 @@ BUFFER_FNS(Ordered, ordered)
BUFFER_FNS(Eopnotsupp, eopnotsupp)
BUFFER_FNS(Unwritten, unwritten)
-#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK)
+/*
+ * If CONFIG_VM_FILE_TAILS is defined, page_address(bh) may not be
+ * aligned to PAGE_SIZE, so bh_offset must take that into account.
+ */
+static inline unsigned long bh_offset(struct buffer_head *bh)
+{
+ return ((unsigned long)bh->b_data -
+ (unsigned long)page_address(bh->b_page)) &
+ ~PAGE_MASK;
+}
+
#define touch_buffer(bh) mark_page_accessed(bh->b_page)
/* If we *know* page->private refers to buffer_heads */
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 009/012] Wrap i_size_write
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (7 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 008/012] bh_offset needs to take page_address into consideration Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 010/012] unpack tail page to avoid memory mapping Dave Kleikamp
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
Wrap i_size_write
If CONFIG_FILE_TAILS is set, i_size_write is defined in file_tail.c
This adds considerable overhead to i_size_write, but i_size_read is unaffected.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
include/linux/fs.h | 5 +++++
mm/file_tail.c | 12 ++++++++++++
2 files changed, 17 insertions(+)
diff -Nurp linux008/include/linux/fs.h linux009/include/linux/fs.h
--- linux008/include/linux/fs.h 2007-05-23 22:53:11.000000000 -0500
+++ linux009/include/linux/fs.h 2007-05-23 22:53:12.000000000 -0500
@@ -661,7 +661,12 @@ static inline loff_t i_size_read(const s
* (normally i_mutex), otherwise on 32bit/SMP an update of i_size_seqcount
* can be lost, resulting in subsequent i_size_read() calls spinning forever.
*/
+#ifdef CONFIG_VM_FILE_TAILS
+extern void i_size_write(struct inode *, loff_t); /* defined in file_tail.c */
+static inline void _i_size_write(struct inode *inode, loff_t i_size)
+#else
static inline void i_size_write(struct inode *inode, loff_t i_size)
+#endif
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
write_seqcount_begin(&inode->i_size_seqcount);
diff -Nurp linux008/mm/file_tail.c linux009/mm/file_tail.c
--- linux008/mm/file_tail.c 2007-05-23 22:53:11.000000000 -0500
+++ linux009/mm/file_tail.c 2007-05-23 22:53:12.000000000 -0500
@@ -188,6 +188,18 @@ int __unpack_file_tail(struct address_sp
return rc;
}
+void i_size_write(struct inode *inode, loff_t i_size)
+{
+ struct address_space *mapping = inode->i_mapping;
+
+ write_lock_irq(&mapping->tree_lock);
+ if (mapping->tail && (i_size > i_size_read(inode)))
+ __unpack_file_tail(mapping);
+ _i_size_write(inode, i_size);
+ write_unlock_irq(&mapping->tree_lock);
+}
+EXPORT_SYMBOL(i_size_write);
+
static void init_once(void *ptr, struct kmem_cache *cachep, unsigned long flags)
{
struct page *page = (struct page *)ptr;
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 010/012] unpack tail page to avoid memory mapping
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (8 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 009/012] Wrap i_size_write Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 011/012] Make sure tail page is freed correctly Dave Kleikamp
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
unpack tail page to avoid memory mapping
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
mm/memory.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff -Nurp linux009/mm/memory.c linux010/mm/memory.c
--- linux009/mm/memory.c 2007-05-21 15:15:48.000000000 -0500
+++ linux010/mm/memory.c 2007-05-23 22:53:12.000000000 -0500
@@ -50,6 +50,7 @@
#include <linux/delayacct.h>
#include <linux/init.h>
#include <linux/writeback.h>
+#include <linux/file_tail.h>
#include <asm/pgalloc.h>
#include <asm/uaccess.h>
@@ -2324,6 +2325,15 @@ retry:
else if (unlikely(new_page == NOPAGE_REFAULT))
return VM_FAULT_MINOR;
+ if (PageFileTail(new_page)) {
+ /* Can new_page->mapping be different from mapping? */
+ struct address_space *mapping2 = new_page->mapping;
+ page_cache_release(new_page);
+ if (unpack_file_tail(mapping2))
+ return VM_FAULT_OOM; /* Can we do better? */
+ goto retry;
+ }
+
/*
* Should we do an early C-O-W break?
*/
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 011/012] Make sure tail page is freed correctly
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (9 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 010/012] unpack tail page to avoid memory mapping Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:12 ` [RFC:PATCH 012/012] Add tail hooks into file_map.c Dave Kleikamp
2007-05-24 12:45 ` [RFC:PATCH 000/012] VM File Tails Dave Kleikamp
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
Make sure tail page is freed correctly
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
mm/page_alloc.c | 6 ++++++
1 file changed, 6 insertions(+)
diff -Nurp linux010/mm/page_alloc.c linux011/mm/page_alloc.c
--- linux010/mm/page_alloc.c 2007-05-21 15:15:48.000000000 -0500
+++ linux011/mm/page_alloc.c 2007-05-23 22:53:12.000000000 -0500
@@ -41,6 +41,7 @@
#include <linux/pfn.h>
#include <linux/backing-dev.h>
#include <linux/fault-inject.h>
+#include <linux/file_tail.h>
#include <asm/tlbflush.h>
#include <asm/div64.h>
@@ -796,6 +797,11 @@ static void fastcall free_hot_cold_page(
struct per_cpu_pages *pcp;
unsigned long flags;
+ if (unlikely(PageFileTail(page))) {
+ page_cache_free_tail(page);
+ return;
+ }
+
if (PageAnon(page))
page->mapping = NULL;
if (free_pages_check(page))
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* [RFC:PATCH 012/012] Add tail hooks into file_map.c
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (10 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 011/012] Make sure tail page is freed correctly Dave Kleikamp
@ 2007-05-24 12:12 ` Dave Kleikamp
2007-05-24 12:45 ` [RFC:PATCH 000/012] VM File Tails Dave Kleikamp
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:12 UTC (permalink / raw)
To: linux-mm
Add tail hooks into file_map.c
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
mm/filemap.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff -Nurp linux011/mm/filemap.c linux012/mm/filemap.c
--- linux011/mm/filemap.c 2007-05-21 15:15:48.000000000 -0500
+++ linux012/mm/filemap.c 2007-05-23 22:53:12.000000000 -0500
@@ -30,6 +30,7 @@
#include <linux/security.h>
#include <linux/syscalls.h>
#include <linux/cpuset.h>
+#include <linux/file_tail.h>
#include "filemap.h"
#include "internal.h"
@@ -116,6 +117,13 @@ void __remove_from_page_cache(struct pag
{
struct address_space *mapping = page->mapping;
+ /*
+ * mapping->tail is kept in sync with the tail page's existence
+ * in the radix tree, so we need to clear it here while holding
+ * the tree_lock
+ */
+ page_cache_free_tail_buffer(page);
+
radix_tree_delete(&mapping->page_tree, page->index);
page->mapping = NULL;
mapping->nrpages--;
@@ -890,6 +898,13 @@ void do_generic_mapping_read(struct addr
goto out;
end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
+
+ /*
+ * If the last page in the request is a candidate for a tail page,
+ * allocate it before we call page_cache_readahead()
+ */
+ preallocate_page_cache_tail(mapping, end_index);
+
for (;;) {
struct page *page;
unsigned long nr, ret;
@@ -2146,6 +2161,17 @@ generic_file_buffered_write(struct kiocb
goto zero_length_segment;
}
+ if (PageFileTail(page) &&
+ ((pos + bytes) > i_size_read(inode))) {
+ /* Can't unpack the tail while holding the tail page */
+ unlock_page(page);
+ page_cache_release(page);
+ status = (long)unpack_file_tail(mapping);
+ if (status)
+ break;
+ continue;
+ }
+
status = a_ops->prepare_write(file, page, offset, offset+bytes);
if (unlikely(status)) {
loff_t isize = i_size_read(inode);
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC:PATCH 000/012] VM File Tails
2007-05-24 12:11 [RFC:PATCH 000/012] VM Page Tails Dave Kleikamp
` (11 preceding siblings ...)
2007-05-24 12:12 ` [RFC:PATCH 012/012] Add tail hooks into file_map.c Dave Kleikamp
@ 2007-05-24 12:45 ` Dave Kleikamp
12 siblings, 0 replies; 14+ messages in thread
From: Dave Kleikamp @ 2007-05-24 12:45 UTC (permalink / raw)
To: linux-mm
On Thu, 2007-05-24 at 08:11 -0400, Dave Kleikamp wrote:
> I wanted to get some feedback on this as it is, before it undergoes some
> major re-writing. These patches are against linux-2.6.22-rc2.
>
> These patches implement what I'm calling "VM File Tails"
I mistyped the original subject. The patches are also available here:
ftp://kernel.org/pub/linux/kernel/people/shaggy/vm_file_tails/
Thanks,
Shaggy
--
David Kleikamp
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread