From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH 2.6.17-rc1-mm1 3/6] Migrate-on-fault - migrate misplaced page
Date: Fri, 07 Apr 2006 16:23:43 -0400 [thread overview]
Message-ID: <1144441424.5198.42.camel@localhost.localdomain> (raw)
In-Reply-To: <1144441108.5198.36.camel@localhost.localdomain>
Migrate-on-fault prototype 3/6 V0.2 - migrate misplaced page
V0.2 - reworked against 2.6.17-rc1-mm1 with Christoph's migration
code reorg.
This patch adds a new function migrate_misplaced_page() to mm/migrate.c
[where most of the other page migration functions live] to migrate a
misplace page to a specified destination node. This function will be
called from the fault path. Because we already know the destination
node for the migration, we allocate pages directly rather than rerunning
the policy node computation in alloc_page_vma().
migrate_misplaced_page() will need to put a single page [the old or
new page] back to the lru, so this patch also splits out a
"putback_lru_page()" function from move_lru_page(). This avoids having
to insert the page on a dummy list just to have move_lru_page() delete
it from the list.
The patch also updates the address space migratepage operations to
skip the attempt to unmap the page, if the operation is being called
in the fault path to migrate a misplaced page. To accomplish this, I
added an additional boolean [int] argument "faulting" to the migratepage
op functions. This argument also adjusts the # of expected page
references because we have an extra count when called in the fault
path.
The migratepage operations now use the migrate_page_try_to_unmap()
and migrate_page_replace_in_mapping() functions separated out in a
previous patch.
I believe that we can now delete migrate_page_remove_references().
But, I haven't, yet.
Finally, the page adds the static inline function
check_migrate_misplaced_page() to mempolicy.h to check whether a
page has no mappings [no pte references] and is "misplaced"--i.e.
on a node different from what the policy for (vma, address) dictates.
In this case, the page will be migrated to the "correct" node, if
possible. If migration fails for any reason, we just use the
original page.
Note that when NUMA or MIGRATION is not configured, the
check_migrate_misplaced_page() function becomes a macro that
evaluates to its page argument.
Subsequent patches will hook the fault handlers [anon, file, shmem]
to check_migrate_misplaced_page().
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Index: linux-2.6.17-rc1-mm1/include/linux/mempolicy.h
===================================================================
--- linux-2.6.17-rc1-mm1.orig/include/linux/mempolicy.h 2006-04-05 10:14:39.000000000 -0400
+++ linux-2.6.17-rc1-mm1/include/linux/mempolicy.h 2006-04-05 10:14:41.000000000 -0400
@@ -34,6 +34,7 @@
#include <linux/rbtree.h>
#include <linux/spinlock.h>
#include <linux/nodemask.h>
+#include <linux/migrate.h>
struct vm_area_struct;
@@ -184,6 +185,31 @@ int do_migrate_pages(struct mm_struct *m
int mpol_misplaced(struct page *, struct vm_area_struct *,
unsigned long, int *);
+#if defined(CONFIG_MIGRATION) && defined(_LINUX_MM_H)
+/*
+ * called in fault path, where _LINUX_MM_H will be defined.
+ * page is uptodate and locked.
+ */
+static inline struct page *check_migrate_misplaced_page(struct page *page,
+ struct vm_area_struct *vma, unsigned long address)
+{
+ int polnid, misplaced;
+
+ if (page_mapcount(page) || PageWriteback(page))
+ return page;
+
+ misplaced = mpol_misplaced(page, vma, address, &polnid);
+ if (!misplaced)
+ return page;
+
+ return migrate_misplaced_page(page, polnid,
+ misplaced_is_interleaved(misplaced));
+
+}
+#else
+#define check_migrate_misplaced_page(page, vma, address) (page)
+#endif
+
extern void *cpuset_being_rebound; /* Trigger mpol_copy vma rebind */
#else
@@ -279,6 +305,8 @@ static inline int do_migrate_pages(struc
return 0;
}
+#define check_migrate_misplaced_page(page, vma, address) (page)
+
static inline void check_highest_zone(int k)
{
}
Index: linux-2.6.17-rc1-mm1/include/linux/fs.h
===================================================================
--- linux-2.6.17-rc1-mm1.orig/include/linux/fs.h 2006-04-05 10:14:36.000000000 -0400
+++ linux-2.6.17-rc1-mm1/include/linux/fs.h 2006-04-05 10:14:41.000000000 -0400
@@ -373,7 +373,7 @@ struct address_space_operations {
struct page* (*get_xip_page)(struct address_space *, sector_t,
int);
/* migrate the contents of a page to the specified target */
- int (*migratepage) (struct page *, struct page *);
+ int (*migratepage) (struct page *, struct page *, int);
};
struct backing_dev_info;
@@ -1760,7 +1760,7 @@ extern void simple_release_fs(struct vfs
extern ssize_t simple_read_from_buffer(void __user *, size_t, loff_t *, const void *, size_t);
#ifdef CONFIG_MIGRATION
-extern int buffer_migrate_page(struct page *, struct page *);
+extern int buffer_migrate_page(struct page *, struct page *, int);
#else
#define buffer_migrate_page NULL
#endif
Index: linux-2.6.17-rc1-mm1/include/linux/gfp.h
===================================================================
--- linux-2.6.17-rc1-mm1.orig/include/linux/gfp.h 2006-03-20 00:53:29.000000000 -0500
+++ linux-2.6.17-rc1-mm1/include/linux/gfp.h 2006-04-05 10:14:41.000000000 -0400
@@ -131,10 +131,13 @@ alloc_pages(gfp_t gfp_mask, unsigned int
}
extern struct page *alloc_page_vma(gfp_t gfp_mask,
struct vm_area_struct *vma, unsigned long addr);
+extern struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
+ unsigned nid);
#else
#define alloc_pages(gfp_mask, order) \
alloc_pages_node(numa_node_id(), gfp_mask, order)
#define alloc_page_vma(gfp_mask, vma, addr) alloc_pages(gfp_mask, 0)
+#define alloc_page_interleave(gfp_mask, order, nid) alloc_pages(gfp_mask, 0)
#endif
#define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
Index: linux-2.6.17-rc1-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.17-rc1-mm1.orig/mm/mempolicy.c 2006-04-05 10:14:39.000000000 -0400
+++ linux-2.6.17-rc1-mm1/mm/mempolicy.c 2006-04-05 10:14:41.000000000 -0400
@@ -1179,7 +1179,7 @@ struct zonelist *huge_zonelist(struct vm
/* Allocate a page in interleaved policy.
Own path because it needs to do special accounting. */
-static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
+struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
unsigned nid)
{
struct zonelist *zl;
Index: linux-2.6.17-rc1-mm1/mm/migrate.c
===================================================================
--- linux-2.6.17-rc1-mm1.orig/mm/migrate.c 2006-04-05 10:14:38.000000000 -0400
+++ linux-2.6.17-rc1-mm1/mm/migrate.c 2006-04-05 10:14:41.000000000 -0400
@@ -59,7 +59,8 @@ int isolate_lru_page(struct page *page,
del_page_from_active_list(zone, page);
else
del_page_from_inactive_list(zone, page);
- list_add_tail(&page->lru, pagelist);
+ if (pagelist)
+ list_add_tail(&page->lru, pagelist);
}
spin_unlock_irq(&zone->lru_lock);
}
@@ -88,9 +89,14 @@ int migrate_prep(void)
return 0;
}
-static inline void move_to_lru(struct page *page)
+/*
+ * Put a single page back to appropriate lru list via cache.
+ * Removes page reference added by isolate_lru_page, but
+ * the lru_cache_add*() will add a temporary ref while the
+ * pages resides in the cache [pagevec].
+ */
+static inline void putback_lru_page(struct page *page)
{
- list_del(&page->lru);
if (PageActive(page)) {
/*
* lru_cache_add_active checks that
@@ -104,6 +110,12 @@ static inline void move_to_lru(struct pa
put_page(page);
}
+static inline void move_to_lru(struct page *page)
+{
+ list_del(&page->lru);
+ putback_lru_page(page);
+}
+
/*
* Add isolated pages on the list back to the LRU.
*
@@ -125,7 +137,7 @@ int putback_lru_pages(struct list_head *
/*
* Non migratable page
*/
-int fail_migrate_page(struct page *newpage, struct page *page)
+int fail_migrate_page(struct page *newpage, struct page *page, int faulting)
{
return -EIO;
}
@@ -335,29 +347,35 @@ EXPORT_SYMBOL(migrate_page_copy);
*
* Pages are locked upon entry and exit.
*/
-int migrate_page(struct page *newpage, struct page *page)
+int migrate_page(struct page *newpage, struct page *page, int faulting)
{
- int rc;
- int nr_refs = 2; /* cache + current */
+ int rc = 0;
+ /*
+ * nr_refs: cache + current [+ fault path]
+ */
+ int nr_refs = 2 + !!faulting;
BUG_ON(PageWriteback(page)); /* Writeback must be complete */
- rc = migrate_page_unmap_and_replace(newpage, page, nr_refs);
-
+ if (!faulting)
+ rc = migrate_page_try_to_unmap(page, nr_refs);
+ if (!rc)
+ rc = migrate_page_replace_in_mapping(newpage, page, nr_refs);
if (rc)
return rc;
migrate_page_copy(newpage, page);
/*
- * Remove auxiliary swap entries and replace
- * them with real ptes.
+ * If we are not already in the fault path, remove auxiliary swap
+ * entries and replace them with real ptes.
*
* Note that a real pte entry will allow processes that are not
* waiting on the page lock to use the new page via the page tables
* before the new page is unlocked.
*/
- remove_from_swap(newpage);
+ if (!faulting)
+ remove_from_swap(newpage);
return 0;
}
EXPORT_SYMBOL(migrate_page);
@@ -468,7 +486,7 @@ redo:
* own migration function. This is the most common
* path for page migration.
*/
- rc = mapping->a_ops->migratepage(newpage, page);
+ rc = mapping->a_ops->migratepage(newpage, page, 0);
goto unlock_both;
}
@@ -498,7 +516,7 @@ redo:
*/
if (!page_has_buffers(page) ||
try_to_release_page(page, GFP_KERNEL)) {
- rc = migrate_page(newpage, page);
+ rc = migrate_page(newpage, page, 0);
goto unlock_both;
}
@@ -555,23 +573,28 @@ next:
* if the underlying filesystem guarantees that no other references to "page"
* exist.
*/
-int buffer_migrate_page(struct page *newpage, struct page *page)
+int buffer_migrate_page(struct page *newpage, struct page *page, int faulting)
{
struct address_space *mapping = page->mapping;
struct buffer_head *bh, *head;
- int nr_refs = 3; /* cache + bufs + current */
- int rc;
+ int rc = 0;
+ /*
+ * nr_refs: cache + bufs + current [+ fault path]
+ */
+ int nr_refs = 3 + !!faulting;
if (!mapping)
return -EAGAIN;
if (!page_has_buffers(page))
- return migrate_page(newpage, page);
+ return migrate_page(newpage, page, faulting);
head = page_buffers(page);
- rc = migrate_page_unmap_and_replace(newpage, page, nr_refs);
-
+ if (!faulting)
+ rc = migrate_page_try_to_unmap(page, nr_refs);
+ if (!rc)
+ rc = migrate_page_replace_in_mapping(newpage, page, nr_refs);
if (rc)
return rc;
@@ -683,3 +706,71 @@ out:
nr_pages++;
return nr_pages;
}
+
+/*
+ * attempt to migrate a misplaced page to the specified destination
+ * node. Page is already unmapped and locked by caller. Anon pages
+ * are in the swap cache.
+ *
+ * page refs on entry/exit: cache + fault path [+ bufs]
+ */
+struct page *migrate_misplaced_page(struct page *page,
+ int dest, int interleaved)
+{
+ struct page *newpage;
+ struct address_space *mapping = page_mapping(page);
+ unsigned int gfp;
+
+//TODO: explicit assertions during debug/testing
+ BUG_ON(!PageLocked(page));
+ BUG_ON(page_mapcount(page));
+ if (PageAnon(page))
+ BUG_ON(!PageSwapCache(page));
+ BUG_ON(!mapping);
+
+ if (isolate_lru_page(page, NULL)) /* incrs page count on success */
+ goto out_nolru; /* we lost */
+
+//TODO: or just use GFP_HIGHUSER ?
+ gfp = (unsigned int)mapping_gfp_mask(mapping);
+
+ if (interleaved)
+ newpage = alloc_page_interleave(gfp, 0, dest);
+ else
+ newpage = alloc_pages_node(dest, gfp, 0);
+
+ if (!newpage)
+ goto out; /* give up */
+ lock_page(newpage);
+
+ if (mapping->a_ops->migratepage) {
+ /*
+ * migrating in fault path.
+ * migrate a_op transfers cache [+ buf] refs
+ */
+ int rc = mapping->a_ops->migratepage(newpage, page, 1);
+ if (rc) {
+ unlock_page(newpage);
+ __free_page(newpage);
+ } else {
+ get_page(newpage); /* add isolate_lru_page ref */
+ put_page(page); /* drop " " */
+
+ unlock_page(page);
+ put_page(page); /* drop fault path ref & free */
+
+ page = newpage;
+ }
+ goto out;
+ } else {
+//TODO: for now, give up if no address space migrate op.
+// later, handle w/ default mechanism, like migrate_pages?
+ }
+
+out:
+ putback_lru_page(page); /* drops a page ref */
+
+out_nolru:
+ return page;
+
+}
Index: linux-2.6.17-rc1-mm1/include/linux/migrate.h
===================================================================
--- linux-2.6.17-rc1-mm1.orig/include/linux/migrate.h 2006-04-05 10:14:38.000000000 -0400
+++ linux-2.6.17-rc1-mm1/include/linux/migrate.h 2006-04-05 10:14:41.000000000 -0400
@@ -7,7 +7,7 @@
#ifdef CONFIG_MIGRATION
extern int isolate_lru_page(struct page *p, struct list_head *pagelist);
extern int putback_lru_pages(struct list_head *l);
-extern int migrate_page(struct page *, struct page *);
+extern int migrate_page(struct page *, struct page *, int);
extern void migrate_page_copy(struct page *, struct page *);
extern int migrate_page_try_to_unmap(struct page *, int);
extern int migrate_page_replace_in_mapping(struct page *, struct page *, int);
@@ -16,7 +16,8 @@ extern int migrate_pages(struct list_hea
struct list_head *moved, struct list_head *failed);
extern int migrate_pages_to(struct list_head *pagelist,
struct vm_area_struct *vma, int dest);
-extern int fail_migrate_page(struct page *, struct page *);
+struct page *migrate_misplaced_page(struct page *, int, int);
+extern int fail_migrate_page(struct page *, struct page *, int);
extern int migrate_prep(void);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-04-07 20:22 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-07 20:18 [PATCH 2.6.17-rc1-mm1 0/6] Migrate-on-fault - Overview Lee Schermerhorn
2006-04-07 20:22 ` [PATCH 2.6.17-rc1-mm1 1/6] Migrate-on-fault - separate unmap from radix tree replace Lee Schermerhorn
2006-04-11 18:08 ` Christoph Lameter
2006-04-11 18:47 ` Lee Schermerhorn
2006-04-07 20:23 ` [PATCH 2.6.17-rc1-mm1 2/6] Migrate-on-fault - check for misplaced page Lee Schermerhorn
2006-04-11 18:21 ` Christoph Lameter
2006-04-11 19:28 ` Lee Schermerhorn
2006-04-11 19:33 ` Christoph Lameter
2006-04-12 16:43 ` Paul Jackson
2006-04-12 18:49 ` Lee Schermerhorn
2006-04-12 20:55 ` Paul Jackson
2006-04-07 20:23 ` Lee Schermerhorn [this message]
2006-04-11 18:32 ` [PATCH 2.6.17-rc1-mm1 3/6] Migrate-on-fault - migrate " Christoph Lameter
2006-04-11 19:51 ` Lee Schermerhorn
2006-04-07 20:24 ` [PATCH 2.6.17-rc1-mm1 4/6] Migrate-on-fault - handle misplaced anon pages Lee Schermerhorn
2006-04-07 20:26 ` [PATCH 2.6.17-rc1-mm1 5/6] Migrate-on-fault - add MPOL_MF_LAZY Lee Schermerhorn
2006-04-07 20:27 ` [PATCH 2.6.17-rc1-mm1 6/6] Migrate-on-fault - add MPOL_NOOP Lee Schermerhorn
2006-04-09 7:01 ` [PATCH 2.6.17-rc1-mm1 0/6] Migrate-on-fault - Overview Andi Kleen
2006-04-11 18:46 ` Christoph Lameter
2006-04-11 18:52 ` Andi Kleen
2006-04-11 19:03 ` Jack Steiner
2006-04-11 20:40 ` Lee Schermerhorn
2006-04-11 22:12 ` Jack Steiner
2006-04-11 20:40 ` Lee Schermerhorn
2006-04-11 20:40 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1144441424.5198.42.camel@localhost.localdomain \
--to=lee.schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox