From: John Stultz <john.stultz@linaro.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: John Stultz <john.stultz@linaro.org>,
Andrew Morton <akpm@linux-foundation.org>,
Android Kernel Team <kernel-team@android.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
Hugh Dickins <hughd@google.com>, Dave Hansen <dave@sr71.net>,
Rik van Riel <riel@redhat.com>,
Dmitry Adamushko <dmitry.adamushko@gmail.com>,
Neil Brown <neilb@suse.de>,
Andrea Arcangeli <aarcange@redhat.com>,
Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>,
Jan Kara <jack@suse.cz>,
KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
Michel Lespinasse <walken@google.com>,
Minchan Kim <minchan@kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: [PATCH 4/5] vrange: Set affected pages referenced when marking volatile
Date: Fri, 21 Mar 2014 14:17:34 -0700 [thread overview]
Message-ID: <1395436655-21670-5-git-send-email-john.stultz@linaro.org> (raw)
In-Reply-To: <1395436655-21670-1-git-send-email-john.stultz@linaro.org>
One issue that some potential users were concerned about, was that
they wanted to ensure that all the pages from one volatile range
were purged before we purge pages from a different volatile range.
This would prevent the case where they have 4 large objects, and
the system purges one page from each object, casuing all of the
objects to have to be re-created.
The counter-point to this case, is when an application is using the
SIGBUS semantics to continue to access pages after they have been
marked volatile. In that case, the desire was that the most recently
touched pages be purged last, and only the "cold" pages be purged
from the specified range.
Instead of adding option flags for the various usage model (at least
initially), one way of getting a solutoin for both uses would be to
have the act of marking pages as volatile in effect mark the pages
as accessed. Since all of the pages in the range would be marked
together, they would be of the same "age" and would (approximately)
be purged together. Further, if any pages in the range were accessed
after being marked volatile, they would be moved to the end of the
lru and be purged later.
This patch provides this solution by walking the pages in the range
and setting them accessed when set volatile.
This does have a performance impact, as we have to touch each page
when setting them volatile. Additionally, while setting all the
pages to the same age solves the basic problem, there is still an
open question of: What age all the pages should be set to?
One could consider them all recently accessed, which would put them
at the end of the active lru. Or one could possibly move them all to
the end of the inactive lru, making them more likely to be purged
sooner.
Another possibility would be to not affect the pages at all when
marking them as volatile, and allow applications to use madvise
prior to marking any pages as volatile to age them together, if
that behavior was needed. In that case this patch would be
unnecessary.
Thoughts on the best approach would be greatly appreciated.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Android Kernel Team <kernel-team@android.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Robert Love <rlove@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Hommey <mh@glandium.org>
Cc: Taras Glek <tglek@mozilla.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
mm/vrange.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 71 insertions(+)
diff --git a/mm/vrange.c b/mm/vrange.c
index 28ceb6f..9be8f45 100644
--- a/mm/vrange.c
+++ b/mm/vrange.c
@@ -79,6 +79,73 @@ static int vrange_check_purged(struct mm_struct *mm,
}
+
+/**
+ * vrange_mark_accessed_pte - Marks pte pages in range accessed
+ *
+ * Iterates over the ptes in the pmd and marks the coresponding page
+ * as accessed. This ensures all the pages in the range are of the
+ * same "age", so that when pages are purged, we will most likely purge
+ * them together.
+ */
+static int vrange_mark_accessed_pte(pmd_t *pmd, unsigned long addr,
+ unsigned long end, struct mm_walk *walk)
+{
+ struct vm_area_struct *vma = walk->private;
+ pte_t *pte;
+ spinlock_t *ptl;
+
+ if (pmd_trans_huge(*pmd))
+ return 0;
+ if (pmd_trans_unstable(pmd))
+ return 0;
+
+ pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ for (; addr != end; pte++, addr += PAGE_SIZE) {
+ if (pte_present(*pte)) {
+ struct page *page;
+
+ page = vm_normal_page(vma, addr, *pte);
+ if (IS_ERR_OR_NULL(page))
+ break;
+ get_page(page);
+ /*
+ * XXX - So here we may want to do something
+ * else other then marking the page accessed.
+ * Setting them to all be the same "age" ensures
+ * they are pruged together, but its not clear
+ * what that "age" should be.
+ */
+ mark_page_accessed(page);
+ put_page(page);
+ }
+ }
+ pte_unmap_unlock(pte - 1, ptl);
+ cond_resched();
+
+ return 0;
+}
+
+
+/**
+ * vrange_mark_range_accessed - Sets up a mm_walk to mark pages accessed
+ *
+ * Sets up and calls wa_page_range() to mark affected pages as accessed.
+ */
+static void vrange_mark_range_accessed(struct vm_area_struct *vma,
+ unsigned long start,
+ unsigned long end)
+{
+ struct mm_walk vrange_walk = {
+ .pmd_entry = vrange_mark_accessed_pte,
+ .mm = vma->vm_mm,
+ .private = vma,
+ };
+
+ walk_page_range(start, end, &vrange_walk);
+}
+
+
/**
* do_vrange - Marks or clears VMAs in the range (start-end) as VM_VOLATILE
*
@@ -165,6 +232,10 @@ static ssize_t do_vrange(struct mm_struct *mm, unsigned long start,
success:
vma->vm_flags = new_flags;
+ /* Mark the vma range as accessed */
+ if (mode == VRANGE_VOLATILE)
+ vrange_mark_range_accessed(vma, start, tmp);
+
/* update count to distance covered so far*/
count = tmp - orig_start;
--
1.8.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-03-21 21:18 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-21 21:17 [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder John Stultz
2014-03-21 21:17 ` [PATCH 1/5] vrange: Add vrange syscall and handle splitting/merging and marking vmas John Stultz
2014-03-23 12:20 ` Jan Kara
2014-03-23 20:34 ` John Stultz
2014-03-23 16:50 ` KOSAKI Motohiro
2014-04-08 18:52 ` John Stultz
2014-03-21 21:17 ` [PATCH 2/5] vrange: Add purged page detection on setting memory non-volatile John Stultz
2014-03-23 12:29 ` Jan Kara
2014-03-23 20:21 ` John Stultz
2014-03-23 17:42 ` KOSAKI Motohiro
2014-04-07 18:37 ` John Stultz
2014-04-07 22:14 ` KOSAKI Motohiro
2014-04-08 3:09 ` John Stultz
2014-03-23 17:50 ` KOSAKI Motohiro
2014-03-23 20:26 ` John Stultz
2014-03-23 21:50 ` KOSAKI Motohiro
2014-04-09 18:29 ` John Stultz
2014-03-21 21:17 ` [PATCH 3/5] vrange: Add page purging logic & SIGBUS trap John Stultz
2014-03-23 23:44 ` KOSAKI Motohiro
2014-04-10 18:49 ` John Stultz
2014-03-21 21:17 ` John Stultz [this message]
2014-03-24 0:01 ` [PATCH 4/5] vrange: Set affected pages referenced when marking volatile KOSAKI Motohiro
2014-03-21 21:17 ` [PATCH 5/5] vmscan: Age anonymous memory even when swap is off John Stultz
2014-03-24 17:33 ` Rik van Riel
2014-03-24 18:04 ` John Stultz
2014-04-01 21:21 ` [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder Johannes Weiner
2014-04-01 21:34 ` H. Peter Anvin
2014-04-01 21:35 ` H. Peter Anvin
2014-04-01 23:01 ` Dave Hansen
2014-04-02 4:12 ` John Stultz
2014-04-02 16:36 ` Johannes Weiner
2014-04-02 17:40 ` John Stultz
2014-04-02 17:58 ` Johannes Weiner
2014-04-02 19:01 ` John Stultz
2014-04-02 19:47 ` Johannes Weiner
2014-04-02 20:13 ` John Stultz
2014-04-02 22:44 ` Jan Kara
2014-04-11 19:32 ` John Stultz
2014-04-07 5:48 ` Minchan Kim
2014-04-08 4:32 ` Kevin Easton
2014-04-08 3:38 ` John Stultz
2014-04-07 5:24 ` Minchan Kim
2014-04-02 4:03 ` John Stultz
2014-04-02 4:07 ` H. Peter Anvin
2014-04-02 16:30 ` Johannes Weiner
2014-04-02 16:32 ` H. Peter Anvin
2014-04-02 16:37 ` H. Peter Anvin
2014-04-02 17:18 ` Johannes Weiner
2014-04-02 17:40 ` Dave Hansen
2014-04-02 17:48 ` John Stultz
2014-04-02 18:07 ` Johannes Weiner
2014-04-02 19:37 ` John Stultz
2014-04-02 18:31 ` Andrea Arcangeli
2014-04-02 19:27 ` Johannes Weiner
2014-04-07 6:19 ` Minchan Kim
2014-04-02 19:51 ` John Stultz
2014-04-07 6:11 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1395436655-21670-5-git-send-email-john.stultz@linaro.org \
--to=john.stultz@linaro.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dave@sr71.net \
--cc=dmitry.adamushko@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=kernel-team@android.com \
--cc=kosaki.motohiro@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mh@glandium.org \
--cc=minchan@kernel.org \
--cc=neilb@suse.de \
--cc=riel@redhat.com \
--cc=rlove@google.com \
--cc=tglek@mozilla.com \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox