From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mel@csn.ul.ie,
clameter@sgi.com, riel@redhat.com, andrea@suse.de,
a.p.zijlstra@chello.nl, eric.whitney@hp.com, npiggin@suse.de
Subject: Re: [PATCH/RFC 6/14] Reclaim Scalability: "No Reclaim LRU Infrastructure"
Date: Wed, 19 Sep 2007 11:30:16 +0530 [thread overview]
Message-ID: <46F0BAF0.2020806@linux.vnet.ibm.com> (raw)
In-Reply-To: <20070914205438.6536.49500.sendpatchset@localhost>
Lee Schermerhorn wrote:
> PATCH/RFC 06/14 Reclaim Scalability: "No Reclaim LRU Infrastructure"
>
> Against: 2.6.23-rc4-mm1
>
> Infrastructure to manage pages excluded from reclaim--i.e., hidden
> from vmscan. Based on a patch by Larry Woodman of Red Hat. Reworked
> to maintain "nonreclaimable" pages on a separate per-zone LRU list,
> to "hide" them from vmscan. A separate noreclaim pagevec is provided
> for shrink_active_list() to move nonreclaimable pages to the noreclaim
> list without over burdening the zone lru_lock.
>
> Pages on the noreclaim list have both PG_noreclaim and PG_lru set.
> Thus, PG_noreclaim is analogous to and mutually exclusive with
> PG_active--it specifies which LRU list the page is on.
>
> The noreclaim infrastructure is enabled by a new mm Kconfig option
> [CONFIG_]NORECLAIM.
>
Could we use a different name. CONFIG_NORECLAIM could be misunderstood
to be that reclaim is disabled on the system all together.
>
> 4. TODO: Memory Controllers maintain separate active and inactive lists.
> Need to consider whether they should also maintain a noreclaim list.
> Also, convert to use Christoph's array of indexed lru variables?
>
> See //TODO note in mm/memcontrol.c re: isolating non-reclaimable
> pages.
>
Thanks, I'll look into exploiting this in the memory controller.
> Index: Linux/mm/swap.c
> ===================================================================
> --- Linux.orig/mm/swap.c 2007-09-14 10:21:45.000000000 -0400
> +++ Linux/mm/swap.c 2007-09-14 10:21:48.000000000 -0400
> @@ -116,14 +116,14 @@ int rotate_reclaimable_page(struct page
> return 1;
> if (PageDirty(page))
> return 1;
> - if (PageActive(page))
> + if (PageActive(page) | PageNoreclaim(page))
Did you intend to make this bitwise or?
> - if (PageLRU(page) && !PageActive(page)) {
> + if (PageLRU(page) && !PageActive(page) && !PageNoreclaim(page)) {
Since we use this even below, does it make sense to wrap it into an
inline function and call it check_page_lru_inactive_reclaimable()?
> void lru_add_drain(void)
> @@ -277,14 +312,18 @@ void release_pages(struct page **pages,
>
> if (PageLRU(page)) {
> struct zone *pagezone = page_zone(page);
> + int is_lru_page;
> +
> if (pagezone != zone) {
> if (zone)
> spin_unlock_irq(&zone->lru_lock);
> zone = pagezone;
> spin_lock_irq(&zone->lru_lock);
> }
> - VM_BUG_ON(!PageLRU(page));
> - __ClearPageLRU(page);
> + is_lru_page = PageLRU(page);
> + VM_BUG_ON(!(is_lru_page));
> + if (is_lru_page)
This is a little confusing, after asserting that the page
is indeed in LRU, why add the check for is_lru_page again?
Comments will be helpful here.
> +#ifdef CONFIG_NORECLAIM
> +void __pagevec_lru_add_noreclaim(struct pagevec *pvec)
> +{
> + int i;
> + struct zone *zone = NULL;
> +
> + for (i = 0; i < pagevec_count(pvec); i++) {
> + struct page *page = pvec->pages[i];
> + struct zone *pagezone = page_zone(page);
> +
> + if (pagezone != zone) {
> + if (zone)
> + spin_unlock_irq(&zone->lru_lock);
> + zone = pagezone;
> + spin_lock_irq(&zone->lru_lock);
> + }
> + VM_BUG_ON(PageLRU(page));
> + SetPageLRU(page);
> + VM_BUG_ON(PageActive(page) || PageNoreclaim(page));
> + SetPageNoreclaim(page);
> + add_page_to_noreclaim_list(zone, page);
These two calls seem to be the only difference between __pagevec_lru_add
and this routine, any chance we could refactor to reuse most of the
code? Something like __pagevec_lru_add_prepare(), do the stuff and
then call __pagevec_lru_add_finish()
> +/*
> + * move_to_lru() - place @page onto appropriate lru list
> + * based on preserved page flags: active, noreclaim, none
> + */
> static inline void move_to_lru(struct page *page)
> {
> - if (PageActive(page)) {
> + if (PageNoreclaim(page)) {
> + VM_BUG_ON(PageActive(page));
> + ClearPageNoreclaim(page);
> + lru_cache_add_noreclaim(page);
I know that lru_cache_add_noreclaim() does the right thing
by looking at PageNoReclaim(), but the sequence is a little
confusing to read.
> -int __isolate_lru_page(struct page *page, int mode)
> +int __isolate_lru_page(struct page *page, int mode, int take_nonreclaimable)
> {
> int ret = -EINVAL;
>
> @@ -652,12 +660,27 @@ int __isolate_lru_page(struct page *page
> return ret;
>
> /*
> - * When checking the active state, we need to be sure we are
> - * dealing with comparible boolean values. Take the logical not
> - * of each.
> + * Non-reclaimable pages shouldn't make it onto the inactive list,
> + * so if we encounter one, we should be scanning either the active
> + * list--e.g., after splicing noreclaim list to end of active list--
> + * or nearby pages [lumpy reclaim]. Take it only if scanning active
> + * list.
> */
> - if (mode != ISOLATE_BOTH && (!PageActive(page) != !mode))
> - return ret;
> + if (PageNoreclaim(page)) {
> + if (!take_nonreclaimable)
> + return -EBUSY; /* lumpy reclaim -- skip this page */
> + /*
> + * else fall thru' and try to isolate
> + */
I think we need to distinguish between the types of nonreclaimable
pages. Is it the heavily mapped pages that you pass on further?
A casual reader like me finds it hard to understand how lumpy reclaim
might try to reclaim a non-reclaimable page :-)
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-09-19 6:00 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-14 20:53 [PATCH/RFC 0/14] Page Reclaim Scalability Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 1/14] Reclaim Scalability: Convert anon_vma lock to read/write lock Lee Schermerhorn
2007-09-17 11:02 ` Mel Gorman
2007-09-18 2:41 ` KAMEZAWA Hiroyuki
2007-09-18 11:01 ` Mel Gorman
2007-09-18 14:57 ` Rik van Riel
2007-09-18 15:37 ` Lee Schermerhorn
2007-09-18 20:17 ` Lee Schermerhorn
2007-09-20 10:19 ` Mel Gorman
2007-09-14 20:54 ` [PATCH/RFC 2/14] Reclaim Scalability: convert inode i_mmap_lock to reader/writer lock Lee Schermerhorn
2007-09-17 12:53 ` Mel Gorman
2007-09-20 1:24 ` Andrea Arcangeli
2007-09-20 14:10 ` Lee Schermerhorn
2007-09-20 14:16 ` Andrea Arcangeli
2007-09-14 20:54 ` [PATCH/RFC 3/14] Reclaim Scalability: move isolate_lru_page() to vmscan.c Lee Schermerhorn
2007-09-14 21:34 ` Peter Zijlstra
2007-09-15 1:55 ` Rik van Riel
2007-09-17 14:11 ` Lee Schermerhorn
2007-09-17 9:20 ` Balbir Singh
2007-09-17 19:19 ` Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 4/14] Reclaim Scalability: Define page_anon() function Lee Schermerhorn
2007-09-15 2:00 ` Rik van Riel
2007-09-17 13:19 ` Mel Gorman
2007-09-18 1:58 ` KAMEZAWA Hiroyuki
2007-09-18 2:27 ` Rik van Riel
2007-09-18 2:40 ` KAMEZAWA Hiroyuki
2007-09-18 15:04 ` Lee Schermerhorn
2007-09-18 19:41 ` Christoph Lameter
2007-09-19 0:30 ` KAMEZAWA Hiroyuki
2007-09-19 16:58 ` Lee Schermerhorn
2007-09-20 0:56 ` KAMEZAWA Hiroyuki
2007-09-14 20:54 ` [PATCH/RFC 5/14] Reclaim Scalability: Use an indexed array for LRU variables Lee Schermerhorn
2007-09-17 13:40 ` Mel Gorman
2007-09-17 14:17 ` Lee Schermerhorn
2007-09-17 14:39 ` Lee Schermerhorn
2007-09-17 18:58 ` Balbir Singh
2007-09-17 19:12 ` Lee Schermerhorn
2007-09-17 19:36 ` Balbir Singh
2007-09-17 19:36 ` Rik van Riel
2007-09-17 20:21 ` Balbir Singh
2007-09-17 21:01 ` Rik van Riel
2007-09-14 20:54 ` [PATCH/RFC 6/14] Reclaim Scalability: "No Reclaim LRU Infrastructure" Lee Schermerhorn
2007-09-14 22:47 ` Christoph Lameter
2007-09-17 15:17 ` Lee Schermerhorn
2007-09-17 18:41 ` Christoph Lameter
2007-09-18 9:54 ` Mel Gorman
2007-09-18 19:45 ` Christoph Lameter
2007-09-19 11:11 ` Mel Gorman
2007-09-19 18:03 ` Christoph Lameter
2007-09-19 6:00 ` Balbir Singh [this message]
2007-09-19 14:47 ` Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 7/14] Reclaim Scalability: Non-reclaimable page statistics Lee Schermerhorn
2007-09-17 1:56 ` Rik van Riel
2007-09-14 20:54 ` [PATCH/RFC 8/14] Reclaim Scalability: Ram Disk Pages are non-reclaimable Lee Schermerhorn
2007-09-17 1:57 ` Rik van Riel
2007-09-17 14:40 ` Lee Schermerhorn
2007-09-17 18:42 ` Christoph Lameter
2007-09-14 20:54 ` [PATCH/RFC 9/14] Reclaim Scalability: SHM_LOCKED pages are nonreclaimable Lee Schermerhorn
2007-09-17 2:18 ` Rik van Riel
2007-09-14 20:55 ` [PATCH/RFC 10/14] Reclaim Scalability: track anon_vma "related vmas" Lee Schermerhorn
2007-09-17 2:52 ` Rik van Riel
2007-09-17 15:52 ` Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 11/14] Reclaim Scalability: swap backed pages are nonreclaimable when no swap space available Lee Schermerhorn
2007-09-17 2:53 ` Rik van Riel
2007-09-18 17:46 ` Lee Schermerhorn
2007-09-18 20:01 ` Rik van Riel
2007-09-19 14:55 ` Lee Schermerhorn
2007-09-18 2:59 ` KAMEZAWA Hiroyuki
2007-09-18 15:47 ` Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 12/14] Reclaim Scalability: Non-reclaimable Mlock'ed pages Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 13/14] Reclaim Scalability: Handle Mlock'ed pages during map/unmap and truncate Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 14/14] Reclaim Scalability: cull non-reclaimable anon pages in fault path Lee Schermerhorn
2007-09-14 21:11 ` [PATCH/RFC 0/14] Page Reclaim Scalability Peter Zijlstra
2007-09-14 21:42 ` Linus Torvalds
2007-09-14 22:02 ` Peter Zijlstra
2007-09-15 0:07 ` Linus Torvalds
2007-09-17 6:44 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46F0BAF0.2020806@linux.vnet.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=andrea@suse.de \
--cc=clameter@sgi.com \
--cc=eric.whitney@hp.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox