From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Hirokazu Takahashi <taka@valinux.co.jp>
Cc: haveblue@us.ibm.com, akpm@osdl.org, linux-mm@kvack.org,
piggin@cyberone.com.au, arjanv@redhat.com,
linux-kernel@vger.kernel.org
Subject: Re: [RFC] memory defragmentation to satisfy high order allocations
Date: Sat, 2 Oct 2004 15:33:49 -0300 [thread overview]
Message-ID: <20041002183349.GA7986@logos.cnet> (raw)
In-Reply-To: <20041002.183015.41630389.taka@valinux.co.jp>
On Sat, Oct 02, 2004 at 06:30:15PM +0900, Hirokazu Takahashi wrote:
> Hello, Marcelo.
>
> Generic memory defragmentation will be very nice for me to implement
> hugetlbpage migration, as allocating a new hugetlbpage is a hard job.
>
> > For the "defragmentation" operation we want to do an "easy" try - ie if we
> > can't remap giveup.
> >
> > I feel we should try to "untie" the code which checks for remapping availability /
> > does the remapping from the page migration - so to be able to share the most
> > code between it and other users of the same functionality.
>
> I think it's possible to introduce non-wait mode to the migration code,
> as you may expect. Shall I implement it?
>
> > Curiosity: How did you guys test the migration operation? Several threads on
> > several processors operating on the memory, etc?
>
> I always test it with the zone hotplug emulation patch, which Mr.Iwamoto
> has made. I usually run following jobs concurrently while zones are added
> and removed repeatedly on a SMP machine.
> - making linux kernel
> - copying file trees.
> - overwriting file trees.
> - removing file trees
> - some pages are swapped out automatically:)
>
> And Mr.Iwamoto has some small programs to check any kind of page
> can be migrated. The programs repeat one of following actions:
> - read/write files .
> - use MAP_SHARED and MAP_PRIVATE mmap()'s and read/write there.
> - use Direct I/O.
> - use AIO.
> - fork to have COW pages.
> - use shmem.
> - use sendfile.
>
> > Cool. I'll take a closer look at the relevant parts of memory hotplug patches
> > this weekend, hopefully. See if I can help with testing of these patches too.
>
> Any comments are very welcome.
I have a few comments about the code:
1)
I'm pretty sure you should transfer the radix tree tag at radix_tree_replace().
If for example you transfer a dirty tagged page to another zone, an mpage_writepages()
will miss it (because it uses pagevec_lookup_tag(PAGECACHE_DIRTY_TAG)).
Should be quite trivial to do (save tags before deleting and set to new entry,
all in radix_tree_replace).
My implementation also contained the same bug.
2)
At migrate_onepage you add anonymous pages which aren't swap allocated
to the swap cache
+ /*
+ * Put the page in a radix tree if it isn't in the tree yet.
+ */
+#ifdef CONFIG_SWAP
+ if (PageAnon(page) && !PageSwapCache(page))
+ if (!add_to_swap(page, GFP_KERNEL)) {
+ unlock_page(page);
+ return ERR_PTR(-ENOSPC);
+ }
+#endif /* CONFIG_SWAP */
Why's that? You can copy anonymous pages without adding them to swap (thats
what the patch I posted does).
3) At migrate_page_common you assume additional page references
(page_migratable returning -EAGAIN) means the code should try to writeout
the page.
Is that assumption always valid?
In theory there is no need to writeout pages when migrating them to
other zones - they will be copied and the dirty information retained (either
in the PageDirty bit or radix tree tag).
I just noticed you do that on further patches (migrate_page_buffer), but AFAICS
the writeout remains. Why arent you using migrate_page_buffer yet?
I think the final aim should be to remove the need for "pageout()"
completly.
4)
About implementing a nonblocking version of it. The easier way, it
seems to me, is to pass a "block" argument to generic_migrate_page() and
use that.
Questions: are there any documents on the memory hotplug userspace tools?
Where can I find them?
Are Iwamoto's test programs available?
In general the code looks nice to me! I'll jump in and help with
testing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2004-10-02 18:33 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-01 18:22 Marcelo Tosatti
2004-10-01 20:11 ` Andrew Morton
2004-10-01 19:04 ` Marcelo Tosatti
2004-10-01 21:00 ` Andrew Morton
2004-10-01 21:57 ` Dave Hansen
2004-10-01 23:42 ` Marcelo Tosatti
2004-10-02 1:17 ` Andrew Morton
2004-10-02 9:30 ` Hirokazu Takahashi
2004-10-02 18:33 ` Marcelo Tosatti [this message]
2004-10-03 4:13 ` Hirokazu Takahashi
2004-10-03 14:07 ` Marcelo Tosatti
2004-10-03 18:35 ` Hirokazu Takahashi
2004-10-03 19:21 ` Trond Myklebust
2004-10-03 20:03 ` Hirokazu Takahashi
2004-10-03 20:44 ` Trond Myklebust
2004-10-04 13:02 ` Hirokazu Takahashi
2004-10-04 17:24 ` Marcelo Tosatti
2004-10-05 2:53 ` Hirokazu Takahashi
2004-10-07 12:06 ` Marcelo Tosatti
2004-10-08 7:00 ` Hirokazu Takahashi
2004-10-08 10:00 ` Marcelo Tosatti
2004-10-08 12:23 ` Hirokazu Takahashi
2004-10-08 12:41 ` Marcelo Tosatti
2004-10-08 16:52 ` Hirokazu Takahashi
2004-10-08 15:36 ` Marcelo Tosatti
2004-10-12 10:56 ` IWAMOTO Toshihiro
2004-10-12 10:35 ` Marcelo Tosatti
2004-10-12 17:55 ` Hirokazu Takahashi
2004-10-12 14:26 ` Martin J. Bligh
2004-10-12 12:17 ` Marcelo Tosatti
2004-10-12 15:01 ` Dave Hansen
2004-10-04 3:24 ` IWAMOTO Toshihiro
2004-10-04 2:22 ` Dave Hansen
2004-10-05 16:46 ` [PATCH] mhp: transfer dirty tag at radix_tree_replace Marcelo Tosatti
2004-10-05 18:35 ` Dave Hansen
2004-10-06 7:39 ` Hirokazu Takahashi
2004-10-08 8:15 ` Hirokazu Takahashi
2004-10-08 20:36 ` Marcelo Tosatti
2004-10-04 4:09 ` [RFC] memory defragmentation to satisfy high order allocations IWAMOTO Toshihiro
2004-10-04 17:29 ` Marcelo Tosatti
2004-10-02 2:30 ` Nick Piggin
2004-10-02 3:08 ` Marcelo Tosatti
2004-10-04 8:15 ` Nick Piggin
2004-10-02 2:41 ` Nick Piggin
2004-10-02 3:50 ` Hirokazu Takahashi
2004-10-02 16:06 ` Marcelo Tosatti
2004-10-04 2:38 ` Hiroyuki KAMEZAWA
2004-10-04 17:32 ` Marcelo Tosatti
2004-10-04 6:58 ` Hiroyuki KAMEZAWA
2004-10-07 15:58 ` memory hotplug and mem= Marcelo Tosatti
2004-10-07 18:36 ` Dave Hansen
2004-10-07 17:01 ` Marcelo Tosatti
2004-10-07 19:10 ` Dave Hansen
2004-10-07 20:25 ` Dave Hansen
2004-10-11 16:40 [RFC] memory defragmentation to satisfy high order allocations linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041002183349.GA7986@logos.cnet \
--to=marcelo.tosatti@cyclades.com \
--cc=akpm@osdl.org \
--cc=arjanv@redhat.com \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=piggin@cyberone.com.au \
--cc=taka@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox