From: Andrea Arcangeli <andrea@e-mind.com>
To: Chuck Lever <cel@monkey.org>
Cc: linux-kernel@vger.rutgers.edu, linux-mm@kvack.org,
"Stephen C. Tweedie" <sct@redhat.com>,
Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [patch] arca-vm-2.2.5
Date: Mon, 5 Apr 1999 02:22:35 +0200 (CEST) [thread overview]
Message-ID: <Pine.LNX.4.05.9904050033340.779-100000@laser.random> (raw)
In-Reply-To: <Pine.BSF.4.03.9904041657210.15836-100000@funky.monkey.org>
On Sun, 4 Apr 1999, Chuck Lever wrote:
>> ftp://e-mind.com/pub/linux/arca-tree/2.2.5_arca2.gz
*snip*
>first, i notice you've altered the page hash function and quadrupled the
The page hash function change is from Stephen (I did it here too because I
completly agreed with it). The point is that shm entries uses the lower
bits of the pagemap->offset field.
>size of the hash table. do you have measurements/benchmarks that show
>that the page hash was not working well? can you say how a plain 2.2.5
The page_hash looked like to me a quite obvious improvement while swapping
in/out shm entreis (it will improve the swap cache queries) but looks my
comment below...
>kernel compares to one that has just the page hash changes without the
>rest of your VM modifications? the reason i ask is because i've played
The reason of that is that it's an obvious improvement. And since it's
statically allocated (not dynamically allocated at boot in function of the
memory size) a bit larger default can be desiderable, I can safely alloc
some more bit of memory (some decade of kbyte) without harming the
avalilable mm here. Well, as I just said many times I think someday we'll
need RB-trees instead of fuzzy hash but it's not a big issue right now
due the so low number of pages available.
Returning to your question in my tree I enlarged the hashtable to 13 bit.
This mean that in the best case I'll be able to address in O(1) up to 8192
pages. Here I have 32752 pages so as worse I'll have 4 pages chained on
every hash entry. 13 bits of hash-depth will alloc for the hash 32k of
memory (really not an issue ;).
In the stock kernel instead the hash size is 2^11 = 2048 so in the worst
case I would have 16 pages chained in the same hash entry.
>with that hash table, and found most changes to it cause undesirable
>increases in system CPU utilization. although, it *is* highly interesting
Swapping out/in shm entries is not a so frequent task as doing normal
query on the page cache. So I am removing the patch here. Thanks for the
info, I really didn't thought about this...
For the record this is the hash-function change we are talking about:
Index: pagemap.h
===================================================================
RCS file: /var/cvs/linux/include/linux/pagemap.h,v
retrieving revision 1.1.1.2
retrieving revision 1.1.2.12
diff -u -r1.1.1.2 -r1.1.2.12
--- pagemap.h 1999/01/23 16:29:55 1.1.1.2
+++ pagemap.h 1999/04/01 23:12:37 1.1.2.12
@@ -36,7 +35,7 @@
#define i (((unsigned long) inode)/(sizeof(struct inode) & ~ (sizeof(struct inode) - 1)))
#define o (offset >> PAGE_SHIFT)
#define s(x) ((x)+((x)>>PAGE_HASH_BITS))
- return s(i+o) & (PAGE_HASH_SIZE-1);
+ return s(i+o+offset) & (PAGE_HASH_SIZE-1);
#undef i
#undef o
#undef s
>that the buffer hash table is orders of magnitude larger, yet hashes about
>the same number of objects. can someone provide history on the design of
>the page hash function?
I can't help you into this, but looks Ok to me ;). If somebody did the
math on it I'd like to try understanding it.
>also, can you tell what improvement you expect from the additional logic
>in try_to_free_buffers() ?
Eh, my shrink_mmap() is is a black magic and it's long to explain what I
thought ;). Well one of the reasons is that ext2 take used the superblock
all the time and so when I reach an used buffers I'll put back at the top
of the lru list since I don't want to go in swap because there are some
unfreeable superblock that live forever at the end of the pagemap
lru_list.
Note also (you didn't asked about that but I bet you noticed that ;) that
in my tree I also made every pagemap entry L1 cacheline aliged. I asked to
people that was complainig about page colouring (and I still don't know
what is exactly page colouring , I only have a guess but I would like to
read something about implementation details, pointers???) to try out my
patch to see if it made differences; but I had no feedback :(. I also
made the irq_state entry cacheline aligned (when I understood the
cacheline issue I agreed with it).
Many thanks for commenting and reading my new experimental code (rock
solid here). I'll release now a:
ftp://e-mind.com/pub/linux/arca-tree/2.2.5_arca4.bz2
It will have my latest stuff I did (flushtime-bugfix included and sane
sysctl values included too) in the last days plus the old hash-function
for the reasons you pointed out to me now.
If you'll find some spare time to try out the new patch let me know the
numbers! ;))
Andrea Arcangeli
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
next prev parent reply other threads:[~1999-04-05 0:50 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
1999-04-01 23:32 Andrea Arcangeli
1999-04-04 21:07 ` Chuck Lever
1999-04-05 0:22 ` Andrea Arcangeli [this message]
1999-04-05 13:23 ` Mark Hemment
1999-04-05 15:56 ` Andrea Arcangeli
1999-04-07 11:28 ` [patch] only-one-cache-query [was Re: [patch] arca-vm-2.2.5] Andrea Arcangeli
1999-04-07 13:06 ` Stephen C. Tweedie
1999-04-07 13:49 ` Andrea Arcangeli
1999-04-07 13:42 ` Andrea Arcangeli
1999-04-07 13:47 ` Ingo Molnar
1999-04-07 14:08 ` Andrea Arcangeli
1999-04-05 20:24 ` [patch] arca-vm-2.2.5 Horst von Brand
1999-04-05 23:25 ` Andrea Arcangeli
1999-04-05 23:37 ` Horst von Brand
1999-04-06 1:23 ` Andrea Arcangeli
1999-04-17 11:12 ` Andrea Arcangeli
1999-04-05 21:31 ` Chuck Lever
1999-04-06 0:15 ` Andrea Arcangeli
1999-04-06 2:14 ` Doug Ledford
1999-04-06 13:04 ` Andrea Arcangeli
1999-04-06 21:31 ` Stephen C. Tweedie
1999-04-06 22:27 ` Andrea Arcangeli
1999-04-07 12:27 ` Stephen C. Tweedie
1999-04-25 3:22 ` Chuck Lever
1999-04-06 5:52 ` Chuck Lever
1999-04-06 13:09 ` Andrea Arcangeli
1999-04-06 16:19 ` Eric W. Biederman
1999-04-06 20:26 ` Andrea Arcangeli
1999-04-07 5:00 ` Eric W. Biederman
1999-04-07 11:36 ` Andrea Arcangeli
1999-04-06 14:02 ` Stephen C. Tweedie
1999-04-06 15:38 ` Chuck Lever
1999-04-06 17:16 ` Andrea Arcangeli
1999-04-06 18:07 ` Andrea Arcangeli
1999-04-06 21:22 ` Stephen C. Tweedie
1999-04-06 22:19 ` Ingo Molnar
1999-04-06 22:40 ` David Miller
1999-04-06 22:49 ` Ingo Molnar
1999-04-06 22:53 ` David Miller
1999-04-07 15:59 ` Gabriel Paubert
1999-04-07 21:07 ` Arvind Sankar
1999-04-09 6:58 ` Eric W. Biederman
1999-04-09 9:27 ` Gabriel Paubert
1999-04-09 15:40 ` Eric W. Biederman
1999-04-08 8:09 ` Carlo Daffara
1999-04-06 22:31 ` Andrea Arcangeli
1999-04-06 20:47 ` Chuck Lever
1999-04-06 21:04 ` Andrea Arcangeli
1999-04-06 21:11 ` Stephen C. Tweedie
1999-04-06 14:00 ` Stephen C. Tweedie
1999-04-06 16:29 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.05.9904050033340.779-100000@laser.random \
--to=andrea@e-mind.com \
--cc=cel@monkey.org \
--cc=linux-kernel@vger.rutgers.edu \
--cc=linux-mm@kvack.org \
--cc=sct@redhat.com \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox