From: "Juan J. Quintela" <quintela@fi.udc.es>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Rik van Riel <riel@conectiva.com.br>,
"Stephen C. Tweedie" <sct@redhat.com>,
Zlatko Calusic <zlatko@iskon.hr>,
alan@redhat.com, Linux MM List <linux-mm@kvack.org>,
Linux Kernel List <linux-kernel@vger.rutgers.edu>,
Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [patch] improve streaming I/O [bug in shrink_mmap()]
Date: 14 Jun 2000 01:41:42 +0200 [thread overview]
Message-ID: <ytthfax87yh.fsf@serpe.mitica> (raw)
In-Reply-To: Andrea Arcangeli's message of "Wed, 14 Jun 2000 01:07:23 +0200 (CEST)"
>>>>> "andrea" == Andrea Arcangeli <andrea@suse.de> writes:
Hi
andrea> How can you be sure of that? So I'll make you an obvious case where
andrea> it will shrink not twice, not three times but _forever_.
andrea> Assume the pages_min of the normal zone watermark triggers when the normal
andrea> zone is allocated at 95% and assume that all such 95% of the normal zone
andrea> is been allocated all in mlocked memory and kernel mem_map_t array. Can't
andrea> somebody (for example an oracle database) allocate 95% of the normal zone
andrea> in mlocked shm memory? Do you agree? Or you are telling me it can't or
andrea> that if it does so it should then expect the linux kernel to explode
andrea> (actually it would cause kswapd to loop forever trying to free the normal
andrea> zone even if there's still 15mbyte of ZONE_DMA memory free).
andrea> So let's make the whole picture from the start starting with all the
andrea> memory free: assume oracle allocates all the normal zone in shm mlocked
andrea> memory. You still have 15mbyte free for the cache in the ZONE_DMA, OK?
andrea> Then you allocate the 95% of such 15mbyte in the cache and then kswapd
andrea> triggers and it will never stop because it will try to free the
andrea> zone_normal forever, even if it just recycled enough memory from the
andrea> ZONE_DMA (so even if __alloc_pages wouldn't start memory balancing
andrea> anymore!). See????
andrea> The classzone patch will fix the above bad behaviour completly because
andrea> kswapd in classzone will notice that there's enough memory for allocation
andrea> from both ZONE_DMA and ZONE_NORMAL because the cache in the ZONE_DMA is
andrea> been recycled successfully.
andrea> Without classzone you'll always get the above case wrong and I don't mind
andrea> if it's a corner case or not, we have to handle it right! I will hate a
andrea> kernel that works fine only as far as you only compile kernels on it.
I think that if you have a program that mlocked 95% of your normal
memory you have two options:
- tweak the values of freepages.{min,low,high}
- buy more memory
What is the difference with the case where we mlocked *all* memory.
If we allocate all memory we don't expect the system to work. Pass
one limit, there is no way to solve the problem. The limit just now
in freepages.high. If you don't like that limit, change it.
Notice also that the actual allocator will give the shm segment pages
in the DMA zone and in the normal zone, that the case that it
allocates all the NORMAL zone but nothing of the DMA zone is not the
normal case, nor should happend. It should get their pages from the
DMA zone and the NORMAL zone. If we have the 95% of the DMA zone and
the 95% of the NORMAL zone mlocked, we are really in problems....
>> I think you're overlooking the fact that kswapd's freeing of
>> pages is something that occurs only *once*...
andrea> Since the normal zone will never return over pages_low it will run more
andrea> than once.
as I told before, if you want to have 95% of your memory mlocked, you
should tweak the values of freepages.*
andrea> My argument of the classzone design is to get correctness in the corner
andrea> case: to fix the drawbacks.
andrea> Then I also included into such patch some performance stuff and that's why
andrea> it also improve performances siginficantly but I'm not interested about
andrea> such part for now. Since such part is stable as well you can get both
andrea> correctness and improvement at the same time but I can drop the
andrea> performance part if there will be an interest only on the other part.
At least _I_ am interested in *only* the performance part. I would
like to test the actual aproach with your performance improvements and
then compare the design. I conceptually preffer the zones desing, but
I can be proved wrong.
andrea> I don't mind about the other part of the email at this moment, I only mind
andrea> about the global design of the allocator at this moment.
Later, Juan.
--
In theory, practice and theory are the same, but in practice they
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-06-13 23:41 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-06-12 21:46 Zlatko Calusic
2000-06-12 22:29 ` Stephen C. Tweedie
2000-06-12 23:04 ` Rik van Riel
2000-06-13 15:08 ` Andrea Arcangeli
2000-06-13 17:08 ` Juan J. Quintela
2000-06-13 19:09 ` Andrea Arcangeli
2000-06-13 19:32 ` Rik van Riel
2000-06-13 23:07 ` Andrea Arcangeli
2000-06-13 23:34 ` Rik van Riel
2000-06-14 0:12 ` Andrea Arcangeli
2000-06-14 0:58 ` Rik van Riel
2000-06-14 1:18 ` Andrea Arcangeli
2000-06-14 1:33 ` Rik van Riel
2000-06-14 2:10 ` Andrea Arcangeli
2000-06-14 2:46 ` Rik van Riel
2000-06-14 13:01 ` Andrea Arcangeli
2000-06-14 13:44 ` Rik van Riel
2000-06-14 13:57 ` Andrea Arcangeli
2000-06-14 16:48 ` Rik van Riel
2000-06-14 17:14 ` Andrea Arcangeli
2000-06-14 17:33 ` Rik van Riel
2000-06-14 18:37 ` Andrea Arcangeli
2000-06-13 23:41 ` Juan J. Quintela [this message]
2000-06-14 0:21 ` Andrea Arcangeli
2000-06-13 19:20 ` Rik van Riel
2000-06-13 21:49 ` Andrea Arcangeli
2000-06-13 8:10 Roger Larsson
[not found] <8i3qe8$lltbv$1@fido.engr.sgi.com>
2000-06-14 6:17 ` Rajagopal Ananthanarayanan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ytthfax87yh.fsf@serpe.mitica \
--to=quintela@fi.udc.es \
--cc=alan@redhat.com \
--cc=andrea@suse.de \
--cc=linux-kernel@vger.rutgers.edu \
--cc=linux-mm@kvack.org \
--cc=riel@conectiva.com.br \
--cc=sct@redhat.com \
--cc=torvalds@transmeta.com \
--cc=zlatko@iskon.hr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox