From: Ingo Molnar <mingo@chiara.csoma.elte.hu>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Kanoj Sarcar <kanoj@google.engr.sgi.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
Andrea Arcangeli <andrea@suse.de>,
Rik van Riel <riel@nl.linux.org>,
linux-mm@kvack.org, linux-kernel@vger.rutgers.edu
Subject: Re: [RFC] 2.3.39 zone balancing
Date: Fri, 14 Jan 2000 00:53:00 +0100 (CET) [thread overview]
Message-ID: <Pine.LNX.4.10.10001140040040.6274-100000@chiara.csoma.elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.10.10001131428250.2250-100000@penguin.transmeta.com>
On Thu, 13 Jan 2000, Linus Torvalds wrote:
> > more_work = 0;
> > for (i = 0; i < MAX_NR_ZONES; i++) {
> > if (i != ZONE_HIGHMEM)
> > more_work |= balance_zone(zone+i)
>
> No, the other reason for kswapd is to get "smoother" behaviour, by trying
> to keep some memory free. Also, while we don't use high-memory pages right
> now in BH and irq contexts, I don't think that is something we need to
> codify, and it may change in the future. There's no real reason per se for
> not using them (except for complexity), so I'd hate to have a special case
> for that case.
one more thing, i think there is a real possibility for the following
scenario to happen: well used server, pagecache takes up all the RAM, as
it should. Application just happens to run out of free RAM and we allocate
from the DMA zone. Then the application happens to use these DMA pages
heavily, and which pages thus become unlikely to get freed. Ie. kswapd
will feel the memory pressure in the DMA zone, without being able to help
the situation. Just running kswapd for a long time will not help the
situation, because the DMA pages are highly used.
so why cant swap_out (conceptually) accept a 'zones under pressure'
bitmask as an input, and calculate zones from the physical address it sees
in the page table. Some per-architecture thing like:
static inline pte_in_zonemask (pte, unsigned long mask)
{
idx = pte_to_pagenr(pte);
/*
* Pages are more likely to be in the highest zone
*/
for (i = ZONE_MAX-1; i--; ) {
struct zone_t *zone = zones + i;
if (zone->offset < idx)
return (1 << (zone-zones)) & mask;
}
}
since ZONE_MAX is 2 or 3 typically, this will likely be unrolled. It's not
going to be as fast as now, but it's simple nevertheless. (and swapping
out is never fast in the first place)
so if kswapd generated a memory pressure 'zone bitmask' instead of a
single zone (single zone is definitely broken), then we could solve such
situations as well. This is at the price of kswapd looping through
pagetables, but i think we should be ready to pay this price for
predictability. Only GFP_DMA16 will pay such price, GFP_NORMAL is likely
to succeed in typical systems. Once highmem_pages/normal_pages is getting
larger, this cost goes up as well.
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.nl.linux.org/Linux-MM/
next prev parent reply other threads:[~2000-01-13 23:53 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-01-12 21:11 Kanoj Sarcar
2000-01-13 13:40 ` Rik van Riel
2000-01-13 17:06 ` Andrea Arcangeli
2000-01-13 17:18 ` Alan Cox
2000-01-13 18:37 ` Rik van Riel
2000-01-13 20:13 ` Andrea Arcangeli
2000-01-13 21:12 ` Rik van Riel
2000-01-13 21:40 ` Kanoj Sarcar
2000-01-14 12:25 ` Jamie Lokier
2000-01-14 13:43 ` Andrea Arcangeli
2000-01-13 18:52 ` Kanoj Sarcar
2000-01-13 19:59 ` Andrea Arcangeli
2000-01-13 21:02 ` Kanoj Sarcar
2000-01-13 21:34 ` Benjamin C.R. LaHaise
2000-01-13 21:48 ` Kanoj Sarcar
2000-01-13 21:42 ` Alan Cox
2000-01-13 21:50 ` Kanoj Sarcar
2000-01-13 21:53 ` Alan Cox
2000-01-13 22:01 ` Linus Torvalds
2000-01-13 22:13 ` Kanoj Sarcar
2000-01-13 22:28 ` Rik van Riel
2000-01-13 22:30 ` Linus Torvalds
2000-01-13 23:53 ` Ingo Molnar [this message]
2000-01-13 23:29 ` Linus Torvalds
2000-01-14 0:33 ` Andrea Arcangeli
2000-01-14 0:52 ` Linus Torvalds
2000-01-14 1:08 ` Rik van Riel
2000-01-14 2:13 ` Ingo Molnar
2000-01-14 1:17 ` Kanoj Sarcar
2000-01-14 2:36 ` Ingo Molnar
2000-01-14 20:33 ` Peter Rival
2000-01-14 1:13 ` Kanoj Sarcar
2000-01-14 2:27 ` Ingo Molnar
2000-01-14 2:46 ` Ingo Molnar
2000-01-14 6:22 ` Kanoj Sarcar
2000-01-15 2:03 ` Reworked 2.3.39 zone balancing - v1 Kanoj Sarcar
2000-01-14 0:28 ` [RFC] 2.3.39 zone balancing Andrea Arcangeli
2000-01-13 17:12 ` Andrea Arcangeli
2000-01-13 18:30 ` Kanoj Sarcar
2000-01-13 19:22 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.10.10001140040040.6274-100000@chiara.csoma.elte.hu \
--to=mingo@chiara.csoma.elte.hu \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=andrea@suse.de \
--cc=kanoj@google.engr.sgi.com \
--cc=linux-kernel@vger.rutgers.edu \
--cc=linux-mm@kvack.org \
--cc=riel@nl.linux.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox