From: Peter Zijlstra <peterz@infradead.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, riel <riel@redhat.com>
Subject: Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks
Date: Thu, 23 Aug 2007 09:39:00 +0200 [thread overview]
Message-ID: <1187854740.6114.319.camel@twins> (raw)
In-Reply-To: <Pine.LNX.4.64.0708221306080.15775@schroedinger.engr.sgi.com>
On Wed, 2007-08-22 at 13:16 -0700, Christoph Lameter wrote:
> On Wed, 22 Aug 2007, Peter Zijlstra wrote:
> > > > As shown, there are cases where there just isn't any memory to reclaim.
^^^^^^^
> > > > Please accept this.
> > > That is an extreme case that AFAIK we currently ignore and could be
> > > avoided with some effort.
> >
> > Its not extreme, not even rare, and its handled now. Its what
> > PF_MEMALLOC is for.
>
> No its not. If you have all pages allocated as anonymous pages and your
> writeout requires more pages than available in the reserves then you are
> screwed either way regardless if you have PF_MEMALLOC set or not.
Christoph, we were talking about memory to reclaim, no about exhausting
the reserves.
> > > The initial PF_MEMALLOC patchset seems to be
> > > still enough to deal with your issues.
> >
> > Take the anonyous workload, user-space will block once the page
> > allocator hits ALLOC_MIN. Network will be able to receive until
> > ALLOC_MIN|ALLOC_HIGH - if the completion doesn't arrive by then it will
> > start dropping all packets until there is memory again. But userspace is
> > wedged and hence will not consume the network traffic, hence we
> > deadlock.
> >
> > Even if there is something to reclaim initially, if the pressure
> > persists that can eventually be exhausted.
>
> Sure ultimately you will end up with pages that are all unreclaimable if
> you reclaim all reclaimable memory.
>
> > > multiple critical tasks on various devices that have various memory needs.
> > > So multiple critical spots can happen concurrently in multiple
> > > application contexts.
> >
> > yes, reclaim can be unbounded concurrent, and that is one of the
> > (theoretically) major problems we currently have.
>
> So your patchset is not fixing it?
No, and I never said it would. I've been meaning to do one that does
though. Just haven't come around to actually doing it :-/
> > > We have that with PF_MEMALLOC.
> >
> > Exactly. But if you recognise the need for PF_MEMALLOC then what is this
> > argument about?
>
> The PF_MEMALLOC patchset f.e. is about avoiding to go out of
> memory when there is still memory available even if we are doing a
> PF_MEMALLOC allocation and would OOM otherwise.
Right, but as long as there is a need for PF_MEMALLOC there is a need
for the patches I proposed.
> > Networking can currently be seen as having two states:
> >
> > 1 receive packets and consume memory
> > 2 drop all packets (when out of memory)
> >
> > I need a 3rd state:
> >
> > 3 receiving packets but not consuming memory
>
> So far a good idea. If you are not consuming memory then why are the
> allocators involved?
Because I do need to receive some packets, its just that I'll free them
again. So it won't keep consuming memory. This needs a little pool of
memory in order to operate in a stable state.
Its: alloc, receive, inspect, free
total memory use: 0
memory delta: a little
(its just that you need to be able to receive a significant number of
packets, not 1, due to funny things like ip-defragmentation before you
can be sure to actually receive 1 whole tcp packet - but the idea is the
same)
> > Now, I need this state when we're in PF_MEMALLOC territory, because I
> > need to be able to process an unspecified amount of network traffic in
> > order to receive the writeout completion.
> >
> > In order to operate this 3rd network state, some memory is needed in
> > which packets can be received and when deemed not important freed and
> > reused.
> >
> > It needs a bounded amount of memory in order to process an unbounded
> > amount of network traffic.
> >
> > What exactly is not clear about this? If you accept the need for
> > PF_MEMALLOC you surely must also agree that at the point you're using it
> > running reclaim is useless.
>
> Yes looks like you would like to add something to the network layer to
> filter important packets. As long as you stay within PF_MEMALLOC
> boundaries you can allocate and throw packets away. If you want to have a
> reserve that is secure and just for you then you need to take it away from
> the reserves (which in turn will lead reclaim to restore them).
Ah, but also note that _using_ PF_MEMALLOC is the trigger to enter that
3rd network state. These two are tightly coupled. You only need this 3rd
state when under PF_MEMALLOC, otherwise we could just receive normally.
So, my thinking was that, if the current reserves are good enough to
keep the system 'deadlock' free, I can just enlarge the reserves by
whatever it is I need for that network state and we're all good, no?
Why separate these two? If the current reserve is large enough (and
theoretically it is not - but I'm meaning to fix that) it will not
consume the extra memory I added below.
Note how:
[PATCH 09/10] mm: emergency pool
pushes up the current reserves in a fashion so as to maintain the
relative operating range of the page allocator (distance between
min,low,high and scaling of the wmarks under ALLOC_HIGH|ALLOC_HARDER).
> > > > Also, failing a memory allocation isn't bad, why are you so worried
> > > > about that? It happens all the time.
> > >
> > > Its a performance impact and plainly does not make sense if there is
> > > reclaimable memory availble. The common action of the vm is to reclaim if
> > > there is a demand for memory. Now we suddenly abandon that approach?
> >
> > I'm utterly confused by this, on one hand you recognise the need for
> > PF_MEMALLOC but on the other hand you're saying its not needed and
> > anybody needing memory (even reclaim itself) should use reclaim.
>
> The VM reclaims memory on demand but in exceptional limited cases where we
> cannot do so we use the reserves. I am sure you know this.
Its the abandon part I got confused about. I'm not at all abandoning
reclaim, its just that I must operate under PF_MEMALLOC, so reclaim is
pointless.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-08-23 7:39 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-20 21:50 Christoph Lameter
2007-08-20 21:50 ` [RFC 1/7] release_lru_pages(): Generic release of pages to the LRU Christoph Lameter
2007-08-21 14:52 ` Mel Gorman
2007-08-21 20:51 ` Christoph Lameter
2007-08-20 21:50 ` [RFC 2/7] Move checks from pageout() to shrink_page_list Christoph Lameter
2007-08-20 21:50 ` [RFC 3/7] shrink_page_list: Support isolating dirty pages on laundry list Christoph Lameter
2007-08-21 15:04 ` Mel Gorman
2007-08-21 20:53 ` Christoph Lameter
2007-08-20 21:50 ` [RFC 4/7] Pass laundry through shrink_inactive_list() and shrink_zone() Christoph Lameter
2007-08-20 21:50 ` [RFC 5/7] Laundry handling for direct reclaim Christoph Lameter
2007-08-21 15:06 ` Mel Gorman
2007-08-21 20:55 ` Christoph Lameter
2007-08-21 15:19 ` Mel Gorman
2007-08-21 21:00 ` Christoph Lameter
2007-08-20 21:50 ` [RFC 6/7] kswapd: Do laundry after reclaim Christoph Lameter
2007-08-20 21:50 ` [RFC 7/7] Switch of PF_MEMALLOC during writeout Christoph Lameter
2007-08-20 23:08 ` Andi Kleen
2007-08-20 23:19 ` Christoph Lameter
2007-08-21 1:13 ` Andi Kleen
2007-08-21 10:36 ` [RFC 0/7] Postphone reclaim laundry to write at high water marks Peter Zijlstra
2007-08-21 20:48 ` Christoph Lameter
2007-08-21 21:13 ` Peter Zijlstra
2007-08-21 21:29 ` Christoph Lameter
2007-08-21 21:43 ` Rik van Riel
2007-08-21 22:32 ` Christoph Lameter
2007-08-23 12:05 ` Andrea Arcangeli
2007-08-23 20:23 ` Christoph Lameter
2007-08-21 22:09 ` Peter Zijlstra
2007-08-21 22:43 ` Christoph Lameter
2007-08-22 7:02 ` Peter Zijlstra
2007-08-22 19:04 ` Christoph Lameter
2007-08-22 20:03 ` Peter Zijlstra
2007-08-22 20:16 ` Christoph Lameter
2007-08-23 7:39 ` Peter Zijlstra [this message]
2007-08-26 4:52 ` Rik van Riel
2007-08-23 12:16 ` Andrea Arcangeli
2007-08-22 7:45 ` Ingo Molnar
2007-08-22 19:19 ` Christoph Lameter
2007-08-23 12:08 ` Andrea Arcangeli
2007-08-23 12:59 ` Peter Zijlstra
2007-08-21 15:16 ` Rik van Riel
2007-08-21 20:59 ` Christoph Lameter
2007-08-21 21:14 ` Rik van Riel
2007-08-21 21:30 ` Christoph Lameter
2007-08-21 15:51 ` Dave McCracken
2007-08-21 21:03 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1187854740.6114.319.camel@twins \
--to=peterz@infradead.org \
--cc=clameter@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox