* Re: The long, long life of an inactive_dirty page
@ 2004-05-12 19:18 Andrew Crawford
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Crawford @ 2004-05-12 19:18 UTC (permalink / raw)
To: linux-mm
> That information is not achievable in a reliable way, ever. Simply because
> it takes a not-even-near-inifintely small amount of time to gather all the
> stats, during which the other cpu can change all the underlying data away
> under your nose.
Although that is of course true, it's also well understood and a factor with
any operating system. Nevertheless, it would be useful to have a picture of
how much memory is available right now. An inaccurate (within reason)
indication *would* be better than none at all.
Just to gather opinions, and since we've been having a bit of a heated
discussion about it here at work, how would you good people define "available
RAM"? and which memory metrics would make it up?
My definition would be "RAM which can be allocated and used without the need
to write any pages". I.e if the memory needs laundered first but no actual
writes need to be done, that's fine and I'll count it as available.
So I'd be counting
Free
+
Inactive_Clean
+
The clean part of inactive_dirty (which can't be measured at present)
Is there anything else that should be on there?
Cheers,
Andrew
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The long, long life of an inactive_dirty page
2004-05-12 18:24 Andrew Crawford
@ 2004-05-12 18:46 ` Arjan van de Ven
0 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2004-05-12 18:46 UTC (permalink / raw)
To: linux-mm
[-- Attachment #1: Type: text/plain, Size: 2367 bytes --]
On Wed, 2004-05-12 at 20:24, Andrew Crawford wrote:
> Thanks for all your replies so far, and the helpful information.
>
> > well you may IF you fix your mail setup to not send me evil mails about
> > having to confirm something.
>
> Just to clarify, you received that mail because you replied to this address
> directly; This account don't accept emails from unverified addresses. This
> account is not subscribed to linux-mm, which I read elsewhere.
I find that truely obnoxious.
> > One thing to realize is that after bdflush has written the pages out, they
> > can become dirty AGAIN for a variety of reasons, and as such the accounting
> > is not quite straightforward.
>
> Is it possible for a page to become dirty again while still remaining
> inactive? Could you give an example? (genuinely curious, hope this doesn't
> sound like I'm arguing!)
If someone has that page mmaped for example, the app that has it mmap'd
can just write to the memory (which makes the page dirty in the
pagetable). The kernel doesn't get involved in that at all.
> > the problem is that the "becoming clean" is basically asynchronous
>
> Isn't this equally true for page_launder? Even if bdflush would wait until the
> next "pass" to move pages to the "clean" list it would be better than the
> current situation. There must be some mechanism that bdflush uses to avoid
> writing the same page twice in a row; couldn't it say "oh, already wrote that
> one, into inactive_clean it goes".
It's slightly more subtle in the kernel you looked at. There is a list
for "write out pending" and a "clean" list.
Between all these lists there is a strict LRU order. You don't move it
to clean once it gets clean, you move it to "write out pending" when you
start writeout, and the VM moves the *other* side of the writeout list
to the clean list when it's clean.
> You will probably appreciate that I am coming at this from the point of view
> of performance measurement and capacity planning; I want to know how much
> actual memory is free or immediately reusable at a point in time.
That information is not achievable in a reliable way, ever. Simply
because it takes a not-even-near-inifintely small amount of time to
gather all the stats, during which the other cpu can change all the
underlying data away under your nose.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The long, long life of an inactive_dirty page
@ 2004-05-12 18:24 Andrew Crawford
2004-05-12 18:46 ` Arjan van de Ven
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Crawford @ 2004-05-12 18:24 UTC (permalink / raw)
To: linux-mm
Thanks for all your replies so far, and the helpful information.
> well you may IF you fix your mail setup to not send me evil mails about
> having to confirm something.
Just to clarify, you received that mail because you replied to this address
directly; This account don't accept emails from unverified addresses. This
account is not subscribed to linux-mm, which I read elsewhere.
> One thing to realize is that after bdflush has written the pages out, they
> can become dirty AGAIN for a variety of reasons, and as such the accounting
> is not quite straightforward.
Is it possible for a page to become dirty again while still remaining
inactive? Could you give an example? (genuinely curious, hope this doesn't
sound like I'm arguing!)
> the problem is that the "becoming clean" is basically asynchronous
Isn't this equally true for page_launder? Even if bdflush would wait until the
next "pass" to move pages to the "clean" list it would be better than the
current situation. There must be some mechanism that bdflush uses to avoid
writing the same page twice in a row; couldn't it say "oh, already wrote that
one, into inactive_clean it goes".
You will probably appreciate that I am coming at this from the point of view
of performance measurement and capacity planning; I want to know how much
actual memory is free or immediately reusable at a point in time.
With thanks for all help,
Andrew
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The long, long life of an inactive_dirty page
2004-05-12 15:28 Andrew Crawford
@ 2004-05-12 16:29 ` Arjan van de Ven
0 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2004-05-12 16:29 UTC (permalink / raw)
To: linux-mm
[-- Attachment #1: Type: text/plain, Size: 1446 bytes --]
On Wed, 2004-05-12 at 17:28, Andrew Crawford wrote:
> Arjan van de Ven wrote:
>
> >bdflush and co WILL commit the data to disk after like 30 seconds.
> >They will not move it to inactive_clean; that will happen at the first
> >sight of memory pressure. The code that does that notices that the data
> >isn't dirty and won't do a write-out just a move.
>
> Thanks for that. I have a couple of follow-up questions if I may be so bold:
well you may IF you fix your mail setup to not send me evil mails about
having to confirm something.
>
> 1. Is there any way, from user space, to distinguish inactive_dirty pages
> which have actually been written from those which haven't?
no, in fact the kernel doesn't know until it looks at the pages (which
it only does on demand). One thing to realize is that after bdflush has
written the pages out, they can become dirty AGAIN for a variety of
reasons, and as such the accounting is not quite straightforward.
> 2. Is there any reason, conceptually, that bdflush shouldn't move the pages to
> the inactive_clean list as page_launder does? After all, they become "known
> clean" at that point, not X hours later when there is a memory shortfall.
the problem is that the "becoming clean" is basically asynchronous,
which would mean the LRU order (FIFO basically) would be destroyed.
(there's implementation issues as well wrt lock ranking etc etc but
that's details)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The long, long life of an inactive_dirty page
@ 2004-05-12 15:28 Andrew Crawford
2004-05-12 16:29 ` Arjan van de Ven
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Crawford @ 2004-05-12 15:28 UTC (permalink / raw)
To: linux-mm
Arjan van de Ven wrote:
>bdflush and co WILL commit the data to disk after like 30 seconds.
>They will not move it to inactive_clean; that will happen at the first
>sight of memory pressure. The code that does that notices that the data
>isn't dirty and won't do a write-out just a move.
Thanks for that. I have a couple of follow-up questions if I may be so bold:
1. Is there any way, from user space, to distinguish inactive_dirty pages
which have actually been written from those which haven't?
2. Is there any reason, conceptually, that bdflush shouldn't move the pages to
the inactive_clean list as page_launder does? After all, they become "known
clean" at that point, not X hours later when there is a memory shortfall.
Cheers,
Andrew
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The long, long life of an inactive_dirty page
2004-05-12 14:11 Andrew Crawford
@ 2004-05-12 14:51 ` Arjan van de Ven
0 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2004-05-12 14:51 UTC (permalink / raw)
To: Andrew Crawford; +Cc: linux-mm
[-- Attachment #1: Type: text/plain, Size: 1304 bytes --]
> It is my understanding that the next thing that should happen is that
> page_launder(), which is invoked when memory gets low, should come along and
> get those pages written, and then, on its next pass mark them inactive_clean.
>
> But in thise case, we have plenty of memory available and absolutely nothing
> using it. So there's never any memory pressure, page_launder is never called,
> and the data is never written to disk. This is arguably a bad thing; an
> entirely idle system should not be sitting for hours or days with uncommitted
> data in RAM for the obvious reason.
bdflush and co WILL commit the data to disk after like 30 seconds.
They will not move it to inactive_clean; that will happen at the first
sight of memory pressure. The code that does that notices that the data
isn't dirty and won't do a write-out just a move.
> > grep Inact_dirty /proc/meminfo
> Inact_dirty: 492240 kB
>
> [ ~5 minutes later ]
>
> > grep Inact_dirty /proc/meminfo
> Inact_dirty: 463680 kB
Inact_dirty isn't guaranteed to be dirty, it's the list of pages that
CAN be dirty.
> That's 460MB of uncommitted data hanging around on a completely idle machine.
>
it's not uncommitted, as I said there are other methods that make sure
that doesn't happen.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* The long, long life of an inactive_dirty page
@ 2004-05-12 14:11 Andrew Crawford
2004-05-12 14:51 ` Arjan van de Ven
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Crawford @ 2004-05-12 14:11 UTC (permalink / raw)
To: linux-mm
Please forgive this beginner-level question, but I am trying to understand the
process by which dirty pages become clean when there is no memory pressure. I
have been unable to find any useful answers in a very thorough Google session.
All of these questions are based on 2.4.21 (-9.ELhugemem).
Imagine that I have a process which writes a large amount of data to a file,
then exits. It is clear and demonstrable that those pages, as yet unwritten to
disk, end up on the inactive_dirty list.
Nothing else is running on the box. There are several GBs of free RAM.
It is my understanding that the next thing that should happen is that
page_launder(), which is invoked when memory gets low, should come along and
get those pages written, and then, on its next pass mark them inactive_clean.
But in thise case, we have plenty of memory available and absolutely nothing
using it. So there's never any memory pressure, page_launder is never called,
and the data is never written to disk. This is arguably a bad thing; an
entirely idle system should not be sitting for hours or days with uncommitted
data in RAM for the obvious reason.
Now my understanding might be naive, out of date, or just plain wrong, but
nevertheless this is happening in real life on our servers. I can produce what
appears to be the same behaviour at will.
After I create a large file with dd, I find that the size of inactive_dirty
reduces at a steady rate - exactly 260K per two seconds - until it reaches a
level where it remains indefinitely.
> grep Inact_dirty /proc/meminfo
Inact_dirty: 480 kB
> dd if=/dev/zero of=/tmp/ac1 bs=1048576 count=500
500+0 records in
500+0 records out
> grep Inact_dirty /proc/meminfo
Inact_dirty: 510684 kB
> grep MemFree /proc/meminfo
MemFree: 7065484 kB
[ ~5 minutes later ]
> grep Inact_dirty /proc/meminfo
Inact_dirty: 492240 kB
[ ~5 minutes later ]
> grep Inact_dirty /proc/meminfo
Inact_dirty: 463680 kB
[ ~1 hr later ]
> grep Inact_dirty /proc/meminfo
Inact_dirty: 463688 kB
[ ~5 hrs later ]
> grep Inact_dirty /proc/meminfo
Inact_dirty: 463682 kB
.. and indeed, the next day the number is basically the same, as long as
updatedb and similar aren't run overnight.
That's 460MB of uncommitted data hanging around on a completely idle machine.
Are there any proc/sysctl parameters that can influence this behaviour?
With thanks for any insights,
Yours,
Andrew
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-05-12 19:18 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-12 19:18 The long, long life of an inactive_dirty page Andrew Crawford
-- strict thread matches above, loose matches on Subject: below --
2004-05-12 18:24 Andrew Crawford
2004-05-12 18:46 ` Arjan van de Ven
2004-05-12 15:28 Andrew Crawford
2004-05-12 16:29 ` Arjan van de Ven
2004-05-12 14:11 Andrew Crawford
2004-05-12 14:51 ` Arjan van de Ven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox