linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@redhat.com>
To: "Martin J. Bligh" <mbligh@aracnet.com>
Cc: Andrew Morton <akpm@digeo.com>, Andrea Arcangeli <andrea@suse.de>,
	mingo@elte.hu, hugh@veritas.com, dmccr@us.ibm.com,
	Linus Torvalds <torvalds@transmeta.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: objrmap and vmtruncate
Date: Tue, 22 Apr 2003 11:07:32 -0400 (EDT)	[thread overview]
Message-ID: <Pine.LNX.4.44.0304221032560.10400-100000@devserv.devel.redhat.com> (raw)
In-Reply-To: <170570000.1051021741@[10.10.2.4]>

On Tue, 22 Apr 2003, Martin J. Bligh wrote:

> [...] I think we're optimising for the wrong case here - isn't the
> 100x100 mappings case exactly what we have sys_remap_file_pages for?

i'm inherently uncomfortable about adding any non-limited component to the
VM scanning code that deteriorates this badly by such a relatively low
level of sharing. Even with just 10 processes sharing the same inode 10
times (easily possible) causes an iteration of 100 steps for every page,
100+ cachelines touched. This brings us back to the Linux 2.0 days. It
also sends the wrong message: 'the more you share, the more we will punish
you'.

Also, the overhead pops up at the wrong place, not in the application
itself: in a central algorithm that otherwise _needs_ timely operation
just for the sake of generic system health. I might be wrong, but i very
much believe that first-class support for 'sharing as much stuff as
possible' should be a central design thing in the VM.

also, it's an inherent DoS thing. Any application that has the 'privilege'
of creating 1000 mappings in 1000 processes to the same inode (this is
already being done, and it will be commonplace in a few years) will
immediately DoS the whole VM. I might be repeating myself, but quadratic
algorithms do get linearly _worse_ as the hw evolves. The pidhash
quadratic behavior triggering the NMI watchdog on the biggest boxes is
just one example of this.

all the VM work in 2.5 has proven that the path to a good and reliable VM
is no-nonsense simplicity and basic robustness, both in algorithms and in
data structures. Queueing stuff as much as possible, avoiding extra
scanning as much as possible. And for God's sake, do not reintroduce any
quadratic algorithm, anywhere.

all this loss in quality and predictibility just to avoid some easily
calculatable RAM overhead? [which RAM overhead can be avoided by smart
applications if they want.]

where does sys_remap_file_pages() stand in this picture?

sys_remap_file_pages() could be fully substituted with mmap(): if the same
file in the same vma, using the same permissions is used with a nonlinear
offset then mmap() could decide to use the techniques of
sys_remap_file_pages() to create nonlinear ptes in that range. It's a
vma-overhead optimization for highly granular mappings.

so in theory we could do the following: if the sharing factor is less than
... 4-5 or so, then use objrmap, otherwise use nonlinear mappings. There
are a couple of problems with this hybrid approach: there is cost
associated with a 'flipover' from objrmap to nonlinear (the vmas have to
be merged, and all non-present ptes have to be fixed up to their pre-merge
offset), but it could probably be reduced and delegated to the app doing
the mapping, which would remove this cost component from the generic
scanning code.

doing the 'sharing factor of 5' flipover would address all my complaints:
nonlinear mappings can automatically solve the quadratic-algorithm
problems, and objrmap can be used whenever the sharing factor is low
enough.

ie. an app creating many mappings in many processes to the same inode
would 'magically' be presented with nonlinear mappings - without it having
to care.

can anyone see any problem with this approach?

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2003-04-22 15:07 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-04 14:34 Hugh Dickins
2003-04-04 16:14 ` William Lee Irwin III
2003-04-04 16:29   ` Hugh Dickins
2003-04-04 18:54 ` Andrew Morton
2003-04-04 21:43   ` Hugh Dickins
2003-04-04 21:45   ` Andrea Arcangeli
2003-04-04 21:58     ` Benjamin LaHaise
2003-04-04 23:07     ` Andrew Morton
2003-04-05  0:03       ` Andrea Arcangeli
2003-04-05  0:31         ` Andrew Morton
2003-04-05  1:31           ` Andrea Arcangeli
2003-04-05  1:52             ` Benjamin LaHaise
2003-04-05  2:22               ` Andrea Arcangeli
2003-04-05 10:01                 ` Jamie Lokier
2003-04-05 10:11                   ` William Lee Irwin III
2003-04-05  2:06             ` Andrew Morton
2003-04-05  2:24               ` Andrea Arcangeli
2003-04-05  2:13           ` Martin J. Bligh
2003-04-05  2:44             ` Andrea Arcangeli
2003-04-05  3:24               ` Andrew Morton
2003-04-05 12:06                 ` Andrew Morton
2003-04-05 15:11                   ` Martin J. Bligh
     [not found]                     ` <20030405161758.1ee19bfa.akpm@digeo.com>
2003-04-06  0:17                       ` Andrew Morton
2003-04-06  7:07                       ` William Lee Irwin III
2003-04-05 16:30                   ` Andrea Arcangeli
2003-04-05 19:01                     ` Andrea Arcangeli
2003-04-05 20:14                       ` Andrew Morton
2003-04-05 21:24                     ` Andrew Morton
2003-04-05 22:06                       ` Andrea Arcangeli
2003-04-05 22:31                         ` Andrew Morton
2003-04-05 23:10                           ` Andrea Arcangeli
2003-04-06  1:58                             ` Andrew Morton
2003-04-06 14:47                               ` Andrea Arcangeli
2003-04-06 21:35                                 ` William Lee Irwin III
2003-04-06  7:38                             ` William Lee Irwin III
2003-04-06 14:51                               ` Andrea Arcangeli
2003-04-06 12:37                           ` Jamie Lokier
2003-04-06 13:12                             ` William Lee Irwin III
2003-04-22 11:00                           ` Ingo Molnar
2003-04-22 11:54                             ` William Lee Irwin III
2003-04-22 14:31                               ` Ingo Molnar
2003-04-22 14:56                                 ` William Lee Irwin III
2003-04-22 15:26                                   ` Ingo Molnar
2003-04-22 16:20                                     ` William Lee Irwin III
2003-04-22 16:57                                       ` Andrea Arcangeli
2003-04-22 17:21                                         ` William Lee Irwin III
2003-04-22 18:08                                           ` Andrea Arcangeli
2003-04-22 17:34                                         ` Ingo Molnar
2003-04-22 18:04                                           ` Benjamin LaHaise
2003-04-22 16:58                                       ` Martin J. Bligh
2003-04-22 12:37                             ` Andrea Arcangeli
2003-04-22 13:20                               ` William Lee Irwin III
2003-04-22 14:38                                 ` Martin J. Bligh
2003-04-22 15:10                                   ` William Lee Irwin III
2003-04-22 15:53                                     ` Martin J. Bligh
2003-04-22 14:52                                 ` Andrea Arcangeli
2003-04-22 14:29                             ` Martin J. Bligh
2003-04-22 15:07                               ` Ingo Molnar [this message]
2003-04-22 15:42                                 ` William Lee Irwin III
2003-04-22 15:55                                   ` Ingo Molnar
2003-04-22 16:58                                     ` William Lee Irwin III
2003-04-22 17:07                                       ` Ingo Molnar
2003-04-22 15:16                               ` Andrea Arcangeli
2003-04-22 15:49                               ` Ingo Molnar
2003-04-22 16:16                                 ` Martin J. Bligh
2003-04-22 17:24                                   ` Ingo Molnar
2003-04-22 17:45                                   ` John Bradford
2003-04-22 14:32                             ` Martin J. Bligh
2003-04-22 15:09                               ` Ingo Molnar
2003-04-05 21:34                     ` Rik van Riel
2003-04-06  9:29                     ` Benjamin LaHaise
2003-04-05 23:25                   ` William Lee Irwin III
2003-04-05 23:57                     ` Andrew Morton
2003-04-06  0:14                       ` Andrea Arcangeli
2003-04-06  1:39                         ` Andrew Morton
2003-04-06  2:13                       ` William Lee Irwin III
2003-04-06  9:26                     ` Benjamin LaHaise
2003-04-06  9:41                       ` William Lee Irwin III
2003-04-06  9:54                         ` William Lee Irwin III
2003-04-06  2:23                   ` Martin J. Bligh
2003-04-06  3:55                     ` Andrew Morton
2003-04-06  3:08                       ` Martin J. Bligh
2003-04-06  7:42                         ` William Lee Irwin III
2003-04-06 14:49                     ` Alan Cox
2003-04-06 16:13                       ` Martin J. Bligh
2003-04-06 21:34                         ` subobj-rmap Martin J. Bligh
2003-04-06 21:42                           ` subobj-rmap Rik van Riel
2003-04-06 21:55                             ` subobj-rmap Jamie Lokier
2003-04-06 22:39                               ` subobj-rmap William Lee Irwin III
2003-04-06 22:03                             ` subobj-rmap Martin J. Bligh
2003-04-06 22:06                               ` subobj-rmap Martin J. Bligh
2003-04-06 22:15                               ` subobj-rmap Andrea Arcangeli
2003-04-06 22:25                                 ` subobj-rmap Martin J. Bligh
2003-04-07 21:25                                   ` subobj-rmap Andrea Arcangeli
2003-04-06 23:06                               ` subobj-rmap Jamie Lokier
2003-04-06 23:26                                 ` subobj-rmap Martin J. Bligh
2003-04-05  3:45               ` objrmap and vmtruncate Martin J. Bligh
2003-04-05  3:59               ` Rik van Riel
2003-04-05  4:10                 ` William Lee Irwin III
2003-04-05  4:49                   ` Martin J. Bligh
2003-04-05 13:31                     ` Rik van Riel
2003-04-05  4:52               ` Martin J. Bligh
2003-04-05  3:22             ` Andrew Morton
2003-04-05  3:35               ` Martin J. Bligh
2003-04-05  3:53       ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.44.0304221032560.10400-100000@devserv.devel.redhat.com \
    --to=mingo@redhat.com \
    --cc=akpm@digeo.com \
    --cc=andrea@suse.de \
    --cc=dmccr@us.ibm.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@aracnet.com \
    --cc=mingo@elte.hu \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox