ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@au1.ibm.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: ksummit-discuss@lists.linuxfoundation.org
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Redesign Memory Management layer and more core subsystem
Date: Sat, 14 Jun 2014 07:36:06 +1000	[thread overview]
Message-ID: <1402695366.20360.14.camel@pasglop> (raw)
In-Reply-To: <alpine.DEB.2.10.1406131158490.913@gentwo.org>

On Fri, 2014-06-13 at 12:02 -0500, Christoph Lameter wrote:
> On Thu, 12 Jun 2014, Phillip Lougher wrote:
> 
> > > 1. The need to use larger order pages, and the resulting problems with
> > > fragmentation. Memory sizes grow and therefore the number of page structs
> > > where state has to be maintained. Maybe there is something different? If
> > > we use hugepages then we have 511 useless page structs. Some apps need
> > > linear memory where we have trouble and are creating numerous memory
> > > allocators (recently the new bootmem allocator and CMA. Plus lots of
> > > specialized allocators in various subsystems).
> > >
> >
> > This was never solved to my knowledge, there is no panacea here.
> > Even in the 90s we had video subsystems wanting to allocate in units
> > of 1Mbyte, and others in units of 4k.  The "solution" was so called
> > split-level allocators, each specialised to deal with a particular
> > "first class media", with them giving back memory to the underlying
> > allocator when memory got tight in another specialised allocator.
> > Not much different to the ad-hoc solutions being adopted in Linux,
> > except the general idea was each specialised allocator had the same
> > API.
> 
> It is solvable if the objects are inherent movable. If any object
> allocated provides a function that makes an object movable then
> defragmentation is possible and therefore large contiguous area of memory
> can be created at any time.

Another interesting thing is migration of pages with mapped DMA on
them :-)

Our IOMMUs support that, but there isn't a way to hook that up into
Linux page migration that wouldn't suck massively at this point.

> > > Can we develop the notion that subsystems own certain cores so that their
> > > execution is restricted to a subset of the system avoiding data
> > > replication and keeping subsystem data hot? I.e. have a device driver
> > > and subsystems driving those devices just run on the NUMA node to which
> > > the PCI-E root complex is attached. Restricting to NUMA node reduces data
> > > locality complexity and increases performance due to cache hot data.
> >
> > Lots of academic hot-air was expended here when designing distributed
> > systems which could scale seamlessly across heterogeneous CPUs connected
> > via different levels of interconnects (bus, ATM, ethernet etc.), zoning,
> > migration, replication etc.  The "solution" is probably out there somewhere
> > forgotten about.
> 
> We have the issue with homogenous cpus due to the proliferation of cores
> on processors now. Maybe that is solvable?
> 
> > Case in point, many years ago I was the lead Linux guy for a company
> > designing a SOC for digital TV.  Just before I left I had an interesting
> > "conversation" with the chief hardware guy of the team who designed the SOC.
> > Turns out they'd budgeted for the RAM bandwidth needed to decode a typical
> > MPEG stream, but they'd not reckoned on all the memcopies Linux needs to do
> > between its "separate address space" processes.  He'd been used to embedded
> > oses which run in a single address space.
> 
> Well maybe that is appropriate for some processes? And we could carve out
> subsections of the hardware where single adress space stuff is possible?
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

  reply	other threads:[~2014-06-13 21:36 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-11 19:03 Christoph Lameter
2014-06-11 19:26 ` Daniel Phillips
2014-06-11 19:45 ` Greg KH
2014-06-12 13:35   ` John W. Linville
2014-06-13 16:57     ` Christoph Lameter
2014-06-13 17:31       ` Greg KH
2014-06-13 17:59         ` Christoph Lameter
2014-06-13 19:18           ` Stephen Hemminger
2014-06-13 22:30             ` Christoph Lameter
2014-06-13 16:56   ` Christoph Lameter
2014-06-13 17:30     ` Greg KH
2014-06-13 17:55       ` James Bottomley
2014-06-13 18:41         ` Christoph Lameter
2014-06-16 11:39           ` Thomas Petazzoni
2014-06-16 14:05             ` Christoph Lameter
2014-06-16 14:09               ` Thomas Petazzoni
2014-06-16 14:28                 ` Christoph Lameter
2014-06-13 18:01       ` Christoph Lameter
2014-06-13 18:25         ` Greg KH
2014-06-13 18:54           ` Christoph Lameter
2014-06-11 20:08 ` josh
2014-06-11 20:15 ` Andy Lutomirski
2014-06-11 20:52 ` Dave Hansen
2014-06-12  6:59 ` Phillip Lougher
2014-06-13 17:02   ` Christoph Lameter
2014-06-13 21:36     ` Benjamin Herrenschmidt [this message]
2014-06-13 22:23       ` Rik van Riel
2014-06-13 23:04       ` Christoph Lameter
2014-06-14  1:19     ` Phillip Lougher
2014-06-16 14:04       ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1402695366.20360.14.camel@pasglop \
    --to=benh@au1.ibm.com \
    --cc=cl@gentwo.org \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox