linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Haggai Eran <haggaie@mellanox.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Shachar Raindel <raindel@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>
Subject: Re: [PATCH V1 0/2] Enable clients to schedule in mmu_notifier methods
Date: Wed, 5 Sep 2012 17:01:36 +0300	[thread overview]
Message-ID: <50475B40.1060407@mellanox.com> (raw)
In-Reply-To: <20120904150615.f6c1a618.akpm@linux-foundation.org>

On 05/09/2012 01:06, Andrew Morton wrote:
> Exactly why do we want on-demand paging for Infiniband?  

Applications register memory with an RDMA adapter using system calls,
and subsequently post IO operations that refer to the corresponding
virtual addresses directly to HW. Until now, this was achieved by
pinning the memory during the registration calls. The goal of on demand
paging is to avoid pinning the pages of registered memory regions (MRs).
This will allow users the same flexibility they get when swapping any
other part of their processes address spaces. Instead of requiring the
entire MR to fit in physical memory, we can allow the MR to be larger,
and only fit the current working set in physical memory.

> Why should anyone care?  What problems are users currently experiencing?

This can make programming with RDMA much simpler. Today, developers that
are working with more data than their RAM can hold need either to
deregister and reregister memory regions throughout their process's
life, or keep a single memory region and copy the data to it. On demand
paging will allow these developers to register a single MR at the
beginning of their process's life, and let the operating system manage
which pages needs to be fetched at a given time. In the future, we might
be able to provide a single memory access key for each process that
would provide the entire process's address as one large memory region,
and the developers wouldn't need to register memory regions at all.

> Is there any prospect that any other subsystems will utilise these
> infrastructural changes?  If so, which and how, etc?

As for other subsystems, I understand that XPMEM wanted to sleep in MMU
notifiers, as Christoph Lameter wrote at
http://lkml.indiana.edu/hypermail/linux/kernel/0802.1/0460.html
and perhaps Andrea knows about other use cases.

Scheduling in mmu notifications is required since we need to sync the
hardware with the secondary page tables change. A TLB flush of an IO
device is inherently slower than a CPU TLB flush, so our design works by
sending the invalidation request to the device, and waiting for an
interrupt before exiting the mmu notifier handler.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      parent reply	other threads:[~2012-09-05 14:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-04  8:41 Haggai Eran
2012-09-04  8:41 ` [PATCH V1 1/2] mm: Move all mmu notifier invocations to be done outside the PT lock Haggai Eran
2012-09-04 22:07   ` Andrew Morton
2012-09-06 14:34     ` [PATCH V2 0/2] Enable clients to schedule in mmu_notifier methods Haggai Eran
2012-09-06 14:34       ` [PATCH V2 1/2] mm: Move all mmu notifier invocations to be done outside the PT lock Haggai Eran
2012-09-06 14:34       ` [PATCH V2 2/2] mm: Wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end Haggai Eran
2012-09-06 20:08       ` [PATCH V2 0/2] Enable clients to schedule in mmu_notifier methods Andrew Morton
2012-09-04  8:41 ` [PATCH V1 2/2] mm: Wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end Haggai Eran
2012-09-04 22:07   ` Andrew Morton
2012-09-04 22:06 ` [PATCH V1 0/2] Enable clients to schedule in mmu_notifier methods Andrew Morton
2012-09-05 12:55   ` Avi Kivity
2012-09-05 14:01   ` Haggai Eran [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50475B40.1060407@mellanox.com \
    --to=haggaie@mellanox.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=ogerlitz@mellanox.com \
    --cc=raindel@mellanox.com \
    --cc=sagig@mellanox.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox