From: Keith Busch <keith.busch@intel.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: ksummit-discuss@lists.linuxfoundation.org,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity
Date: Wed, 15 Jul 2015 17:19:33 +0000 (UTC) [thread overview]
Message-ID: <alpine.LNX.2.00.1507151700300.15930@localhost.lm.intel.com> (raw)
In-Reply-To: <55A67F11.1030709@sandisk.com>
On Wed, 15 Jul 2015, Bart Van Assche wrote:
> * With blk-mq and scsi-mq optimal performance can only be achieved if
> the relationship between MSI-X vector and NUMA node does not change
> over time. This is necessary to allow a blk-mq/scsi-mq driver to
> ensure that interrupts are processed on the same NUMA node as the
> node on which the data structures for a communication channel have
> been allocated. However, today there is no API that allows
> blk-mq/scsi-mq drivers and irqbalanced to exchange information
> about the relationship between MSI-X vector ranges and NUMA nodes.
We could have low-level drivers provide blk-mq the controller's irq
associated with a particular h/w context, and the block layer can provide
the context's cpumask to irqbalance with the smp affinity hint.
The nvme driver already uses the hwctx cpumask to set hints, but this
doesn't seems like it should be a driver responsibility. It currently
doesn't work correctly anyway with hot-cpu since blk-mq could rebalance
the h/w contexts without syncing with the low-level driver.
If we can add this to blk-mq, one additional case to consider is if the
same interrupt vector is used with multiple h/w contexts. Blk-mq's cpu
assignment needs to be aware of this to prevent sharing a vector across
NUMA nodes.
> The only approach I know of that works today to define IRQ affinity
> for blk-mq/scsi-mq drivers is to disable irqbalanced and to run a
> custom script that defines IRQ affinity (see e.g. the
> spread-mlx4-ib-interrupts attachment of
> http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/21312/focus=98409).
next prev parent reply other threads:[~2015-07-15 17:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-15 12:07 Christoph Hellwig
2015-07-15 12:12 ` Thomas Gleixner
2015-07-15 15:41 ` Bart Van Assche
2015-07-15 17:19 ` Keith Busch [this message]
2015-07-15 17:25 ` Jens Axboe
2015-07-15 18:24 ` Sagi Grimberg
2015-07-15 18:48 ` Matthew Wilcox
2015-07-16 6:13 ` Michael S. Tsirkin
2015-07-17 15:51 ` Thomas Gleixner
2015-07-15 14:38 ` Christoph Lameter
2015-07-15 14:56 ` Marc Zyngier
2015-07-15 16:05 ` Michael S. Tsirkin
2015-10-12 16:09 ` Theodore Ts'o
2015-10-12 18:41 ` Christoph Hellwig
2015-10-14 15:56 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LNX.2.00.1507151700300.15930@localhost.lm.intel.com \
--to=keith.busch@intel.com \
--cc=bart.vanassche@sandisk.com \
--cc=hch@infradead.org \
--cc=ksummit-discuss@lists.linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox