* Re: [Ksummit-discuss] [CORE TOPIC] lightweight per-cpu locks / restartable sequences
2015-07-13 9:57 ` Peter Zijlstra
@ 2015-07-13 14:01 ` Christoph Lameter
2015-07-14 20:00 ` Andy Lutomirski
2015-07-22 14:22 ` Lai Jiangshan
2015-07-22 14:34 ` Lai Jiangshan
2 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2015-07-13 14:01 UTC (permalink / raw)
To: Peter Zijlstra
Cc: ksummit-discuss, linux-kernel, Jens Axboe, Mathieu Desnoyers, Shaohua Li
On Mon, 13 Jul 2015, Peter Zijlstra wrote:
> Now the 'problem' is finding these special regions fast, the easy
> solution is the same as the one proposed for userspace, one big section.
> That way the interrupt only has to check if the IP is inside this
> section which is minimal effort.
>
> The down side is that all percpu ops would then end up being full
> function calls. Which on some archs is indeed faster than disabling
> interrupts, but not by much I'm afraid.
Well one could move the entire functions that are using these ops into the
special sections. That is certainly an area requiring much more thought.
> > optimize the x86 variants if interrupts also can detect critical sections
> > and restart at defined points.
>
> I really don't see how we can beat %GS prefixes with any such scheme.
We may be able to avoid RMV sequences which allows the processor to better
schedule operations.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] lightweight per-cpu locks / restartable sequences
2015-07-13 14:01 ` Christoph Lameter
@ 2015-07-14 20:00 ` Andy Lutomirski
2015-07-14 21:15 ` Christoph Lameter
0 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2015-07-14 20:00 UTC (permalink / raw)
To: Christoph Lameter
Cc: ksummit-discuss, Peter Zijlstra, linux-kernel, Jens Axboe,
Mathieu Desnoyers, Shaohua Li
On Mon, Jul 13, 2015 at 7:01 AM, Christoph Lameter <cl@linux.com> wrote:
> On Mon, 13 Jul 2015, Peter Zijlstra wrote:
>
>> Now the 'problem' is finding these special regions fast, the easy
>> solution is the same as the one proposed for userspace, one big section.
>> That way the interrupt only has to check if the IP is inside this
>> section which is minimal effort.
>>
>> The down side is that all percpu ops would then end up being full
>> function calls. Which on some archs is indeed faster than disabling
>> interrupts, but not by much I'm afraid.
>
> Well one could move the entire functions that are using these ops into the
> special sections. That is certainly an area requiring much more thought.
Hmm.
>
>> > optimize the x86 variants if interrupts also can detect critical sections
>> > and restart at defined points.
>>
>> I really don't see how we can beat %GS prefixes with any such scheme.
>
> We may be able to avoid RMV sequences which allows the processor to better
> schedule operations.
True, but cmpxchg is, surprisingly, pretty fast.
Crazy thought: At the risk of proposing something ridiculous, what if
we had per-cpu memory mappings? We could do this at the cost of up to
2kB of memcpy whenever we switch mms. Expensive but maybe not a
showstopper.
--Andy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] lightweight per-cpu locks / restartable sequences
2015-07-14 20:00 ` Andy Lutomirski
@ 2015-07-14 21:15 ` Christoph Lameter
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2015-07-14 21:15 UTC (permalink / raw)
To: Andy Lutomirski
Cc: ksummit-discuss, Peter Zijlstra, linux-kernel, Jens Axboe,
Mathieu Desnoyers, Shaohua Li
On Tue, 14 Jul 2015, Andy Lutomirski wrote:
> Crazy thought: At the risk of proposing something ridiculous, what if
> we had per-cpu memory mappings? We could do this at the cost of up to
> 2kB of memcpy whenever we switch mms. Expensive but maybe not a
> showstopper.
This is not crazy and actually was done before. Itanium has that and
its doable since the TLB insertion could be handled in software.
The problem on x86 is that one would need a separate page table for each
processor for each task. There is no way to handle TLB faults in
software to my knowledge.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] lightweight per-cpu locks / restartable sequences
2015-07-13 9:57 ` Peter Zijlstra
2015-07-13 14:01 ` Christoph Lameter
@ 2015-07-22 14:22 ` Lai Jiangshan
2015-07-22 14:34 ` Lai Jiangshan
2 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2015-07-22 14:22 UTC (permalink / raw)
To: Peter Zijlstra
Cc: ksummit-discuss, linux-kernel, Jens Axboe, Mathieu Desnoyers,
Shaohua Li, Christoph Lameter
On Mon, Jul 13, 2015 at 5:57 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Jul 10, 2015 at 12:26:21PM -0500, Christoph Lameter wrote:
>> On Thu, 9 Jul 2015, Chris Mason wrote:
>>
>> > I think the topic is really interesting and we'll be able to get numbers
>> > from production workloads to help justify and compare different
>> > approaches.
>>
>> Ok that would be important. I also think that the approach may be used
>> in kernel to reduce the overhead of CONFIG_PREEMPT and also to implement
>> fast versions of this_cpu_ops for non x86 architectures and maybe even
>
>
> Also, I don't think we need a schedule check for the in-kernel usage,
> pure interrupt should be good enough, nobody should (want to) call
> schedule() while inside such a critical section, which leaves us with
> involuntary preemption, and those are purely interrupt driven.
>
> Now the 'problem' is finding these special regions fast, the easy
> solution is the same as the one proposed for userspace, one big section.
> That way the interrupt only has to check if the IP is inside this
> section which is minimal effort.
>
> The down side is that all percpu ops would then end up being full
> function calls. Which on some archs is indeed faster than disabling
> interrupts, but not by much I'm afraid.
Anther down site is that all percpu ops can't call any function outside
the section. Otherwise we would fail to detect whether it is a special
region or be hard to detect it.
If we disallow the percpu ops calling any function, I think we can
insert some special instructions to the generated code along with
a notation in a table (like exception table for copy_to_user()).
So thus the interrupt only has to check the special instructions
near the IP and confirm it by check it on the table.
>
>> optimize the x86 variants if interrupts also can detect critical sections
>> and restart at defined points.
>
> I really don't see how we can beat %GS prefixes with any such scheme.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] lightweight per-cpu locks / restartable sequences
2015-07-13 9:57 ` Peter Zijlstra
2015-07-13 14:01 ` Christoph Lameter
2015-07-22 14:22 ` Lai Jiangshan
@ 2015-07-22 14:34 ` Lai Jiangshan
2 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2015-07-22 14:34 UTC (permalink / raw)
To: Peter Zijlstra
Cc: ksummit-discuss, linux-kernel, Jens Axboe, Mathieu Desnoyers,
Shaohua Li, Christoph Lameter
On Mon, Jul 13, 2015 at 5:57 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Jul 10, 2015 at 12:26:21PM -0500, Christoph Lameter wrote:
>> On Thu, 9 Jul 2015, Chris Mason wrote:
>>
>> > I think the topic is really interesting and we'll be able to get numbers
>> > from production workloads to help justify and compare different
>> > approaches.
>>
>> Ok that would be important. I also think that the approach may be used
>> in kernel to reduce the overhead of CONFIG_PREEMPT and also to implement
>> fast versions of this_cpu_ops for non x86 architectures and maybe even
>
> There is nothing stopping people from trying this in-kernel, in fact
> that would be lots easier as we do not have to commit to any one
> specific ABI for that.
It also provides us a nicer way to fight with NMI and
to modify a slight-biger-struct irq-safely
if we have it in-kenrel.
>
> Also, I don't think we need a schedule check for the in-kernel usage,
> pure interrupt should be good enough, nobody should (want to) call
> schedule() while inside such a critical section, which leaves us with
> involuntary preemption, and those are purely interrupt driven.
>
> Now the 'problem' is finding these special regions fast, the easy
> solution is the same as the one proposed for userspace, one big section.
> That way the interrupt only has to check if the IP is inside this
> section which is minimal effort.
>
> The down side is that all percpu ops would then end up being full
> function calls. Which on some archs is indeed faster than disabling
> interrupts, but not by much I'm afraid.
>
>> optimize the x86 variants if interrupts also can detect critical sections
>> and restart at defined points.
>
> I really don't see how we can beat %GS prefixes with any such scheme.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 10+ messages in thread