* Linus rollup
@ 2003-01-29 6:07 Andrew Morton
2003-01-29 6:53 ` David Mosberger
` (4 more replies)
0 siblings, 5 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-29 6:07 UTC (permalink / raw)
To: Russell King, Andi Kleen, David S. Miller, David Mosberger,
Anton Blanchard
Cc: linux-mm
Gents,
I've sifted out all the things which I intend to send to the boss soon. It
would be good if you could perform some quick non-ia32 testing please.
Possible breakage would be in the new frlock-for-xtime_lock code and the
get_order() cleanup.
The frlock code is showing nice speedups, but I think the main reason we want
this is to fix the problem wherein an application spinning on gettimeofday()
can make time stop.
It's all at
http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-lt1/
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 6:07 Linus rollup Andrew Morton
@ 2003-01-29 6:53 ` David Mosberger
2003-01-29 7:25 ` David S. Miller
` (3 subsequent siblings)
4 siblings, 0 replies; 36+ messages in thread
From: David Mosberger @ 2003-01-29 6:53 UTC (permalink / raw)
To: Andrew Morton
Cc: Russell King, Andi Kleen, David S. Miller, David Mosberger,
Anton Blanchard, linux-mm
The patch looks good to me. I applied it on top of 2.5.59+ia64 and it
checked out fine on the Ski simulator and a dual-Itanium2 zx6000
workstation.
Thanks,
--david
>>>>> On Tue, 28 Jan 2003 22:07:29 -0800, Andrew Morton <akpm@digeo.com> said:
Andrew> Gents,
Andrew> I've sifted out all the things which I intend to send to the
Andrew> boss soon. It would be good if you could perform some quick
Andrew> non-ia32 testing please.
Andrew> Possible breakage would be in the new frlock-for-xtime_lock
Andrew> code and the get_order() cleanup.
Andrew> The frlock code is showing nice speedups, but I think the
Andrew> main reason we want this is to fix the problem wherein an
Andrew> application spinning on gettimeofday() can make time stop.
Andrew> It's all at
Andrew> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-lt1/
Andrew> Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 6:07 Linus rollup Andrew Morton
2003-01-29 6:53 ` David Mosberger
@ 2003-01-29 7:25 ` David S. Miller
2003-01-29 9:33 ` Andrew Morton
` (2 subsequent siblings)
4 siblings, 0 replies; 36+ messages in thread
From: David S. Miller @ 2003-01-29 7:25 UTC (permalink / raw)
To: akpm; +Cc: rmk, ak, davidm, anton, linux-mm
I've sifted out all the things which I intend to send to the boss soon. It
would be good if you could perform some quick non-ia32 testing please.
Possible breakage would be in the new frlock-for-xtime_lock code and the
get_order() cleanup.
It all looks fine, I didn't sanity build/boot it on sparc but if
there are any problems they will be minor and I'll just fix it up
when Linus grabs your stuff.
BTW, that:
do {
seq = fr_read_begin(...);
sec = foo;
usec = bar;
} while (seq != fr_read_end(...))
loop is duplicated nearly identically perhaps 10 to 15 times, would
be nice to put it in one spot if possible. :-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 6:07 Linus rollup Andrew Morton
2003-01-29 6:53 ` David Mosberger
2003-01-29 7:25 ` David S. Miller
@ 2003-01-29 9:33 ` Andrew Morton
2003-01-29 9:35 ` David S. Miller
2003-01-29 9:54 ` Anton Blanchard
2003-01-29 9:59 ` Russell King
4 siblings, 1 reply; 36+ messages in thread
From: Andrew Morton @ 2003-01-29 9:33 UTC (permalink / raw)
To: rmk, ak, davem, davidm, anton, linux-mm
Andrew Morton <akpm@digeo.com> wrote:
>
> Gents,
>
> I've sifted out all the things which I intend to send to the boss soon
Forgot to mention:
- sys_semtimedop() is only wired up for ia32/ia64 at present.
- This rollup contains the new sys_fadvise(), so unless Linus bounces
my first syscall, that will need hooking up for non-ia32 too.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 9:33 ` Andrew Morton
@ 2003-01-29 9:35 ` David S. Miller
0 siblings, 0 replies; 36+ messages in thread
From: David S. Miller @ 2003-01-29 9:35 UTC (permalink / raw)
To: akpm; +Cc: rmk, ak, davidm, anton, linux-mm
Forgot to mention:
I wish there was some way people could easily add an
#error to the build so that arch's know which syscalls
need to be added next time someone tries to run make
for that platform.
Nothing comes immediately to mind as an idea however.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 9:59 ` Russell King
@ 2003-01-29 9:51 ` David S. Miller
2003-01-29 10:26 ` Andrew Morton
2003-01-29 10:16 ` Andrew Morton
2003-01-29 17:04 ` David Mosberger
2 siblings, 1 reply; 36+ messages in thread
From: David S. Miller @ 2003-01-29 9:51 UTC (permalink / raw)
To: rmk; +Cc: akpm, ak, davidm, anton, linux-mm
/* This function must be called with interrupts disabled
which hasn't been true for some time, and is even less true now that
local IRQs don't get disabled. Does this matter... for UP?
I disable local IRQs during gettimeofday() on sparc.
These locks definitely need to be taken with IRQs disabled.
Why isn't x86 doing that?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 6:07 Linus rollup Andrew Morton
` (2 preceding siblings ...)
2003-01-29 9:33 ` Andrew Morton
@ 2003-01-29 9:54 ` Anton Blanchard
2003-01-29 9:59 ` Russell King
4 siblings, 0 replies; 36+ messages in thread
From: Anton Blanchard @ 2003-01-29 9:54 UTC (permalink / raw)
To: Andrew Morton
Cc: Russell King, Andi Kleen, David S. Miller, David Mosberger, linux-mm
> I've sifted out all the things which I intend to send to the boss soon. It
> would be good if you could perform some quick non-ia32 testing please.
>
> Possible breakage would be in the new frlock-for-xtime_lock code and the
> get_order() cleanup.
>
> The frlock code is showing nice speedups, but I think the main reason we want
> this is to fix the problem wherein an application spinning on gettimeofday()
> can make time stop.
Checks out OK on ppc64 bar some problems with the get_order patch.
(should include linux/bitops.h instead of asm/bitops.h).
It passed some stress tests (sdet, tpc-h)
Anton
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 6:07 Linus rollup Andrew Morton
` (3 preceding siblings ...)
2003-01-29 9:54 ` Anton Blanchard
@ 2003-01-29 9:59 ` Russell King
2003-01-29 9:51 ` David S. Miller
` (2 more replies)
4 siblings, 3 replies; 36+ messages in thread
From: Russell King @ 2003-01-29 9:59 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, David S. Miller, David Mosberger, Anton Blanchard, linux-mm
On Tue, Jan 28, 2003 at 10:07:29PM -0800, Andrew Morton wrote:
> Possible breakage would be in the new frlock-for-xtime_lock code and the
> get_order() cleanup.
>
> The frlock code is showing nice speedups, but I think the main reason we want
> this is to fix the problem wherein an application spinning on gettimeofday()
> can make time stop.
I'm slightly concerned about this. With this patch, we generally seem
to do:
do {
seq = fr_read_begin(&xtime_lock);
*tv = xtime;
tv->tv_usec += do_fast_gettimeoffset();
tv->tv_usec += lost_ticks;
} while (seq != fr_read_end(&xtime_lock));
This is fine when considering xtime. However, considering the
implementation of do_fast_gettimeoffset(). Notice that
linux/arch/i386/kernel/timers/timer_pit.c:do_offset_pit specifically
makes the comment:
/* This function must be called with interrupts disabled
which hasn't been true for some time, and is even less true now that
local IRQs don't get disabled. Does this matter... for UP?
The same is true for other architectures; their gettimeoffset
implementations need to be audited by the architecture maintainers
to ensure that they are safe to run with (local) interrupts enabled.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 9:59 ` Russell King
2003-01-29 9:51 ` David S. Miller
@ 2003-01-29 10:16 ` Andrew Morton
2003-01-29 17:04 ` David Mosberger
2 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-29 10:16 UTC (permalink / raw)
To: Russell King; +Cc: ak, davem, davidm, anton, linux-mm
Russell King <rmk@arm.linux.org.uk> wrote:
>
> On Tue, Jan 28, 2003 at 10:07:29PM -0800, Andrew Morton wrote:
> > Possible breakage would be in the new frlock-for-xtime_lock code and the
> > get_order() cleanup.
> >
> > The frlock code is showing nice speedups, but I think the main reason we want
> > this is to fix the problem wherein an application spinning on gettimeofday()
> > can make time stop.
>
> I'm slightly concerned about this. With this patch, we generally seem
> to do:
>
> do {
> seq = fr_read_begin(&xtime_lock);
> *tv = xtime;
> tv->tv_usec += do_fast_gettimeoffset();
> tv->tv_usec += lost_ticks;
> } while (seq != fr_read_end(&xtime_lock));
>
> This is fine when considering xtime. However, considering the
> implementation of do_fast_gettimeoffset(). Notice that
> linux/arch/i386/kernel/timers/timer_pit.c:do_offset_pit specifically
> makes the comment:
>
> /* This function must be called with interrupts disabled
>
> which hasn't been true for some time, and is even less true now that
> local IRQs don't get disabled. Does this matter... for UP?
>
> The same is true for other architectures; their gettimeoffset
> implementations need to be audited by the architecture maintainers
> to ensure that they are safe to run with (local) interrupts enabled.
Thanks for spotting that. Bug.
How about I make all instances of the above be:
do {
seq = fr_read_begin_irqsave(&xtime_lock, flags);
*tv = xtime;
tv->tv_usec += do_fast_gettimeoffset();
tv->tv_usec += lost_ticks;
} while (seq != fr_read_end_irqrestore(&xtime_lock, flags));
?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 9:51 ` David S. Miller
@ 2003-01-29 10:26 ` Andrew Morton
2003-01-29 22:35 ` Stephen Hemminger
2003-01-30 0:43 ` Andrea Arcangeli
0 siblings, 2 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-29 10:26 UTC (permalink / raw)
To: David S. Miller; +Cc: rmk, ak, davidm, anton, linux-mm, Andrea Arcangeli
"David S. Miller" <davem@redhat.com> wrote:
>
> From: Russell King <rmk@arm.linux.org.uk>
> Date: Wed, 29 Jan 2003 09:59:49 +0000
>
> /* This function must be called with interrupts disabled
>
> which hasn't been true for some time, and is even less true now that
> local IRQs don't get disabled. Does this matter... for UP?
>
> I disable local IRQs during gettimeofday() on sparc.
>
> These locks definitely need to be taken with IRQs disabled.
> Why isn't x86 doing that?
Darned if I know. Looks like Andrea's kernel will deadlock if
arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock
while that cpu is holding the same lock in do_slow_gettimeoffset().
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 9:59 ` Russell King
2003-01-29 9:51 ` David S. Miller
2003-01-29 10:16 ` Andrew Morton
@ 2003-01-29 17:04 ` David Mosberger
2003-01-29 17:25 ` Russell King
2 siblings, 1 reply; 36+ messages in thread
From: David Mosberger @ 2003-01-29 17:04 UTC (permalink / raw)
To: Russell King
Cc: Andrew Morton, Andi Kleen, David S. Miller, David Mosberger,
Anton Blanchard, linux-mm
>>>>> On Wed, 29 Jan 2003 09:59:49 +0000, Russell King <rmk@arm.linux.org.uk> said:
Russell> On Tue, Jan 28, 2003 at 10:07:29PM -0800, Andrew Morton
Russell> wrote:
>> The frlock code is showing nice speedups, but I think the main
>> reason we want this is to fix the problem wherein an application
>> spinning on gettimeofday() can make time stop.
Russell> I'm slightly concerned about this. With this patch, we
Russell> generally seem to do:
Russell> [snip...]
Russell> The same is true for other architectures; their
Russell> gettimeoffset implementations need to be audited by the
Russell> architecture maintainers to ensure that they are safe to
Russell> run with (local) interrupts enabled.
Should be fine as far as ia64 is concerned, since gettimeoffset()
currently simply reads the cycle-counter (and I think even HPET-based
interpolation would be lock-free).
--david
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 17:04 ` David Mosberger
@ 2003-01-29 17:25 ` Russell King
2003-01-29 19:05 ` David Mosberger
0 siblings, 1 reply; 36+ messages in thread
From: Russell King @ 2003-01-29 17:25 UTC (permalink / raw)
To: davidm
Cc: Andrew Morton, Andi Kleen, David S. Miller, David Mosberger,
Anton Blanchard, linux-mm
On Wed, Jan 29, 2003 at 09:04:37AM -0800, David Mosberger wrote:
> Should be fine as far as ia64 is concerned, since gettimeoffset()
> currently simply reads the cycle-counter (and I think even HPET-based
> interpolation would be lock-free).
If you're happy, then I'm happy.
I was only concerned because it looks like it might be a problem on
some implementations, and I was wondering what would happen on ia64
if a timer interrupt occurs between reading jiffies and itm_next in
gettimeoffset.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 17:25 ` Russell King
@ 2003-01-29 19:05 ` David Mosberger
0 siblings, 0 replies; 36+ messages in thread
From: David Mosberger @ 2003-01-29 19:05 UTC (permalink / raw)
To: Russell King
Cc: davidm, Andrew Morton, Andi Kleen, David S. Miller,
David Mosberger, Anton Blanchard, linux-mm
>>>>> On Wed, 29 Jan 2003 17:25:19 +0000, Russell King <rmk@arm.linux.org.uk> said:
Russell> I was only concerned because it looks like it might be a
Russell> problem on some implementations, and I was wondering what
Russell> would happen on ia64 if a timer interrupt occurs between
Russell> reading jiffies and itm_next in gettimeoffset.
Perhaps I'm missing something, but I thought that in this case,
fr_write_unlock() would increment the counter and cause the reader(s)
to restart the reading of the time-related variables.
I do agree that, in general, there is a potential deadlock if
gettimeoffset() is not lock-free.
--david
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 10:26 ` Andrew Morton
@ 2003-01-29 22:35 ` Stephen Hemminger
2003-01-29 23:12 ` Andrew Morton
2003-01-30 0:46 ` Andrea Arcangeli
2003-01-30 0:43 ` Andrea Arcangeli
1 sibling, 2 replies; 36+ messages in thread
From: Stephen Hemminger @ 2003-01-29 22:35 UTC (permalink / raw)
To: Andrew Morton
Cc: David S. Miller, rmk, ak, davidm, anton, linux-mm, Andrea Arcangeli
On Wed, 2003-01-29 at 02:26, Andrew Morton wrote:
> "David S. Miller" <davem@redhat.com> wrote:
> >
> > From: Russell King <rmk@arm.linux.org.uk>
> > Date: Wed, 29 Jan 2003 09:59:49 +0000
> >
> > /* This function must be called with interrupts disabled
> >
> > which hasn't been true for some time, and is even less true now that
> > local IRQs don't get disabled. Does this matter... for UP?
> >
> > I disable local IRQs during gettimeofday() on sparc.
> >
> > These locks definitely need to be taken with IRQs disabled.
> > Why isn't x86 doing that?
>
> Darned if I know. Looks like Andrea's kernel will deadlock if
> arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock
> while that cpu is holding the same lock in do_slow_gettimeoffset().
Rather than disabling interrupts in the i386 do_gettimeofday
why not just change spin_lock(&i8253_lock) to spin_lock_irqsave
in timer_pit.c
--
Stephen Hemminger <shemminger@osdl.org>
Open Source Devlopment Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 22:35 ` Stephen Hemminger
@ 2003-01-29 23:12 ` Andrew Morton
2003-01-30 0:30 ` David S. Miller
2003-01-30 0:46 ` Andrea Arcangeli
1 sibling, 1 reply; 36+ messages in thread
From: Andrew Morton @ 2003-01-29 23:12 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: davem, rmk, ak, davidm, anton, linux-mm, andrea
Stephen Hemminger <shemminger@osdl.org> wrote:
>
> On Wed, 2003-01-29 at 02:26, Andrew Morton wrote:
> > "David S. Miller" <davem@redhat.com> wrote:
> > >
> > > From: Russell King <rmk@arm.linux.org.uk>
> > > Date: Wed, 29 Jan 2003 09:59:49 +0000
> > >
> > > /* This function must be called with interrupts disabled
> > >
> > > which hasn't been true for some time, and is even less true now that
> > > local IRQs don't get disabled. Does this matter... for UP?
> > >
> > > I disable local IRQs during gettimeofday() on sparc.
> > >
> > > These locks definitely need to be taken with IRQs disabled.
> > > Why isn't x86 doing that?
> >
> > Darned if I know. Looks like Andrea's kernel will deadlock if
> > arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock
> > while that cpu is holding the same lock in do_slow_gettimeoffset().
>
> Rather than disabling interrupts in the i386 do_gettimeofday
> why not just change spin_lock(&i8253_lock) to spin_lock_irqsave
> in timer_pit.c
That's probably a legitimate fix, subject to audit of the other
implementations of ->get_offset().
But that would be a separate patch. _all_ we are doing here is fixing and
optimising the xtime_lock problems. We should seek to do that with
"equivalent transformations".
Fine-tuning the ia32 timer implementation is all well and good, but should
not be kerfuddled with kernel-wide xtime_lock rework.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 23:12 ` Andrew Morton
@ 2003-01-30 0:30 ` David S. Miller
2003-01-30 1:27 ` Andrew Morton
0 siblings, 1 reply; 36+ messages in thread
From: David S. Miller @ 2003-01-30 0:30 UTC (permalink / raw)
To: akpm; +Cc: shemminger, rmk, ak, davidm, anton, linux-mm, andrea
But that would be a separate patch. _all_ we are doing here is fixing and
optimising the xtime_lock problems. We should seek to do that with
"equivalent transformations".
I agree, do this and arch people can tweak later.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 10:26 ` Andrew Morton
2003-01-29 22:35 ` Stephen Hemminger
@ 2003-01-30 0:43 ` Andrea Arcangeli
1 sibling, 0 replies; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 0:43 UTC (permalink / raw)
To: Andrew Morton; +Cc: David S. Miller, rmk, ak, davidm, anton, linux-mm
On Wed, Jan 29, 2003 at 02:26:17AM -0800, Andrew Morton wrote:
> "David S. Miller" <davem@redhat.com> wrote:
> >
> > From: Russell King <rmk@arm.linux.org.uk>
> > Date: Wed, 29 Jan 2003 09:59:49 +0000
> >
> > /* This function must be called with interrupts disabled
> >
> > which hasn't been true for some time, and is even less true now that
> > local IRQs don't get disabled. Does this matter... for UP?
> >
> > I disable local IRQs during gettimeofday() on sparc.
> >
> > These locks definitely need to be taken with IRQs disabled.
> > Why isn't x86 doing that?
>
> Darned if I know. Looks like Andrea's kernel will deadlock if
> arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock
> while that cpu is holding the same lock in do_slow_gettimeoffset().
yes thanks! mostly theorical though, 486 SMP boxes aren't that common ;)
This should fix it.
--- 2.4.21pre3aa1/arch/i386/kernel/time.c.~1~ 2003-01-21 03:43:59.000000000 +0100
+++ 2.4.21pre3aa1/arch/i386/kernel/time.c 2003-01-30 01:40:37.000000000 +0100
@@ -159,6 +159,7 @@ extern spinlock_t i8259A_lock;
static unsigned long do_slow_gettimeoffset(void)
{
int count;
+ unsigned long flags;
static int count_p = LATCH; /* for the first call after boot */
static unsigned long jiffies_p = 0;
@@ -169,7 +170,7 @@ static unsigned long do_slow_gettimeoffs
unsigned long jiffies_t;
/* gets recalled with irq locally disabled */
- spin_lock(&i8253_lock);
+ spin_lock_irqsave(&i8253_lock, flags);
/* timer count may underflow right here */
outb_p(0x00, 0x43); /* latch the count ASAP */
@@ -191,7 +192,7 @@ static unsigned long do_slow_gettimeoffs
count = LATCH - 1;
}
- spin_unlock(&i8253_lock);
+ spin_unlock_irqrestore(&i8253_lock, flags);
/*
* avoiding timer inconsistencies (they are rare, but they happen)...
@@ -212,13 +213,13 @@ static unsigned long do_slow_gettimeoffs
int i;
- spin_lock(&i8259A_lock);
+ spin_lock_irqsave(&i8259A_lock, flags);
/*
* This is tricky when I/O APICs are used;
* see do_timer_interrupt().
*/
i = inb(0x20);
- spin_unlock(&i8259A_lock);
+ spin_unlock_irqrestore(&i8259A_lock, flags);
/* assumption about timer being IRQ0 */
if (i & 0x01) {
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-29 22:35 ` Stephen Hemminger
2003-01-29 23:12 ` Andrew Morton
@ 2003-01-30 0:46 ` Andrea Arcangeli
1 sibling, 0 replies; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 0:46 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Andrew Morton, David S. Miller, rmk, ak, davidm, anton, linux-mm
On Wed, Jan 29, 2003 at 02:35:52PM -0800, Stephen Hemminger wrote:
> On Wed, 2003-01-29 at 02:26, Andrew Morton wrote:
> > "David S. Miller" <davem@redhat.com> wrote:
> > >
> > > From: Russell King <rmk@arm.linux.org.uk>
> > > Date: Wed, 29 Jan 2003 09:59:49 +0000
> > >
> > > /* This function must be called with interrupts disabled
> > >
> > > which hasn't been true for some time, and is even less true now that
> > > local IRQs don't get disabled. Does this matter... for UP?
> > >
> > > I disable local IRQs during gettimeofday() on sparc.
> > >
> > > These locks definitely need to be taken with IRQs disabled.
> > > Why isn't x86 doing that?
> >
> > Darned if I know. Looks like Andrea's kernel will deadlock if
> > arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock
> > while that cpu is holding the same lock in do_slow_gettimeoffset().
>
> Rather than disabling interrupts in the i386 do_gettimeofday
> why not just change spin_lock(&i8253_lock) to spin_lock_irqsave
> in timer_pit.c
as you can see from my patch, I agree
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:27 ` Andrew Morton
@ 2003-01-30 1:24 ` Andi Kleen
2003-01-30 2:01 ` Andrew Morton
2003-01-30 1:35 ` Andrea Arcangeli
1 sibling, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2003-01-30 1:24 UTC (permalink / raw)
To: Andrew Morton
Cc: David S. Miller, shemminger, rmk, ak, davidm, anton, linux-mm, andrea
> This is an optimisation to the ia64, ia32 and x86_64 do_gettimeofday() code
> above and beyond the base frlock work.
Hmm? No x86_64 changes in there.
> arch/i386/kernel/time.c | 7 +++----
> arch/i386/kernel/timers/timer_pit.c | 7 +++----
> arch/ia64/kernel/time.c | 5 ++---
> include/linux/frlock.h | 8 ++++++--
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 0:30 ` David S. Miller
@ 2003-01-30 1:27 ` Andrew Morton
2003-01-30 1:24 ` Andi Kleen
2003-01-30 1:35 ` Andrea Arcangeli
0 siblings, 2 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-30 1:27 UTC (permalink / raw)
To: David S. Miller; +Cc: shemminger, rmk, ak, davidm, anton, linux-mm, andrea
"David S. Miller" <davem@redhat.com> wrote:
>
> From: Andrew Morton <akpm@digeo.com>
> Date: Wed, 29 Jan 2003 15:12:06 -0800
>
> But that would be a separate patch. _all_ we are doing here is fixing and
> optimising the xtime_lock problems. We should seek to do that with
> "equivalent transformations".
>
> I agree, do this and arch people can tweak later.
Stephen has already tweaked. Hopefully Andi and David can review this
sometime:
This is an optimisation to the ia64, ia32 and x86_64 do_gettimeofday() code
above and beyond the base frlock work.
Patch from Stephen Hemminger <shemminger@osdl.org>
* Don't need to disable interrupts on ia64, x86_64 or ia32 with TSC
getttimeofday. Disabling interrupts is noticeably slower ~3% so
would really like to only do it if necessary.
* Some more cleanup of frlock.h in the macro's that aren't being used yet.
arch/i386/kernel/time.c | 7 +++----
arch/i386/kernel/timers/timer_pit.c | 7 +++----
arch/ia64/kernel/time.c | 5 ++---
include/linux/frlock.h | 8 ++++++--
i386/kernel/apm.c | 0
5 files changed, 14 insertions(+), 13 deletions(-)
diff -puN arch/i386/kernel/apm.c~do_gettimeofday-speedup arch/i386/kernel/apm.c
diff -puN arch/i386/kernel/time.c~do_gettimeofday-speedup arch/i386/kernel/time.c
--- 25/arch/i386/kernel/time.c~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003
+++ 25-akpm/arch/i386/kernel/time.c Wed Jan 29 16:39:41 2003
@@ -86,22 +86,21 @@ struct timer_opts* timer = &timer_none;
*/
void do_gettimeofday(struct timeval *tv)
{
- unsigned long flags;
unsigned long seq;
unsigned long usec, sec;
do {
- seq = fr_read_begin_irqsave(&xtime_lock, flags);
+ seq = fr_read_begin(&xtime_lock);
usec = timer->get_offset();
{
unsigned long lost = jiffies - wall_jiffies;
- if (lost)
+ if (unlikely(lost != 0))
usec += lost * (1000000 / HZ);
}
sec = xtime.tv_sec;
usec += (xtime.tv_nsec / 1000);
- } while (unlikely(seq != fr_read_end_irqrestore(&xtime_lock, flags)));
+ } while (unlikely(seq != fr_read_end(&xtime_lock)));
while (usec >= 1000000) {
usec -= 1000000;
diff -puN arch/i386/kernel/timers/timer_pit.c~do_gettimeofday-speedup arch/i386/kernel/timers/timer_pit.c
--- 25/arch/i386/kernel/timers/timer_pit.c~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003
+++ 25-akpm/arch/i386/kernel/timers/timer_pit.c Wed Jan 29 16:39:41 2003
@@ -76,7 +76,7 @@ static void delay_pit(unsigned long loop
static unsigned long get_offset_pit(void)
{
int count;
-
+ unsigned long flags;
static int count_p = LATCH; /* for the first call after boot */
static unsigned long jiffies_p = 0;
@@ -85,8 +85,7 @@ static unsigned long get_offset_pit(void
*/
unsigned long jiffies_t;
- /* gets recalled with irq locally disabled */
- spin_lock(&i8253_lock);
+ spin_lock_irqsave(&i8253_lock, flags);
/* timer count may underflow right here */
outb_p(0x00, 0x43); /* latch the count ASAP */
@@ -108,7 +107,7 @@ static unsigned long get_offset_pit(void
count = LATCH - 1;
}
- spin_unlock(&i8253_lock);
+ spin_unlock_irqrestore(&i8253_lock, flags);
/*
* avoiding timer inconsistencies (they are rare, but they happen)...
diff -puN arch/ia64/kernel/time.c~do_gettimeofday-speedup arch/ia64/kernel/time.c
--- 25/arch/ia64/kernel/time.c~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003
+++ 25-akpm/arch/ia64/kernel/time.c Wed Jan 29 16:39:41 2003
@@ -117,11 +117,10 @@ do_settimeofday (struct timeval *tv)
void
do_gettimeofday (struct timeval *tv)
{
- unsigned long flags;
unsigned long seq, usec, sec, old;
do {
- seq = fr_read_begin_irqsave(&xtime_lock, flags);
+ seq = fr_read_begin(&xtime_lock);
usec = gettimeoffset();
/*
@@ -138,7 +137,7 @@ do_gettimeofday (struct timeval *tv)
sec = xtime.tv_sec;
usec += xtime.tv_nsec / 1000;
- } while (seq != fr_read_end_irqrestore(&xtime_lock, flags));
+ } while (seq != fr_read_end(&xtime_lock));
while (usec >= 1000000) {
diff -puN include/linux/frlock.h~do_gettimeofday-speedup include/linux/frlock.h
--- 25/include/linux/frlock.h~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003
+++ 25-akpm/include/linux/frlock.h Wed Jan 29 16:42:15 2003
@@ -64,6 +64,9 @@ static inline void fr_write_end(frlock_t
preempt_enable();
}
+/* Lock out other writers and update the count.
+ * Acts like a normal spin_lock/unlock.
+ */
static inline void fr_write_lock(frlock_t *rw)
{
spin_lock(&rw->lock);
@@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc
if (ret) {
++rw->pre_sequence;
- wmb();
+ mb();
}
return ret;
}
+/* Start of read calculation -- fetch last complete writer token */
static inline unsigned fr_read_begin(const frlock_t *rw)
{
unsigned ret = rw->post_sequence;
@@ -95,6 +99,7 @@ static inline unsigned fr_read_begin(con
}
+/* End of reader calculation -- fetch last writer start token */
static inline unsigned fr_read_end(const frlock_t *rw)
{
rmb();
@@ -128,4 +133,3 @@ static inline unsigned fr_read_end(const
})
#endif /* __LINUX_FRLOCK_H */
-
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:27 ` Andrew Morton
2003-01-30 1:24 ` Andi Kleen
@ 2003-01-30 1:35 ` Andrea Arcangeli
2003-01-30 2:00 ` Andrew Morton
1 sibling, 1 reply; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 1:35 UTC (permalink / raw)
To: Andrew Morton
Cc: David S. Miller, shemminger, rmk, ak, davidm, anton, linux-mm
On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote:
> @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc
>
> if (ret) {
> ++rw->pre_sequence;
> - wmb();
> + mb();
> }
this isn't needed
if we hold the spinlock, the serialized memory can't be change under us,
so there's no need to put a read barrier, we only care that pre_sequence
is visible before the chagnes are visible and before post_sequence is
visible, hence only wmb() (after spin_lock and pre_sequence++) is
needed there and only rmb() is needed in the read-side.
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 2:00 ` Andrew Morton
@ 2003-01-30 1:50 ` Jeff Garzik
2003-01-30 2:03 ` Andrea Arcangeli
2003-01-30 1:52 ` Richard Henderson
` (2 subsequent siblings)
3 siblings, 1 reply; 36+ messages in thread
From: Jeff Garzik @ 2003-01-30 1:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Andrea Arcangeli, davem, shemminger, rmk, ak, davidm, anton,
linux-mm, rth
Andrew Morton wrote:
> #ifndef __LINUX_FRLOCK_H
> #define __LINUX_FRLOCK_H
>
> /*
> * Fast read-write spinlocks.
Can we please pick a unique name whose meaning will not change over
time? Even "andre[wa]_rw_lock" would be better, because its meaning is
not tied to performance at this specific point in time, on today's ia32
flavor-of-the-month.
If we discover even-yet-faster read-write spinlocks tomorrow, this name
is going to become a joke :)
Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 2:00 ` Andrew Morton
2003-01-30 1:50 ` Jeff Garzik
@ 2003-01-30 1:52 ` Richard Henderson
2003-01-30 2:06 ` Andrea Arcangeli
2003-01-30 1:54 ` Andrea Arcangeli
2003-01-30 2:00 ` Richard Henderson
3 siblings, 1 reply; 36+ messages in thread
From: Richard Henderson @ 2003-01-30 1:52 UTC (permalink / raw)
To: Andrew Morton
Cc: Andrea Arcangeli, davem, shemminger, rmk, ak, davidm, anton, linux-mm
On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea@suse.de> wrote:
> > if we hold the spinlock, the serialized memory can't be change under us,
> > so there's no need to put a read barrier, we only care that pre_sequence
> > is visible before the chagnes are visible and before post_sequence is
> > visible, hence only wmb() (after spin_lock and pre_sequence++) is
> > needed there and only rmb() is needed in the read-side.
Hmm. Perhaps I was confused about how these things are intended
to be used. If indeed the writer doesn't care about the order
in which pre/post_sequence are accessed, then wmb is sufficient
to keep their updates ordered.
r~
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 2:00 ` Andrew Morton
2003-01-30 1:50 ` Jeff Garzik
2003-01-30 1:52 ` Richard Henderson
@ 2003-01-30 1:54 ` Andrea Arcangeli
2003-01-30 2:19 ` Andrew Morton
2003-01-30 17:37 ` Stephen Hemminger
2003-01-30 2:00 ` Richard Henderson
3 siblings, 2 replies; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 1:54 UTC (permalink / raw)
To: Andrew Morton; +Cc: davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth
On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea@suse.de> wrote:
> >
> > On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote:
> > > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc
> > >
> > > if (ret) {
> > > ++rw->pre_sequence;
> > > - wmb();
> > > + mb();
> > > }
> >
> > this isn't needed
> >
> >
> > if we hold the spinlock, the serialized memory can't be change under us,
> > so there's no need to put a read barrier, we only care that pre_sequence
> > is visible before the chagnes are visible and before post_sequence is
> > visible, hence only wmb() (after spin_lock and pre_sequence++) is
> > needed there and only rmb() is needed in the read-side.
> >
>
> OK, thanks muchly.
>
> Lots more updates. Here's the version which I currently have. Looks like
> fr_write_lock() and fr_write_unlock() need to be switched back to rmb()?
you certainly mean wmb() not rmb(), right? If yes, then yes.
I actually didn't notice the write_begin/end, not sure who could need
them, I would suggest removing them, rather than to revert the mb()
there too.
>
>
>
> #ifndef __LINUX_FRLOCK_H
> #define __LINUX_FRLOCK_H
>
> /*
> * Fast read-write spinlocks.
> *
> * Fast reader/writer locks without starving writers. This type of
> * lock for data where the reader wants a consitent set of information
> * and is willing to retry if the information changes. Readers never
> * block but they may have to retry if a writer is in
> * progress. Writers do not wait for readers.
> *
> * Generalization on sequence variables used for gettimeofday on x86-64
> * by Andrea Arcangeli
> *
> * This is not as cache friendly as brlock. Also, this will not work
> * for data that contains pointers, because any writer could
> * invalidate a pointer that a reader was following.
> *
> * Expected reader usage:
> * do {
> * seq = fr_read_begin();
> * ...
> * } while (seq != fr_read_end());
> *
> * On non-SMP the spin locks disappear but the writer still needs
> * to increment the sequence variables because an interrupt routine could
> * change the state of the data.
> */
>
> #include <linux/config.h>
> #include <linux/spinlock.h>
> #include <linux/preempt.h>
>
> typedef struct {
> unsigned pre_sequence;
> unsigned post_sequence;
> spinlock_t lock;
> } frlock_t;
>
> /*
> * These macros triggered gcc-3.x compile-time problems. We think these are
> * OK now. Be cautious.
> */
> #define FR_LOCK_UNLOCKED { 0, 0, SPIN_LOCK_UNLOCKED }
> #define frlock_init(x) do { *(x) = (frlock_t) FR_LOCK_UNLOCKED; } while (0)
>
> /* Update sequence count only
> * Assumes caller is doing own mutual exclusion with other lock
> * or semaphore.
> */
> static inline void fr_write_begin(frlock_t *rw)
> {
> preempt_disable();
> rw->pre_sequence++;
> mb();
> }
>
> static inline void fr_write_end(frlock_t *rw)
> {
> mb();
> rw->post_sequence++;
> BUG_ON(rw->post_sequence != rw->pre_sequence);
> preempt_enable();
> }
>
> /* Lock out other writers and update the count.
> * Acts like a normal spin_lock/unlock.
> */
> static inline void fr_write_lock(frlock_t *rw)
> {
> spin_lock(&rw->lock);
> rw->pre_sequence++;
> mb();
> }
>
> static inline void fr_write_unlock(frlock_t *rw)
> {
> mb();
> rw->post_sequence++;
> spin_unlock(&rw->lock);
> }
>
> static inline int fr_write_trylock(frlock_t *rw)
> {
> int ret = spin_trylock(&rw->lock);
>
> if (ret) {
> ++rw->pre_sequence;
> wmb();
> }
> return ret;
> }
>
> static inline unsigned fr_read_begin(const frlock_t *rw)
> {
> unsigned ret = rw->post_sequence;
> rmb();
> return ret;
>
> }
>
> /* End of reader calculation -- fetch last writer start token */
> static inline unsigned fr_read_end(const frlock_t *rw)
> {
> rmb();
> return rw->pre_sequence;
> }
>
> /*
> * Possible sw/hw IRQ protected versions of the interfaces.
> */
> #define fr_write_lock_irqsave(lock, flags) \
> do { local_irq_save(flags); fr_write_lock(lock); } while (0)
> #define fr_write_lock_irq(lock) \
> do { local_irq_disable(); fr_write_lock(lock); } while (0)
> #define fr_write_lock_bh(lock) \
> do { local_bh_disable(); fr_write_lock(lock); } while (0)
>
> #define fr_write_unlock_irqrestore(lock, flags) \
> do { fr_write_unlock(lock); local_irq_restore(flags); } while(0)
> #define fr_write_unlock_irq(lock) \
> do { fr_write_unlock(lock); local_irq_enable(); } while(0)
> #define fr_write_unlock_bh(lock) \
> do { fr_write_unlock(lock); local_bh_enable(); } while(0)
>
> #define fr_read_begin_irqsave(lock, flags) \
> ({ local_irq_save(flags); fr_read_begin(lock); })
>
> #define fr_read_end_irqrestore(lock, flags) \
> ({ unsigned ret = fr_read_end(lock); \
> local_irq_save(flags); \
> ret; \
> })
>
> #endif /* __LINUX_FRLOCK_H */
>
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 2:00 ` Andrew Morton
` (2 preceding siblings ...)
2003-01-30 1:54 ` Andrea Arcangeli
@ 2003-01-30 2:00 ` Richard Henderson
3 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2003-01-30 2:00 UTC (permalink / raw)
To: Andrew Morton
Cc: Andrea Arcangeli, davem, shemminger, rmk, ak, davidm, anton, linux-mm
On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote:
> * Expected reader usage:
> * do {
> * seq = fr_read_begin();
> * ...
> * } while (seq != fr_read_end());
I think perhaps
do {
seq = fr_read_begin(&lock);
...
} while (fr_read_end(&lock, seq))
would be a better interface. This would allow you to change
the implementation as well. E.g.
unsigned fr_read_begin(frlock_t *lock)
{
unsigned s;
do
{
barrier ();
s = lock->sequence;
}
while (s & 1);
rmb();
return s;
}
int fr_read_end(frlock_t *lock, unsigned s)
{
rmb();
return s == lock->sequence;
}
void fr_write_begin(frlock_t *rw)
{
rw->sequence++;
wmb();
}
void fr_write_begin(frlock_t *rw)
{
wmb();
rw->sequence++;
}
which doesn't require as much memory.
r~
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:35 ` Andrea Arcangeli
@ 2003-01-30 2:00 ` Andrew Morton
2003-01-30 1:50 ` Jeff Garzik
` (3 more replies)
0 siblings, 4 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-30 2:00 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth
Andrea Arcangeli <andrea@suse.de> wrote:
>
> On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote:
> > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc
> >
> > if (ret) {
> > ++rw->pre_sequence;
> > - wmb();
> > + mb();
> > }
>
> this isn't needed
>
>
> if we hold the spinlock, the serialized memory can't be change under us,
> so there's no need to put a read barrier, we only care that pre_sequence
> is visible before the chagnes are visible and before post_sequence is
> visible, hence only wmb() (after spin_lock and pre_sequence++) is
> needed there and only rmb() is needed in the read-side.
>
OK, thanks muchly.
Lots more updates. Here's the version which I currently have. Looks like
fr_write_lock() and fr_write_unlock() need to be switched back to rmb()?
#ifndef __LINUX_FRLOCK_H
#define __LINUX_FRLOCK_H
/*
* Fast read-write spinlocks.
*
* Fast reader/writer locks without starving writers. This type of
* lock for data where the reader wants a consitent set of information
* and is willing to retry if the information changes. Readers never
* block but they may have to retry if a writer is in
* progress. Writers do not wait for readers.
*
* Generalization on sequence variables used for gettimeofday on x86-64
* by Andrea Arcangeli
*
* This is not as cache friendly as brlock. Also, this will not work
* for data that contains pointers, because any writer could
* invalidate a pointer that a reader was following.
*
* Expected reader usage:
* do {
* seq = fr_read_begin();
* ...
* } while (seq != fr_read_end());
*
* On non-SMP the spin locks disappear but the writer still needs
* to increment the sequence variables because an interrupt routine could
* change the state of the data.
*/
#include <linux/config.h>
#include <linux/spinlock.h>
#include <linux/preempt.h>
typedef struct {
unsigned pre_sequence;
unsigned post_sequence;
spinlock_t lock;
} frlock_t;
/*
* These macros triggered gcc-3.x compile-time problems. We think these are
* OK now. Be cautious.
*/
#define FR_LOCK_UNLOCKED { 0, 0, SPIN_LOCK_UNLOCKED }
#define frlock_init(x) do { *(x) = (frlock_t) FR_LOCK_UNLOCKED; } while (0)
/* Update sequence count only
* Assumes caller is doing own mutual exclusion with other lock
* or semaphore.
*/
static inline void fr_write_begin(frlock_t *rw)
{
preempt_disable();
rw->pre_sequence++;
mb();
}
static inline void fr_write_end(frlock_t *rw)
{
mb();
rw->post_sequence++;
BUG_ON(rw->post_sequence != rw->pre_sequence);
preempt_enable();
}
/* Lock out other writers and update the count.
* Acts like a normal spin_lock/unlock.
*/
static inline void fr_write_lock(frlock_t *rw)
{
spin_lock(&rw->lock);
rw->pre_sequence++;
mb();
}
static inline void fr_write_unlock(frlock_t *rw)
{
mb();
rw->post_sequence++;
spin_unlock(&rw->lock);
}
static inline int fr_write_trylock(frlock_t *rw)
{
int ret = spin_trylock(&rw->lock);
if (ret) {
++rw->pre_sequence;
wmb();
}
return ret;
}
static inline unsigned fr_read_begin(const frlock_t *rw)
{
unsigned ret = rw->post_sequence;
rmb();
return ret;
}
/* End of reader calculation -- fetch last writer start token */
static inline unsigned fr_read_end(const frlock_t *rw)
{
rmb();
return rw->pre_sequence;
}
/*
* Possible sw/hw IRQ protected versions of the interfaces.
*/
#define fr_write_lock_irqsave(lock, flags) \
do { local_irq_save(flags); fr_write_lock(lock); } while (0)
#define fr_write_lock_irq(lock) \
do { local_irq_disable(); fr_write_lock(lock); } while (0)
#define fr_write_lock_bh(lock) \
do { local_bh_disable(); fr_write_lock(lock); } while (0)
#define fr_write_unlock_irqrestore(lock, flags) \
do { fr_write_unlock(lock); local_irq_restore(flags); } while(0)
#define fr_write_unlock_irq(lock) \
do { fr_write_unlock(lock); local_irq_enable(); } while(0)
#define fr_write_unlock_bh(lock) \
do { fr_write_unlock(lock); local_bh_enable(); } while(0)
#define fr_read_begin_irqsave(lock, flags) \
({ local_irq_save(flags); fr_read_begin(lock); })
#define fr_read_end_irqrestore(lock, flags) \
({ unsigned ret = fr_read_end(lock); \
local_irq_save(flags); \
ret; \
})
#endif /* __LINUX_FRLOCK_H */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:24 ` Andi Kleen
@ 2003-01-30 2:01 ` Andrew Morton
0 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-30 2:01 UTC (permalink / raw)
To: Andi Kleen; +Cc: davem, shemminger, rmk, davidm, anton, linux-mm, andrea
Andi Kleen <ak@muc.de> wrote:
>
> > This is an optimisation to the ia64, ia32 and x86_64 do_gettimeofday() code
> > above and beyond the base frlock work.
>
> Hmm? No x86_64 changes in there.
>
> > arch/i386/kernel/time.c | 7 +++----
> > arch/i386/kernel/timers/timer_pit.c | 7 +++----
> > arch/ia64/kernel/time.c | 5 ++---
> > include/linux/frlock.h | 8 ++++++--
>
True. Hmm. Maybe Stephen made some x86_64 changes, but didn't send them?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:50 ` Jeff Garzik
@ 2003-01-30 2:03 ` Andrea Arcangeli
2003-01-30 17:09 ` Stephen Hemminger
0 siblings, 1 reply; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 2:03 UTC (permalink / raw)
To: Jeff Garzik
Cc: Andrew Morton, davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth
On Wed, Jan 29, 2003 at 08:50:18PM -0500, Jeff Garzik wrote:
> Andrew Morton wrote:
> >#ifndef __LINUX_FRLOCK_H
> >#define __LINUX_FRLOCK_H
> >
> >/*
> > * Fast read-write spinlocks.
>
>
> Can we please pick a unique name whose meaning will not change over
> time? Even "andre[wa]_rw_lock" would be better, because its meaning is
I was the first to use it in practice in linux for gettimeofday in the
x86-64 port, but it should be called "keith_owens_lock" if really you
like to use the inventor name. I was trying to find this kind of
readonly read-side lock while designing the vgettimeofday the first time
it was discussed ever on l-k a few years back (way before it seen the
light of the day), and during such discussion Keith shown me the light,
so it wouldn't be fair to take credit for it ;). If you search l-k
probably in y2k or near you should find that email from Keith in answer
to me.
> not tied to performance at this specific point in time, on today's ia32
> flavor-of-the-month.
>
> If we discover even-yet-faster read-write spinlocks tomorrow, this name
> is going to become a joke :)
it would become historical, like so many other things. Actually when
they say frlock I don't even think at fast read lock, I think at frlock
as a specific new name, so personally I'm fine either ways. I don't
dislike it, it's not worse than the big reader lock name that should be
replaced by RCU at large btw.
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:52 ` Richard Henderson
@ 2003-01-30 2:06 ` Andrea Arcangeli
0 siblings, 0 replies; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 2:06 UTC (permalink / raw)
To: Richard Henderson
Cc: Andrew Morton, davem, shemminger, rmk, ak, davidm, anton, linux-mm
On Wed, Jan 29, 2003 at 05:52:32PM -0800, Richard Henderson wrote:
> On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote:
> > Andrea Arcangeli <andrea@suse.de> wrote:
> > > if we hold the spinlock, the serialized memory can't be change under us,
> > > so there's no need to put a read barrier, we only care that pre_sequence
> > > is visible before the chagnes are visible and before post_sequence is
> > > visible, hence only wmb() (after spin_lock and pre_sequence++) is
> > > needed there and only rmb() is needed in the read-side.
>
> Hmm. Perhaps I was confused about how these things are intended
> to be used. If indeed the writer doesn't care about the order
> in which pre/post_sequence are accessed, then wmb is sufficient
> to keep their updates ordered.
yep, IMHO it should be enough.
btw, I'm speaking only about the fr_write_lock/unlock, not sure what the
write_begin/end are meant for, there are no fr_write_begin/end in the
patch I was reading (in 2.4). If fr_write_begin/end have to be retained
because they can do something useful, and they don't take the spinlock,
it could be needed there to use mb(), I just have no idea of where
write_begin/end could be useful so it's hard to say.
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:54 ` Andrea Arcangeli
@ 2003-01-30 2:19 ` Andrew Morton
2003-01-30 17:37 ` Stephen Hemminger
1 sibling, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-30 2:19 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth
Andrea Arcangeli <andrea@suse.de> wrote:
>
> On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote:
> > Andrea Arcangeli <andrea@suse.de> wrote:
> > >
> > > On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote:
> > > > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc
> > > >
> > > > if (ret) {
> > > > ++rw->pre_sequence;
> > > > - wmb();
> > > > + mb();
> > > > }
> > >
> > > this isn't needed
> > >
> > >
> > > if we hold the spinlock, the serialized memory can't be change under us,
> > > so there's no need to put a read barrier, we only care that pre_sequence
> > > is visible before the chagnes are visible and before post_sequence is
> > > visible, hence only wmb() (after spin_lock and pre_sequence++) is
> > > needed there and only rmb() is needed in the read-side.
> > >
> >
> > OK, thanks muchly.
> >
> > Lots more updates. Here's the version which I currently have. Looks like
> > fr_write_lock() and fr_write_unlock() need to be switched back to rmb()?
>
> you certainly mean wmb() not rmb(), right? If yes, then yes.
Yup.
> I actually didn't notice the write_begin/end, not sure who could need
> them, I would suggest removing them, rather than to revert the mb()
> there too.
The intent here was to use them for i_size updates. In situations where
writer serialisation was provided by external means (i_sem), and the spinlock
is not needed.
It's causing confusion so yeah, I'll probably pull them out.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 2:03 ` Andrea Arcangeli
@ 2003-01-30 17:09 ` Stephen Hemminger
2003-01-30 17:15 ` Randy.Dunlap
0 siblings, 1 reply; 36+ messages in thread
From: Stephen Hemminger @ 2003-01-30 17:09 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Jeff Garzik, Andrew Morton, David Miller, rmk, ak, davidm, anton,
linux-mm, rth
> > not tied to performance at this specific point in time, on today's ia32
> > flavor-of-the-month.
> >
> > If we discover even-yet-faster read-write spinlocks tomorrow, this name
> > is going to become a joke :)
>
> it would become historical, like so many other things. Actually when
> they say frlock I don't even think at fast read lock, I think at frlock
> as a specific new name, so personally I'm fine either ways. I don't
> dislike it, it's not worse than the big reader lock name that should be
> replaced by RCU at large btw.
Don't read too much into the name. It was just a 30 second effort.
Just didn't want a name like:
ThingToDoReadConsitentDataUsingSequenceNumbers
So if there is a standard or better name in a reasonable length,
then let's change it. Marketing always changes the name of everything
prior to release anyway ;-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 17:09 ` Stephen Hemminger
@ 2003-01-30 17:15 ` Randy.Dunlap
2003-01-30 17:25 ` Andi Kleen
0 siblings, 1 reply; 36+ messages in thread
From: Randy.Dunlap @ 2003-01-30 17:15 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Andrea Arcangeli, Jeff Garzik, Andrew Morton, David Miller, rmk,
ak, davidm, anton, linux-mm, rth
On 30 Jan 2003, Stephen Hemminger wrote:
| > > not tied to performance at this specific point in time, on today's ia32
| > > flavor-of-the-month.
| > >
| > > If we discover even-yet-faster read-write spinlocks tomorrow, this name
| > > is going to become a joke :)
| >
| > it would become historical, like so many other things. Actually when
| > they say frlock I don't even think at fast read lock, I think at frlock
| > as a specific new name, so personally I'm fine either ways. I don't
| > dislike it, it's not worse than the big reader lock name that should be
| > replaced by RCU at large btw.
|
| Don't read too much into the name. It was just a 30 second effort.
| Just didn't want a name like:
| ThingToDoReadConsitentDataUsingSequenceNumbers
|
| So if there is a standard or better name in a reasonable length,
| then let's change it. Marketing always changes the name of everything
| prior to release anyway ;-)
You can follow Andrea's suggestion and call it a kaos_lock
(for Keith Owens).
--
~Randy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 17:25 ` Andi Kleen
@ 2003-01-30 17:23 ` Randy.Dunlap
0 siblings, 0 replies; 36+ messages in thread
From: Randy.Dunlap @ 2003-01-30 17:23 UTC (permalink / raw)
To: Andi Kleen
Cc: Stephen Hemminger, Andrea Arcangeli, Jeff Garzik, Andrew Morton,
David Miller, rmk, davidm, anton, linux-mm, rth
On Thu, 30 Jan 2003, Andi Kleen wrote:
| > You can follow Andrea's suggestion and call it a kaos_lock
| > (for Keith Owens).
|
| How about just seq_lock (sequence lock) ?
Hey, that makes some sense...
--
~Randy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 17:15 ` Randy.Dunlap
@ 2003-01-30 17:25 ` Andi Kleen
2003-01-30 17:23 ` Randy.Dunlap
0 siblings, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2003-01-30 17:25 UTC (permalink / raw)
To: Randy.Dunlap
Cc: Stephen Hemminger, Andrea Arcangeli, Jeff Garzik, Andrew Morton,
David Miller, rmk, ak, davidm, anton, linux-mm, rth
> You can follow Andrea's suggestion and call it a kaos_lock
> (for Keith Owens).
How about just seq_lock (sequence lock) ?
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 1:54 ` Andrea Arcangeli
2003-01-30 2:19 ` Andrew Morton
@ 2003-01-30 17:37 ` Stephen Hemminger
2003-01-30 17:50 ` Andrea Arcangeli
1 sibling, 1 reply; 36+ messages in thread
From: Stephen Hemminger @ 2003-01-30 17:37 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth
> you certainly mean wmb() not rmb(), right? If yes, then yes.
>
> I actually didn't notice the write_begin/end, not sure who could need
> them, I would suggest removing them, rather than to revert the mb()
> there too.
The write_begin/end was suggested by Andrew as a simplification for use
when using this to update values already write-locked by other means.
One possible usage was to fix the race issues with non-atomic update
of 64 bit i_size.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup
2003-01-30 17:37 ` Stephen Hemminger
@ 2003-01-30 17:50 ` Andrea Arcangeli
0 siblings, 0 replies; 36+ messages in thread
From: Andrea Arcangeli @ 2003-01-30 17:50 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth
On Thu, Jan 30, 2003 at 09:37:06AM -0800, Stephen Hemminger wrote:
>
> > you certainly mean wmb() not rmb(), right? If yes, then yes.
> >
> > I actually didn't notice the write_begin/end, not sure who could need
> > them, I would suggest removing them, rather than to revert the mb()
> > there too.
>
> The write_begin/end was suggested by Andrew as a simplification for use
> when using this to update values already write-locked by other means.
>
> One possible usage was to fix the race issues with non-atomic update
> of 64 bit i_size.
It looks overdesign to me, you don't need the spinlock for that, the
i_sem is explicit too. The generalized abstraction is worthwhile when
you have to use it 99% of the time, the missing 1% doesn't need to be
abstracted, forward porting my implementation is the best for such
specific case IMHO.
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2003-01-30 17:50 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-29 6:07 Linus rollup Andrew Morton
2003-01-29 6:53 ` David Mosberger
2003-01-29 7:25 ` David S. Miller
2003-01-29 9:33 ` Andrew Morton
2003-01-29 9:35 ` David S. Miller
2003-01-29 9:54 ` Anton Blanchard
2003-01-29 9:59 ` Russell King
2003-01-29 9:51 ` David S. Miller
2003-01-29 10:26 ` Andrew Morton
2003-01-29 22:35 ` Stephen Hemminger
2003-01-29 23:12 ` Andrew Morton
2003-01-30 0:30 ` David S. Miller
2003-01-30 1:27 ` Andrew Morton
2003-01-30 1:24 ` Andi Kleen
2003-01-30 2:01 ` Andrew Morton
2003-01-30 1:35 ` Andrea Arcangeli
2003-01-30 2:00 ` Andrew Morton
2003-01-30 1:50 ` Jeff Garzik
2003-01-30 2:03 ` Andrea Arcangeli
2003-01-30 17:09 ` Stephen Hemminger
2003-01-30 17:15 ` Randy.Dunlap
2003-01-30 17:25 ` Andi Kleen
2003-01-30 17:23 ` Randy.Dunlap
2003-01-30 1:52 ` Richard Henderson
2003-01-30 2:06 ` Andrea Arcangeli
2003-01-30 1:54 ` Andrea Arcangeli
2003-01-30 2:19 ` Andrew Morton
2003-01-30 17:37 ` Stephen Hemminger
2003-01-30 17:50 ` Andrea Arcangeli
2003-01-30 2:00 ` Richard Henderson
2003-01-30 0:46 ` Andrea Arcangeli
2003-01-30 0:43 ` Andrea Arcangeli
2003-01-29 10:16 ` Andrew Morton
2003-01-29 17:04 ` David Mosberger
2003-01-29 17:25 ` Russell King
2003-01-29 19:05 ` David Mosberger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox