* Linus rollup
@ 2003-01-29 6:07 Andrew Morton
2003-01-29 6:53 ` David Mosberger
` (4 more replies)
0 siblings, 5 replies; 36+ messages in thread
From: Andrew Morton @ 2003-01-29 6:07 UTC (permalink / raw)
To: Russell King, Andi Kleen, David S. Miller, David Mosberger,
Anton Blanchard
Cc: linux-mm
Gents,
I've sifted out all the things which I intend to send to the boss soon. It
would be good if you could perform some quick non-ia32 testing please.
Possible breakage would be in the new frlock-for-xtime_lock code and the
get_order() cleanup.
The frlock code is showing nice speedups, but I think the main reason we want
this is to fix the problem wherein an application spinning on gettimeofday()
can make time stop.
It's all at
http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-lt1/
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: Linus rollup 2003-01-29 6:07 Linus rollup Andrew Morton @ 2003-01-29 6:53 ` David Mosberger 2003-01-29 7:25 ` David S. Miller ` (3 subsequent siblings) 4 siblings, 0 replies; 36+ messages in thread From: David Mosberger @ 2003-01-29 6:53 UTC (permalink / raw) To: Andrew Morton Cc: Russell King, Andi Kleen, David S. Miller, David Mosberger, Anton Blanchard, linux-mm The patch looks good to me. I applied it on top of 2.5.59+ia64 and it checked out fine on the Ski simulator and a dual-Itanium2 zx6000 workstation. Thanks, --david >>>>> On Tue, 28 Jan 2003 22:07:29 -0800, Andrew Morton <akpm@digeo.com> said: Andrew> Gents, Andrew> I've sifted out all the things which I intend to send to the Andrew> boss soon. It would be good if you could perform some quick Andrew> non-ia32 testing please. Andrew> Possible breakage would be in the new frlock-for-xtime_lock Andrew> code and the get_order() cleanup. Andrew> The frlock code is showing nice speedups, but I think the Andrew> main reason we want this is to fix the problem wherein an Andrew> application spinning on gettimeofday() can make time stop. Andrew> It's all at Andrew> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-lt1/ Andrew> Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 6:07 Linus rollup Andrew Morton 2003-01-29 6:53 ` David Mosberger @ 2003-01-29 7:25 ` David S. Miller 2003-01-29 9:33 ` Andrew Morton ` (2 subsequent siblings) 4 siblings, 0 replies; 36+ messages in thread From: David S. Miller @ 2003-01-29 7:25 UTC (permalink / raw) To: akpm; +Cc: rmk, ak, davidm, anton, linux-mm I've sifted out all the things which I intend to send to the boss soon. It would be good if you could perform some quick non-ia32 testing please. Possible breakage would be in the new frlock-for-xtime_lock code and the get_order() cleanup. It all looks fine, I didn't sanity build/boot it on sparc but if there are any problems they will be minor and I'll just fix it up when Linus grabs your stuff. BTW, that: do { seq = fr_read_begin(...); sec = foo; usec = bar; } while (seq != fr_read_end(...)) loop is duplicated nearly identically perhaps 10 to 15 times, would be nice to put it in one spot if possible. :-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 6:07 Linus rollup Andrew Morton 2003-01-29 6:53 ` David Mosberger 2003-01-29 7:25 ` David S. Miller @ 2003-01-29 9:33 ` Andrew Morton 2003-01-29 9:35 ` David S. Miller 2003-01-29 9:54 ` Anton Blanchard 2003-01-29 9:59 ` Russell King 4 siblings, 1 reply; 36+ messages in thread From: Andrew Morton @ 2003-01-29 9:33 UTC (permalink / raw) To: rmk, ak, davem, davidm, anton, linux-mm Andrew Morton <akpm@digeo.com> wrote: > > Gents, > > I've sifted out all the things which I intend to send to the boss soon Forgot to mention: - sys_semtimedop() is only wired up for ia32/ia64 at present. - This rollup contains the new sys_fadvise(), so unless Linus bounces my first syscall, that will need hooking up for non-ia32 too. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 9:33 ` Andrew Morton @ 2003-01-29 9:35 ` David S. Miller 0 siblings, 0 replies; 36+ messages in thread From: David S. Miller @ 2003-01-29 9:35 UTC (permalink / raw) To: akpm; +Cc: rmk, ak, davidm, anton, linux-mm Forgot to mention: I wish there was some way people could easily add an #error to the build so that arch's know which syscalls need to be added next time someone tries to run make for that platform. Nothing comes immediately to mind as an idea however. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 6:07 Linus rollup Andrew Morton ` (2 preceding siblings ...) 2003-01-29 9:33 ` Andrew Morton @ 2003-01-29 9:54 ` Anton Blanchard 2003-01-29 9:59 ` Russell King 4 siblings, 0 replies; 36+ messages in thread From: Anton Blanchard @ 2003-01-29 9:54 UTC (permalink / raw) To: Andrew Morton Cc: Russell King, Andi Kleen, David S. Miller, David Mosberger, linux-mm > I've sifted out all the things which I intend to send to the boss soon. It > would be good if you could perform some quick non-ia32 testing please. > > Possible breakage would be in the new frlock-for-xtime_lock code and the > get_order() cleanup. > > The frlock code is showing nice speedups, but I think the main reason we want > this is to fix the problem wherein an application spinning on gettimeofday() > can make time stop. Checks out OK on ppc64 bar some problems with the get_order patch. (should include linux/bitops.h instead of asm/bitops.h). It passed some stress tests (sdet, tpc-h) Anton -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 6:07 Linus rollup Andrew Morton ` (3 preceding siblings ...) 2003-01-29 9:54 ` Anton Blanchard @ 2003-01-29 9:59 ` Russell King 2003-01-29 9:51 ` David S. Miller ` (2 more replies) 4 siblings, 3 replies; 36+ messages in thread From: Russell King @ 2003-01-29 9:59 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, David S. Miller, David Mosberger, Anton Blanchard, linux-mm On Tue, Jan 28, 2003 at 10:07:29PM -0800, Andrew Morton wrote: > Possible breakage would be in the new frlock-for-xtime_lock code and the > get_order() cleanup. > > The frlock code is showing nice speedups, but I think the main reason we want > this is to fix the problem wherein an application spinning on gettimeofday() > can make time stop. I'm slightly concerned about this. With this patch, we generally seem to do: do { seq = fr_read_begin(&xtime_lock); *tv = xtime; tv->tv_usec += do_fast_gettimeoffset(); tv->tv_usec += lost_ticks; } while (seq != fr_read_end(&xtime_lock)); This is fine when considering xtime. However, considering the implementation of do_fast_gettimeoffset(). Notice that linux/arch/i386/kernel/timers/timer_pit.c:do_offset_pit specifically makes the comment: /* This function must be called with interrupts disabled which hasn't been true for some time, and is even less true now that local IRQs don't get disabled. Does this matter... for UP? The same is true for other architectures; their gettimeoffset implementations need to be audited by the architecture maintainers to ensure that they are safe to run with (local) interrupts enabled. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 9:59 ` Russell King @ 2003-01-29 9:51 ` David S. Miller 2003-01-29 10:26 ` Andrew Morton 2003-01-29 10:16 ` Andrew Morton 2003-01-29 17:04 ` David Mosberger 2 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2003-01-29 9:51 UTC (permalink / raw) To: rmk; +Cc: akpm, ak, davidm, anton, linux-mm /* This function must be called with interrupts disabled which hasn't been true for some time, and is even less true now that local IRQs don't get disabled. Does this matter... for UP? I disable local IRQs during gettimeofday() on sparc. These locks definitely need to be taken with IRQs disabled. Why isn't x86 doing that? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 9:51 ` David S. Miller @ 2003-01-29 10:26 ` Andrew Morton 2003-01-29 22:35 ` Stephen Hemminger 2003-01-30 0:43 ` Andrea Arcangeli 0 siblings, 2 replies; 36+ messages in thread From: Andrew Morton @ 2003-01-29 10:26 UTC (permalink / raw) To: David S. Miller; +Cc: rmk, ak, davidm, anton, linux-mm, Andrea Arcangeli "David S. Miller" <davem@redhat.com> wrote: > > From: Russell King <rmk@arm.linux.org.uk> > Date: Wed, 29 Jan 2003 09:59:49 +0000 > > /* This function must be called with interrupts disabled > > which hasn't been true for some time, and is even less true now that > local IRQs don't get disabled. Does this matter... for UP? > > I disable local IRQs during gettimeofday() on sparc. > > These locks definitely need to be taken with IRQs disabled. > Why isn't x86 doing that? Darned if I know. Looks like Andrea's kernel will deadlock if arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock while that cpu is holding the same lock in do_slow_gettimeoffset(). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 10:26 ` Andrew Morton @ 2003-01-29 22:35 ` Stephen Hemminger 2003-01-29 23:12 ` Andrew Morton 2003-01-30 0:46 ` Andrea Arcangeli 2003-01-30 0:43 ` Andrea Arcangeli 1 sibling, 2 replies; 36+ messages in thread From: Stephen Hemminger @ 2003-01-29 22:35 UTC (permalink / raw) To: Andrew Morton Cc: David S. Miller, rmk, ak, davidm, anton, linux-mm, Andrea Arcangeli On Wed, 2003-01-29 at 02:26, Andrew Morton wrote: > "David S. Miller" <davem@redhat.com> wrote: > > > > From: Russell King <rmk@arm.linux.org.uk> > > Date: Wed, 29 Jan 2003 09:59:49 +0000 > > > > /* This function must be called with interrupts disabled > > > > which hasn't been true for some time, and is even less true now that > > local IRQs don't get disabled. Does this matter... for UP? > > > > I disable local IRQs during gettimeofday() on sparc. > > > > These locks definitely need to be taken with IRQs disabled. > > Why isn't x86 doing that? > > Darned if I know. Looks like Andrea's kernel will deadlock if > arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock > while that cpu is holding the same lock in do_slow_gettimeoffset(). Rather than disabling interrupts in the i386 do_gettimeofday why not just change spin_lock(&i8253_lock) to spin_lock_irqsave in timer_pit.c -- Stephen Hemminger <shemminger@osdl.org> Open Source Devlopment Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 22:35 ` Stephen Hemminger @ 2003-01-29 23:12 ` Andrew Morton 2003-01-30 0:30 ` David S. Miller 2003-01-30 0:46 ` Andrea Arcangeli 1 sibling, 1 reply; 36+ messages in thread From: Andrew Morton @ 2003-01-29 23:12 UTC (permalink / raw) To: Stephen Hemminger; +Cc: davem, rmk, ak, davidm, anton, linux-mm, andrea Stephen Hemminger <shemminger@osdl.org> wrote: > > On Wed, 2003-01-29 at 02:26, Andrew Morton wrote: > > "David S. Miller" <davem@redhat.com> wrote: > > > > > > From: Russell King <rmk@arm.linux.org.uk> > > > Date: Wed, 29 Jan 2003 09:59:49 +0000 > > > > > > /* This function must be called with interrupts disabled > > > > > > which hasn't been true for some time, and is even less true now that > > > local IRQs don't get disabled. Does this matter... for UP? > > > > > > I disable local IRQs during gettimeofday() on sparc. > > > > > > These locks definitely need to be taken with IRQs disabled. > > > Why isn't x86 doing that? > > > > Darned if I know. Looks like Andrea's kernel will deadlock if > > arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock > > while that cpu is holding the same lock in do_slow_gettimeoffset(). > > Rather than disabling interrupts in the i386 do_gettimeofday > why not just change spin_lock(&i8253_lock) to spin_lock_irqsave > in timer_pit.c That's probably a legitimate fix, subject to audit of the other implementations of ->get_offset(). But that would be a separate patch. _all_ we are doing here is fixing and optimising the xtime_lock problems. We should seek to do that with "equivalent transformations". Fine-tuning the ia32 timer implementation is all well and good, but should not be kerfuddled with kernel-wide xtime_lock rework. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 23:12 ` Andrew Morton @ 2003-01-30 0:30 ` David S. Miller 2003-01-30 1:27 ` Andrew Morton 0 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2003-01-30 0:30 UTC (permalink / raw) To: akpm; +Cc: shemminger, rmk, ak, davidm, anton, linux-mm, andrea But that would be a separate patch. _all_ we are doing here is fixing and optimising the xtime_lock problems. We should seek to do that with "equivalent transformations". I agree, do this and arch people can tweak later. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 0:30 ` David S. Miller @ 2003-01-30 1:27 ` Andrew Morton 2003-01-30 1:24 ` Andi Kleen 2003-01-30 1:35 ` Andrea Arcangeli 0 siblings, 2 replies; 36+ messages in thread From: Andrew Morton @ 2003-01-30 1:27 UTC (permalink / raw) To: David S. Miller; +Cc: shemminger, rmk, ak, davidm, anton, linux-mm, andrea "David S. Miller" <davem@redhat.com> wrote: > > From: Andrew Morton <akpm@digeo.com> > Date: Wed, 29 Jan 2003 15:12:06 -0800 > > But that would be a separate patch. _all_ we are doing here is fixing and > optimising the xtime_lock problems. We should seek to do that with > "equivalent transformations". > > I agree, do this and arch people can tweak later. Stephen has already tweaked. Hopefully Andi and David can review this sometime: This is an optimisation to the ia64, ia32 and x86_64 do_gettimeofday() code above and beyond the base frlock work. Patch from Stephen Hemminger <shemminger@osdl.org> * Don't need to disable interrupts on ia64, x86_64 or ia32 with TSC getttimeofday. Disabling interrupts is noticeably slower ~3% so would really like to only do it if necessary. * Some more cleanup of frlock.h in the macro's that aren't being used yet. arch/i386/kernel/time.c | 7 +++---- arch/i386/kernel/timers/timer_pit.c | 7 +++---- arch/ia64/kernel/time.c | 5 ++--- include/linux/frlock.h | 8 ++++++-- i386/kernel/apm.c | 0 5 files changed, 14 insertions(+), 13 deletions(-) diff -puN arch/i386/kernel/apm.c~do_gettimeofday-speedup arch/i386/kernel/apm.c diff -puN arch/i386/kernel/time.c~do_gettimeofday-speedup arch/i386/kernel/time.c --- 25/arch/i386/kernel/time.c~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003 +++ 25-akpm/arch/i386/kernel/time.c Wed Jan 29 16:39:41 2003 @@ -86,22 +86,21 @@ struct timer_opts* timer = &timer_none; */ void do_gettimeofday(struct timeval *tv) { - unsigned long flags; unsigned long seq; unsigned long usec, sec; do { - seq = fr_read_begin_irqsave(&xtime_lock, flags); + seq = fr_read_begin(&xtime_lock); usec = timer->get_offset(); { unsigned long lost = jiffies - wall_jiffies; - if (lost) + if (unlikely(lost != 0)) usec += lost * (1000000 / HZ); } sec = xtime.tv_sec; usec += (xtime.tv_nsec / 1000); - } while (unlikely(seq != fr_read_end_irqrestore(&xtime_lock, flags))); + } while (unlikely(seq != fr_read_end(&xtime_lock))); while (usec >= 1000000) { usec -= 1000000; diff -puN arch/i386/kernel/timers/timer_pit.c~do_gettimeofday-speedup arch/i386/kernel/timers/timer_pit.c --- 25/arch/i386/kernel/timers/timer_pit.c~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003 +++ 25-akpm/arch/i386/kernel/timers/timer_pit.c Wed Jan 29 16:39:41 2003 @@ -76,7 +76,7 @@ static void delay_pit(unsigned long loop static unsigned long get_offset_pit(void) { int count; - + unsigned long flags; static int count_p = LATCH; /* for the first call after boot */ static unsigned long jiffies_p = 0; @@ -85,8 +85,7 @@ static unsigned long get_offset_pit(void */ unsigned long jiffies_t; - /* gets recalled with irq locally disabled */ - spin_lock(&i8253_lock); + spin_lock_irqsave(&i8253_lock, flags); /* timer count may underflow right here */ outb_p(0x00, 0x43); /* latch the count ASAP */ @@ -108,7 +107,7 @@ static unsigned long get_offset_pit(void count = LATCH - 1; } - spin_unlock(&i8253_lock); + spin_unlock_irqrestore(&i8253_lock, flags); /* * avoiding timer inconsistencies (they are rare, but they happen)... diff -puN arch/ia64/kernel/time.c~do_gettimeofday-speedup arch/ia64/kernel/time.c --- 25/arch/ia64/kernel/time.c~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003 +++ 25-akpm/arch/ia64/kernel/time.c Wed Jan 29 16:39:41 2003 @@ -117,11 +117,10 @@ do_settimeofday (struct timeval *tv) void do_gettimeofday (struct timeval *tv) { - unsigned long flags; unsigned long seq, usec, sec, old; do { - seq = fr_read_begin_irqsave(&xtime_lock, flags); + seq = fr_read_begin(&xtime_lock); usec = gettimeoffset(); /* @@ -138,7 +137,7 @@ do_gettimeofday (struct timeval *tv) sec = xtime.tv_sec; usec += xtime.tv_nsec / 1000; - } while (seq != fr_read_end_irqrestore(&xtime_lock, flags)); + } while (seq != fr_read_end(&xtime_lock)); while (usec >= 1000000) { diff -puN include/linux/frlock.h~do_gettimeofday-speedup include/linux/frlock.h --- 25/include/linux/frlock.h~do_gettimeofday-speedup Wed Jan 29 16:39:41 2003 +++ 25-akpm/include/linux/frlock.h Wed Jan 29 16:42:15 2003 @@ -64,6 +64,9 @@ static inline void fr_write_end(frlock_t preempt_enable(); } +/* Lock out other writers and update the count. + * Acts like a normal spin_lock/unlock. + */ static inline void fr_write_lock(frlock_t *rw) { spin_lock(&rw->lock); @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc if (ret) { ++rw->pre_sequence; - wmb(); + mb(); } return ret; } +/* Start of read calculation -- fetch last complete writer token */ static inline unsigned fr_read_begin(const frlock_t *rw) { unsigned ret = rw->post_sequence; @@ -95,6 +99,7 @@ static inline unsigned fr_read_begin(con } +/* End of reader calculation -- fetch last writer start token */ static inline unsigned fr_read_end(const frlock_t *rw) { rmb(); @@ -128,4 +133,3 @@ static inline unsigned fr_read_end(const }) #endif /* __LINUX_FRLOCK_H */ - _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:27 ` Andrew Morton @ 2003-01-30 1:24 ` Andi Kleen 2003-01-30 2:01 ` Andrew Morton 2003-01-30 1:35 ` Andrea Arcangeli 1 sibling, 1 reply; 36+ messages in thread From: Andi Kleen @ 2003-01-30 1:24 UTC (permalink / raw) To: Andrew Morton Cc: David S. Miller, shemminger, rmk, ak, davidm, anton, linux-mm, andrea > This is an optimisation to the ia64, ia32 and x86_64 do_gettimeofday() code > above and beyond the base frlock work. Hmm? No x86_64 changes in there. > arch/i386/kernel/time.c | 7 +++---- > arch/i386/kernel/timers/timer_pit.c | 7 +++---- > arch/ia64/kernel/time.c | 5 ++--- > include/linux/frlock.h | 8 ++++++-- -Andi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:24 ` Andi Kleen @ 2003-01-30 2:01 ` Andrew Morton 0 siblings, 0 replies; 36+ messages in thread From: Andrew Morton @ 2003-01-30 2:01 UTC (permalink / raw) To: Andi Kleen; +Cc: davem, shemminger, rmk, davidm, anton, linux-mm, andrea Andi Kleen <ak@muc.de> wrote: > > > This is an optimisation to the ia64, ia32 and x86_64 do_gettimeofday() code > > above and beyond the base frlock work. > > Hmm? No x86_64 changes in there. > > > arch/i386/kernel/time.c | 7 +++---- > > arch/i386/kernel/timers/timer_pit.c | 7 +++---- > > arch/ia64/kernel/time.c | 5 ++--- > > include/linux/frlock.h | 8 ++++++-- > True. Hmm. Maybe Stephen made some x86_64 changes, but didn't send them? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:27 ` Andrew Morton 2003-01-30 1:24 ` Andi Kleen @ 2003-01-30 1:35 ` Andrea Arcangeli 2003-01-30 2:00 ` Andrew Morton 1 sibling, 1 reply; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 1:35 UTC (permalink / raw) To: Andrew Morton Cc: David S. Miller, shemminger, rmk, ak, davidm, anton, linux-mm On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote: > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc > > if (ret) { > ++rw->pre_sequence; > - wmb(); > + mb(); > } this isn't needed if we hold the spinlock, the serialized memory can't be change under us, so there's no need to put a read barrier, we only care that pre_sequence is visible before the chagnes are visible and before post_sequence is visible, hence only wmb() (after spin_lock and pre_sequence++) is needed there and only rmb() is needed in the read-side. Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:35 ` Andrea Arcangeli @ 2003-01-30 2:00 ` Andrew Morton 2003-01-30 1:50 ` Jeff Garzik ` (3 more replies) 0 siblings, 4 replies; 36+ messages in thread From: Andrew Morton @ 2003-01-30 2:00 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth Andrea Arcangeli <andrea@suse.de> wrote: > > On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote: > > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc > > > > if (ret) { > > ++rw->pre_sequence; > > - wmb(); > > + mb(); > > } > > this isn't needed > > > if we hold the spinlock, the serialized memory can't be change under us, > so there's no need to put a read barrier, we only care that pre_sequence > is visible before the chagnes are visible and before post_sequence is > visible, hence only wmb() (after spin_lock and pre_sequence++) is > needed there and only rmb() is needed in the read-side. > OK, thanks muchly. Lots more updates. Here's the version which I currently have. Looks like fr_write_lock() and fr_write_unlock() need to be switched back to rmb()? #ifndef __LINUX_FRLOCK_H #define __LINUX_FRLOCK_H /* * Fast read-write spinlocks. * * Fast reader/writer locks without starving writers. This type of * lock for data where the reader wants a consitent set of information * and is willing to retry if the information changes. Readers never * block but they may have to retry if a writer is in * progress. Writers do not wait for readers. * * Generalization on sequence variables used for gettimeofday on x86-64 * by Andrea Arcangeli * * This is not as cache friendly as brlock. Also, this will not work * for data that contains pointers, because any writer could * invalidate a pointer that a reader was following. * * Expected reader usage: * do { * seq = fr_read_begin(); * ... * } while (seq != fr_read_end()); * * On non-SMP the spin locks disappear but the writer still needs * to increment the sequence variables because an interrupt routine could * change the state of the data. */ #include <linux/config.h> #include <linux/spinlock.h> #include <linux/preempt.h> typedef struct { unsigned pre_sequence; unsigned post_sequence; spinlock_t lock; } frlock_t; /* * These macros triggered gcc-3.x compile-time problems. We think these are * OK now. Be cautious. */ #define FR_LOCK_UNLOCKED { 0, 0, SPIN_LOCK_UNLOCKED } #define frlock_init(x) do { *(x) = (frlock_t) FR_LOCK_UNLOCKED; } while (0) /* Update sequence count only * Assumes caller is doing own mutual exclusion with other lock * or semaphore. */ static inline void fr_write_begin(frlock_t *rw) { preempt_disable(); rw->pre_sequence++; mb(); } static inline void fr_write_end(frlock_t *rw) { mb(); rw->post_sequence++; BUG_ON(rw->post_sequence != rw->pre_sequence); preempt_enable(); } /* Lock out other writers and update the count. * Acts like a normal spin_lock/unlock. */ static inline void fr_write_lock(frlock_t *rw) { spin_lock(&rw->lock); rw->pre_sequence++; mb(); } static inline void fr_write_unlock(frlock_t *rw) { mb(); rw->post_sequence++; spin_unlock(&rw->lock); } static inline int fr_write_trylock(frlock_t *rw) { int ret = spin_trylock(&rw->lock); if (ret) { ++rw->pre_sequence; wmb(); } return ret; } static inline unsigned fr_read_begin(const frlock_t *rw) { unsigned ret = rw->post_sequence; rmb(); return ret; } /* End of reader calculation -- fetch last writer start token */ static inline unsigned fr_read_end(const frlock_t *rw) { rmb(); return rw->pre_sequence; } /* * Possible sw/hw IRQ protected versions of the interfaces. */ #define fr_write_lock_irqsave(lock, flags) \ do { local_irq_save(flags); fr_write_lock(lock); } while (0) #define fr_write_lock_irq(lock) \ do { local_irq_disable(); fr_write_lock(lock); } while (0) #define fr_write_lock_bh(lock) \ do { local_bh_disable(); fr_write_lock(lock); } while (0) #define fr_write_unlock_irqrestore(lock, flags) \ do { fr_write_unlock(lock); local_irq_restore(flags); } while(0) #define fr_write_unlock_irq(lock) \ do { fr_write_unlock(lock); local_irq_enable(); } while(0) #define fr_write_unlock_bh(lock) \ do { fr_write_unlock(lock); local_bh_enable(); } while(0) #define fr_read_begin_irqsave(lock, flags) \ ({ local_irq_save(flags); fr_read_begin(lock); }) #define fr_read_end_irqrestore(lock, flags) \ ({ unsigned ret = fr_read_end(lock); \ local_irq_save(flags); \ ret; \ }) #endif /* __LINUX_FRLOCK_H */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 2:00 ` Andrew Morton @ 2003-01-30 1:50 ` Jeff Garzik 2003-01-30 2:03 ` Andrea Arcangeli 2003-01-30 1:52 ` Richard Henderson ` (2 subsequent siblings) 3 siblings, 1 reply; 36+ messages in thread From: Jeff Garzik @ 2003-01-30 1:50 UTC (permalink / raw) To: Andrew Morton Cc: Andrea Arcangeli, davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth Andrew Morton wrote: > #ifndef __LINUX_FRLOCK_H > #define __LINUX_FRLOCK_H > > /* > * Fast read-write spinlocks. Can we please pick a unique name whose meaning will not change over time? Even "andre[wa]_rw_lock" would be better, because its meaning is not tied to performance at this specific point in time, on today's ia32 flavor-of-the-month. If we discover even-yet-faster read-write spinlocks tomorrow, this name is going to become a joke :) Jeff -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:50 ` Jeff Garzik @ 2003-01-30 2:03 ` Andrea Arcangeli 2003-01-30 17:09 ` Stephen Hemminger 0 siblings, 1 reply; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 2:03 UTC (permalink / raw) To: Jeff Garzik Cc: Andrew Morton, davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth On Wed, Jan 29, 2003 at 08:50:18PM -0500, Jeff Garzik wrote: > Andrew Morton wrote: > >#ifndef __LINUX_FRLOCK_H > >#define __LINUX_FRLOCK_H > > > >/* > > * Fast read-write spinlocks. > > > Can we please pick a unique name whose meaning will not change over > time? Even "andre[wa]_rw_lock" would be better, because its meaning is I was the first to use it in practice in linux for gettimeofday in the x86-64 port, but it should be called "keith_owens_lock" if really you like to use the inventor name. I was trying to find this kind of readonly read-side lock while designing the vgettimeofday the first time it was discussed ever on l-k a few years back (way before it seen the light of the day), and during such discussion Keith shown me the light, so it wouldn't be fair to take credit for it ;). If you search l-k probably in y2k or near you should find that email from Keith in answer to me. > not tied to performance at this specific point in time, on today's ia32 > flavor-of-the-month. > > If we discover even-yet-faster read-write spinlocks tomorrow, this name > is going to become a joke :) it would become historical, like so many other things. Actually when they say frlock I don't even think at fast read lock, I think at frlock as a specific new name, so personally I'm fine either ways. I don't dislike it, it's not worse than the big reader lock name that should be replaced by RCU at large btw. Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 2:03 ` Andrea Arcangeli @ 2003-01-30 17:09 ` Stephen Hemminger 2003-01-30 17:15 ` Randy.Dunlap 0 siblings, 1 reply; 36+ messages in thread From: Stephen Hemminger @ 2003-01-30 17:09 UTC (permalink / raw) To: Andrea Arcangeli Cc: Jeff Garzik, Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth > > not tied to performance at this specific point in time, on today's ia32 > > flavor-of-the-month. > > > > If we discover even-yet-faster read-write spinlocks tomorrow, this name > > is going to become a joke :) > > it would become historical, like so many other things. Actually when > they say frlock I don't even think at fast read lock, I think at frlock > as a specific new name, so personally I'm fine either ways. I don't > dislike it, it's not worse than the big reader lock name that should be > replaced by RCU at large btw. Don't read too much into the name. It was just a 30 second effort. Just didn't want a name like: ThingToDoReadConsitentDataUsingSequenceNumbers So if there is a standard or better name in a reasonable length, then let's change it. Marketing always changes the name of everything prior to release anyway ;-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 17:09 ` Stephen Hemminger @ 2003-01-30 17:15 ` Randy.Dunlap 2003-01-30 17:25 ` Andi Kleen 0 siblings, 1 reply; 36+ messages in thread From: Randy.Dunlap @ 2003-01-30 17:15 UTC (permalink / raw) To: Stephen Hemminger Cc: Andrea Arcangeli, Jeff Garzik, Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth On 30 Jan 2003, Stephen Hemminger wrote: | > > not tied to performance at this specific point in time, on today's ia32 | > > flavor-of-the-month. | > > | > > If we discover even-yet-faster read-write spinlocks tomorrow, this name | > > is going to become a joke :) | > | > it would become historical, like so many other things. Actually when | > they say frlock I don't even think at fast read lock, I think at frlock | > as a specific new name, so personally I'm fine either ways. I don't | > dislike it, it's not worse than the big reader lock name that should be | > replaced by RCU at large btw. | | Don't read too much into the name. It was just a 30 second effort. | Just didn't want a name like: | ThingToDoReadConsitentDataUsingSequenceNumbers | | So if there is a standard or better name in a reasonable length, | then let's change it. Marketing always changes the name of everything | prior to release anyway ;-) You can follow Andrea's suggestion and call it a kaos_lock (for Keith Owens). -- ~Randy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 17:15 ` Randy.Dunlap @ 2003-01-30 17:25 ` Andi Kleen 2003-01-30 17:23 ` Randy.Dunlap 0 siblings, 1 reply; 36+ messages in thread From: Andi Kleen @ 2003-01-30 17:25 UTC (permalink / raw) To: Randy.Dunlap Cc: Stephen Hemminger, Andrea Arcangeli, Jeff Garzik, Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth > You can follow Andrea's suggestion and call it a kaos_lock > (for Keith Owens). How about just seq_lock (sequence lock) ? -Andi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 17:25 ` Andi Kleen @ 2003-01-30 17:23 ` Randy.Dunlap 0 siblings, 0 replies; 36+ messages in thread From: Randy.Dunlap @ 2003-01-30 17:23 UTC (permalink / raw) To: Andi Kleen Cc: Stephen Hemminger, Andrea Arcangeli, Jeff Garzik, Andrew Morton, David Miller, rmk, davidm, anton, linux-mm, rth On Thu, 30 Jan 2003, Andi Kleen wrote: | > You can follow Andrea's suggestion and call it a kaos_lock | > (for Keith Owens). | | How about just seq_lock (sequence lock) ? Hey, that makes some sense... -- ~Randy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 2:00 ` Andrew Morton 2003-01-30 1:50 ` Jeff Garzik @ 2003-01-30 1:52 ` Richard Henderson 2003-01-30 2:06 ` Andrea Arcangeli 2003-01-30 1:54 ` Andrea Arcangeli 2003-01-30 2:00 ` Richard Henderson 3 siblings, 1 reply; 36+ messages in thread From: Richard Henderson @ 2003-01-30 1:52 UTC (permalink / raw) To: Andrew Morton Cc: Andrea Arcangeli, davem, shemminger, rmk, ak, davidm, anton, linux-mm On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote: > Andrea Arcangeli <andrea@suse.de> wrote: > > if we hold the spinlock, the serialized memory can't be change under us, > > so there's no need to put a read barrier, we only care that pre_sequence > > is visible before the chagnes are visible and before post_sequence is > > visible, hence only wmb() (after spin_lock and pre_sequence++) is > > needed there and only rmb() is needed in the read-side. Hmm. Perhaps I was confused about how these things are intended to be used. If indeed the writer doesn't care about the order in which pre/post_sequence are accessed, then wmb is sufficient to keep their updates ordered. r~ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:52 ` Richard Henderson @ 2003-01-30 2:06 ` Andrea Arcangeli 0 siblings, 0 replies; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 2:06 UTC (permalink / raw) To: Richard Henderson Cc: Andrew Morton, davem, shemminger, rmk, ak, davidm, anton, linux-mm On Wed, Jan 29, 2003 at 05:52:32PM -0800, Richard Henderson wrote: > On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote: > > Andrea Arcangeli <andrea@suse.de> wrote: > > > if we hold the spinlock, the serialized memory can't be change under us, > > > so there's no need to put a read barrier, we only care that pre_sequence > > > is visible before the chagnes are visible and before post_sequence is > > > visible, hence only wmb() (after spin_lock and pre_sequence++) is > > > needed there and only rmb() is needed in the read-side. > > Hmm. Perhaps I was confused about how these things are intended > to be used. If indeed the writer doesn't care about the order > in which pre/post_sequence are accessed, then wmb is sufficient > to keep their updates ordered. yep, IMHO it should be enough. btw, I'm speaking only about the fr_write_lock/unlock, not sure what the write_begin/end are meant for, there are no fr_write_begin/end in the patch I was reading (in 2.4). If fr_write_begin/end have to be retained because they can do something useful, and they don't take the spinlock, it could be needed there to use mb(), I just have no idea of where write_begin/end could be useful so it's hard to say. Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 2:00 ` Andrew Morton 2003-01-30 1:50 ` Jeff Garzik 2003-01-30 1:52 ` Richard Henderson @ 2003-01-30 1:54 ` Andrea Arcangeli 2003-01-30 2:19 ` Andrew Morton 2003-01-30 17:37 ` Stephen Hemminger 2003-01-30 2:00 ` Richard Henderson 3 siblings, 2 replies; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 1:54 UTC (permalink / raw) To: Andrew Morton; +Cc: davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote: > Andrea Arcangeli <andrea@suse.de> wrote: > > > > On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote: > > > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc > > > > > > if (ret) { > > > ++rw->pre_sequence; > > > - wmb(); > > > + mb(); > > > } > > > > this isn't needed > > > > > > if we hold the spinlock, the serialized memory can't be change under us, > > so there's no need to put a read barrier, we only care that pre_sequence > > is visible before the chagnes are visible and before post_sequence is > > visible, hence only wmb() (after spin_lock and pre_sequence++) is > > needed there and only rmb() is needed in the read-side. > > > > OK, thanks muchly. > > Lots more updates. Here's the version which I currently have. Looks like > fr_write_lock() and fr_write_unlock() need to be switched back to rmb()? you certainly mean wmb() not rmb(), right? If yes, then yes. I actually didn't notice the write_begin/end, not sure who could need them, I would suggest removing them, rather than to revert the mb() there too. > > > > #ifndef __LINUX_FRLOCK_H > #define __LINUX_FRLOCK_H > > /* > * Fast read-write spinlocks. > * > * Fast reader/writer locks without starving writers. This type of > * lock for data where the reader wants a consitent set of information > * and is willing to retry if the information changes. Readers never > * block but they may have to retry if a writer is in > * progress. Writers do not wait for readers. > * > * Generalization on sequence variables used for gettimeofday on x86-64 > * by Andrea Arcangeli > * > * This is not as cache friendly as brlock. Also, this will not work > * for data that contains pointers, because any writer could > * invalidate a pointer that a reader was following. > * > * Expected reader usage: > * do { > * seq = fr_read_begin(); > * ... > * } while (seq != fr_read_end()); > * > * On non-SMP the spin locks disappear but the writer still needs > * to increment the sequence variables because an interrupt routine could > * change the state of the data. > */ > > #include <linux/config.h> > #include <linux/spinlock.h> > #include <linux/preempt.h> > > typedef struct { > unsigned pre_sequence; > unsigned post_sequence; > spinlock_t lock; > } frlock_t; > > /* > * These macros triggered gcc-3.x compile-time problems. We think these are > * OK now. Be cautious. > */ > #define FR_LOCK_UNLOCKED { 0, 0, SPIN_LOCK_UNLOCKED } > #define frlock_init(x) do { *(x) = (frlock_t) FR_LOCK_UNLOCKED; } while (0) > > /* Update sequence count only > * Assumes caller is doing own mutual exclusion with other lock > * or semaphore. > */ > static inline void fr_write_begin(frlock_t *rw) > { > preempt_disable(); > rw->pre_sequence++; > mb(); > } > > static inline void fr_write_end(frlock_t *rw) > { > mb(); > rw->post_sequence++; > BUG_ON(rw->post_sequence != rw->pre_sequence); > preempt_enable(); > } > > /* Lock out other writers and update the count. > * Acts like a normal spin_lock/unlock. > */ > static inline void fr_write_lock(frlock_t *rw) > { > spin_lock(&rw->lock); > rw->pre_sequence++; > mb(); > } > > static inline void fr_write_unlock(frlock_t *rw) > { > mb(); > rw->post_sequence++; > spin_unlock(&rw->lock); > } > > static inline int fr_write_trylock(frlock_t *rw) > { > int ret = spin_trylock(&rw->lock); > > if (ret) { > ++rw->pre_sequence; > wmb(); > } > return ret; > } > > static inline unsigned fr_read_begin(const frlock_t *rw) > { > unsigned ret = rw->post_sequence; > rmb(); > return ret; > > } > > /* End of reader calculation -- fetch last writer start token */ > static inline unsigned fr_read_end(const frlock_t *rw) > { > rmb(); > return rw->pre_sequence; > } > > /* > * Possible sw/hw IRQ protected versions of the interfaces. > */ > #define fr_write_lock_irqsave(lock, flags) \ > do { local_irq_save(flags); fr_write_lock(lock); } while (0) > #define fr_write_lock_irq(lock) \ > do { local_irq_disable(); fr_write_lock(lock); } while (0) > #define fr_write_lock_bh(lock) \ > do { local_bh_disable(); fr_write_lock(lock); } while (0) > > #define fr_write_unlock_irqrestore(lock, flags) \ > do { fr_write_unlock(lock); local_irq_restore(flags); } while(0) > #define fr_write_unlock_irq(lock) \ > do { fr_write_unlock(lock); local_irq_enable(); } while(0) > #define fr_write_unlock_bh(lock) \ > do { fr_write_unlock(lock); local_bh_enable(); } while(0) > > #define fr_read_begin_irqsave(lock, flags) \ > ({ local_irq_save(flags); fr_read_begin(lock); }) > > #define fr_read_end_irqrestore(lock, flags) \ > ({ unsigned ret = fr_read_end(lock); \ > local_irq_save(flags); \ > ret; \ > }) > > #endif /* __LINUX_FRLOCK_H */ > Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:54 ` Andrea Arcangeli @ 2003-01-30 2:19 ` Andrew Morton 2003-01-30 17:37 ` Stephen Hemminger 1 sibling, 0 replies; 36+ messages in thread From: Andrew Morton @ 2003-01-30 2:19 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: davem, shemminger, rmk, ak, davidm, anton, linux-mm, rth Andrea Arcangeli <andrea@suse.de> wrote: > > On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote: > > Andrea Arcangeli <andrea@suse.de> wrote: > > > > > > On Wed, Jan 29, 2003 at 05:27:43PM -0800, Andrew Morton wrote: > > > > @@ -82,11 +85,12 @@ static inline int fr_write_trylock(frloc > > > > > > > > if (ret) { > > > > ++rw->pre_sequence; > > > > - wmb(); > > > > + mb(); > > > > } > > > > > > this isn't needed > > > > > > > > > if we hold the spinlock, the serialized memory can't be change under us, > > > so there's no need to put a read barrier, we only care that pre_sequence > > > is visible before the chagnes are visible and before post_sequence is > > > visible, hence only wmb() (after spin_lock and pre_sequence++) is > > > needed there and only rmb() is needed in the read-side. > > > > > > > OK, thanks muchly. > > > > Lots more updates. Here's the version which I currently have. Looks like > > fr_write_lock() and fr_write_unlock() need to be switched back to rmb()? > > you certainly mean wmb() not rmb(), right? If yes, then yes. Yup. > I actually didn't notice the write_begin/end, not sure who could need > them, I would suggest removing them, rather than to revert the mb() > there too. The intent here was to use them for i_size updates. In situations where writer serialisation was provided by external means (i_sem), and the spinlock is not needed. It's causing confusion so yeah, I'll probably pull them out. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 1:54 ` Andrea Arcangeli 2003-01-30 2:19 ` Andrew Morton @ 2003-01-30 17:37 ` Stephen Hemminger 2003-01-30 17:50 ` Andrea Arcangeli 1 sibling, 1 reply; 36+ messages in thread From: Stephen Hemminger @ 2003-01-30 17:37 UTC (permalink / raw) To: Andrea Arcangeli Cc: Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth > you certainly mean wmb() not rmb(), right? If yes, then yes. > > I actually didn't notice the write_begin/end, not sure who could need > them, I would suggest removing them, rather than to revert the mb() > there too. The write_begin/end was suggested by Andrew as a simplification for use when using this to update values already write-locked by other means. One possible usage was to fix the race issues with non-atomic update of 64 bit i_size. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 17:37 ` Stephen Hemminger @ 2003-01-30 17:50 ` Andrea Arcangeli 0 siblings, 0 replies; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 17:50 UTC (permalink / raw) To: Stephen Hemminger Cc: Andrew Morton, David Miller, rmk, ak, davidm, anton, linux-mm, rth On Thu, Jan 30, 2003 at 09:37:06AM -0800, Stephen Hemminger wrote: > > > you certainly mean wmb() not rmb(), right? If yes, then yes. > > > > I actually didn't notice the write_begin/end, not sure who could need > > them, I would suggest removing them, rather than to revert the mb() > > there too. > > The write_begin/end was suggested by Andrew as a simplification for use > when using this to update values already write-locked by other means. > > One possible usage was to fix the race issues with non-atomic update > of 64 bit i_size. It looks overdesign to me, you don't need the spinlock for that, the i_sem is explicit too. The generalized abstraction is worthwhile when you have to use it 99% of the time, the missing 1% doesn't need to be abstracted, forward porting my implementation is the best for such specific case IMHO. Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-30 2:00 ` Andrew Morton ` (2 preceding siblings ...) 2003-01-30 1:54 ` Andrea Arcangeli @ 2003-01-30 2:00 ` Richard Henderson 3 siblings, 0 replies; 36+ messages in thread From: Richard Henderson @ 2003-01-30 2:00 UTC (permalink / raw) To: Andrew Morton Cc: Andrea Arcangeli, davem, shemminger, rmk, ak, davidm, anton, linux-mm On Wed, Jan 29, 2003 at 06:00:54PM -0800, Andrew Morton wrote: > * Expected reader usage: > * do { > * seq = fr_read_begin(); > * ... > * } while (seq != fr_read_end()); I think perhaps do { seq = fr_read_begin(&lock); ... } while (fr_read_end(&lock, seq)) would be a better interface. This would allow you to change the implementation as well. E.g. unsigned fr_read_begin(frlock_t *lock) { unsigned s; do { barrier (); s = lock->sequence; } while (s & 1); rmb(); return s; } int fr_read_end(frlock_t *lock, unsigned s) { rmb(); return s == lock->sequence; } void fr_write_begin(frlock_t *rw) { rw->sequence++; wmb(); } void fr_write_begin(frlock_t *rw) { wmb(); rw->sequence++; } which doesn't require as much memory. r~ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 22:35 ` Stephen Hemminger 2003-01-29 23:12 ` Andrew Morton @ 2003-01-30 0:46 ` Andrea Arcangeli 1 sibling, 0 replies; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 0:46 UTC (permalink / raw) To: Stephen Hemminger Cc: Andrew Morton, David S. Miller, rmk, ak, davidm, anton, linux-mm On Wed, Jan 29, 2003 at 02:35:52PM -0800, Stephen Hemminger wrote: > On Wed, 2003-01-29 at 02:26, Andrew Morton wrote: > > "David S. Miller" <davem@redhat.com> wrote: > > > > > > From: Russell King <rmk@arm.linux.org.uk> > > > Date: Wed, 29 Jan 2003 09:59:49 +0000 > > > > > > /* This function must be called with interrupts disabled > > > > > > which hasn't been true for some time, and is even less true now that > > > local IRQs don't get disabled. Does this matter... for UP? > > > > > > I disable local IRQs during gettimeofday() on sparc. > > > > > > These locks definitely need to be taken with IRQs disabled. > > > Why isn't x86 doing that? > > > > Darned if I know. Looks like Andrea's kernel will deadlock if > > arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock > > while that cpu is holding the same lock in do_slow_gettimeoffset(). > > Rather than disabling interrupts in the i386 do_gettimeofday > why not just change spin_lock(&i8253_lock) to spin_lock_irqsave > in timer_pit.c as you can see from my patch, I agree Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 10:26 ` Andrew Morton 2003-01-29 22:35 ` Stephen Hemminger @ 2003-01-30 0:43 ` Andrea Arcangeli 1 sibling, 0 replies; 36+ messages in thread From: Andrea Arcangeli @ 2003-01-30 0:43 UTC (permalink / raw) To: Andrew Morton; +Cc: David S. Miller, rmk, ak, davidm, anton, linux-mm On Wed, Jan 29, 2003 at 02:26:17AM -0800, Andrew Morton wrote: > "David S. Miller" <davem@redhat.com> wrote: > > > > From: Russell King <rmk@arm.linux.org.uk> > > Date: Wed, 29 Jan 2003 09:59:49 +0000 > > > > /* This function must be called with interrupts disabled > > > > which hasn't been true for some time, and is even less true now that > > local IRQs don't get disabled. Does this matter... for UP? > > > > I disable local IRQs during gettimeofday() on sparc. > > > > These locks definitely need to be taken with IRQs disabled. > > Why isn't x86 doing that? > > Darned if I know. Looks like Andrea's kernel will deadlock if > arch/i386/kernel/time.c:timer_interrupt() takes i8253_lock > while that cpu is holding the same lock in do_slow_gettimeoffset(). yes thanks! mostly theorical though, 486 SMP boxes aren't that common ;) This should fix it. --- 2.4.21pre3aa1/arch/i386/kernel/time.c.~1~ 2003-01-21 03:43:59.000000000 +0100 +++ 2.4.21pre3aa1/arch/i386/kernel/time.c 2003-01-30 01:40:37.000000000 +0100 @@ -159,6 +159,7 @@ extern spinlock_t i8259A_lock; static unsigned long do_slow_gettimeoffset(void) { int count; + unsigned long flags; static int count_p = LATCH; /* for the first call after boot */ static unsigned long jiffies_p = 0; @@ -169,7 +170,7 @@ static unsigned long do_slow_gettimeoffs unsigned long jiffies_t; /* gets recalled with irq locally disabled */ - spin_lock(&i8253_lock); + spin_lock_irqsave(&i8253_lock, flags); /* timer count may underflow right here */ outb_p(0x00, 0x43); /* latch the count ASAP */ @@ -191,7 +192,7 @@ static unsigned long do_slow_gettimeoffs count = LATCH - 1; } - spin_unlock(&i8253_lock); + spin_unlock_irqrestore(&i8253_lock, flags); /* * avoiding timer inconsistencies (they are rare, but they happen)... @@ -212,13 +213,13 @@ static unsigned long do_slow_gettimeoffs int i; - spin_lock(&i8259A_lock); + spin_lock_irqsave(&i8259A_lock, flags); /* * This is tricky when I/O APICs are used; * see do_timer_interrupt(). */ i = inb(0x20); - spin_unlock(&i8259A_lock); + spin_unlock_irqrestore(&i8259A_lock, flags); /* assumption about timer being IRQ0 */ if (i & 0x01) { Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 9:59 ` Russell King 2003-01-29 9:51 ` David S. Miller @ 2003-01-29 10:16 ` Andrew Morton 2003-01-29 17:04 ` David Mosberger 2 siblings, 0 replies; 36+ messages in thread From: Andrew Morton @ 2003-01-29 10:16 UTC (permalink / raw) To: Russell King; +Cc: ak, davem, davidm, anton, linux-mm Russell King <rmk@arm.linux.org.uk> wrote: > > On Tue, Jan 28, 2003 at 10:07:29PM -0800, Andrew Morton wrote: > > Possible breakage would be in the new frlock-for-xtime_lock code and the > > get_order() cleanup. > > > > The frlock code is showing nice speedups, but I think the main reason we want > > this is to fix the problem wherein an application spinning on gettimeofday() > > can make time stop. > > I'm slightly concerned about this. With this patch, we generally seem > to do: > > do { > seq = fr_read_begin(&xtime_lock); > *tv = xtime; > tv->tv_usec += do_fast_gettimeoffset(); > tv->tv_usec += lost_ticks; > } while (seq != fr_read_end(&xtime_lock)); > > This is fine when considering xtime. However, considering the > implementation of do_fast_gettimeoffset(). Notice that > linux/arch/i386/kernel/timers/timer_pit.c:do_offset_pit specifically > makes the comment: > > /* This function must be called with interrupts disabled > > which hasn't been true for some time, and is even less true now that > local IRQs don't get disabled. Does this matter... for UP? > > The same is true for other architectures; their gettimeoffset > implementations need to be audited by the architecture maintainers > to ensure that they are safe to run with (local) interrupts enabled. Thanks for spotting that. Bug. How about I make all instances of the above be: do { seq = fr_read_begin_irqsave(&xtime_lock, flags); *tv = xtime; tv->tv_usec += do_fast_gettimeoffset(); tv->tv_usec += lost_ticks; } while (seq != fr_read_end_irqrestore(&xtime_lock, flags)); ? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 9:59 ` Russell King 2003-01-29 9:51 ` David S. Miller 2003-01-29 10:16 ` Andrew Morton @ 2003-01-29 17:04 ` David Mosberger 2003-01-29 17:25 ` Russell King 2 siblings, 1 reply; 36+ messages in thread From: David Mosberger @ 2003-01-29 17:04 UTC (permalink / raw) To: Russell King Cc: Andrew Morton, Andi Kleen, David S. Miller, David Mosberger, Anton Blanchard, linux-mm >>>>> On Wed, 29 Jan 2003 09:59:49 +0000, Russell King <rmk@arm.linux.org.uk> said: Russell> On Tue, Jan 28, 2003 at 10:07:29PM -0800, Andrew Morton Russell> wrote: >> The frlock code is showing nice speedups, but I think the main >> reason we want this is to fix the problem wherein an application >> spinning on gettimeofday() can make time stop. Russell> I'm slightly concerned about this. With this patch, we Russell> generally seem to do: Russell> [snip...] Russell> The same is true for other architectures; their Russell> gettimeoffset implementations need to be audited by the Russell> architecture maintainers to ensure that they are safe to Russell> run with (local) interrupts enabled. Should be fine as far as ia64 is concerned, since gettimeoffset() currently simply reads the cycle-counter (and I think even HPET-based interpolation would be lock-free). --david -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 17:04 ` David Mosberger @ 2003-01-29 17:25 ` Russell King 2003-01-29 19:05 ` David Mosberger 0 siblings, 1 reply; 36+ messages in thread From: Russell King @ 2003-01-29 17:25 UTC (permalink / raw) To: davidm Cc: Andrew Morton, Andi Kleen, David S. Miller, David Mosberger, Anton Blanchard, linux-mm On Wed, Jan 29, 2003 at 09:04:37AM -0800, David Mosberger wrote: > Should be fine as far as ia64 is concerned, since gettimeoffset() > currently simply reads the cycle-counter (and I think even HPET-based > interpolation would be lock-free). If you're happy, then I'm happy. I was only concerned because it looks like it might be a problem on some implementations, and I was wondering what would happen on ia64 if a timer interrupt occurs between reading jiffies and itm_next in gettimeoffset. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linus rollup 2003-01-29 17:25 ` Russell King @ 2003-01-29 19:05 ` David Mosberger 0 siblings, 0 replies; 36+ messages in thread From: David Mosberger @ 2003-01-29 19:05 UTC (permalink / raw) To: Russell King Cc: davidm, Andrew Morton, Andi Kleen, David S. Miller, David Mosberger, Anton Blanchard, linux-mm >>>>> On Wed, 29 Jan 2003 17:25:19 +0000, Russell King <rmk@arm.linux.org.uk> said: Russell> I was only concerned because it looks like it might be a Russell> problem on some implementations, and I was wondering what Russell> would happen on ia64 if a timer interrupt occurs between Russell> reading jiffies and itm_next in gettimeoffset. Perhaps I'm missing something, but I thought that in this case, fr_write_unlock() would increment the counter and cause the reader(s) to restart the reading of the time-related variables. I do agree that, in general, there is a potential deadlock if gettimeoffset() is not lock-free. --david -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2003-01-30 17:50 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-01-29 6:07 Linus rollup Andrew Morton 2003-01-29 6:53 ` David Mosberger 2003-01-29 7:25 ` David S. Miller 2003-01-29 9:33 ` Andrew Morton 2003-01-29 9:35 ` David S. Miller 2003-01-29 9:54 ` Anton Blanchard 2003-01-29 9:59 ` Russell King 2003-01-29 9:51 ` David S. Miller 2003-01-29 10:26 ` Andrew Morton 2003-01-29 22:35 ` Stephen Hemminger 2003-01-29 23:12 ` Andrew Morton 2003-01-30 0:30 ` David S. Miller 2003-01-30 1:27 ` Andrew Morton 2003-01-30 1:24 ` Andi Kleen 2003-01-30 2:01 ` Andrew Morton 2003-01-30 1:35 ` Andrea Arcangeli 2003-01-30 2:00 ` Andrew Morton 2003-01-30 1:50 ` Jeff Garzik 2003-01-30 2:03 ` Andrea Arcangeli 2003-01-30 17:09 ` Stephen Hemminger 2003-01-30 17:15 ` Randy.Dunlap 2003-01-30 17:25 ` Andi Kleen 2003-01-30 17:23 ` Randy.Dunlap 2003-01-30 1:52 ` Richard Henderson 2003-01-30 2:06 ` Andrea Arcangeli 2003-01-30 1:54 ` Andrea Arcangeli 2003-01-30 2:19 ` Andrew Morton 2003-01-30 17:37 ` Stephen Hemminger 2003-01-30 17:50 ` Andrea Arcangeli 2003-01-30 2:00 ` Richard Henderson 2003-01-30 0:46 ` Andrea Arcangeli 2003-01-30 0:43 ` Andrea Arcangeli 2003-01-29 10:16 ` Andrew Morton 2003-01-29 17:04 ` David Mosberger 2003-01-29 17:25 ` Russell King 2003-01-29 19:05 ` David Mosberger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox