From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <419EF257.8010103@yahoo.com.au> Date: Sat, 20 Nov 2004 18:29:27 +1100 From: Nick Piggin MIME-Version: 1.0 Subject: Re: page fault scalability patch V11 [0/7]: overview References: <20041120020306.GA2714@holomorphy.com> <419EBBE0.4010303@yahoo.com.au> <20041120035510.GH2714@holomorphy.com> <419EC205.5030604@yahoo.com.au> <20041120042340.GJ2714@holomorphy.com> <419EC829.4040704@yahoo.com.au> <20041120053802.GL2714@holomorphy.com> <419EDB21.3070707@yahoo.com.au> <20041120062341.GM2714@holomorphy.com> <419EE911.20205@yahoo.com.au> <20041120071514.GO2714@holomorphy.com> In-Reply-To: <20041120071514.GO2714@holomorphy.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: William Lee Irwin III Cc: Linus Torvalds , Christoph Lameter , akpm@osdl.org, Benjamin Herrenschmidt , Hugh Dickins , linux-mm@kvack.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org List-ID: William Lee Irwin III wrote: > William Lee Irwin III wrote: > >>>touch_nmi_watchdog() is only "protection" against local interrupt >>>disablement triggering the NMI oopser because alert_counter[] >>>increments are not atomic. Yet even supposing they were made so, the > > > On Sat, Nov 20, 2004 at 05:49:53PM +1100, Nick Piggin wrote: > >>That would be a bug in touch_nmi_watchdog then, because you're >>racy against your own NMI too. >>So I'm actually not very very wrong at all. I'm technically wrong >>because touch_nmi_watchdog has a theoretical 'bug'. In practice, >>multiple races with the non atomic increments to the same counter, >>and in an unbroken sequence would be about as likely as hardware >>failure. >>Anyway, this touch nmi thing is going off topic, sorry list. > > > No, it's on-topic. > (1) The issue is not theoretical. e.g. sysrq t does trigger NMI oopses, > merely not every time, and not on every system. It is not > associated with hardware failure. It is, however, tolerable > because sysrq's require privilege to trigger and are primarly > used when the box is dying anyway. OK then put a touch_nmi_watchdog in there if you must. > (2) NMI's don't nest. There is no possibility of NMI's racing against > themselves while the data is per-cpu. > Your point was that touch_nmi_watchdog() which resets alert_counter, is racy when resetting the counter of other CPUs. Yes it is racy. It is also racy against the NMI on the _current_ CPU. This has nothing whatsoever to do with NMIs racing against themselves, I don't know how you got that idea when you were the one to bring up this race anyway. [ snip back-and-forth that is going nowhere ] I'll bow out of the argument here. I grant you raise valid concens WRT the /proc issues, of course. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: aart@kvack.org