linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: Enhanced profiling support (was Re: vm lock contention reduction)
@ 2002-07-10 14:28 Richard J Moore
  2002-07-10 20:30 ` Karim Yaghmour
  0 siblings, 1 reply; 13+ messages in thread
From: Richard J Moore @ 2002-07-10 14:28 UTC (permalink / raw)
  To: John Levon
  Cc: Andrew Morton, Andrea Arcangeli, bob, Karim Yaghmour,
	linux-kernel, linux-mm, mjbligh, John Levon, Rik van Riel,
	Linus Torvalds

>Sure, there are all sorts of things where some tracing can come in
>useful. The question is whether it's really something the mainline
>kernel should be doing, and if the gung-ho approach is nice or not.
>
>> The fact that so many kernel subsystems already have their own tracing
>> built-in (see other posting)
>
>Your list was almost entirely composed of per-driver debug routines.
>This is not the same thing as logging trap entry/exits, syscalls etc
>etc, on any level, and I'm a bit perplexed that you're making such an
>assocation.

There's a balance to be struck with tracing. First we should point out that
the recording mechanism doesn't have to intrude within the kernel unlss you
want init time tracing. The bigger point of contention seems to be that of
instrumentation. Yes, it is very ugly to have thousands of trace points
littering the source. On the otherhand, for basic serviceability a minimal
set should be present in a production system - these would typically allow
the external interface of any component to be traced.  For low-level
tracing - i.e. internal routines etc - the dynamic trace can be used. This
requires no modification to source. The tracepoint is implemanted
dynamically in execting code. DProbes+LTT provides this capability.

Some level of tracing (along with other complementary PD tools e.g. crash
dump) needs to be readiliy available to deal with those types of problem we
see with mature systems employed in the production environment. Typically
such problems are not readily recreatable nor even prictable. I've often
had to solve problems which impact a business environment severely, where
one server out of 2000 gets hit each day, but its a different one each day.
Its under those circumstances that trace along without other automated data
capturing problem determination tools become invaluable. And its a fact of
life that only those types of difficult problem remain once we've beaten a
system to death in developments and test. Being able to use a common set of
tools whatever the componets under investigation greatly eases problem
determination. This is especially so where you have the ability to use
dprobes with LTT to provide ad hoc tracepoints that were not originally
included by the developers.



Richard J Moore CEng, MIEE, Consulting IT Specialist, TSM
RAS Project Lead - Linux Technology Centre (ATS-PIC).
http://oss.software.ibm.com/developerworks/opensource/linux
Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
IBM UK Ltd,  MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK
The IBM Academy will hold a Conference on Performance Engineering in
Toronto July 8-10. A High Availability Conference follows July 10-12.
Details on http://w3.ibm.com/academy/


                                                                                                                                           
                      John Levon                                                                                                           
                      <movement@marcelothewonderp        To:       Karim Yaghmour <karim@opersys.com>                                      
                      enguin.com>                        cc:       Linus Torvalds <torvalds@transmeta.com>, Andrew Morton                  
                      Sent by: John Levon                 <akpm@zip.com.au>, Andrea Arcangeli <andrea@suse.de>, Rik van Riel               
                      <moz@compsoc.man.ac.uk>             <riel@conectiva.com.br>, "linux-mm@kvack.org" <linux-mm@kvack.org>,              
                                                          mjbligh@linux.ibm.com, linux-kernel@vger.kernel.org, Richard J                   
                                                          Moore/UK/IBM@IBMGB, bob <bob@watson.ibm.com>                                     
                      10/07/2002 00:38                   Subject:  Re: Enhanced profiling support (was Re: vm lock contention reduction)   
                      Please respond to John                                                                                               
                      Levon                                                                                                                
                                                                                                                                           
                                                                                                                                           



On Wed, Jul 10, 2002 at 12:16:05AM -0400, Karim Yaghmour wrote:

[snip]

> And the list goes on.

Sure, there are all sorts of things where some tracing can come in
useful. The question is whether it's really something the mainline
kernel should be doing, and if the gung-ho approach is nice or not.

> The fact that so many kernel subsystems already have their own tracing
> built-in (see other posting)

Your list was almost entirely composed of per-driver debug routines.
This is not the same thing as logging trap entry/exits, syscalls etc
etc, on any level, and I'm a bit perplexed that you're making such an
assocation.

> expect user-space developers to efficiently use the kernel if they
> have
> absolutely no idea about the dynamic interaction their processes have
> with the kernel and how this interaction is influenced by and
> influences
> the interaction with other processes?

This is clearly an exaggeration. And seeing as something like LTT
doesn't (and cannot) tell the "whole story" either, I could throw the
same argument directly back at you. The point is, there comes a point of
no return where usefulness gets outweighed by ugliness. For the very few
cases that such detailed information is really useful, the user can
usually install the needed special-case tools.

In contrast a profiling mechanism that improves on the poor lot that
currently exists (gprof, readprofile) has a truly general utility, and
can hopefully be done without too much ugliness.

The primary reason I want to see something like this is to kill the ugly
code I have to maintain.

> > The entry.S examine-the-registers approach is simple enough, but
> > it's
> > not much more tasteful than sys_call_table hackery IMHO
>
> I guess we won't agree on this. From my point of view it is much
> better
> to have the code directly within entry.S for all to see instead of
> having some external software play around with the syscall table in a
> way kernel users can't trace back to the kernel's own code.

Eh ? I didn't say sys_call_table hackery was better. I said the entry.S
thing wasn't much better ...

regards
john





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: vm lock contention reduction
@ 2002-07-07  2:50 Andrew Morton
  2002-07-07  3:05 ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-07-07  2:50 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, Rik van Riel, linux-mm, Martin J. Bligh

Andrea Arcangeli wrote:
> 
> On Thu, Jul 04, 2002 at 11:33:45PM -0700, Andrew Morton wrote:
> > Well.  First locks first.  kmap_lock is a bad one on x86.
> 
> Actually I thought about kmap_lock and the per-process kmaps a bit more
> with Martin (cc'ed) during OLS and there is an easy process-scalable
> solution to drop:

Martin is being bitten by the global invalidate more than by the lock.
He increased the size of the kmap pool just to reduce the invalidate
frequency and saw 40% speedups of some stuff.

Those invalidates don't show up nicely on profiles.

>         the kmap_lock
>         in turn the global pool
>         in turn the global tlb flush
> 
> The only problem is that it's not anymore both atomic *and* persistent,
> it's only persistent. It's also atomic if the mm_count == 1, but the
> kernel cannot rely on it, it has to assume it's a blocking operation
> always (you find it out if it's blocking only at runtime).

I was discussing this with sct a few days back.  iiuc, the proposal
was to create a small per-cpu pool (say, 4-8 pages) which is a
"front-end" to regular old kmap().

Any time you have one of these pages in use, the process gets
pinned onto the current CPU. If we run out of per-cpu kmaps,
just fall back to traditional kmap().

It does mean that this variant of kmap() couldn't just return
a `struct page *' - it would have to return something richer
than that.

> In short the same design of the per-process kmaps will work just fine if
> we add a semaphore to the mm_struct. then before starting using the kmap
> entry we must acquire the semaphore. This way all the global locking and
> global tlb flush goes away completely for normal tasks, but still
> remains the contention of that per-mm semaphore with threads doing
> simutaneous pte manipulation or simultaneous pagecache I/O though.
> Furthmore this I/O will be serialized, threaded benchmark like dbench
> may perform poorly that way I suspect, or we should add a pool of
> userspace pages so more than 1 thread is allowed to go ahead, but still
> we may cacheline-bounce in the synchronization of the pool across
> threads (similar to what we do now in the global pool).
> 
> Then there's the problem the pagecache/FS API should be changed to pass
> the vaddr through the stack because page->virtual would go away, the
> virtual address would be per-process protected by the mm->kmap_sem so we
> couldn't store it in a global, all tasks can kmap the same page at the
> same time at virtual vaddr. This as well will break some common code.
> 
> Last but not the least, I hope in 2.6 production I won't be running
> benchmarks and profiling using a 32bit cpu anymore anyways.
> 
> So I'm not very motivated anymore in doing that change after the comment
> from Linus about the issue with threads.

I believe that IBM have 32gig, 8- or 16-CPU ia32 machines just
coming into production now.  Presumably, they're not the only
ones.  We're stuck with this mess for another few years.

-
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2002-07-11  4:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-10 14:28 Enhanced profiling support (was Re: vm lock contention reduction) Richard J Moore
2002-07-10 20:30 ` Karim Yaghmour
2002-07-10 21:41   ` Andrea Arcangeli
2002-07-11  4:47     ` Karim Yaghmour
2002-07-11  4:59       ` Karim Yaghmour
  -- strict thread matches above, loose matches on Subject: below --
2002-07-07  2:50 vm lock contention reduction Andrew Morton
2002-07-07  3:05 ` Linus Torvalds
2002-07-07  3:47   ` Andrew Morton
2002-07-08 11:39     ` Enhanced profiling support (was Re: vm lock contention reduction) John Levon
2002-07-08 17:52       ` Linus Torvalds
2002-07-08 18:41         ` Karim Yaghmour
2002-07-10  2:22           ` John Levon
2002-07-10  4:16             ` Karim Yaghmour
2002-07-10  4:38               ` John Levon
2002-07-10  5:46                 ` Karim Yaghmour
2002-07-10 13:10                 ` bob

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox