| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215 |
- * Writing better performing .NET and Mono applications
- <center>
- Miguel de Icaza ([email protected])<br>
- Ben Maurer ([email protected])
- </center>
- The following document contains a few hints on how to improve
- the performance of your Mono/.NET applications.
- These are just guidelines, and you should still profile your
- code to find the actual performance problems in your
- application. It is never a smart idea to make a change with the
- hopes of improving the performance of your code without first
- measuring. In general, these guidelines should serve as ideas
- to help you figure out `how can I make this method run faster'.
-
- It is up to you to figure out, `Which method is running slowly.'
- ** Using the Mono profiler
-
- So, how does one measure what method are running slowly? A profiler
- helps with this task. Mono includes a profiler that is built
- into the runtime system. You can invoke this profiler on your program
- by running with the --profile flag.
- <pre class="shell">
- mono --profile program.exe
- </pre>
- The above will instruct Mono to instrument your application
- for profiling. The default Mono profiler will record the time
- spent on a routine, the number of times the routine called,
- the memory consumed by each method broken down by invoker, and
- the total amount of memory consumed.
- It does this by asking the JIT to insert a call to the profiler
- every time a method is entered or left. The profiler times the
- amount of time elapsed between the beginning and the end of the
- call. The profiler is also notified of allocations.
-
- When the program has finished executing, the profiler prints the
- data in human readable format. It looks like:
- <pre class="shell">
- Total time spent compiling 227 methods (sec): 0.07154
- Slowest method to compile (sec): 0.01893: System.Console::.cctor()
- Time(ms) Count P/call(ms) Method name
- ########################
- 91.681 1 91.681 .DebugOne::Main()
- Callers (with count) that contribute at least for 1%:
- 1 100 % .DebugOne::Main(object,intptr,intptr)
- ...
- Total number of calls: 3741
- ...
- Allocation profiler
- Total mem Method
- ########################
- 406 KB .DebugOne::Main()
- 406 KB 1000 System.Int32[]
- Callers (with count) that contribute at least for 1%:
- 1 100 % .DebugOne::Main(object,intptr,intptr)
- Total memory allocated: 448 KB
- </pre>
- At the top, it shows each method that is called. The data is sorted
- by the total time that the program spent within the method. Then
- it shows how many times the method was called, and the average time
- per call.
-
- Below this, it shows the top callers of the method. This is very useful
- data. If you find, for example, that the method Data::Computate () takes
- a very long time to run, you can look to see if any of the calls can be
- avoided.
-
- Two warnings must be given about the method data. First,
- the profiler has an overhead associated with it. As such,
- a high number of calls to a method may show up as consuming
- lots of time, when in reality they do not consume much time
- at all. If you see a method that has a very high number of
- calls, you may be able to ignore it. However, do consider
- removing calls if possible, as that will sometimes help
- performance. This problem is often seen with the use
- of built in collection types.
-
- Secondly, due to the nature of the profiler, recursive calls
- have extremely large times (because the profiler double counts
- when the method calls itself). One easy way to see this problem
- is that if a method is shown as taking more time than the Main
- method, it is very likely recursive, and causing this problem.
-
- Below the method data, allocation data is shown. This shows
- how much memory each method allocates. The number beside
- the method is the total amount of memory. Below that, it
- is broken down into types. Then, the caller data is given. This
- data is again useful when you want to figure out how to eliminate calls.
-
- You might want to keep a close eye on the memory consumption
- and on the method invocation counts. A lot of the
- performance gains in MCS for example came from reducing its
- memory usage, as opposed to changes in the execution path.
- ** Profiling without JIT instrumentation
- You might also be interested in using mono --aot to generate
- precompiled code, and then use a system like `oprofile' to
- profile your programs.
- ** Memory Management in the .NET/Mono world.
- Since Mono and .NET offer automatic garbage collection, the
- programmer is freed from having to track and dispose the
- objects it consumes (except for IDispose-like classes). This
- is a great productivity gain, but if you create thousands of
- objects, that will make the garbage collector do more work,
- and it might slow down your application.
-
- Remember, each time you allocate an object, the GC is forced
- to find space for the object. Each object has an 8 byte overhead
- (4 to tell what type it is, then 4 for a sync block). If
- the GC finds that it is running out of room, it will scan every
- object for pointers, looking for unreferenced objects. If you allocate
- extra objects, the GC then must take the effort to free the objects.
-
- Mono uses the Boehm GC, which is a conservative collector,
- and this might lead to some memory fragmentation and unlike
- generational GC systems, it has to scan the entire allocated
- memory pool.
-
- *** Boxing
- The .NET framework provides a rich hierarchy of object types.
- Each object not only has value information, but also type
- information associated with it. This type information makes
- many types of programs easier to write. It also has a cost
- associated with it. The type information takes up space.
-
- In order to reduce the cost of type information, almost every
- Object Oriented language has the concept of `primitives'.
- They usually map to types such as integers and booleans. These
- types do not have any type information associated with them.
-
- However, the language also must be able to treat primitives
- as first class datums -- in the class with objects. Languages
- handle this issue in different ways. Some choose to make a
- special class for each primitive, and force the user to do an
- operation such as:
- <pre class="shell">
- // This is Java
- list.add (new Integer (1));
- System.out.println (list.get (1).intValue ());
- </pre>
- The C# design team was not satisfied with this type
- of construct. They added a notion of `boxing' to the language.
-
- Boxing preforms the same thing as Java's <code>new Integer (1)</code>.
- The user is not forced to write the extra code. However,
- behind the scenes the <em>same thing</em> is being done
- by the runtime. Each time a primitive is cast to an object,
- a new object is allocated.
-
- You must be careful when casting a primitive to an object.
- Note that because it is an implicit conversion, you will
- not see it in your code. For example, boxing is happening here:
- <pre class="shell">
- ArrayList foo = new ArrayList ();
- foo.Add (1);
- </pre>
-
- In high performance code, this operation can be very costly.
- *** Using structs instead of classes for small objects
- For small objects, you might want to consider using value
- types (structs) instead of object (classes).
-
- However, you must be careful that you do not use the struct
- as an object, in that case it will actually be more costly.
-
- As a rule of thumb, only use structs if you have a small
- number of fields (totaling less than 32 bytes), and
- need to pass the item `by value'. You should not box the object.
- *** Assisting the Garbage Collector
- Although the Garbage Collector will do the right thing in
- terms of releasing and finalizing objects on time, you can
- assist the garbage collector by clearing the fields that
- points to objects. This means that some objects might be
- eligible for collection earlier than they would, this can help
- reduce the memory consumption and reduce the work that the GC
- has to do.
- ** Common problems with <tt>foreach</tt>
- The <tt>foreach</tt> C# statement handles various kinds of
- different constructs (about seven different code patterns are
- generated). Typically foreach generates more efficient code
- than loops constructed manually, and also ensures that objects
- which implement IDispose are properly released.
- But foreach sometimes might generate code that under stress
- performs badly. Foreach performs badly when its used in tight
- loops, and its use leads to the creation of many enumerators.
- Although technically obtaining an enumerator for some objects
- like ArrayList is more efficient than using the ArrayList
- indexer, the pressure introduced due to the extra memory
- requirements and the demands on the garbage collector make it
- more inefficient.
- There is no straight-forward rule on when to use foreach, and
- when to use a manual loop. The best thing to do is to always
- use foreach, and only when profile shows a problem, replace
- foreach with for loops.
|