123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291 |
- .. _doc_general_optimization:
- General optimization tips
- =========================
- Introduction
- ~~~~~~~~~~~~
- In an ideal world, computers would run at infinite speed, and the only limit to
- what we could achieve would be our imagination. In the real world, however, it
- is all too easy to produce software that will bring even the fastest computer to
- its knees.
- Designing games and other software is thus a compromise between what we would
- like to be possible, and what we can realistically achieve while maintaining
- good performance.
- To achieve the best results, we have two approaches:
- * Work faster
- * Work smarter
- And preferably, we will use a blend of the two.
- Smoke and Mirrors
- ^^^^^^^^^^^^^^^^^
- Part of working smarter is recognizing that, especially in games, we can often
- get the player to believe they are in a world that is far more complex,
- interactive, and graphically exciting than it really is. A good programmer is a
- magician, and should strive to learn the tricks of the trade, and try to invent
- new ones.
- The nature of slowness
- ^^^^^^^^^^^^^^^^^^^^^^
- To the outside observer, performance problems are often lumped together. But in
- reality, there are several different kinds of performance problem:
- * A slow process that occurs every frame, leading to a continuously low frame
- rate
- * An intermittent process that causes 'spikes' of slowness, leading to
- stalls
- * A slow process that occurs outside of normal gameplay, for instance, on
- level load
- Each of these are annoying to the user, but in different ways.
- Measuring performance
- =====================
- Probably the most important tool for optimization is the ability to measure
- performance - to identify where bottlenecks are, and to measure the success of
- our attempts to speed them up.
- There are several methods of measuring performance, including :
- * Putting a start / stop timer around code of interest
- * Using the Godot profiler
- * Using external third party profilers
- * Using GPU profilers / debuggers
- * Checking the frame rate (with vsync disabled)
- Be very aware that the relative performance of different areas can vary on
- different hardware. Often it is a good idea to make timings on more than one
- device, especially including mobile as well as desktop, if you are targeting
- mobile.
- Limitations
- ~~~~~~~~~~~
- CPU Profilers are often the 'go to' method for measuring performance, however
- they don't always tell the whole story.
- - Bottlenecks are often on the GPU, `as a result` of instructions given by the
- CPU
- - Spikes can occur in the Operating System processes (outside of Godot) `as a
- result` of instructions used in Godot (for example dynamic memory allocation)
- - You may not be able to profile e.g. a mobile phone
- - You may have to solve performance problems that occur on hardware you don't
- have access to
- As a result of these limitations, you often need to use detective work to find
- out where bottlenecks are.
- Detective work
- ~~~~~~~~~~~~~~
- Detective work is a crucial skill for developers (both in terms of performance,
- and also in terms of bug fixing). This can include hypothesis testing, and
- binary search.
- Hypothesis testing
- ^^^^^^^^^^^^^^^^^^
- Say for example you believe that sprites are slowing down your game. You can
- test this hypothesis for example by:
- * Measuring the performance when you add more sprites, or take some away.
- This may lead to a further hypothesis - does the size of the sprite determine
- the performance drop?
- * You can test this by keeping everything the same, but changing the sprite
- size, and measuring performance
- Binary search
- ^^^^^^^^^^^^^
- Say you know that frames are taking much longer than they should, but you are
- not sure where the bottleneck lies. You could begin by commenting out
- approximately half the routines that occur on a normal frame. Has the
- performance improved more or less than expected?
- Once you know which of the two halves contains the bottleneck, you can then
- repeat this process, until you have pinned down the problematic area.
- Profilers
- =========
- Profilers allow you to time your program while running it. Profilers then
- provide results telling you what percentage of time was spent in different
- functions and areas, and how often functions were called.
- This can be very useful both to identify bottlenecks and to measure the results
- of your improvements. Sometimes attempts to improve performance can backfire and
- lead to slower performance, so always use profiling and timing to guide your
- efforts.
- For more info about using the profiler within Godot see
- :ref:`doc_debugger_panel`.
- Principles
- ==========
- Donald Knuth:
- *Programmers waste enormous amounts of time thinking about, or worrying
- about, the speed of noncritical parts of their programs, and these attempts
- at efficiency actually have a strong negative impact when debugging and
- maintenance are considered. We should forget about small efficiencies, say
- about 97% of the time: premature optimization is the root of all evil. Yet
- we should not pass up our opportunities in that critical 3%.*
- The messages are very important:
- * Programmer / Developer time is limited. Instead of blindly trying to speed up
- all aspects of a program we should concentrate our efforts on the aspects that
- really matter.
- * Efforts at optimization often end up with code that is harder to read and
- debug than non-optimized code. It is in our interests to limit this to areas
- that will really benefit.
- Just because we `can` optimize a particular bit of code, it doesn't necessarily
- mean that we should. Knowing when, and when not to optimize is a great skill to
- develop.
- One misleading aspect of the quote is that people tend to focus on the subquote
- "premature optimization is the root of all evil". While `premature` optimization
- is (by definition) undesirable, performant software is the result of performant
- design.
- Performant design
- ~~~~~~~~~~~~~~~~~
- The danger with encouraging people to ignore optimization until necessary, is
- that it conveniently ignores that the most important time to consider
- performance is at the design stage, before a key has even hit a keyboard. If the
- design / algorithms of a program are inefficient, then no amount of polishing the
- details later will make it run fast. It may run `faster`, but it will never run
- as fast as a program designed for performance.
- This tends to be far more important in game / graphics programming than in
- general programming. A performant design, even without low level optimization,
- will often run many times faster than a mediocre design with low level
- optimization.
- Incremental design
- ~~~~~~~~~~~~~~~~~~
- Of course, in practice, unless you have prior knowledge, you are unlikely to
- come up with the best design first time. So you will often make a series of
- versions of a particular area of code, each taking a different approach to the
- problem, until you come to a satisfactory solution. It is important not to spend
- too much time on the details at this stage until you have finalized the overall
- design, otherwise much of your work will be thrown out.
- It is difficult to give general guidelines for performant design because this is
- so dependent on the problem. One point worth mentioning though, on the CPU
- side, is that modern CPUs are nearly always limited by memory bandwidth. This
- has led to a resurgence in data orientated design, which involves designing data
- structures and algorithms for locality of data and linear access, rather than
- jumping around in memory.
- The optimization process
- ~~~~~~~~~~~~~~~~~~~~~~~~
- Assuming we have a reasonable design, and taking our lessons from Knuth, our
- first step in optimization should be to identify the biggest bottlenecks - the
- slowest functions, the low hanging fruit.
- Once we have successfully improved the speed of the slowest area, it may no
- longer be the bottleneck. So we should test / profile again, and find the next
- bottleneck on which to focus.
- The process is thus:
- 1. Profile / Identify bottleneck
- 2. Optimize bottleneck
- 3. Return to step 1
- Optimizing bottlenecks
- ~~~~~~~~~~~~~~~~~~~~~~
- Some profilers will even tell you which part of a function (which data accesses,
- calculations) are slowing things down.
- As with design you should concentrate your efforts first on making sure the
- algorithms and data structures are the best they can be. Data access should be
- local (to make best use of CPU cache), and it can often be better to use compact
- storage of data (again, always profile to test results). Often you precalculate
- heavy computations ahead of time (e.g. at level load, or loading precalculated
- data files).
- Once algorithms and data are good, you can often make small changes in routines
- which improve performance, things like moving calculations outside of loops.
- Always retest your timing / bottlenecks after making each change. Some changes
- will increase speed, others may have a negative effect. Sometimes a small
- positive effect will be outweighed by the negatives of more complex code, and
- you may choose to leave out that optimization.
- Appendix
- ========
- Bottleneck math
- ~~~~~~~~~~~~~~~
- The proverb "a chain is only as strong as its weakest link" applies directly to
- performance optimization. If your project is spending 90% of the time in
- function 'A', then optimizing A can have a massive effect on performance.
- .. code-block:: none
- A: 9 ms
- Everything else: 1 ms
- Total frame time: 10 ms
- .. code-block:: none
- A: 1 ms
- Everything else: 1ms
- Total frame time: 2 ms
- So in this example improving this bottleneck A by a factor of 9x, decreases
- overall frame time by 5x, and increases frames per second by 5x.
- If however, something else is running slowly and also bottlenecking your
- project, then the same improvement can lead to less dramatic gains:
- .. code-block:: none
- A: 9 ms
- Everything else: 50 ms
- Total frame time: 59 ms
- .. code-block:: none
- A: 1 ms
- Everything else: 50 ms
- Total frame time: 51 ms
- So in this example, even though we have hugely optimized functionality A, the
- actual gain in terms of frame rate is quite small.
- In games, things become even more complicated because the CPU and GPU run
- independently of one another. Your total frame time is determined by the slower
- of the two.
- .. code-block:: none
- CPU: 9 ms
- GPU: 50 ms
- Total frame time: 50 ms
- .. code-block:: none
- CPU: 1 ms
- GPU: 50 ms
- Total frame time: 50 ms
- In this example, we optimized the CPU hugely again, but the frame time did not
- improve, because we are GPU-bottlenecked.
|