Shader Series 5: Multipass Lighting Sample
This sample includes the following sections.
Overview
The number and types of lighting effects that can be applied to a given piece of geometry are limited only by the need to maintain an interactive framerate. In this sample, shader programs are run multiple times on individual pieces of geometry to enable hundreds of lights on a single piece of geometry.
Additive, multipass lighting techniques are not unique to programmable shader hardware, but shaders add additional flexibility in batching for performance. This sample uses blend states to add the effects of individual draws to the back buffer, allowing multiple draws to light the same piece of geometry.
Minimum Shader Profile
- Vertex Shader Model 2.0
- Pixel Shader Model 2.0
Sample Controls
This sample uses the following keyboard and gamepad controls.
| Action | Keyboard Control | Gamepad Control |
|---|---|---|
| Rotate the camera. | W, A, S, and D | Right analog D-Pad |
| Rotate the meshes. | UP ARROW, DOWN ARROW, LEFT ARROW, and RIGHT ARROW | Left analog D-Pad |
| Rotate the lights. | PAGE UP, PAGE DOWN | Left and right triggers |
| Add lights to the scene. | NUMPAD ADD | Right shoulder button |
| Add lights per pass. (Pixel Shader 3.0 only) | P | Up D-Pad |
| Remove lights from the scene. | NUMPAD SUBTRACT | Left shoulder button |
| Cycle the materials. | TAB | X |
| Generate random light properties. | SPACEBAR | Y |
| Exit the sample. | ESC or ALT+F4 | BACK |
Blending
Until now in the shader series, we avoided the subject of alpha blending, since we are only dealing with opaque geometry. However, blend state is still useful in opaque rendering, since it has the flexibility to achieve a variety of effects.
The goal in this sample is to sum the light contribution of multiple draws while preserving depth. By preserving the depth buffer, the geometry doesn't need to be sorted, as the additive draws will be discarded when they are occluded. The RenderStates necessary for this are found in Example 1.4 in MultipassLighting.cs.
In Example 1.4, the destination and source blend states are set to Blend.One. The blend function is a simple summation (BlendFunction.Add). The result of each draw can be expressed as: (backbufferColor * 1) + (drawnPixelColor * 1). Normally, these pixels would be rejected by the depth test, except that the depth test has been set to CompareFunction.LessEqual, so that pixels with the same Z value as the depth buffer will still be drawn.
Rendering Multiple Passes
After the blend states are properly set, there are a few more considerations for rendering. It usually does not make sense to render a scene-wide ambient pass more than once per piece of geometry in the scene, so a completely separate pass is rendered first. This has the additional benefit of pre-populating the depth buffer while making completely opaque draws.
Example 2.1 in Material.cs shows how depth buffer writes are enabled for the ambient pass, and how alpha blending is disabled. Subsequent draws that account for the point light contributions do not need to write to the depth buffer, as that would be redundant. Reading and writing to the depth buffer will absorb GPU bandwidth, and should be reduced if they are not needed.
In this sample, the outer for loop within in MultipassLighting.Draw iterates through the objects in the scene. For each object, a material batch is started, which sets up state shared by all subsequent passes on that object.
The inner loops iterate through one or more lights, setting their parameters on the material effect, and calling the Material.DrawModel method. This method commits the parameter changes to the effect and calls DrawIndexedPrimitives.
Multiple Passes and Batching
Calculating the effects of hundreds of lights in a single scene puts a tremendous amount of pressure on the GPU's pixel stages. In order to improve framerates, it often becomes important to organize draws into "batches" of calls. This generally reduces the amount of CPU overhead by sending GPU commands with less frequency than simply flushing new GPU states every time a draw is called. It has additional benefits on the GPU by improving cache performance and reducing idle time.
In this sample, batching is used primarily to reduce the number of Texture, VertexBuffer, RenderState, and EffectParameter sets. Example 2.2 shows that some amount of per-instance data is still necessary for every call—in that case, the immediate parameters of the lights being drawn.
Performance analysis of the GPU can be highly complex. Lacking powerful GPU profiling tools, the next best solution is usually to employ common sense in reducing state changes and measure the results of your changes. There is no one best way to design a material lighting system, but some level of batching is present in all commercial game engines.
Extending the Sample
The following are some ideas for extending the sample.
- The amount of pixel fill could be drastically reduced by doing some basic distance-based range checking on lights to geometry. When the scene becomes heavily fill-rate limited, trading off CPU to reduce the number of draws can improve overall framerate.
- A simple optimization would be to render the ambient passes for all of the geometry before doing any point light passes. This pre-populates the depth buffer so overdrawn pixels are discarded earlier, sometimes resulting in better performance.
- The shaders are far from optimal due to the need for clarity in the examples. They could be optimized to further increase the number of lights possible in the scene.
- Try combining multipass lighting with more complex shading algorithms, such as normal mapped surfaces or ambient occlusion.