Notes.txt 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189
  1. Reminders:
  2. - When I'll be doing SRGB write make sure GUI textures are handled properly. Right now they are read in gamma space, and displayed
  3. as normal, but if I switch to SRGB write then gamma would be applied twice to those textures.
  4. - Async callbacks. I'd like to be able to assign a callback to an async method, that will execute on the calling thread once the async operation is complete.
  5. - For example when setting PixelData for a cursor I need to get PixelData from a texture, which is an async operation, in which case I need to block the calling thread
  6. until I get the result. But I'd rather apply the result once render thread is finished.
  7. - GUI currently doesn't batch elements belonging to different GUIWidgets because each of them has its own transform. Implement some form of instancing for DX11 and GL
  8. so this isn't required. GUIManager already has the ability to properly group meshes, all that is needed is a shader.
  9. - When specifying GUIElement layout using (for example) GUILayoutOptions::expandableX then it would be useful if I didn't have to provide the Y height, and instead make it use
  10. the default value for that GUIElement type. For most elements I will only be changing width, and height will remain default (labels, buttons, toggles, drop down lists, etc.)
  11. so this would be helpful in a way I wouldn't need to look up actual element height in EngineGUI.
  12. - GUI currently ignores tooltips in GUIContent.
  13. - A way to initialize BansheeEngine without RenderSystem or any kind of UI. So that it may be used for server building as well.
  14. - Calls to "initialize()" should be protected. For example GUIWidget and all its derived classes require the user to separately call initialize() after construction.
  15. However a much better option would be to wrap construction and initialization in a create() method and make initialize protected.
  16. - GUIWidget needs to be added using addComponent which makes "create()" method not practical. However I have trouble seeing the need for initialize(),
  17. it could be replaced with a variable parameter version of addComponent.
  18. - Profiling: Create an easy to browse list of all loaded Resources (similar to Hierarchy in Unity, just for loaded resources)
  19. - Possibly also for all core objects
  20. - GUI ignores image in GUIContent for most elements
  21. - Each view (i.e. camera) of the scene should be put into its own thread
  22. - How do I handle multiple mesh formats? Some files need animation, other don't. Some would mabye like to use QTangent, others the proper tangent frame.
  23. - Asset postprocessor? Imports a regular mesh using normal importers and then postprocesses it into a specialized format?
  24. - Load texture mips separately so we can unload HQ textures from far away objects (like UE3)
  25. - Add Unified shader so I can easily switch between HLSL and GLSL shaders (they need same parameters usually, just different code)
  26. - UE4 has GLSL/HLSL shader cross compiler, so something similar
  27. - Remove HardwarePixelBuffer (DX11 doesn't use it, and DX9 and OpenGL textures can be rewritten so they have its methods internally)
  28. - Make sure my Log system uses XML + HTML
  29. - There is an issue that custom-UIs won't have their mesh shared. For example most game UIs will be advanced and will
  30. likely use on GUIWidget per element. However currently I only perform batching within a single widget which
  31. doesn't help in the mentioned case.
  32. - Input: Allow combinations like A+B+X on joystick to be a virtual key
  33. - Add a field that tracks % of resource deserialization in BinarySerializer
  34. - Add GL Texture buffers (They're equivalent to DX11 buffers) - http://www.opengl.org/wiki/Buffer_Texture
  35. - I should consider creating two special Mesh types:
  36. StreamMesh - constantly updated by CPU and read by GPU
  37. ReadMesh - written by GPU and easily read by CPU
  38. - OpenGL especially has no good way of reading or streaming data. It has special STREAM and COPY buffer types which I never use.
  39. - (EXTREMELY LOW PRIORITY) Scripting: It might be good to make Mono classes more generic and move them to BansheeEngine.
  40. e.g. MonoClass -> ScriptClass, where ScriptClass is just an abstract interface. Then I don't expose any Mono stuff to actually script libraries like
  41. SBansheeEngine. User could then fairly easily port the system to another scripting language just by implementing another ScriptSystem.
  42. - This would probably come with an overhead of at least one extra function call for each script call, which is currently unacceptable
  43. considering that most people will definitely won't be writing new script systems.
  44. - Perhaps add code generation functionality to the engine through Mono? I know Mono has support for it though its Embed interface
  45. - Add instancing to engine: All I really need to add are per-instance attributes in VertexData (and MeshData). And then RenderSystem::renderInstance method that also accepts an instance count.
  46. - Debug console
  47. - Add ability to add colors tags like <color=#123>
  48. - When showing a debug message, also provide a (clickable?) reference to Component it was triggered on (if applicable)
  49. - It really helps when you get an error on a Component that hundreds of SceneObjects use
  50. - When displaying an error with a callstack, make each line of the callstack clickable where it opens the external editor
  51. - std::function allocates memory but I have no got way of using custom allocators as I'd have to wrap std::bind and that seems non-trivial
  52. - Add a TaskScheduler profiler that neatly shows time slices of each task and on which thread they are run on
  53. - Add support for BC6H and BC7 file format compression
  54. - D-Pad on non-XInput devices will not be handled properly by the Input system because OIS reports it separately from other buttons (Simple to fix)
  55. - Possible improvement: I can only update entire Mesh at once with writeSubresource
  56. - Possible improvement: I keep bounds for the entire mesh and not per-submesh
  57. - Possible improvement: I don't serialize Mesh bounds and they are recalculated whenever a mesh is loaded
  58. - Add better mip map and compression options to the texture importer. Right now I'm ignoring a lot of the options even though I support them.
  59. - Add separable pass sorting to RenderQueue
  60. - DDS file import
  61. - Make hierarchical documentation. Organize stuff based on type. Once I actually generate the documentation add Doxygen grouping tags (or whatever they're called)
  62. - Make a Getting Started guide, along with the example project. Or just finish up the manual.
  63. - I'm not too happy with HStrings using so many events. They use one internally which I think I can replace completely quite easily. And one externally for notifying GUI components, replacing
  64. which would require more thought.
  65. - GUIElementBase::_getElementAreas is currently only implemented for layouts. It will work of child elements of layouts and layouts themselves but will not
  66. work for child elements of custom element types. This method is used for calculating size and position of an element in its parent.
  67. - I shouldn't use WeakRef with GameObjects. They need be deserialized in the order of their hierarchy and weak ref can break that.
  68. - Make sure to add fixedUpdate() to run your game logic in. This should have an adjustable update rate. See: http://gameprogrammingpatterns.com/game-loop.html
  69. - Will also need GameObject::destroy and GameObject::destroyImmediate. So I can remove GameObjects that might still be referenced that same frame (destroy() would just queue for destruction)
  70. - When unloading unused resources I should do it recursively so it unloads any new ones that might have been released during the first unload.
  71. (e.g. unloading a sprite texture will probably release its atlas texture as well)
  72. - Make importer multithreaded. I can add a flag to SpecificImporter that tells me if a certain importer supports multiple threads or not, but even if it doesn't I can run
  73. different importers on different threads this way. And once I hook up progress dialog box, perhaps make it have multiple progress bars in a single window (per thread).
  74. - Add FolderManager, extensible C# script that can be attached to a folder (can be in a folder .meta) file. Handles processing of assets in that folder. Can get notified
  75. whenever anything in the folder changes. But is that necessary if I can have asset post- and pre- processor that can filter by folder programatically?
  76. - Add option to optimize mesh for post-transform cache (triangles sharing vertices are sequential) on import (also pre-transform as well, order vertices by their first reference in IB, and so on)
  77. - It would be nice to have automated creation of glue code between C# and C++ instead of writing all of these Script classes
  78. - Think about removing C# GUIPanel. If I replace current EditorWidget's GUIArea with GUIWidget then panels aren't needed as those widgets can just
  79. use areas directly. This means that the EditorWidgetContainer will also have a GUIWidget but it will be separate.
  80. - When doing this also consider refactoring how GUIArea anchoring works (currently it's limited and have ugly interface)
  81. Potential optimizations:
  82. - bulkPixelConversion is EXTREMELY poorly unoptimized. Each pixel it calls a separate method that does redudant operations every pixel.
  83. - UI shader resolution params for gui should be in a separate constant buffer
  84. ----------------------------------------------------------------------------------------------
  85. More detailed thought out system descriptions:
  86. <<<<Memory allocation critical areas>>>>
  87. - Binding gpu params. It gets copied in DeferredRenderContext
  88. - GameObjectHandle often allocates its internal data
  89. - ResourceHandle often allocates its internal data
  90. - AsyncOp allocates AsyncOpData internally
  91. - Deserialization, a lot of temporary allocations going on - But how much impact on performance will allocations have considering this is probably limited by disk read?
  92. - Creating SceneObjects and Components - I might want to pool them, as I suspect user might alloc many per frame
  93. - Log logMsg
  94. <<Multithreaded GUI rendering>>
  95. - Event handling and normal "update" will still be done on the main thread
  96. - At the beginning of each frame a GUI mesh update is queued on the GUI thread
  97. - Since we're queuing the update at the beggining of the frame we will be using last frames transform and gui element states.
  98. - When queing we need to make sure to store GUIWidget transform, and specific element states (e.g. "text" in GUILabel)
  99. - At the end of simulation frame wait until GUI update is complete. After both simulation and GUI updates are complete, proceed with submitting it to render system.
  100. <<Figure out how to store texture references in a font>>
  101. - Currently I store a copy of the textures but how do I automatically update the font if they change?
  102. - Flesh out the dependencies system?
  103. - I can import texture as normal, and keep it as an actual TextureHandle, only keep it hidden
  104. if it was created automatically (by FontImporter) for example?
  105. - But then who deletes the texture?
  106. - Set up an "internalResource" system where resources hold references to each other and also release them?
  107. - In inspector they can be expanded as children of the main resource, but cannot be directly modified?
  108. - Deleting the main resource deletes the children too
  109. <<<<Reducing render state changes>>>>
  110. - Transparent objects get sorted back to front, always
  111. - Opaque objects I can choose between front to back, no sort or back to front
  112. - Then sort based on material-pass combo, rendering all passes of the same material at once, then moving to next pass, then to next material, etc.
  113. - For transucent objects I need to render entire material at once, and not group by pass
  114. - Ignore individual state and textures changes, just sort based on material
  115. - Use key-based approach as described here: http://realtimecollisiondetection.net/blog/?p=86
  116. Questions/Notes:
  117. 1. Could I make use of multiple texture slots so I don't have to re-assign textures for every material when rendering translucent objects pass by pass?
  118. - When sorting back to front (or front to back) it's highly unlikely that there will be many objects sharing the same material next to the same depth level anyway.
  119. So probably ignore this problem for now, and just change the states.
  120. 2. Should sorting be done on main or render thread?
  121. - Main thread. It's highly unlikely main thread will be using as much CPU as render thread will, so free up render thread as much as possible.
  122. 3. Should oct-tree queries be done on main or render thread?
  123. - Main thread as I would need to save and copy the state of the entire scene, in order to pass it to the render thread. Otherwise we risk race conditions.
  124. 4. Since render state and shader changes are much more expensive than shader constant/buffer/mesh (and even texture) changes, it might be a good idea to sort based on these,
  125. instead of exact material? A lot of materials might share shaders and render states but not textures.
  126. 5. This guy: http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BRenderstate%20change%20costs%5D%5D (who's a driver programmer) sorts all opaque objects based on shader/state
  127. into buckets, and then orders elements in these bucks in front to back order. This gives him best of two worlds, early z rejection and low state changes.
  128. <<<<DirectDraw>>>>
  129. - Used for quickly drawing something, usually for debug and editor purposes.
  130. - It consists of methods like: DrawLine, DrawPolygon, DrawCube, DrawSphere, etc.
  131. - It may also contain other fancier methods like DrawWireframeMesh, DrawWorldGrid etc.
  132. - Commands get queued from various Component::update methods and get executed at the end of frame. After they're executed they are cleared and need to be re-queued next frame.
  133. - Internally DirectDraw manages dynamic meshes so it can merge multiple DrawLine class into one and such. This can help performance, but generally performance of this class should not be a major concern.
  134. - Example uses for it:
  135. - Drawing GUI element bounds when debugging GUI
  136. - Drawing a wireframe selection effect when a mesh is selected in the scene
  137. <<<<RenderSystem needed modifications>>>>
  138. - Texture resource views (Specifying just a subresource of a texture as a shader parameter)
  139. - UAV for textures
  140. - Stream out (write vertex buffers) (DX11 and GL)
  141. - Texture buffers
  142. - Just add a special texture type? OpenGL doesn't support getting offset from within a texture buffer anyway
  143. - Tesselation (hull/domain) shader
  144. - Detachable and readable depthstencil buffer (Window buffers not required as they behave a bit differently in OpenGL)
  145. - OpenGL provides image load/store which seems to be GL UAV equivalent (http://www.opengl.org/wiki/Image_Load_Store)
  146. - Resolving MSAA textures (i.e. copying them to non-MSAA so they can be displayed on-screen). DX has ResolveSubresource, and OpenGL might have something similar.
  147. - Single and dual channel textures (especially render textures, which are very important for effects like SSAO)
  148. - Compute pipeline
  149. - Instancing (DrawInstanced) (DX11 and GL)
  150. - OpenGL append/consume buffers
  151. - Indirect drawing via indirect argument buffers
  152. - Texture arrays
  153. - Rendertargets that aren't just 2D (Volumetric (3D) render targets in particular)
  154. - Shader support for doubles
  155. - Dynamic shader linkage (Interfaces and similar)
  156. - Multisampled texture resources
  157. - Multiple adapters (multi gpu)
  158. - Passing initial data when creating a resource (DX11, but possibly GL too)
  159. - Sample mask when setting blend state (DX11, check if equivalent exists in GL)
  160. - RGBA blend factor when setting blend state(DX11, check if equivalent exists in GL)
  161. - HLSL9/HLSL11/GLSL/Cg shaders need preprocessor defines & includes
  162. - DirectX11 supports concurrent drawing and resource creation so all my resource updates should be direct calls to DX methods (I'll need a deferred context?)
  163. - One camera -> one task (thread) approach for multithreading
  164. - Also make sure to run off a thread pool (WorkQueue class already exists that provides needed interface)
  165. - The way I handle rendering currently is to discard simulation results if gpu thread isn't finished.
  166. - This reduces input lag but at worst case scenario the effect of multithreading might be completely eliminated as
  167. GPU ends up waiting for GPU, just because it was few milliseconds late. Maybe better to wait for GPU?
  168. <<<Localization notes for MUCH LATER>>>
  169. - It would be nice if HString identifier hash was being generated at compile time
  170. - I still need an easy way to edit the string table (Editor, importer or similar)
  171. - I might need font localization for non-standard character sets (e.g. russian, greek, asian, etc.)
  172. - I probably don't want to use one huge set of textures containing both latin and asian characters but want to keep them separate
  173. - Also asian sets might be too large for textures, in which case generating them at runtime might be necessary (or parsing string table and
  174. generating textures from only used characters)