Pārlūkot izejas kodu

Fix inefficient upload in Mobile Shadows

Clustered performs the following shadow rendering steps

1. Process objects [0; 10) for cascade 0.
2. Process objects [10; 30) for cascade 1.
3. Process objects [30; 100) for cascade 2.
4. Upload objects [0; 100) to GPU.
5. Draw all cascades.

Mobile was supposed to be doing the same, but instead was doing:

1. Process objects [0; 10) for cascade 0.
2. Upload objects [0; 10) to GPU.
3. Process objects [10; 30) for cascade 1.
4. Upload objects [0; 30) to GPU.
5. Process objects [30; 100) for cascade 2.
6. Upload objects [0; 100) to GPU.
7. Draw all cascades.

That is, always reuploaded everything from scratch.
Therefore it pointlessly (and with geometric growth) wasted BW.
Matias N. Goldberg 6 mēneši atpakaļ
vecāks
revīzija
62c1a4782d

+ 2 - 1
servers/rendering/renderer_rd/forward_mobile/render_forward_mobile.cpp

@@ -1471,7 +1471,7 @@ void RenderForwardMobile::_render_shadow_append(RID p_framebuffer, const PagedAr
 	_fill_render_list(RENDER_LIST_SECONDARY, &render_data, pass_mode, true);
 	uint32_t render_list_size = render_list[RENDER_LIST_SECONDARY].elements.size() - render_list_from;
 	render_list[RENDER_LIST_SECONDARY].sort_by_key_range(render_list_from, render_list_size);
-	_fill_instance_data(RENDER_LIST_SECONDARY, render_list_from, render_list_size);
+	_fill_instance_data(RENDER_LIST_SECONDARY, render_list_from, render_list_size, false);
 
 	{
 		//regular forward for now
@@ -1498,6 +1498,7 @@ void RenderForwardMobile::_render_shadow_append(RID p_framebuffer, const PagedAr
 }
 
 void RenderForwardMobile::_render_shadow_process() {
+	_update_instance_data_buffer(RENDER_LIST_SECONDARY);
 	//render shadows one after the other, so this can be done un-barriered and the driver can optimize (as well as allow us to run compute at the same time)
 
 	for (uint32_t i = 0; i < scene_state.shadow_passes.size(); i++) {