博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
为什么shader切换很耗效率(论坛帖备注)
阅读量:6874 次
发布时间:2019-06-26

本文共 2266 字,大约阅读时间需要 7 分钟。

假定两套shader(即opengl中的program)都完全编译好,并都分配id等,只是未连接(link)而已

链接地址:http://www.opengpu.org/bbs/forum.php?mod=viewthread&tid=1300

 

可能消耗:

1. shader中有无写z和alpha test会决定是否启用heirachical z,如这类的shader切换发生,则需要把之前的z buffer都resolved掉;

2. shader对texture sampler slot使用的不同可能会导致不用的texture cache layout,往往需要dirty整个texture cache;(CPU API层)

其中可能有各类shader source的更新和填充,比如Constant Buffer数据(对应opengl中的UniformBuffer)

(若等待切换的shader曾经链接过,即所有 texture uniform都已有opengl id,贴图数据未改变的情况下,会判断render contex中的16张贴图是否存有当前shader所需贴图,若没有会调用一次bind texture;而非texture uniform若uniform location已得到时,只会根据其值是否改变,来决定是否重新上传实际数值)

3. 有些比较长的shader,其指令并不能放在chip上,那么切换时就需要从外存中载入,这个开销相对较小;(GPU层)
shader切换开销中有 shader和input/output的绑定和错误检查的开销,这个是在GPU上发生的。

 

记得是D3D Runtime构建command buffer,然后批量发给driver,这时会有一次user mode到kernel mode的switch, 所以开销较大。之后driver再对command buffer做处理,待显卡空闲时逐步发给显卡来执行。API里的设备状态是D3D runtime维护的,所以才有pure device选项,让D3D runtime放弃维护状态以换取更高的性能。冗余状态设置的消除判断也不是总是做的,因为有开销,而且经常是浪费。D3D把多数是否做冗余消除的选择交 给了driver,以便根据显卡特性做出最优选择。(这段存疑,需查看widows display driver model

 

"Why are cards so bad at shader changes? It's because many of them can only be running one set of shaders and shader constants at a time. Remember that video cards are not like a CPU - they are a very long pipeline - it may take hundreds of thousands of clock cycles for a triangle to get from being given to the card to being rendered. GPUs are so fast because they can have huge numbers of triangles and pixels being processed at once, so although they have very poor "latency" (the time taken from an input to go all the way through to the end and finish processing), they have extremely impressive "throughput" (how many things you can finish processing in a second).

So when you change shader, many cards have to wait for their entire pipeline to drain fully empty, then upload the new shader or set of shader constants into a completely idle pipeline, and then start feeding work into the start of the pipeline again. This draining is often called a pipeline "bubble" (though it's a slightly inaccurate term from the point of view of a hardware engineer), and in general you want to avoid them."

CPU发送命令给GPU,GPU是异步执行的,所以可能不至于等待整个pipeLine的flush,同样这段存疑,需查看widows display driver model

转载于:https://www.cnblogs.com/ActionFG/archive/2012/09/09/2677800.html

你可能感兴趣的文章
JavaScript语言精粹读书笔记 - JavaScript函数
查看>>
sqlserver error 40解决方案
查看>>
Chapter 3. ASP.NET Reapter数据控件
查看>>
python进行mp3格式判断
查看>>
Codeforces Round #287 (Div. 2) ABCDE
查看>>
ios Debug
查看>>
【转载】读懂IL代码就这么简单(二)
查看>>
09-JS的事件流的概念(重点)
查看>>
有关inline-block
查看>>
文献随笔(九)
查看>>
git相关
查看>>
加入大型的js文件如jQuery文件,Eclipse会报错
查看>>
POJ 2763 (树链剖分+边修改+边查询)
查看>>
全局变量---只创建一次
查看>>
IOS APP上下黑边问题
查看>>
数位dp题集
查看>>
4-删除、更名
查看>>
C# 汉字转拼音
查看>>
MySQL小知识点
查看>>
jquery实现复制的两种方式
查看>>