Intel VTune is a paid tool for analyzing applications, but free tools with similar features are also available to analyze desktop applications, such as; Platform Analyzer, System Analyzer, Graphics Frame Analyzer, and Graphics Trace Analyzer. Those tools are available for free in the Intel GPA suite.
Intel Graphics Frame Analyzer is a powerful, single-frame analysis and optimization tool with detailed metrics down to the draw call level, including shaders, render states, pixel history, and textures. The Graphics Frame Analyzer can be used to experiment with performance and visual impacts without having to recompile actual code on Windows. You can use it to:
Select a draw call and verify its contribution to the frame, alpha channel, color, format, and depth buffers.
Quantify performance optimization opportunities with render experiments per draw call.
Solve issues with shadowing, lighting, or color schemes by locating misplaced objects.
First, you use the System Analyzer to determine whether the application is GPU bound. If that’s the case, the Graphics Frame Analyzer helps to perform tests to find out how to reduce overhead.
To identify bottlenecks, there are four different tests (performed from within the Experiments tab):
1x1 Scissor Rect
Simple Pixel Shader
Use this option to keep the selected ergs from being rendered and to test Scene efficiency, for example, by disabling all affected erg(s) of a post-effect, or a specific model. This would be similar to disabling the renderer on a GameObject.
Use the 2x2 Textures override mode to help identify potential performance bottlenecks with textures bandwidth. The Graphics Frame Analyzer replaces all textures for a Scene with simple 2x2 pixel textures. Usually, the Graphics Frame Analyzer uses a simple halftone or a colorized bitmap for this option.
If using this override mode significantly improves the frame rate, the GPU is bound to texture bandwidth while loading textures from the CPU instead of using a cached version of that texture from the GPU. If the total size of textures is high for a Scene, consider reducing one of the textures so that all the texture maps fit into the GPU's texture cache for that Scene.
The 1x1 Scissor Rect override mode is a DirectX API override. However, the implementation of this override mode is highly dependent upon a specific graphics configuration. In particular, scissoring may occur either before or after the pixel shader stage. Using the 1x1 scissor rect nullifies the workload of units after the vertex shader by clipping all rasterization and shading work.
The Simple Pixel Shader experiment replaces the pixel shaders in your frame with a simple pixel shader, which writes a constant color to the render target for every selected erg. If the frame rate significantly decreases as a result of this experiment, you may need to further analyze shaders to see whether you can reduce rendering time without detracting from the visual quality of the Scene.
One thing to keep in mind is that enabling this experiment for ergs that do not reference a pixel shader in the original Scene may result in a slower rendering time when using this override mode. This may seem counterintuitive, but ergs are now forced to use a pixel shader, and this pixel shader may be slower than the fixed-function shader.
Intel GPA is a powerful tool and has many use cases. For more detailed information, please also see the official Intel documentation.