Android Profiling Tools

Checked with version: 2018.1

-

Difficulty: Advanced

Android supports a vast variety of devices, which comes with some constraints, such as specific tools for specific chipsets. Many platform-specific tools yield useful information about both the target device and app performance.

There are profiling tools available which help to profile the Java Managed Environment, such as the Memory Monitor in Android Studio. The Memory Monitor shows how an app allocates managed memory over the course of a single session. This tool, however, only provides data for the Java Environment and does not offer sufficient data on the native systems, which are essential for most applications made with Unity. Only a small portion of Unity’s code runs in a Java managed environment, with the majority being driven by systems in the native environment.

Android native profiling tools depend on the target chipset, and each chipset manufacturer has their own suite of tools to gain data from CPU, GPU, and memory. There are tools that provide chipset spanning performance analysis as well, such as Android Studio.

The following table includes a collection of tools available to profile Android devices.

Tool Target Devices GPU CPU System Description OS API
Systrace Any chipset Yes Yes Systrace is not a sampling profiler. The tracing tool captures system-wide traces, and Android Studio embeds it in 3.0. Android
Adreno Profiler Adreno GPU Yes Replaced by Snapdragon Profiler Requires old mono framework 2.10.5 and xquartz 2.7.11 Android DirectX OpenGL ES OpenCL
Snapdragon Profiler Snapdragon Chipset Yes Yes Yes Requires latest mono framework and Android 6.0+ (frame capture) Android OpenGL ES OpenCL Vulkan
Mali Graphics Debugger Mali GPU Yes Additional .so file added (via Plugin in Unity) otherwise requires source changes or a rooted device. Android Linux OpenGL ES OpenCL
Simpleperf Any chipset Command Line only, but Android Studio integrates it in 3.0. Android
Android Studio GPU Debugger Any chipset Yes Create Android Studio Project and build/run. Android OpenGL ES
DS-5 Streamline ARM Chipset (Cortex-M, Cortex-A) Yes Yes Streamline requires a custom kernel and a rooted device. Linux, Android
Tegra System Profiler Tegra K1, X1 Yes Yes A system trace and multi-core CPU call stack sampling profiler. Android Linux
Tegra Graphics Debugger Tegra K1, X1 Yes A tool for debugging and profiling applications. Android Linux OpenGL ES OpenGL
Intel VTune Amplifier Performance Analyzer Intel x86 Desktop Yes Yes Yes A tool for analysing and profiling processor performance. Windows Android Linux DirectX OpenGL ES OpenGL
Intel System Analyzer Intel x86, ARM Yes Yes A tool for analysing CPU live traces. Windows Android Linux DirectX OpenGL ES OpenGL
Intel Graphics Frame Analyzer Intel x86 Yes A tool for single-frame analysis and optimization. Windows Android Linux DirectX OpenGL ES OpenGL
Intel Graphics Trace Analyzer Intel x86 Yes Yes A trace tool for analyzing workload performance across the CPU and GPU. Windows Android Linux DirectX OpenGL ES OpenGL
PVRTrace Any chipset Yes Used for older GPUs and imgtec is not producing new GPUs right now. Android Linux OpenGL ES OpenGL
gapid Any chipset Yes You need to build it manually, but Google also offers pre-built executables. Android Windows Linux OpenGL ES Vulcan
vkTrace Any chipset Yes Command Line only. Linux Windows Vulcan

For other platforms, similar tools are available. It’s good practice to verify and compare results on different platforms, especially when they share similar hardware or the same graphics APIs. Systems often act similarly, even across different platforms. Profiling on non-target platforms is not ideal, but it is better than not profiling at all.

Each tool helps to identify bottlenecks and takes appropriate actions towards solving the causes. The following section shares tips on profiling Unity applications using specific tools.

Snapdragon Profiler

This profiler is for devices using Snapdragon chipsets and it can show a significant amount of performance data. All the data can be overwhelming at first, but you can filter data by bundle id.

There is an excellent talk from Unite 2015 called Uncover Your Game's Power and Performance Profile that covers the basics of Snapdragon Profiling and which filters are a great choice for profiling.

If the Snapdragon Profiler cannot find a device, go to the Snapdragon Settings and check whether the auto locator has set the adb path to the correct adb location. When a device is visible via the adb devices command, it should be visible for Snapdragon as well.

Between profiling sessions, it is useful to shutdown and restart the Snapdragon profiler. Before you unplug a device, quit the Snapdragon Profiler so that the profiler can clean up memory and network connections, and then reconnect to the device the next time without issues.

Simpleperf

Simpleperf is a sampling profiler and provides multiple CPU counters. Currently, it only exists as a command line tool.

Modern CPUs have a hardware component called the performance monitoring unit (PMU). The PMU has several hardware counters that count events like CPU cycles, executed instructions, and the number of cache misses.

Simpleperf uses perf_event_open system calls on Android to get the data from hardware perf events and uses its Linux kernel to wrap hardware counters into these hardware perf events. The Linux kernel also provides hardware-independent software events and tracepoint events, and exposes all these events to user space via the perf_event_open system call.

The best way to get the sampling profiler working with call-graphs is by using dwarf debug info instead of frame-pointer. For more information, see in the Simpleperf readme file.

You can run Simpleperf from any host development platform that the NDK supports. You can get the Simpleperf tool from the Android NDK r13b and higher (under the ndk-location/simpleperf/ directory).

Note: You cannot use the run-as command on some Samsung devices. If you do, you will receive a Could not set capabilities: Operation not permitted error.

Example of obtaining sample data

Install Simpleperf from the NDK location onto the device into a temp data folder:

~$ adb push ndk-location/.../simpleperf /data/local/tmp

Google provides a way to access the internal storage of debuggable versions of their packages using the run-as command:

~$ adb shell run-as com.unity.androidtest cp /data/local/tmp/simpleperf .

Change the execution right of Simpleperf to everyone:

~$ adb shell run-as com.unity.androidtest chmod a+x simpleperf

Google has blocked access to the Perf tool by default since Android Nougat, so you need to change this flag to be able to access the tool:

~$ adb shell setprop security.perf_harden 0

Now the device is ready to record data. In the following example, we:

  • Record on the process (-p) with an event (-e) for 20 seconds (--duration). You don’t need to add cpu-cycles explicitly because it is also the default event.

  • Record a dwarf-based call graph (-g) and use the --symfs argument to redirect the path.

  • Set the frequency to dump records (-f) with approximately 2000 records every second when the monitored thread runs.

  • Return the pid of the given bundle-id using `adb shell pidof com.unity.androidtest`.

~$ adb shell run-as com.unity.androidtest ./simpleperf record -p `adb shell pidof com.unity.androidtest` -e cpu-cycles:u -f 2000 -g --symfs . --duration 20

Alternatively, you can record stack frame-based call graphs:

~$ adb shell run-as com.unity.androidtest ./simpleperf record -p `adb shell pidof com.unity.androidtest` --call-graph fp --dump-symbols --symfs . --duration 20

You can also select which processes (-p) or threads (-t) to monitor. Monitoring a process is the same as monitoring all threads in the process:

~$ adb shell run-as com.unity.androidtest ./simpleperf record -t 7146,6471,7148,7147 --call-graph fp --symfs . --duration 20

Copy the perf.data onto the sd card:

~$ adb shell run-as com.unity.androidtest cp perf.data /sdcard/

Fetch data from the sd card to your local dir:

~$ adb pull /sdcard/perf.data

Write caller information from the perf.data on the device into the local perf.caller.report log file:

~$ adb shell run-as com.unity.androidtest ./simpleperf report -g caller -n --symfs . > perf.caller.report

Write callee information from the perf.data on the device into the perf.caller.report log file:

~$ adb shell run-as com.unity.androidtest ./simpleperf report -g callee -n --symfs . > perf.callee.report

Also, you can also flatten any of the reports:

~$ adb shell run-as com.unity.androidtest ./simpleperf report -n --symfs . > perf.flat.report

A flat report looks similar to the image below:

Android Studio

Android Studio 3.1 integrates Simpleperf into its Performance Profiler and offers a simple UI for capturing data, so you do not have to use the command line. For details, see the Android CPU Profiler documentation.

The Android Studio CPU Profiler works in a similar way to Native Debugging and works out of the box for many phones, such as the Samsung S8. To get native traces, follow the following steps:

  1. Export your Unity Project as Gradle project and open it in Android Studio. For more information, see Gradle for Android.

  2. Start profiling.

  3. Select the Sampled (Native) profile in the CPU Profiler.

  4. Select the thread you want to profile, for instance, UnityMain.

  5. Press the record button and stop recording when done.

When the Profiler stops recording, inspect the sampled data after. Note: For very long sessions, you should increase Android Studios’ memory limit.

The Call Chart is similar to the data in the Unity Timeline Profiler.

The Top Down view shows the native stack trace to identify bottlenecks. For more information on interpreting traces, see the dissecting native stack traces section.

The Profiler also offers a memory tab which provide data on the memory consumption. If this is relevant for you, read more about it in our guide on Memory Management in Unity.

Note: It is not possible to symbolicate user code and display function names when compiled with IL2CPP, although you can set the path of the symbols in the Android Studio Project Settings (Menu: Run > Edit Configurations > Debugger).

Dissecting native stack traces

It’s often useful to look into dissecting traces, for example, when looking at a trace of the start-up process to evaluate startup time. Read more about dissecting stack traces in the Profiling section of the Understanding optimization in Unity Best Practice Guide.

Debugging Android Crashes

Symbolicate with ndk-stack

You can use the NDK utility ndk-stack to symbolicate the call stack of an adb logcat output. You can find the tool in the root directory of your NDK installation. You need to ensure that your symbol files are called libunity.so instead of libunity.dbg.so, otherwise the ndk-stack tool will fail to locate them; simply rename the files if needed. You can call the tool using the following command in the terminal:

~$ ndk-stack -sym unity_path\build\AndroidPlayer\Variations\mono\Development\Symbols\armeabi-v7a -dump path_to_your_logcat

For more details and instructions, please visit the official ndk-stack documentation.

Symbolicate Script

Alternatively to the ndk-stack tool, there is a tool available on Bitbucket that allows you to symbolicate the release crash dumps available on the Developer Console from the Play Store.

It does not provide the full line numbers from the call stack, but it allows you to see the method names and give you a much better idea of what could be going wrong. Full instructions on using the tool are in the tool’s readme file on Bitbucket.