Taking the Mystery out of Middleware Perfomance

Sleuthing out code bottlenecks and unnecessary frame activity is possible only with sophisticated tools. PC game developers rely on the graphics performance analyzers (GPA) to visualize over time the execution profile of the tasks in their code on the heterogeneous (CPU+GPU) PC platform.

But until recently, coverage for middleware code has been limited for the game development community, which is why Autodesk Scaleform could pay off big for many development teams.

Introducing Autodesk Scaleform

With about 30 employees in its home office, Autodesk Scaleform was founded in 2005 by Brendan Iribe and Michael Antonov, who met at the University of Maryland. After creating a user interface (UI) middleware company, they got busy making an engine and library. Their first big success was the UI for the huge blockbuster title Civilization IV.

Autodesk Scaleform is a leading provider of Flash-based middleware and UI solutions for the video game and consumer electronic industries. Perfecting the UI is a key ingredient for software success, but it also requires a different mindset from designing fiendish levels and compelling characters. Autodesk Scaleform 4.0 gives game-development artists and user-centered design experts the ability to build out interfaces with dropdown menus, radio buttons, even a heads-up display. These background tools that work between other games subsystems are referred to as “middleware,” and they link separate software applications.

When game developers use Autodesk Scaleform and GPA together, it’s easy to see how the middleware and their game code perform. Middleware tends to be a “black box” and difficult to understand when performance-tuning. Generally, the code takes some CPU time and adds draw calls, but beyond that, middleware code is opaque.

Getting the System-wide Picture

GPA tools visualize the execution profile of tasks over time. They collect trace data during the application run, so they provide a detailed analysis of how the code executes across all threads and correlates the CPU work with tasks performed on the GPU. GPA also aligns clocks across all cores in the entire system so that developers can analyze CPU-based workloads together with GPU-based workloads on the timeline.

The visualization of the execution profile gives developers a system-wide picture of the way code executes on the CPU and GPU cores. The GPA is built upon a docking interface around a task timeline. Panels in the interface present the trace data and task selection set in a format that assists in performing a detailed analysis. Different analyses are available through the range of panels, such as a bar chart containing the sums of task durations that correspond to common workloads.

Developers get details on parsing errors, see the task that executed within a thread, and are able to identify the period of time the application spends preparing each frame — thus allowing them to visualize where in the application that work is occurring relative to trace instrumentation. GPA tools automatically provide the DX CPU and DX GPU tracks, as well as Microsoft DirectX call instrumentation, in DirectX applications, without additional work. While most tracks show the activity of traced code within a particular thread, the DX CPU and DX GPU tracks highlight the work performed by the graphics driver (DX CPU) and the graphics hardware (DX GPU).

One problem that large studios typically face is that while they can instrument their own code for interacting with a GPA, very few computer games are produced with 100-percent proprietary code. Instead, competitive titles on hyper-schedules require developers to combine various technologies. Studios typically use off-the-shelf code such as Unreal Engine for key foundation work, but they might also incorporate Havok Physics for collisions and explosions, use Geomerics Enlighten for lighting, and so forth. Scaleform 4.0 enters the picture during the production of the UI.

Streamlining UI Builds on the Optimized Scaleform 4.0

Scaleform 4.0 includes an all-new, high-performance, multithreaded rendering engine, Flash 10/AS3 support, and iOS and Android mobile compatibility. The multithreading code was rewritten from the ground up, designed to make the new version future-proof in anticipation of powerful new processors. GPA helped Autodesk Scaleform confirm that its new renderer is faster than previous versions and that its ActionScript 3 virtual machine is very efficient. Autodesk Scaleform engineers reported a huge performance gain.

Alexis Mantzaris, principal engineer at Autodesk M&E Games Technology, is enthusiastic about the interaction with Scaleform 4.0 and the GPA they used. “Scaleform 4.0 customers who are familiar with GPA can now immediately evaluate the performance of their UI without learning to use a new tool like AMP,” he says. “Once confident that Scaleform 4.0 is highly optimized, they can focus on optimizing their Flash UI content and game code.”

Mantzaris says that the optimizations Autodesk Scaleform made on their core product include faster rendering and data loading. Those improvements make a big difference on mobile and low-end devices, which don’t have PC computing power. Flash content can load faster, run smoother and even take advantage of advanced rendering capabilities, such as 3D, when running on low-end devices.