Optimizing to Prevent Nerd Rage, part 2

By Rocco Balsamo on Nov 15, 2012 in Let's Talk Tech

On Tuesday we looked at optimizing for programmer time and usability. Now on to to final component:

Optimize for Speed
OptimizingFortunately, today the tools for profiling and finding bottlenecks are quite good. The choices are varied: Firebug, Chrome developer tools and IE developer tools. For the uber ‘leet, there’s Speed Tracer for chrome. For this post, I’ll focus on the standard Chrome Developer tools.

Manual profile
The simplest way to run the profiler in chrome is just to fire off a manual profile in the developer tools.

  • Open developer tools (View->Developer->Developer tools)
  • Select “Profiles” and use profiling type “Collect JavaScript CPU Profile”
  • Click Start
  • Click Stop

Automatic Profile
Manual profiling can work well enough, but I wanted to automatically turn on and off the profiler from javascript so that I could see exactly what was going on during the load / render time. I also wanted a print out the total time elapsed.

I wrote some helper functions for gathering this data. The key here is using console.profile() and console.profileEnd():

    var profileNum =  0;  // a counter
var timeStart = {}; // key: profile name, value : profile start time
var timeOnly = false; // false: also collect profile, true: only capture time

//Start a profile. You can give it a specific name or it will be named
//profile_0, profile_1...
var startProfile = function(name){
if (console && console.profile){
if (name === undefined){
name = 'profile_' + profileNum;
}
timeStart[name] = (new Date()).getTime();
if (!timeOnly){
console.profile(name);
}
}
};
var endProfile = function(name){
if (console && console.profile){
if (name === undefined){
name = 'profile_' + profileNum;
profileNum++;
}
if (!timeOnly){
console.profileEnd();
}
var time = name + ' took ' + ((new Date()).getTime() - timeStart[name]) + ' ms.';
console.log(time);
}
};

Profiler A screen shot of the profiler in Heavy (Bottom Up) view.

Next, I inserted startProfile and endProfile calls at points in the code where the render starts and ends. For benchmarking I usually just use timers to avoid overhead from the profiler (set timeOnly to true). I usually run the same test about 4 or 5 times in order to get average time spent on the load operation.

Once I know the average length of time for load/render, I can run the test again with timeOnly set to false, enabling the full profile. After the load, I go into Developer Tools->Profiler->Profiles. At this point I see profile_0 in the left panel.

Interpreting the Results
From there I drill down into the results. I find the Heavy (Bottom Up) view to be the most useful. Google has a decent tutorial on interpreting the profiler results.

Generally my approach is to drill down on the function with the heaviest CPU usage. Eventually with a little voodoo and smarts, you can find the functions that are the bottlenecks.

In Gliffy, we found three major bottlenecks:

  • AffineTransform::checkFiniteNumbers
    • a function used to catch errors in matrix calculations
    • used about 5% CPU
  • jQuery::camelCase
    • an internal jQuery function. It is used to normalize CSS names such as text-align to textAlign
    • used about 3% CPU
  • Text::__calculateWidth, Text::__calculateHeight
    • functions that calculate the rendered width and height of text
    • used about 33% CPU

AffineTransform::checkFiniteNumbers was the easiest to fix. As this was purely debugging functionality for checking errors in matrix transformations, we simply wrapped the calls in if (_gliffy.isDebug()).

For jQuery::camelCase, I saw that jQuery does some pretty expensive regular expression calculations. And it was doing these same calculations repeatedly on less than 50 strings. Because javascript allows you to redefine functions, I rewrote jQuery.camelCase to cache up to 1000 of these values. Here is the code, ready for integration into your project:

//performance optimization: replace jquery camelCase with our own version that caches
//values rather than re-running regexes on every css type.
var camelMap = {};
var camelSize = 0;
$.camelCase = function( string ) {
//inspired by jquery func.
var getCamel = function(string){
return string.replace( /^-ms-/, "ms-" ).replace( /-([a-z]|[0-9])/ig, function( all, letter ) {
return ( letter + "" ).toUpperCase();
} );
};

if (camelMap[string] === undefined){
camelMap[string] = getCamel(string);

//impose some limit on this optimization so that it doesn't take infinite memory
if (++camelSize > 1000){
camelSize = 0;
camelMap = {};
}
}
return camelMap[string];
};

Note that redefining an internal jQuery function may be a little bit risky, because there is no guarantee that future jQuery versions will define this function the same. But the performance increases merited the future risk.

Finally, the big boys, using a whopping 33% CPU were our __calculateWidth and __calculateHeight text functions. Since these were very Gliffy specific, I will spare you the details, but I will say that we came up with a similar caching strategy as that of jQuery.camelCase.

The text dimension optimizations came back to bite us with hard-to-find bugs, but at a 33% performance increase, the tradeoff of a little lost programmer time was certainly worth it.

Good luck and happy coding,

Rocco