Better performance testing

My latest problem to solve was performance, particularly on the algorithm that calculates lighting, and the tools I had were severely lacking, so when I made a tweak somewhere, it wasn’t completely clear whether it made things better or didn’t make a difference.

The main problem here was that, to assess performance, I had my general visual impression of “how fast it went”, which is of course flawed, and an FPS counter, which oscillates a lot, so not much help there either. To make matters worse, different scenarios have different effects on performance. Standing still is considerably faster than moving around close to the screen center, which is ridiculously faster than having to scroll; the FPS counter oscillates like crazy, and it’s hard to get a good measure of performance differences.

After thinking about this for a while, I found a workaround that’d let me be a bit more scientific in my measurements. Basically, I’d have the character automatically follow a certain walking path once the engine loaded, and I’d time how long it took to do that. This is also not 100% exact, as there is a setTimeout with 0 delay between frames, which can make things vary, but it’s reasonably realistic, and if I let it run a few times, I can get a decent measure, much more scientific than what I had so far at least.

I obviously also can use my timing function from the beginning of my experiments, where I turn the engine off, and just run one function in a loop thousands of times and time it, and I’m going to do that in some cases, but I also want to have a way to measure performance of “the whole thing”.

Coding that was reasonably simple, and I intentionally coded it in a way that is easy to “add it on” to the existing code (God, I love Javascript). That way, I can add it to older versions of the code, and check whether some of the changes I made (particularly the ones where I was following the profilers, which made my code somewhat uglier) actually made a difference, and whether that difference is big enough to justify the uglier code.

In the table below, you’ll find the results of running all the relevant versions of the code, in each of the browsers I’m testing. You’ll notice I made “lighting on” and “lighting off” tests. Lighting off means that the global light level is the maximum, in which case the code knows it doesn’t need to care about darkening stuff, and it completely bypasses all the lighting and lightmap calculations. I did this because I believe the lighting code is part of the performance problem, and I want to measure to what extent.

Times with lighting

Version Desktop Chrome Desktop Firefox Laptop Chrome Laptop Firefox HTC Desire S Portrait HTC Desire S Landscape iPhone 4 Portrait
1 24954
+-26
23264
+-80
38934
+-106
44639
+-298
91412
+-3945
90239
+-495
176235
+-3842
2 13577
+-114
8026
+-47
34776
+-174
28582
+-149
75171
+-1257
75891
+-1342
133525
+-1643
3 13655
+-104
7985
+-48
35336
+-98
29336
+-42
74852
+-362
83093
+-146
131648
+-865
4 14889
+-66
19095
+-205
38262
+-248
35366
+-72
92587
+-386
100320
+-326
189879
+-1482
5 14817
+-49
17039
+-240
38533
+-217
34861
+-79
90699
+-608
180984
+-2326
6 14922
+-46
17842
+-192
38716
+-207
36049
+-146
94242
+-1103
201195
+-1992

Times without lighting

Version Desktop Chrome Desktop Firefox Laptop Chrome Laptop Firefox HTC Desire S Portrait HTC Desire S Landscape iPhone 4 Portrait
1 24313
+-37
21117
+-190
37905
+-172
38476
+-91
62009
+-507
64704
+-552
96330
+-296
2 12877
+-126
5585
+-240
34577
+-270
24609
+-62
47220
+-217
53905
+-739
65196
+-690
3 12829
+-117
5795
+-267
34513
+-270
23971
+-66
47460
+-171
52511
+-292
65979
+-1485
4 14090
+-65
10828
+-185
37473
+-183
29390
+-240
68764
+-286
115799
+-493
5 13995
+-64
9154
+-204
36726
+-139
29172
+-140
66950
+-552
109763
+-599
6 14029
+-66
9141
+-229
36394
+-254
29408
+-93
66718
+-250
110656
+-1570

(I’m sorry for the holes, testing this took forever, so I didn’t finish all the “landscape” versions)

Versions of the code:

  • Code1: First version that included following a walking path. Includes lighting, right before doing delta drawing.
  • Code2: First version with delta drawing, incompletely done (would pan the existing canvas, and draw invalidated cells, but wouldn’t invalidate the new cols/rows when scrolling, nor the lightmap). It’s not useful to compare against Code 1, because it’s very incomplete, but it’s useful to compare to the next several versions, as I add things one by one, to see how long each of those take.
  • Code3: As part of delta drawing, I had to adjust the viewport to the screen better, to be able to draw only a few columns/rows when scrolling. This is that change, I included this version of the code to see how much that change had impacted performance. Verdict: Not much really.
  • Code4: Delta drawing: Invalidate new columns/rows that show up while scrolling, to “fill the black gaps at the borders”
  • Code5: At one point I started playing around with the Chrome and Firebug Profilers, and I made a bunch of tweaks to the code based on where I was finding bottlenecks. These basically made the code a bit uglier, and inlined a few things, to gain performance. At the time, I wasn’t sure how effective they were. Verdict: More effective than I thought.
  • Code6: The final missing change for delta drawing to be done 100%: When the lightmap changes, invalidate the cells whose lighting changed. This was a good hit to performance, now I need to figure out if it’s finding which cells to invalidate, or actually having to draw more cells that makes the impact. That’s pretty easy to figure out.

Hardware:

  • Desktop: Quad-core i7-2600 @ 3.4 GHz, 8 Gb RAM, running at 1920×1080 (both Firefox and Chrome results shown)
  • Laptop: Three year old Dell laptop: Core 2 Duo @ 2.00Ghz, 3Gb RAM (both Firefox and Chrome results shown)
  • Android: HTC Desire S, 768 MB RAM, Android v2.3.3 (480×720)
  • iPhone: iPhone 4, running inside Safari (not standalone app) (640×832)

Interesting things to note:

  1. First of all… How did I not have this before? I feel so idiotic. Just as I was running the tests, giving cursory glances to the data, I started seeing so many obvious patterns emerge I wanted to kick myself. Having a standard performance benchmark to run in all your devices (and actually running it all the time) is way more fundamental than I would’ve expected, mainly because performance-wise, browsers are way more different than I expected.
  2. Comparing equal to equal, lighting vs non-lighting: Lighting doesn’t take a lot of time for Chrome, but it does take a lot of time in Firefox, and it takes a huge amount of time in the cell phones. This was a big red-herring for me, and the main reason why not having this test framework before was a huge mistake. When I did lighting, I was checking out how it affected performance, but just in Chrome, and since it was really fast, the price I was paying in render time was absolutely worth it to get the gorgeous effect. Testing in other browsers shows I have considerably less leeway when it comes to what I can do in the lighting department, I really need to improve that.
  3. The performance gains from the first stage of delta drawing (which was thoroughly incomplete, and by far the fastest version of it) were incredibly bigger in my desktop (for both Chrome and FF) than in all the others. In the worst case (the iPhone), it is actually a bit slower than just drawing the whole frame every time. I’m not sure what this means. Probably the performance ratio in the iPhone of blitting against the extra JS processing I need to do on invalidated cells is completely different than in Chrome in my desktop. (Which would mean blitting in my desktop is stupidly slow compared to JS processing, it could be…). I’m going to keep delta drawing anyway since it’s stupidly fast if you’re not scrolling, and I can work on improving the performance of the code I’m running every frame, but this was a big disappointment to find out.
  4. The little tweaks I made based on the profiler indications were a nice win. Not in Chrome, and I now know that my computer was the worst possible choice to run the profiler on, but the tradeoff of performance vs slightly uglier code definitely paid off. I’ll be doing considerably more of this, if I can. By that I mean… In my computer, I took the profiler as far as I could, until it started giving me stupid, incorrect data. Hopefully running it in my laptop will give me new useful information.

The main thing I got out of all this is: A lot of things I tested whether they made a difference or not, and didn’t, only didn’t make a difference in Chrome, or in my beast of a computer, and they do make a big difference in other browsers/platforms.

For example, I had a bunch of constants to turn things on/off (for testing/debugging purposes), like drawing FPSs, drawing Viewport Data, etc. This meant a bunch of “if we should do x, do it”, where it didn’t do anything. Just checking the flag, in Chrome, didn’t make any difference at all, so I never bothered with it. When I removed those checks, my test runs in iPhone without lighting went from 110s to 93s. That’s a pretty fucking big change, where as in Chrome the difference was absolutely zero.

The part that sucks is that this means I need to revisit pretty much *all* my assumptions, everything, from the beginning. All the little benchmark tests I made for “unit things” need to be re-run, in all platforms.

Shit.

The second big thing I learned is that lighting is stupidly fast in my computer, and stupidly slow in all the others. That’s cool, it wasn’t written to be fast initially, it was written to be somewhat elegant and I never revisited that because it seemed fast enough (in my computer). I’ll be working *a lot* on that now.

The upside is, obviously, that it’s pretty obvious I’ll be able to squeeze some good amount of extra performance out of these…


So the moral is… Different devices / browsers are not just a matter of faster or slower. Some things will be faster or slower relative to other things. In my desktop, it seems like JS execution is stupidly fast compared to drawing to screen. In the cell phones, it’s exactly the opposite. Things that didn’t make any difference at all in my computer had dismal differences on the mobiles.

So benchmark in all the devices. All time time. Every time.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">