Beware of Reflection

The pprof flamegraph didn't look good. My portion of the Go code was slow. Very slow. And I didn't immediately understand why.

A few months earlier, I was responsible for designing the system that would validate our legacy data format. Roughly 100 different xml file types each with their own rules and constraints. Unmarshaling the data using the standard approach, custom structs and encoding/xml, was easy enough. The result was an in-memory tree. The nodes were either our custom defined types, usually structs, or builtin types. The system needed to traverse the tree and for every type that implemented a Validate method, call it to check for errors. The simplest way to implement this is to manually chain together all of the Validate methods that traverse the tree. After some discussion, however, the manual approach seemed tedious and error prone. What if we forgot to add one of the calls and a portion of the tree wasn't being validated? In hindsight, misplaced fear.

The clever solution, or so I thought, was to use reflection. With a few dozen lines of code, we could recursively traverse the entire tree and guarantee that all Validate implementations would be called. Initially this approach worked great and the team seemed to like it. But months later, it was clearly causing problems. Why was it so slow?

I tried a few experiments to gain a better understanding but they yielded little insight. I was stumped. Eventually I stared at the benchmark file data and the structs when I finally spotted a lead. Sometime after the initial implementation, a []byte field was added the tree and it could potentially grow to 10+ MB. Sure enough, the "clever" reflection code was looping over every byte in the slice and checking if the byte implemented the validation interface. Face palm. The quick fix was check the kind of the slice's element type and early-exit for builtins. The result was ~75,000 times faster for the worst case scenario which would be impressive if it wasn't so sad.

Rob Pike gave a presentation where he made the points that:

I was aware of this advice going into the initial design and somehow I still thought my case was an exception where reflection would be appropriate. But even as the author, it was challenging to figure out what was happening. The recursive reflection based code obfuscated the pprof so the investigation took longer. Whereas the approach with manual calls would have been immediately obvious. Not to mention, we would never have written a loop over the []byte using the manual approach.

All of this leads back to Go's core belief of optimizing for clear code for all future readers over easy to write code for the author.

Hi, I'm Eddie Scholtz. These are my notes. You can reach me at Atom