The Mathenaeum – A story about testing and use cases

The importance of unit testing to the software development process is by now well established. Advantages include: a) demonstrating the functionality of code units, b) highlighting any unwanted side-effects caused by new changes, c) a B. F. Skinner-esque positive feedback system reflecting the progress and success of one’s development work. Most importantly perhaps, developing code that fails to perform as desired gives visibility into each successive point of failure and serves to motivate the development process. In general, you can’t fix the bugs that you can’t see and the importance of baking QA into the development workflow cannot be overstated. Unit testing, regression testing and continuous integration are an essential part of the software development process at Three Byte.

The Mathenaeum exhibit, built for the Museum of Mathematics that opened this past December, is a highly optimized, multi-threaded piece of 3D graphics software written in C++ with OpenGL and the Cinder framework. The algorithms employed for manipulating complex objects across a wide range of geometric manipulations and across multiple threads were challenging, but for me the most challenging and edifying part of this project was the problem of hardware integration and effective testing. More specifically, working on the Mathenaeum taught me about the difficulties associated with and the creativity required for effective testing.

Unlike some software deadlines, MoMath was going to open to the general public on December 15 whether we were ready for it or not. At Three Byte we were balancing the pressure of getting our product ready to deliver and the knowledge that long nights and stressful bouts of overtime can introduce more bugs than they fix. Just before opening day functionality on the Mathenaeum was complete. And we delivered…and the museum opened…and things looked fine…but every so often it would freeze. The freezes were infrequent and most visitors had a successful experience and show control software that we wrote made it trivial for the MoMath floor staff to restart a frozen exhibit from a smart-phone, but even an infrequent crash means a frustrated user and a failed exhibit experience which was devastating to me.

Visitors at work in the Mathenaeum

Effectively testing the Mathenaeum was a challenge. The first issue I solved was a slow leak of openGL display lists that weren’t being disposed of properly. This leak was aggravated by a bug in the communications protocol we had setup with a set of five LCD screens embedded in the Mathenaeum control deck. To set the screen state for the arduinos we were creating and opening Windows Socket 2 objects (SOCKET) but failing to close them. This meant we were leaking object handles and causing memory fragmentation causing the leaking Mathenaeum to crash after using only 100 MB of memory.

Visual Leak Detector for C++ was helpful in finding leaks, but in the end tracking the correlation between memory consumption in the task manager and various operations was sufficient for localizing all memory leaks. Despite plugging up all memory leaks the sporadic crash/freeze persisted and no matter what I tried and I could not reproduce the bug on my development machine. Visibility into this issue was basically zero.

Everyone knows that a developer cannot be an effective tester of his or her own software. Therefore, when trying to reproduce the Mathenaeum crash I would try to inhabit the psyche of a person who never saw this software before and is feeling their way around for the first time. Everyone at Three Byte tried to reproduce this bug but to no avail. So, I started spending time at MoMath observing the interactions that happened there. Lots of adults and kids took the time to build stunning creations in 3D and took the care to stylize every vertex and face with artistic precision. Some people were motivated by the novelty of the physical interface, the excitement in experimenting with the various geometric manipulations, and others seemed motivated by a desire to create a stunning piece of visual art to share with the world on a digital gallery. In addition, the most popular creations were printed by a nearby 3D printed and put on display for all to see. I saw a mother stand by in awe as her eleven year-old son learned to navigate the software and spent hours building an amazing creation. Watching people engaged in my exhibit inspired me in a way I never felt before and made me extremely proud to be a software developer.

However, I also saw a second type of interaction which was equally interesting. MoMath hosts a lot of school trips and it’s not uncommon for the museum floor to be “overrun” by hundreds of girls and boys under the age of eight. For these kids, the Mathenaeum is an amazingly dynamic contraption. The trackball (an undrilled bowling ball) can be made to spin at great speeds, the gearshift is a noise maker when banged from side to side and throttle generates exciting visual feedback when jammed in both directions. For this particular use case the Mathenaeum is being used to its fullest when two kids are spinning the trackball as fast as possible while two others work the gear shift and throttle with breakneck force. It soon became clear to me that the Mathenaeum was failing because it was never tested against this second use case.

The first step in stress testing the Mathenaeum, was making sure that my development machine used the same threading context as the production machines. Concretely, the Mathenaeum explicitly spawns four distinct threads: a) a render-loop thread, b) a trackball polling thread, c) an input polling thread, d) a local visitor/RFID tag polling thread. The physical interface on my development machine, being different from the trackball, gearshift and throttle on the deployment machines, was using only one thread for trackball and input polling (both emulated with the mouse). Replicating the deployment environment meant enforcing a threading context which was consistent in both places. In retrospect, this change was obvious and easy to implement, but I hadn’t yet realized the importance of automated stress testing.

My observations at the museum inspired the construction of a new module called fakePoll() which would be responsible for injecting method calls into the two input polling threads as fast as my 3.20 GHz Inter Xeon processor will allow. This overload of redundant calls, (similar perhaps to a team of second graders) works both input threads simultaneously, while causing all types of operations (and combinations thereof) and navigating the Mathenaeum state machine graph at great speeds. In short, fakePoll() made it possible to easily test every corner of Matheaneaum functionality and all the locks and mutexes and race conditions that could be achieved. Unsurprisingly, I was now able to crash the Mathenaeum in a fraction of a second – a veritable triumph!

Given a failing test I had new visibility into the points of failure and I started uncovering threading problem after threading problem. Numerous deadlocks, inconsistent states, rendering routines that weren’t thread safe, and more. With every fix, I was able to prolong the load test – first to two fractions of a second, then to a few seconds, then to a minute then a few minutes. Seeing all the threading mistakes I had missed was a little disheartening but an important learning experience. Injecting other operations into other threads such as an idle timeout to the attract screen and various visitor identification conditions exposed further bugs.

memoryCorruption.jpg

In a single threaded environment a heap corruption bug can be difficult to fix, however by peppering your code with: _ASSERTE(_CrtCheckMemory()); it’s possible to do a binary search over your source code and home in on the fault. In a multithreaded application solving this problem is like finding a needle in a haystack.

After spending hours poring over the most meticulous and painstaking logs I ever produced I finally found an unsafe state transition in the StylizeEdges::handleButton() method. This bug – the least reproducible and most elusive of all solved Mathenaeum bugs, exposed a weakness in the basic architectural choice on which the whole Mathenaeum was built.

The state machine pattern is characterized by a collection of a states, each deriving from a single base class, where each state is uniquely responsible for determining a) how to handle user input in that state, b) what states can be reached next, c) what to show on screen. The state machine design pattern is great because it enforces an architecture built on components which are modular and connected in an extensible network. In the state machine architecture, no individual component is aware of a global topology of states and states can be added or removed without any side-effects or cascade of changes. In the Mathenaeum, the specific set of operations and manipulations that a user can implement with the gearshift, button and throttle, depends on where that person stands within the network of available state machine states.

When a user navigates to the stylizeEdges state in the state machine, they are able to set the diameter of their selected edges and then change the color of these edges. After setting the color of the edges, we navigate them to the main menu state with the call:

_machine->setState(new MainMenuState(_machine));

The setState() method is responsible for deleting the current state and replacing it with a newly created state. At some point, I realized that if the user sets all selected edges to have diameter zero, effectively making these edges invisible, it doesn’t make sense to let the user set the color of these edges. Therefore, before letting the user set the edge color I added a check to see if the edges under inspection had any diameter. If the edges had no diameter, the user would be taken directly to the main menu state without being prompted to set an edge color.

This change set introduced a catastrophic bug. Now, the _machine->setState() could delete the stylizeEdges state before having exited the handleButton method(). In other words, the stylizeEdges state commits premature suicide (by deleting itself) resulting in memory corruption and an eventual crash. To fix the bug, I just had to insure that the handleButton() method would complete as soon as the _machine->setState() method was called.

Now my load test wasn’t failing and I was able to watch colors and shapes spinning and morphing on screen at incredible speeds for a full hour. I triumphantly pushed my changes to the exhibit on site and announced to the office: “the Mathenaeum software is now perfect.” Of course it wasn’t. After about five hours of load testing the Mathenaeum still crashes and I have my eye out for the cause, but I don’t think this bug will reproduce on site anytime soon so it’s low priority.

Some Mathenaeum creations:

Amichai

Low Latency Syncronization over the internet

Previously, ActiveDeck was able to stay in sync within about 4 seconds between the PowerPoint computer and it’s neighboring iPads. However, this wasn’t good enough for us- our background is in show control systems and frame accurate video playback systems. We have a curse of over-analyzing every video playback system we see for raster tear, frame skips and sync problems. Given that ActiveDeck solely relies on the internet, we thought 4 seconds was pretty good, considering. But we wanted to make it much, much better.

We thought about the best way to improve the latency, and our initial thinking leaned towards a local network broadcast originating from the computer running PowerPoint, but that introduces issues on WiFi networks, especially those you would find in hotel ballrooms. VPN to the cloud service would be another option, but adds lots of complexity.

We ended up using pure HTTPS communications (no sockets, no VPN, no broadcasts) to and from the cloud servers with the use of some clever coding. If the iPad has internet connectivity, it will be in sync.

Check the video out, this is over a cable modem internet connection and a plan Linksys WRT54G access point. Our Windows Azure servers are at least 13 router hops from our office. The beautiful part is that the sync messages are tiny and this will scale to hundreds of iPads.

Kinect: Cheap Key

The 3Byte R&D lab recently purchased a Microsoft Kinect to play with. We didn’t mind the fact that we don’t have an XBox to plug it into because Code Laboratories has published an SDK which allows you to use C# (and several other high-level languages) to access the camera feed. In fact, the test app that they distribute is very cool for immediately figuring out why this device is different than a normal web cam:

In addition to providing a normal color camera video stream (with red, green, and blue pixels), it also provides another dimension (literally) of depth information in a separate parallel stream. The picture above is me sitting at my desk, and the depth feed has been colorized to give a rough indication of where different objects are in the frame.

So, how do we do something useful with our new toy?

One thing that we immediately decided to try is Kinect Keying. The concept is similar to chroma keying but instead of requiring a solid blue or green colored background, we use the depth information from the Kinect to extract only the elements at a certain physical depth. I tackled this problem in a proof-of-concept project using WPF.

The important transformations happen in two steps:

    First I create a mask by capturing the depth frame from the camera and choose a specific depth value to isolate (plus or minus a margin of error). For every pixel in the depth frame, if it is within the desired depth slice, I keep it; if it is closer or farther away, then I set that pixel to 0 so that we ignore it.
    Second, I combine the new depth mask with the normal incoming video signal, and if the pixel from the depth mask is greater than 0, keep the video pixel; otherwise, set the video pixel alpha to 0.0 so that it is totally transparent.

Combine this with a background image, and we can send Mr. Gingerbread man on a trip to the desert:

The upper left-hand corner is the normal video feed of G-Bread standing on his desk. To the right is the grayscale version of the simultaneous depth feed from the camera. Anything in black is either too close or too far away for the camera to perceive it, but that is ok, because we care about a particular section of the mid-field here.

On the bottom left is the depth mask I created by specifying a specific depth slice. The sliders at the bottom of the screen allow you to easily adjust the desired depth and the tolerance (how much depth) to slice.

Finally, on the lower right is the composited image with a static background. As you can see, this a bit primitive because the incoming depth signal is somewhat noisy and it isn’t perfectly registered with the video image (there are two cameras in slightly different positions). But this demonstrates that a cheap keying effect is possible without specialized hardware or sets.

The source code as a Visual Studio project is available here: KinectDepthSample

With thanks to Code Laboratories for their great SDK and managed libraries, and to Greg Schechter for his series of articles on leveraging GPU acceleration through pixel shaders in a managed environment.

100 iPads

Have you ever wondered what 106 iPads look like when packed as densely as possible? Here is a picture:

For a recent project, we developed a synchronized iPad display app. The project was to support a presentation with some new method of interacting with the participants. The designers liked the idea of handing out ipads to which they could “push” content they wanted, when they wanted.

So, we fired up xCode and built the iPad app. The application is made up of several modes of operation while the main mode is to display content driven by the presenter, so that on cue all of the iPads display new screens without any interaction by the person holding the iPad. This looks pretty awesome when it gets triggered and you can see all of the iPads change their screens at once.

Other modes are sort of like tests or drills, where the users complete a quiz and then submit that data to the presenter. We have another application there that creates graphs based on the statistics from all of the iPad users to which the presenter can speak to when projected on a large screen in the center of the room.

When we set out to design the system, we had to think through the potential bottlenecks. Our main concern was network latency, so after some research we specified the best wireless access points we could find- Ruckus Networks. See these links:

http://www.ruckuswireless.com/

http://www.tomshardware.com/reviews/beamforming-wifi-ruckus,2390.html.

We ended up with 5 access points and a network controller on a gigabit network. Worked great (a little bumpy the first day of the presentation to due to a faulty access point).

Next, we created a back end system where the content would be stored locally, yet able to be updated during the presentation. Using IIS, we posted the images and XML files on a local (to the event network) webservice. We then wrote a multi-threaded socket server on another computer that was dedicated to triggering page turns, mode changes, and initiating fresh content downloads to the iPad.

Here is a video during some initial sync testing, this is all running from one access point, and triggered by Chris and his PC.

Network Shutdown

Computers in AV Systems

All of the AV systems I’ve worked on recently include at least one computer. Because Windows computers are so general-purpose and typically inexpensive, they can be used for interactive touchscreen kiosks, video playback, audio playback, or many other useful functions.

However, one thing computers don’t do well, is listen to control systems (at least out of the box).
The most essential function of an AV control system is to turn everything on at the start of the day, and turn everything off at the end of the day. Not only does this protect the equipment (especially monitors and projectors), but it is also the green thing to do. Everyone is paying more attention to reducing power consumption, particularly when the system is not even being used. So we want to be able to turn non-essential computers on and off, too. And, it turns out this is not as easy as it should be. This post describes the ways that I have developed to handle this problem gracefully.

Startup

This is the easier of the two problems. Most modern computers include include a BIOS setting that allows you to prevent completely turning the off the power to the ethernet adapter, and the network adapter will respond to a Wake-On-LAN magic packet over the network. Even when the computer is turned off, you can power it up, by sending out a special command that includes the MAC address of the computer. I have written a Crestron module, and a C# library to perform this function, and you are free to use them and see how it works.

Some older computers do not have network adapters or power supplies that support Wake-On-LAN. In this case, you can punt and set the BIOS to turn the computer on a specific time of day. Even if you don’t know exactly when it needs to be on, you can still reduce the computer’s duty cycle by judiciously setting a daily startup time.

Shutdown

This is actually the difficult part. I couldn’t find any way to tell a Windows computer to shutdown on command. You would think this should be easy, but computers are designed to protect the user form the outside world by default, so they don’t let anybody tell them what to do.

So, I wrote a small C# console application that runs in the background and listens on a well-defined network port to incoming messages. When it gets a “SHUTDOWN\x0D\x0A” message (with CR and LF appended), it issues the shutdown command to the operating system. This could work on any operating system, but I’ve implemented it for Windows and the critical line of code looks like this:

System.Diagnostics.Process.Start("shutdown", "/s /f /t 3 /c \"Control System Triggered Shutdown\" /d p:0:0");

The compiled application is called NetworkShutdown. Unzip it and put a copy on the computer you want to control, and add a shortcut to it to the Startup folder. You also need to make sure that UDP port 16009 is open in the Windows Firewall.

Then, any control system that can send UDP packets can be used to control this computer. For example, using Crestron just send an ASCII string like this:

Conclusion

At the end of the day, it doesn’t take that much more programming work to ensure that computers can be turned on and off with your media system, and you can save a lot of energy in the process. When being green is easy, why not?

PrePix Virtualization Software

I cannot speculate as to the implications for their sync mechanism, except to observe that the rasters of the three screen segments visible in this video seem to be in good sync, though the frames themselves do not.

A PrePix Screenshot

Different technologies are modeled by selecting a Pixel Model, which is nothing more than a macro image representing a single pixel of the intended display technology displaying white at full brightness. Some sample Pixel Model images representing LCD and various LED technologies are included, but additional Pixel Models may be added easily by the user.

PrePix includes a graphical interface for setting up a virtual sign with parameters for Pixel Model selection, pixel dimensions, physical sign dimensions, brightness gain, RGB level adjustments, and source video selection. PrePix supports 30fps wmv video playback as well as jpg, gif and bmp images.

Source video is automatically up-sampled or down-sampled to match the pixel dimensions of the virtual video surface. Then, a Pixel Model is applied, followed by the Gain and Color Balance layers. The virtual display is viewable in a virtual 3D space.

In the 3D virtual space, the user may select from preset views including “Pixel Match,” which attempts to match large virtual LED pixels onto the PrePix user’s computer display so that the user may get a sense of the look of a virtual video surface in a real, physical space. Viewing a real 11mm LED display from 10 feet away should be comparable to viewing such a virtual display Pixel Matched to a 24” LCD display from the same distance, with the only significant difference being brightness. If this kind of modeling is deemed comparable and effective, then PrePix can be used to determine optimal viewing distances for different pixel pitches and LED technologies without requiring a lot of sample hardware.

PixelMatch close-up

An even more concrete use of PrePix is to determine the effectiveness of video or image content when displayed on certain low-resolution LED signs. The LED signs on the sides of MTA busses in NYC have pixel dimensions of 288×56. They are often sourced with video content that was obviously designed for a higher resolution display. With PrePix, a content producer can easily preview his video content on a virtual LED bus sign to check text legibility and graphical effectiveness.

 288x56-pixel source image

288x56-pixel source image

 Close-up of 3D video display model

Close-up of 3D video display model

 Full view of virtual LED sign. The moire patterns is comparable to that 
which would be seen in a digital photo of a real LED display.

Full view of virtual LED sign. The moire patterns is comparable to that which would be seen in a digital photo of a real LED display.

3Byte would like to develop PrePix further, depending on feedback from users. Please let us know what you think!

PrePix can be downloaded here.

PrePix System Requirements:

  • Windows XP, Vista or 7
  • A graphics card supporting OpenGl 3.0 and FBOs (Nearly all cards less than 18 months old.)

3Byte can be contacted at info@3-byte.com