5/22/2013

The New (Time-of-Flight) Kinect Sensor and Speculations

In the official Xbox One release yesterday (2013/5/21), it is mentioned that the new Kinect sensor employs a time-of-flight camera to acquire depth image instead of using the Light Coding technology from the original Kinect (which is patented by PrimeSense). It amazed me because I heard about how expensive a time-of-flight camera is: SwissRanger 4000 (a popular choice in academic research) with 176x144 resolution at 50 FPS costs more than $4,000. So I started digging and found how ignorant I am to the recent development of the field.

To understand how time-of-flight (or TOF) cameras acquire a depth image, its Wikipedia page is a good place to start. The basic idea is that you need to measure the round-trip time (RTT) of the photons that are emitted by the sensor and reflected back. Given the speed of light being 3x10^8 m/s, the sensor would need the precision to measure 6 picoseconds difference in time to measure 1 millimeter difference in depth (1mm is the best precision the original Kinect could achieve). This high precision requirement, i.e. high frequency RF, makes the chip and circuitry design more challenging and costly. The surprise for me is that 3DV Systems (and Canesta) seemed to have found a way to lower the cost drastically and planned to release a RGB-Depth sensor, called ZCam, for under $100. (But before 3DV could sell it, the company was bought by Microsoft.)

Let's venture deeper into the hardware. Thanks for WIRED Magazine's exclusive report, we can see the performance and internals of the new Kinect sensor.

We can see from the picture of the circuit board that the three crucial components, RGB camera, infrared (IR) sensor, and IR illuminator, not unlike the original Kinect except that the IR sensor and illuminator are no longer exposed.
The external of the new Kinect sensor unveiled with Xbox One

Photo of the internal of a new Kinect from WIRED with labels added by me

Photo of an original Kinect sensor with components labeled. 

Thus the IR sensor and RGB camera are still separated (also confirmed by the different field of view when switching between the two streams in the WIRED video: http://youtu.be/Hi5kMNfgDS4?t=5m27s). If the sensor can capture both RGB and IR simultaneously from the same sensor (or just switching quickly between the two at 60Hz/60Hz or 30Hz/30Hz), texture mapping alignment in KinectFusion type of application would get better.

In case you wonder, the reason why I argue that the IR illuminator is the big rectangular block to the left of the IR sensor (instead of the other way around) is because the active IR image shows shadow due to IR illumination on the left but the screen we see is a mirror image: http://youtu.be/Hi5kMNfgDS4?t=5m8s. This shows that the IR illuminator is on the left to the IR sensor (which by the way should have a lens like the exposed RGB camera).

Comparing to the original Kinect, the most noticeable improvement of the new Kinect is the almost shadow-less depth image (see video http://youtu.be/Hi5kMNfgDS4?t=37s) due to the closer placement of IR sensor and illuminator. In fact, TOF technology allows for more flexibility with the illumination placement and design (thus better depth image). For comparison, google "kinect shadow" and take a look at the images.

Looking forward, the natural question to ask is when will depth sensing technology go mobile? Canesta seems to have some prototype for product can fit into the form factor of a phone: https://www.youtube.com/watch?v=5_PVx1NbUZQ and the video is two years old.

The trail that led to this new design of Kinect sensor is clear in retrospect. Microsoft acquired 3DV Systems and Canesta a few years ago which have both worked on TOF technologies extensively. The acquisitions obviously clear up some patent concerns for Microsoft (Bye, PrimeSense...). The down side is that we, as developers and consumers, might not see a more open-source friendly alternative with similar technology anytime soon. And we would have to rely on Microsoft to release good SDK and live with Windows when using the new Kinect in commercial applications.

3 comments:

  1. My gut feeling is Microsoft is going to be much more developer friendly with Xbox One/ Kinect 2 than in the past. This console is obviously much more ambitious than just a videogame system and media device (part of the reason they announced it prior to E3 and didn't speak too mucha bout games)--it's intended to truly be the computer that takes over and operates the living room. It's very important they succeed at this, or they risk becoming irrelevant (windows mobile/tablets are fail imo, and Microsoft Office is under increased siege) Google has so far failed at getting the living room, and Microsoft knows they will very soon be under threat in this by Apple and their strong developer ecosystem. If they've learned anything, they're not going to neglect their developers this time.

    ReplyDelete
    Replies
    1. I have never developed on Xbox before but I would guess that Xbox SDK is one of the most invested projects at Microsoft. I should have qualified my last propositions. I was mainly thinking about the ecosystem outside of console games: commercial applications that might not fit into the living room but would benefit from incorporating 3D reconstruction technologies.

      This is a thought inspired by your comment. It could be the case that Xbox One (with its alleged integration of 3 OS's including Windows 7) would expand to do away PC's or take over many of PC's tasks. For example, you can set up a virtual fitting room with Kinect in your living room for clothing e-commerce websites. The standard Windows OS (along with Kinect) might provide unique opportunities for non-gaming business to reach into the living rooms. (one more baggage for developers in this case: you have to stick with Xbox platform...)

      Delete
  2. Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking.

    Load Cell Suppliers

    ReplyDelete