by on / 6 comments


This image is of me looking into one end of a system we built to explore the question of just how well conventional video conferencing can be made. We used a two-way mirror to allow us to mount a high-quality camera behind a glass surface on which you see projected the image from an LCD monitor (laying flat on the table). By connecting two of these boxes together, you are able to make normal eye contact while video conferencing: because the camera sits right ‘behind’ the image of the other person, you are able to look right at them.

The Problem
Using Skype or Facetime is never quite right, because you can’t make eye contact – the camera is above the edge of the screen, but the image you are looking at is below it. We also used HDMI 1080p cameras and monitors to create a very high resolution, low latency connection.

Our findings upon using this system were that although real eye contact is very compelling, you still don’t enjoy using the system as much as you’d expect… in a manner similar to typical video conferencing, it still feels very far from ‘being there’.

Digging into this, a few problems stand out:

  • Video latency – As surprising as it may seem (and John Carmack has written at length on the details behind this), a camcorder with HDMI output connected directly to an LCD TV Monitor at 60Hz still exhibits a video delay of typically 100msecs! This is really frustrating, as the 60Hz camera/screen rate would lead you to expect an average of 16msec delay, and yet you see more than 5x that delay.

  • Lighting Even when you aren’t lit by the ugly white glare of a laptop monitor, it is almost impossible to create a lighting setup that simultaneously makes you look good to the other person while enabling you to see them. The warm white ‘magic hour’ light that makes YOU look best is shining right in the corner of your eye while you are trying to make out the other person. It’s like stage lighting – you can’t look good on stage, AND see the audience.

  • Audio We used high quality XLR-linked sets of high-exclusion headphones, but a big factor that is missing when using video conferencing as compared to face-to-face is the spatialization of audio. For example, if you turn your head while talking to me so that you are no longer pointing right at me, I can easily hear the effect – the frequencies and timing of the sound change. This information is missing when we are on headphones.

Looking Ahead
These observations support the idea that it could be better to optimize the technology that can turn us into highly realistic avatars, rather than improving on the state of the art in video conferencing. If we don’t try to capture the photons reflecting off your poorly-lit human face, and instead try to capture sensor data that tells us about your movement, gaze, and facial cues at high resolution, using that data to animate an avatar.  Once we have you as an avatar, of course, we can do any lighting we’d like.  Also we can put you on a beach or in a huge boardroom filled with monitors, but that’s a story for another day.

Barbara Auntis on May 10, 2013

If it’s really true, and if eye contakt really works on high level, it opens a new era of video conferencing and it’s genious!


Kevin peterson on May 22, 2013

I believe using quality video conferencing service providers such as Polycom, Accutel, Vidyotel, RHUB web video conferencing appliances etc. eliminates barriers to video conferencing such as Video latency, audio, lighting etc.


Julie Steinhaus on February 6, 2014

I am a teacher and I am desperate for the new platform, faster please! Seriously, after my positive experience with online learning in my own education, a lot of my daydreaming time has been spent imagining a free online university with avatars and audio instead of posting and email. I await the new world you are creating avidly…


Adam Crow on August 16, 2014

I imagined that a future video system would find participant’s eyes and change them live with a calculated fake overlay, so that they appear to be looking at the other parties.


Howie on August 3, 2015

I used to work for a video conferencing company, and we dealt with these guys, who did this back in the mid-2000s. It was a Tandberg codec modified to fit in a neat pop-up terminal with two-way mirror. The effect was pretty good, but this was pre-HD, and dedicated codecs weren’t cheap!


Add your thoughts


Philip Rosedale
Lifelong entrepreneur and technology innovator. CTO at RealNetworks, founder and CEO Linden Lab, creator of virtual world Second Life. Cofounder High Fidelity, Inc.