Drag & Drop with Kinect for Windows

With the inclusion of the InteractionStream and the ability to detect a Grip gesture in the Kinect for Windows SDK Update 1.7 it’s now possible to grab UI elements on a screen and move them around. This blog post shows a possible implementation in a WPF application. Please notice that I’m using the following nuGet packages

Code-Behind: MainWindow.cs

// this code can be called after initialization of the MainWindow

// Get a kinect instance with started SkeletonStream and DepthStream
var kinect = await KinectConnector.GetKinect();
kinect.KickStart();

// instantiate an object that implements IInteractionClient
var interactionClient = new InteractionClient();

// GetUserInfoObservable() method is available through Kinect.Reactive
kinect.GetUserInfoObservable(interactionClient)
      .SelectMany(_ => _.Select(__ => __.HandPointers.Where(___ => ___.IsActive)))
      .Where(_ => _.FirstOrDefault() != null)
      .Select(_ => _.First())
      .ObserveOnDispatcher()
      .Subscribe(_ =>
      {
            var region = this.kinectRegion;
            var p = new Point(_.X * region.ActualWidth, _.Y * region.ActualHeight);
            if (_.HandEventType == InteractionHandEventType.Grip)
            {
                  var elem = this.kinectRegion.InputHitTest(p) as TextBlock));
                  if (elem != null)
                  {
                        this.lastTouched = elem;
                  }									  
            }
            else if(_.HandEventType == InteractionHandEventType.GripRelease)
            {
                  this.lastTouched = null;
            }
            else
            {
                  if (this.lastTouched == null) return;
                  
                  Canvas.SetLeft(this.lastTouched, p.X - this.lastTouched.ActualWidth / 2);
                  Canvas.SetTop(this.lastTouched, p.Y - this.lastTouched.ActualHeight / 2);
            }
	});

XAML: MainWindow.xaml


<Window x:Class="DragAndDrop.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:dd="clr-namespace:DragAndDrop"
        xmlns:k="clr-namespace:Microsoft.Kinect.Toolkit.Controls;assembly=Microsoft.Kinect.Toolkit.Controls"
        Title="MainWindow" WindowState="Maximized">
    <Window.Resources>
        <Style TargetType="TextBlock">
            <Setter Property="Height" Value="200" />
            <Setter Property="Width" Value="200" />
            <Setter Property="Foreground" Value="White" />
            <Setter Property="FontWeight" Value="ExtraBold" />
            <Setter Property="FontSize" Value="35" />
            <Setter Property="Text" Value="Drag Me" />
            <Setter Property="TextAlignment" Value="Center" />
            <Setter Property="Background" Value="Black" />
        </Style>
    </Window.Resources>
    <k:KinectRegion x:Name="kinectRegion" KinectSensor="{Binding Kinect}">
        <Grid>
            <Grid.RowDefinitions>
                <RowDefinition Height="100" />
                <RowDefinition Height="*" />
            </Grid.RowDefinitions>
            <Grid Grid.Row="0">
                <k:KinectUserViewer x:Name="userViewer" />
            </Grid>
            <Canvas Grid.Row="1">
                <TextBlock Canvas.Left="50" Canvas.Top="50" />
                <TextBlock Canvas.Left="260" Canvas.Top="50" />
                <TextBlock Canvas.Left="470" Canvas.Top="50" />
                <TextBlock Canvas.Left="680" Canvas.Top="50" />
                <TextBlock Canvas.Left="890" Canvas.Top="50" />
            </Canvas>
        </Grid>
    </k:KinectRegion>
</Window>

Framework soup

Over the last months I’ve spent quite some time checking out the Kinect for Windows SDK and it’s really fun to work with this awesome little piece of hard and software.

While playing around with the Kinect for Windows SDK it soon became obvious that I’ll need some sort of framework that handles all the tough event handling stuff for me. Thats were the ReactiveExtensions framework came into play. Since I’ve first read about it I wanted to use it somewhere but didn’t get the chance so far. Now that the Kinect came around the corner with its event driven API Rx fitted perfectly to my requirements.

Another framework that impressed me a lot is SignalR, an abstraction layer for persistent connections over http.

So how do they all come together? In an application I called ‘KickerNotifier‘!
We have a ‘Kicker’ (thats german, in english foosball/tabletop football/tabletop soccer) to play with in the cellar of our office.

Problem is that the table stands in the cellar and you never know whether it is busy.

So the idea is the following: Set up a Kinect that is able to track the number of people in the room and push the amount to a website so that all the colleagues are able to see how many people are playing.

The technology stack:

The SignalR hub class
public class KickerNotifyHub : Hub
{
    private Int32 playerCount = -1;
    public void SetPlayerCount(Int32 count)
    {
        if (this.playerCount != count)
        {
            this.playerCount = count;
            Clients.setCurrentPlayerCount(this.playerCount);
        }
    }
}
Webpage hub connection
<script type="text/javascript" src="Scripts/jquery-1.6.4.js" />
<script type="text/javascript" src="Scripts/jquery.signalR-0.5.3.js" />
<script src="signalr/hubs" type="text/javascript" />
$(function () {
    var hub = $.connection.kickerNotifyHub;

    hub.setCurrentPlayerCount = function (count) {
        $('#currentPlayerCount').text(count); // updates <div id="currentPlayerCount" />
    };

    $.connection.hub.start();
});
Client console application
using System.Reactive.Disposables;
using System.Reactive.Linq;
using SignalR.Client;
using SignalR.Client.Hubs;

// some method

var personNotification = new PersonNotification();
var connection = new HubConnection("http://???????");
var hub = connection.CreateProxy("KickerNotifyHub");
var setPlayerCountSubscription = Disposable.Empty;
connection.Start().ContinueWith(task =>
{
   if (task.IsFaulted)
   {
      Console.WriteLine("Failed to start: {0}", task.Exception.GetBaseException());
   }
   else
   {
      Console.WriteLine("Success! Connected with client connection id {0}", connection.ConnectionId);

      setPlayerCountSubscription = Observable.Interval(TimeSpan.FromSeconds(1))
                                             .Subscribe(l => hub.Invoke("SetPlayerCount", personNotification.PersonCount));
   }
});

Console.ReadLine();
setPlayerCountSubscription.Dispose();
personNotification.Dispose();
PersonNotification.cs
public class PersonNotification
{
   private readonly KinectSensor kinect;
   private readonly IDisposable newSkeletonDataEvent;

   public PersonNotification() : IDisposable
   {
      this.kinect = KinectSensor.KinectSensors
                                .FirstOrDefault(s => s.Status == KinectStatus.Connected);
      if (this.kinect == null) throw new InvalidOperationException("No Kinect connected.");

      this.newSkeletonDataEvent = Observable.FromEventPattern(this.kinect, "SkeletonFrameReady")
                                            .Select(e => e.EventArgs)
                                            .Subscribe(NewSkeletonData);

      this.kinect.SkeletonStream.TrackingMode = SkeletonTrackingMode.Seated;
      this.kinect.SkeletonStream.Enable();
      this.kinect.Start();
   }

   private void NewSkeletonData(SkeletonFrameReadyEventArgs skeletonDataFrame)
   {
      using (var frame = skeletonDataFrame.OpenSkeletonFrame())
      {
         if (frame == null) return;

         var skeletons = new Skeleton[frame.SkeletonArrayLength];
         frame.CopySkeletonDataTo(skeletons);

         var personCount = skeletons.Count(s => s.TrackingState == SkeletonTrackingState.PositionOnly ||
                                                s.TrackingState == SkeletonTrackingState.Tracked);
         if (this.PersonCount != personCount)
         {
            this.PersonCount = personCount;
         }
      }
   }

   public Int32 PersonCount { get; private set; }

   public void Dispose()
   {
      this.newSkeletonDataEvent.Dispose();
   }
}

Kinect.Reactive

Events in the .NET programming model really don’t have anything in common with what the Object Oriented paradigm has taught us. Personally I dislike events because they are not first class objects but some kind of background compiler voodoo.

When it comes to writing against an event driven API like the Kinect for Windows SDK you don’t have much of a choice but programming against those events. But wait, there is this wonderful ReactiveExtensions library that comes to the rescue. These classes and extension methods give you the possibility to easily wrap an event in an object and furthermore the library provides you with tons of handy methods that you don’t have to code yourself.

So I decided to write my own IObservable extension methods to extend the Kinect API with the ReactiveExtensions programming model.
Here are two methods to give you an idea of what I’m talking about:

public static IObservable<AllFramesReadyEventArgs> GetAllFramesReadyObservable(this KinectSensor kinectSensor)
{
   if(kinectSensor == null) throw new ArgumentNullException("kinectSensor");

   return Observable.FromEventPattern<AllFramesReadyEventArgs>(h => kinectSensor.AllFramesReady += h,
                                                               h => kinectSensor.AllFramesReady -= h)
                    .Select(e => e.EventArgs);
}

public static IObservable<ColorImageFrameReadyEventArgs> GetColorFrameReadyObservable(this KinectSensor kinectSensor)
{
   if (kinectSensor == null) throw new ArgumentNullException("kinectSensor");

   return Observable.FromEventPattern<ColorImageFrameReadyEventArgs>(
                                                           h => kinectSensor.ColorFrameReady += h,
                                                           h => kinectSensor.ColorFrameReady -= h)
                    .Select(e => e.EventArgs);
}
[...]

And so on…