Custom IKinectController – Drag and Drop

Drag and Drop sample with IKinectManipulateableController

This sample shows how you can implement a custom KinectControl, for example to move things around in a Canvas.

To hook into the whole KinectRegion magic you can implement your own UserControl that implements IKinectControl.
When you move your hand KinectRegion keeps track of the movements and will constantly check if there is some Kinect enabled control at the current hand pointer’s position.
To determine such a control the KinectRegion looks for IKinectControls.

The IKinectControl interface forces you to implement the method IKinectController CreateController(IInput inputModel, KinectRegion kinectRegion).
The IKinectControl is also required to implement IsManipulatable and IsPressable.
This way you are specifying on what gestures your control will react to.
Because this sample shows how to move controls around the Draggable control’s property IsManipulatable returns true.

UserControl Draggable.cs implements IKinectControl

public sealed partial class Draggable : UserControl, IKinectControl
    public Draggable()

    public IKinectController CreateController(IInputModel inputModel, KinectRegion kinectRegion)
        // Only one controller is instantiated for one Control
        var model = new ManipulatableModel(inputModel.GestureRecognizer.GestureSettings, this);
        return new DragAndDropController(this, model, kinectRegion);

    public bool IsManipulatable { get { return true; } }

    public bool IsPressable { get { return false; } }

You’ll find two interfaces in the SDK *.Controls namespace that implement IKinectController and of course they are closely related to the other properties you have to implement.

Those interfaces are:
– IKinectPressableController
– IKinectManipulateableController

Since IsManipulatable returns true CreateController should return an instance of IKinectManipulateableController.
For an example how to implement IKinectPressableController see here.

Controller class DragAndDropController

public class DragAndDropController : IKinectManipulatableController, IDisposable

The DragAndDropController‘s constructor gets a reference to the manipulatable control, an IInputModel and a reference to the KinectRegion.

public DragAndDropController(FrameworkElement element, ManipulatableModel model, KinectRegion kinectRegion)
    this.element = new WeakReference(element);
    this.kinectRegion = kinectRegion;
    this.inputModel = model;

    if (this.inputModel == null)

The ManipulatableModel provides four events you can subscribe to in order to react to user input.
This sample uses the nuget package Kinect.ReactiveV2.Input that provides specific Rx-extension methods to subscribe to these events.

    this.eventSubscriptions = new CompositeDisposable 
                       .Subscribe(_ => VisualStateManager.GoToState(this.Control, "Focused", true)),

                       .Subscribe(_ => Debug.WriteLine(string.Format("ManipulationInertiaStarting: {0}, ", DateTime.Now))),

                       .Subscribe(_ => OnManipulationUpdated(_)),

                       .Subscribe(_ => VisualStateManager.GoToState(this.Control, "Unfocused", true)),

All subscriptions are composed into one CompositeDisposable that is disposed in the controllers’s Dispose() method.

  • ManipulationStartedObservable –> is fired when the user closes it’s hand
  • ManipulationInertiaStartingObservable –> ???
  • ManipulationUpdatedObservable –> is fired when the user moves it’s hand while keeping it closed
  • ManipulationCompletedObservable –> is fired when the user releases it’s hand

For this sample the most interesting observable is ManipulationUpdatedObservable. Everytime this event is fired the method OnManipulationUpdated is called.

private void OnManipulationUpdated(KinectManipulationUpdatedEventArgs args)
    var dragableElement = this.Element;
    if (!(dragableElement.Parent is Canvas)) return;

    var delta = args.Delta.Translation;
    var translationPoint = new Point(delta.X, delta.Y);
    var translatedPoint = InputPointerManager.TransformInputPointerCoordinatesToWindowCoordinates(translationPoint, this.kinectRegion.Bounds);

    var offsetY = Canvas.GetTop(dragableElement);
    var offsetX = Canvas.GetLeft(dragableElement);

    if (double.IsNaN(offsetY)) offsetY = 0;
    if (double.IsNaN(offsetX)) offsetX = 0;

    Canvas.SetTop(dragableElement, offsetY + translatedPoint.Y);
    Canvas.SetLeft(dragableElement, offsetX + translatedPoint.X);

If this IKinectControl is placed inside a Canvas it’s position is updated relative to the users hand.

The complete code is found here.
This article in markdown is found here.

HeiRes – MS Hackathon – PartyCrasher

On 23rd May 2014 the company HeiRes organized together with Microsoft Germany a hackathon in my hometown Dresden. The goal was to build an app in about 8 hours through the night.
I joined a team with Christian and our idea was to build a ‘PartyCrasher’-App. When you leave your kids alone at home over the weekend they may get the idea to have a party at your house which you may not want because they made a mess the last time. So why don’t you set up the PartyCrasher App in you living room that consists of a Kinect for Windows controller that is hooked to a little PC. The Kinect watches the scene in the living room and starts to take pictures as soon as there are more than a specified number of people, for example more than 3. At this time the Kinect starts to take pictures in an interval that you can configure (maybe every minute).

So we’ve started the hackathon with this little architecture in our mind.

In this blog post I want to share how easy it was to program this little Kinect service (console application) that is hooked to the Kinect hardware. We’ve made use of the Kinect for Windows SDK V2 Preview, Kinect.ReactiveV2 and Rx to implement the photo shooting every specified amount of time as soon as there a more than the specified number of people in the scene.

using Kinect.ReactiveV2;
using Microsoft.Kinect;
using System;
using System.Linq;
using System.Reactive.Linq;

static void Main(string[] args)
  var kinect = KinectSensor.Default;

  var frameDescription = kinect.ColorFrameSource.CreateFrameDescription(ColorImageFormat.Rgba);
  var bytes = new byte[frameDescription.Width * frameDescription.Height * 4];

  var moreThanPeople = 3;
  var intervalInSeconds = 60;

  var reader = kinect.ColorFrameSource.OpenReader();
  var bodies = new Body[6];

  var subscription = kinect.BodyFrameArrivedObservable()
                           .Where(_ => _.Count() > moreThanPeople)
                           .Subscribe(bs =>
                             using(var frame = reader.AcquireLatestFrame())
                               if(frame == null) return;
                               frame.CopyConvertedFrameDataToArray(bytes, ColorImageFormat.Rgba);

                             SaveInBlobStorage(frameDescription, bytes);

  Console.WriteLine("[ENTER] to stop");


We’ve continued the hackathon with implementing the bits that saved the pictures in Azure BlobStorage. The file references to the BlobStorage were saved in a ravenDB on ravenHQ. Later on we’ve implemented an ASP.NET MVC service on Azure websites that served the pictures taken to a Windows Store App. While implementing the picture download in the Windows Store App we unfortunately ran out of time. This was the App when we had to stop.

Anyways the whole hackathon was really good fun. Big thanks to HeiRes and Microsoft and maybe some time in the future we’ll finish the PartyCrasher-App.

Kinect.ReactiveV2 – Rx-ing the Kinect for Windows SDK

A few weeks ago I was finally able to get my hands on to the new Kinect for Windows V2 SDK. There are a few API changes compared to V1. So I started to port Kinect.Reactive to the new Kinect for Windows Dev Preview SDK and Kinect.ReactiveV2 was born.

Kinect.ReactiveV2 is, as it’s older brother, a project that contains a bunch of extension methods that should ease the development with the Kinect for Windows SDK. The project uses the ReactiveExtensions (an open source framework built by Microsoft) to transform the various Kinect reader events into IObservable<T> sequences. This transformation enables you to use Linq style query operators on those events.

Here is an example of how to use the BodyIndexFrame data as an observable sequence.

using System.Linq;
using System.Reactive;
using Microsoft.Kinect;
using Kinect.ReactiveV2;

var sensor = KinectSensor.Default;

var bodyIndexFrameDescription = sensor.BodyIndexFrameSource.FrameDescription;
var bodyIndexData = new byte[bodyIndexFrameDescription.LengthInPixels];

      .Subscribe(data => someBitmap.WritePixels(rect, data, stride, 0));

You’ll also get an extension method called SceneChanges() on every KinectSensor instance which notifies all it’s subscribers whenever a person entered or left a scene.

using System;
using System.Linq;
using System.Reactive;
using Microsoft.Kinect;
using Kinect.ReactiveV2;

var sensor = KinectSensor.Default;

      .Subscribe(_ =>
            if (_.SceneChangedType is PersonEnteredScene)
                  Console.WriteLine("Person {0} entered scene", _.SceneChangedType.TrackingId);
            else if (_.SceneChangedType is PersonLeftScene)
                  Console.WriteLine("Person {0} left scene", _.SceneChangedType.TrackingId);

Until now there are extension methods included for the BodyFrame, BodyIndexFrame, ColorFrame, DepthFrame, InfraredFrame and MultiSourceFrame.

The source code is available here.
Download the nuget package from here, or directly typing Install-Package Kinect.ReactiveV2 into the package manager console.

Please be aware that “This is preliminary software and/or hardware and APIs are preliminary and subject to change”.

ContinousGrippedState in Kinect.Reactive

For a while now I was wondering why the Kinect’s InteractionStream sends only one InteractionHandEventType.Grip when the user closes its hand. While the user still holds its hand in a closed state the SDK will fire events that have a HandEventType of None. This confused me from the very beginning. Compared to mouse events you’ll get continous mousedown events when the user does not release the mouse button.

So I thought about a way to get the same functionality when using the Kinect for Windows SDKs 1.x InteractionStream.
This extension method solved my problem and is now part of Kinect.Reactive:

/// <summary>
/// Returns a sequence with continuous GrippedState HandEventType until GripRelease.
/// </summary>
/// <param name="source">The source observable.</param>
/// <returns>The observable.</returns>
public static IObservable<UserInfo[]> ContinousGrippedState(this IObservable<UserInfo[]> source)
  if (source == null) throw new ArgumentNullException("source");

  var memory = new Dictionary<Tuple<int, InteractionHandType>, object>();
  var propInfo = typeof(InteractionHandPointer).GetProperty("HandEventType");
  var handEventTypeSetter = new Action<InteractionHandPointer>(o => propInfo.SetValue(o, InteractionHandEventType.Grip));

  return source.Select(_ =>
    _.ForEach(u => u.HandPointers.ForEach(h =>
     if (h.HandEventType == InteractionHandEventType.Grip)
        memory.Add(Tuple.Create(u.SkeletonTrackingId, h.HandType), null);
     else if (h.HandEventType == InteractionHandEventType.GripRelease)
        memory.Remove(Tuple.Create(u.SkeletonTrackingId, h.HandType));
     else if (memory.ContainsKey(Tuple.Create(u.SkeletonTrackingId, h.HandType)))

   return _;

Use the extension method this way and you’ll get continously events with e.HandEventType == InteractionHandEventType.Grip until you’ll release your hand.

IDisposable subscription = null;

KinectConnector.GetKinect().ContinueWith(k =>
  var disp = k.Result.KickStart(true)
              .GetUserInfoObservable(new InteractionClientConsole())
              .SelectMany(_ => _.Select(__ => __.HandPointers.Where(CheckForRightGripAndGripRelease)))
              .Subscribe(_ => _.ForEach(__ => Console.WriteLine(String.Format("Active: {0}, HandEventType: {1}", __.HandType, __.HandEventType))));
  subscription = disp;


Drag & Drop with Kinect for Windows

With the inclusion of the InteractionStream and the ability to detect a Grip gesture in the Kinect for Windows SDK Update 1.7 it’s now possible to grab UI elements on a screen and move them around. This blog post shows a possible implementation in a WPF application. Please notice that I’m using the following nuGet packages

Code-Behind: MainWindow.cs

// this code can be called after initialization of the MainWindow

// Get a kinect instance with started SkeletonStream and DepthStream
var kinect = await KinectConnector.GetKinect();

// instantiate an object that implements IInteractionClient
var interactionClient = new InteractionClient();

// GetUserInfoObservable() method is available through Kinect.Reactive
      .SelectMany(_ => _.Select(__ => __.HandPointers.Where(___ => ___.IsActive)))
      .Where(_ => _.FirstOrDefault() != null)
      .Select(_ => _.First())
      .Subscribe(_ =>
            var region = this.kinectRegion;
            var p = new Point(_.X * region.ActualWidth, _.Y * region.ActualHeight);
            if (_.HandEventType == InteractionHandEventType.Grip)
                  var elem = this.kinectRegion.InputHitTest(p) as TextBlock));
                  if (elem != null)
                        this.lastTouched = elem;
            else if(_.HandEventType == InteractionHandEventType.GripRelease)
                  this.lastTouched = null;
                  if (this.lastTouched == null) return;
                  Canvas.SetLeft(this.lastTouched, p.X - this.lastTouched.ActualWidth / 2);
                  Canvas.SetTop(this.lastTouched, p.Y - this.lastTouched.ActualHeight / 2);

XAML: MainWindow.xaml

<Window x:Class="DragAndDrop.MainWindow"
        Title="MainWindow" WindowState="Maximized">
        <Style TargetType="TextBlock">
            <Setter Property="Height" Value="200" />
            <Setter Property="Width" Value="200" />
            <Setter Property="Foreground" Value="White" />
            <Setter Property="FontWeight" Value="ExtraBold" />
            <Setter Property="FontSize" Value="35" />
            <Setter Property="Text" Value="Drag Me" />
            <Setter Property="TextAlignment" Value="Center" />
            <Setter Property="Background" Value="Black" />
    <k:KinectRegion x:Name="kinectRegion" KinectSensor="{Binding Kinect}">
                <RowDefinition Height="100" />
                <RowDefinition Height="*" />
            <Grid Grid.Row="0">
                <k:KinectUserViewer x:Name="userViewer" />
            <Canvas Grid.Row="1">
                <TextBlock Canvas.Left="50" Canvas.Top="50" />
                <TextBlock Canvas.Left="260" Canvas.Top="50" />
                <TextBlock Canvas.Left="470" Canvas.Top="50" />
                <TextBlock Canvas.Left="680" Canvas.Top="50" />
                <TextBlock Canvas.Left="890" Canvas.Top="50" />

Subscribing to the InteractionStream the Rx way

The most exciting feature in the Kinect for Windows SDK Update 1.7 was probably the InteractionStream. The InteractionStream is based on the SkeletonStream and the DepthStream and it enables you to detect basic interactions like a grip gesture or a button push in a Kinect for Windows application.
Since the InteractionStream needs Skeleton- and DepthData for its calculations you have to provide the InteractionStream with depth and skeleton data whenever new frames are available.
A straight forward approach to do that could be the following:

var skeletonData = // initialize array
var depthData = // initialize array
var kinect = // somehow get a Kinect sensor instance
IInteractionClient interactionClient = // a class that implements IInteractionClient

var interactionStream = new InteractionStream(kinect, interactionClient);

kinect.AllFramesReady += (s, e) =>
    long skeletonTimestamp = 0;
    long depthTimestamp = 0;
    var accelerometerReading = kinect.AccelerometerGetCurrentReading();

    using (var depthImageFrame = e.OpenDepthImageFrame())
    using (var skeletonFrame = e.OpenSkeletonFrame())
      if (depthImageFrame == null || skeletonFrame == null) return;

      skeletonTimestamp = skeletonFrame.Timestamp;
      depthData = depthImageFrame.GetRawPixelData();
      depthTimestamp = depthImageFrame.Timestamp;

    interactionStream.ProcessDepth(depthData, depthTimestamp);
    interactionStream.ProcessSkeleton(skeletonData, accelerometerReading, skeletonTimestamp);

interactionStream.InteractionFrameReady += OnInteractionFrameReady;

// The method that handles the InteractionFrameReady events
private void InteractionFrameReady(object sender, InteractionFrameReadyEventArgs e)
    UserInfo[] userInfos = // initialize array
    using (var interactionFrame = e.OpenInteractionFrame())
      if (interactionFrame != null)

    // do something with the UserInfos array

Since I am a huge fan of the ReactiveExtensions framework I have tried to find a solution to encapsulate this code in one method and produce an IObservable. I wanted to be able to subscribe to the InteractionStream in the same ‘Rx way’ as I am used to to subscribe to the SkeletonStream for example. So the goal was to have something like this:

kinect.InteractionStreamObservable().Subscribe(userInfos => 
    // do something useful with the userInfos

This is the solution I have found and it is already included in the Kinect.Reactive NuGet package.

public static IObservable<UserInfo[]> GetUserInfoObservable(this KinectSensor kinectSensor, IInteractionClient interactionClient)
    // null checks and checks if streams are enabled

    return Observable.Create<UserInfo[]>(obs =>
      var stream = new InteractionStream(kinectSensor, interactionClient);
      var allFramesSub = 
                    .SelectStreams((_, __) => Tuple.Create(_.Timestamp, __.Timestamp))
	                .Subscribe(_ =>
                          var accelerometer = kinectSensor.AccelerometerGetCurrentReading();
                          stream.ProcessSkeleton(_.Item3, accelerometer, _.Item4.Item1);
                          stream.ProcessDepth(_.Item2, _.Item4.Item2);

	        .Subscribe(_ => obs.OnNext(_));

      return new Action(() =>

Subscribing to the InteractionStream is now very easy including the benefits of the Rx-Framework.

kinect.GetUserInfoObservable(new InteractionClient ())
      .SelectMany(_ => _.Where(userInfo => userInfo.SkeletonTrackingId == 1))
      .SelectMany(_ => _.HandPointers.Where(handPointer => handPointer.HandType == InteractionHandType.Right))
      .// and so on…

Framework soup

Over the last months I’ve spent quite some time checking out the Kinect for Windows SDK and it’s really fun to work with this awesome little piece of hard and software.

While playing around with the Kinect for Windows SDK it soon became obvious that I’ll need some sort of framework that handles all the tough event handling stuff for me. Thats were the ReactiveExtensions framework came into play. Since I’ve first read about it I wanted to use it somewhere but didn’t get the chance so far. Now that the Kinect came around the corner with its event driven API Rx fitted perfectly to my requirements.

Another framework that impressed me a lot is SignalR, an abstraction layer for persistent connections over http.

So how do they all come together? In an application I called ‘KickerNotifier‘!
We have a ‘Kicker’ (thats german, in english foosball/tabletop football/tabletop soccer) to play with in the cellar of our office.

Problem is that the table stands in the cellar and you never know whether it is busy.

So the idea is the following: Set up a Kinect that is able to track the number of people in the room and push the amount to a website so that all the colleagues are able to see how many people are playing.

The technology stack:

The SignalR hub class
public class KickerNotifyHub : Hub
    private Int32 playerCount = -1;
    public void SetPlayerCount(Int32 count)
        if (this.playerCount != count)
            this.playerCount = count;
Webpage hub connection
<script type="text/javascript" src="Scripts/jquery-1.6.4.js" />
<script type="text/javascript" src="Scripts/jquery.signalR-0.5.3.js" />
<script src="signalr/hubs" type="text/javascript" />
$(function () {
    var hub = $.connection.kickerNotifyHub;

    hub.setCurrentPlayerCount = function (count) {
        $('#currentPlayerCount').text(count); // updates <div id="currentPlayerCount" />

Client console application
using System.Reactive.Disposables;
using System.Reactive.Linq;
using SignalR.Client;
using SignalR.Client.Hubs;

// some method

var personNotification = new PersonNotification();
var connection = new HubConnection("http://???????");
var hub = connection.CreateProxy("KickerNotifyHub");
var setPlayerCountSubscription = Disposable.Empty;
connection.Start().ContinueWith(task =>
   if (task.IsFaulted)
      Console.WriteLine("Failed to start: {0}", task.Exception.GetBaseException());
      Console.WriteLine("Success! Connected with client connection id {0}", connection.ConnectionId);

      setPlayerCountSubscription = Observable.Interval(TimeSpan.FromSeconds(1))
                                             .Subscribe(l => hub.Invoke("SetPlayerCount", personNotification.PersonCount));

public class PersonNotification
   private readonly KinectSensor kinect;
   private readonly IDisposable newSkeletonDataEvent;

   public PersonNotification() : IDisposable
      this.kinect = KinectSensor.KinectSensors
                                .FirstOrDefault(s => s.Status == KinectStatus.Connected);
      if (this.kinect == null) throw new InvalidOperationException("No Kinect connected.");

      this.newSkeletonDataEvent = Observable.FromEventPattern(this.kinect, "SkeletonFrameReady")
                                            .Select(e => e.EventArgs)

      this.kinect.SkeletonStream.TrackingMode = SkeletonTrackingMode.Seated;

   private void NewSkeletonData(SkeletonFrameReadyEventArgs skeletonDataFrame)
      using (var frame = skeletonDataFrame.OpenSkeletonFrame())
         if (frame == null) return;

         var skeletons = new Skeleton[frame.SkeletonArrayLength];

         var personCount = skeletons.Count(s => s.TrackingState == SkeletonTrackingState.PositionOnly ||
                                                s.TrackingState == SkeletonTrackingState.Tracked);
         if (this.PersonCount != personCount)
            this.PersonCount = personCount;

   public Int32 PersonCount { get; private set; }

   public void Dispose()