Custom IKinectController – Drag and Drop

Drag and Drop sample with IKinectManipulateableController

This sample shows how you can implement a custom KinectControl, for example to move things around in a Canvas.

To hook into the whole KinectRegion magic you can implement your own UserControl that implements IKinectControl.
When you move your hand KinectRegion keeps track of the movements and will constantly check if there is some Kinect enabled control at the current hand pointer’s position.
To determine such a control the KinectRegion looks for IKinectControls.

The IKinectControl interface forces you to implement the method IKinectController CreateController(IInput inputModel, KinectRegion kinectRegion).
The IKinectControl is also required to implement IsManipulatable and IsPressable.
This way you are specifying on what gestures your control will react to.
Because this sample shows how to move controls around the Draggable control’s property IsManipulatable returns true.

UserControl Draggable.cs implements IKinectControl

public sealed partial class Draggable : UserControl, IKinectControl
    public Draggable()

    public IKinectController CreateController(IInputModel inputModel, KinectRegion kinectRegion)
        // Only one controller is instantiated for one Control
        var model = new ManipulatableModel(inputModel.GestureRecognizer.GestureSettings, this);
        return new DragAndDropController(this, model, kinectRegion);

    public bool IsManipulatable { get { return true; } }

    public bool IsPressable { get { return false; } }

You’ll find two interfaces in the SDK *.Controls namespace that implement IKinectController and of course they are closely related to the other properties you have to implement.

Those interfaces are:
– IKinectPressableController
– IKinectManipulateableController

Since IsManipulatable returns true CreateController should return an instance of IKinectManipulateableController.
For an example how to implement IKinectPressableController see here.

Controller class DragAndDropController

public class DragAndDropController : IKinectManipulatableController, IDisposable

The DragAndDropController‘s constructor gets a reference to the manipulatable control, an IInputModel and a reference to the KinectRegion.

public DragAndDropController(FrameworkElement element, ManipulatableModel model, KinectRegion kinectRegion)
    this.element = new WeakReference(element);
    this.kinectRegion = kinectRegion;
    this.inputModel = model;

    if (this.inputModel == null)

The ManipulatableModel provides four events you can subscribe to in order to react to user input.
This sample uses the nuget package Kinect.ReactiveV2.Input that provides specific Rx-extension methods to subscribe to these events.

    this.eventSubscriptions = new CompositeDisposable 
                       .Subscribe(_ => VisualStateManager.GoToState(this.Control, "Focused", true)),

                       .Subscribe(_ => Debug.WriteLine(string.Format("ManipulationInertiaStarting: {0}, ", DateTime.Now))),

                       .Subscribe(_ => OnManipulationUpdated(_)),

                       .Subscribe(_ => VisualStateManager.GoToState(this.Control, "Unfocused", true)),

All subscriptions are composed into one CompositeDisposable that is disposed in the controllers’s Dispose() method.

  • ManipulationStartedObservable –> is fired when the user closes it’s hand
  • ManipulationInertiaStartingObservable –> ???
  • ManipulationUpdatedObservable –> is fired when the user moves it’s hand while keeping it closed
  • ManipulationCompletedObservable –> is fired when the user releases it’s hand

For this sample the most interesting observable is ManipulationUpdatedObservable. Everytime this event is fired the method OnManipulationUpdated is called.

private void OnManipulationUpdated(KinectManipulationUpdatedEventArgs args)
    var dragableElement = this.Element;
    if (!(dragableElement.Parent is Canvas)) return;

    var delta = args.Delta.Translation;
    var translationPoint = new Point(delta.X, delta.Y);
    var translatedPoint = InputPointerManager.TransformInputPointerCoordinatesToWindowCoordinates(translationPoint, this.kinectRegion.Bounds);

    var offsetY = Canvas.GetTop(dragableElement);
    var offsetX = Canvas.GetLeft(dragableElement);

    if (double.IsNaN(offsetY)) offsetY = 0;
    if (double.IsNaN(offsetX)) offsetX = 0;

    Canvas.SetTop(dragableElement, offsetY + translatedPoint.Y);
    Canvas.SetLeft(dragableElement, offsetX + translatedPoint.X);

If this IKinectControl is placed inside a Canvas it’s position is updated relative to the users hand.

The complete code is found here.
This article in markdown is found here.


ContinousGrippedState in Kinect.Reactive

For a while now I was wondering why the Kinect’s InteractionStream sends only one InteractionHandEventType.Grip when the user closes its hand. While the user still holds its hand in a closed state the SDK will fire events that have a HandEventType of None. This confused me from the very beginning. Compared to mouse events you’ll get continous mousedown events when the user does not release the mouse button.

So I thought about a way to get the same functionality when using the Kinect for Windows SDKs 1.x InteractionStream.
This extension method solved my problem and is now part of Kinect.Reactive:

/// <summary>
/// Returns a sequence with continuous GrippedState HandEventType until GripRelease.
/// </summary>
/// <param name="source">The source observable.</param>
/// <returns>The observable.</returns>
public static IObservable<UserInfo[]> ContinousGrippedState(this IObservable<UserInfo[]> source)
  if (source == null) throw new ArgumentNullException("source");

  var memory = new Dictionary<Tuple<int, InteractionHandType>, object>();
  var propInfo = typeof(InteractionHandPointer).GetProperty("HandEventType");
  var handEventTypeSetter = new Action<InteractionHandPointer>(o => propInfo.SetValue(o, InteractionHandEventType.Grip));

  return source.Select(_ =>
    _.ForEach(u => u.HandPointers.ForEach(h =>
     if (h.HandEventType == InteractionHandEventType.Grip)
        memory.Add(Tuple.Create(u.SkeletonTrackingId, h.HandType), null);
     else if (h.HandEventType == InteractionHandEventType.GripRelease)
        memory.Remove(Tuple.Create(u.SkeletonTrackingId, h.HandType));
     else if (memory.ContainsKey(Tuple.Create(u.SkeletonTrackingId, h.HandType)))

   return _;

Use the extension method this way and you’ll get continously events with e.HandEventType == InteractionHandEventType.Grip until you’ll release your hand.

IDisposable subscription = null;

KinectConnector.GetKinect().ContinueWith(k =>
  var disp = k.Result.KickStart(true)
              .GetUserInfoObservable(new InteractionClientConsole())
              .SelectMany(_ => _.Select(__ => __.HandPointers.Where(CheckForRightGripAndGripRelease)))
              .Subscribe(_ => _.ForEach(__ => Console.WriteLine(String.Format("Active: {0}, HandEventType: {1}", __.HandType, __.HandEventType))));
  subscription = disp;


Drag & Drop with Kinect for Windows

With the inclusion of the InteractionStream and the ability to detect a Grip gesture in the Kinect for Windows SDK Update 1.7 it’s now possible to grab UI elements on a screen and move them around. This blog post shows a possible implementation in a WPF application. Please notice that I’m using the following nuGet packages

Code-Behind: MainWindow.cs

// this code can be called after initialization of the MainWindow

// Get a kinect instance with started SkeletonStream and DepthStream
var kinect = await KinectConnector.GetKinect();

// instantiate an object that implements IInteractionClient
var interactionClient = new InteractionClient();

// GetUserInfoObservable() method is available through Kinect.Reactive
      .SelectMany(_ => _.Select(__ => __.HandPointers.Where(___ => ___.IsActive)))
      .Where(_ => _.FirstOrDefault() != null)
      .Select(_ => _.First())
      .Subscribe(_ =>
            var region = this.kinectRegion;
            var p = new Point(_.X * region.ActualWidth, _.Y * region.ActualHeight);
            if (_.HandEventType == InteractionHandEventType.Grip)
                  var elem = this.kinectRegion.InputHitTest(p) as TextBlock));
                  if (elem != null)
                        this.lastTouched = elem;
            else if(_.HandEventType == InteractionHandEventType.GripRelease)
                  this.lastTouched = null;
                  if (this.lastTouched == null) return;
                  Canvas.SetLeft(this.lastTouched, p.X - this.lastTouched.ActualWidth / 2);
                  Canvas.SetTop(this.lastTouched, p.Y - this.lastTouched.ActualHeight / 2);

XAML: MainWindow.xaml

<Window x:Class="DragAndDrop.MainWindow"
        Title="MainWindow" WindowState="Maximized">
        <Style TargetType="TextBlock">
            <Setter Property="Height" Value="200" />
            <Setter Property="Width" Value="200" />
            <Setter Property="Foreground" Value="White" />
            <Setter Property="FontWeight" Value="ExtraBold" />
            <Setter Property="FontSize" Value="35" />
            <Setter Property="Text" Value="Drag Me" />
            <Setter Property="TextAlignment" Value="Center" />
            <Setter Property="Background" Value="Black" />
    <k:KinectRegion x:Name="kinectRegion" KinectSensor="{Binding Kinect}">
                <RowDefinition Height="100" />
                <RowDefinition Height="*" />
            <Grid Grid.Row="0">
                <k:KinectUserViewer x:Name="userViewer" />
            <Canvas Grid.Row="1">
                <TextBlock Canvas.Left="50" Canvas.Top="50" />
                <TextBlock Canvas.Left="260" Canvas.Top="50" />
                <TextBlock Canvas.Left="470" Canvas.Top="50" />
                <TextBlock Canvas.Left="680" Canvas.Top="50" />
                <TextBlock Canvas.Left="890" Canvas.Top="50" />

await GetKinect()

In the first blog post about FluentKinect I’ve mentioned that I’m not very happy with the actual process of getting a KinectSensor instance from the KinectSensors collection.

FluentKinect has now been updated and the KinectConnector’s static method GetKinect is now awaitable.

If you now call KinectConnector.GetKinect() and no KinectSensor is connected to your PC a new Task is started that listens for StatusChanged events of the KinectSensor collection. If you later plugin a Kinect controller the Task returns the connected KinectSensor instance.

KinectConnector’s GetKinect method:

public static Task<KinectSensor> GetKinect()
	return Task.Factory.StartNew<KinectSensor>(() =>
		if (kinectSensor != null) return kinectSensor;

		var kinect = KinectSensor.KinectSensors
							.FirstOrDefault(_ => _.Status == KinectStatus.Connected);
		if (kinect != null)
			kinectSensor = kinect;
			return kinectSensor;

		using (var signal = new ManualResetEventSlim())
			KinectSensor.KinectSensors.StatusChanged += (s, e) =>
				if (e.Status == KinectStatus.Connected)
					kinectSensor = e.Sensor;
					coordinateMapper = new CoordinateMapper(kinectSensor);


		return kinectSensor;

How to use it:

var kinect = await KinectConnector.GetKinect();

It’s not thread safe at the moment but there are a few improvements now in my opinion.

    • You can start and debug your actual program without a Kinect controller connected because no more exception is thrown. This helps with Kinect programming on a plane for example. 😉
    • Your programm starts faster because GetKinect returns immediately
    • Since GetKinect returns a Task you get the ability to await the result

The code was pushed to GitHub and the nuget package FluentKinect has been updated as well.

Looking forward to new improvements!

Fluent Kinect

Since I have been playing around with the Kinect for Windows SDK I’ve created a lot of little new projects and samples to try things out. Starting point was always something like this:

var sensor = KinectSensor.KinectSensors
                         .FirstOrDefault(_ => _.Status == KinectStatus.Connected);
if (sensor == null) throw new InvalidOperationException("No kinect connected");

sensor.SkeletonStream.EnableTrackingInNearRange = true;
sensor.SkeletonStream.TrackingMode = SkeletonTrackingMode.Seated;

A lot of code just to set up a Kinect sensor, isn’t it?

Why not using a fluent style with less and cleaner code to set up a Kinect Sensor? So I came up with the idea of FluentKinect, a project with a few extension methods. Now I can set up my Kinect Sensor this way:

var sensor = KinectSensor.KinectSensors
                         .FirstOrDefault(_ => _.Status == KinectStatus.Connected);
if (sensor == null) throw new InvalidOperationException("No kinect connected");


Because I most often use the 640×480 option anyway, the format is an optional parameter when enabling the streams and it defaults to *640x480Fps30.
I’ve exracted the two little lines that gets the first connected Kinect Sensor to a class called KinectConnector. At the moment an exception is thrown when no Kinect unit is connected. This is not a very good way of handling this scenario and will be changed in the future.
Now the code is even cleaner:

var sensor = KinectConnector.GetKinect()

For an even shorter and quicker Setup I’ve implemented the method ‘KickStart’ which enables the three streams and calls Start() on the KinectSensor object.
For future ‘try out samples’ I’ll just have to code this now:

var sensor = KinectConnector.GetKinect()

Framework soup

Over the last months I’ve spent quite some time checking out the Kinect for Windows SDK and it’s really fun to work with this awesome little piece of hard and software.

While playing around with the Kinect for Windows SDK it soon became obvious that I’ll need some sort of framework that handles all the tough event handling stuff for me. Thats were the ReactiveExtensions framework came into play. Since I’ve first read about it I wanted to use it somewhere but didn’t get the chance so far. Now that the Kinect came around the corner with its event driven API Rx fitted perfectly to my requirements.

Another framework that impressed me a lot is SignalR, an abstraction layer for persistent connections over http.

So how do they all come together? In an application I called ‘KickerNotifier‘!
We have a ‘Kicker’ (thats german, in english foosball/tabletop football/tabletop soccer) to play with in the cellar of our office.

Problem is that the table stands in the cellar and you never know whether it is busy.

So the idea is the following: Set up a Kinect that is able to track the number of people in the room and push the amount to a website so that all the colleagues are able to see how many people are playing.

The technology stack:

The SignalR hub class
public class KickerNotifyHub : Hub
    private Int32 playerCount = -1;
    public void SetPlayerCount(Int32 count)
        if (this.playerCount != count)
            this.playerCount = count;
Webpage hub connection
<script type="text/javascript" src="Scripts/jquery-1.6.4.js" />
<script type="text/javascript" src="Scripts/jquery.signalR-0.5.3.js" />
<script src="signalr/hubs" type="text/javascript" />
$(function () {
    var hub = $.connection.kickerNotifyHub;

    hub.setCurrentPlayerCount = function (count) {
        $('#currentPlayerCount').text(count); // updates <div id="currentPlayerCount" />

Client console application
using System.Reactive.Disposables;
using System.Reactive.Linq;
using SignalR.Client;
using SignalR.Client.Hubs;

// some method

var personNotification = new PersonNotification();
var connection = new HubConnection("http://???????");
var hub = connection.CreateProxy("KickerNotifyHub");
var setPlayerCountSubscription = Disposable.Empty;
connection.Start().ContinueWith(task =>
   if (task.IsFaulted)
      Console.WriteLine("Failed to start: {0}", task.Exception.GetBaseException());
      Console.WriteLine("Success! Connected with client connection id {0}", connection.ConnectionId);

      setPlayerCountSubscription = Observable.Interval(TimeSpan.FromSeconds(1))
                                             .Subscribe(l => hub.Invoke("SetPlayerCount", personNotification.PersonCount));

public class PersonNotification
   private readonly KinectSensor kinect;
   private readonly IDisposable newSkeletonDataEvent;

   public PersonNotification() : IDisposable
      this.kinect = KinectSensor.KinectSensors
                                .FirstOrDefault(s => s.Status == KinectStatus.Connected);
      if (this.kinect == null) throw new InvalidOperationException("No Kinect connected.");

      this.newSkeletonDataEvent = Observable.FromEventPattern(this.kinect, "SkeletonFrameReady")
                                            .Select(e => e.EventArgs)

      this.kinect.SkeletonStream.TrackingMode = SkeletonTrackingMode.Seated;

   private void NewSkeletonData(SkeletonFrameReadyEventArgs skeletonDataFrame)
      using (var frame = skeletonDataFrame.OpenSkeletonFrame())
         if (frame == null) return;

         var skeletons = new Skeleton[frame.SkeletonArrayLength];

         var personCount = skeletons.Count(s => s.TrackingState == SkeletonTrackingState.PositionOnly ||
                                                s.TrackingState == SkeletonTrackingState.Tracked);
         if (this.PersonCount != personCount)
            this.PersonCount = personCount;

   public Int32 PersonCount { get; private set; }

   public void Dispose()


Events in the .NET programming model really don’t have anything in common with what the Object Oriented paradigm has taught us. Personally I dislike events because they are not first class objects but some kind of background compiler voodoo.

When it comes to writing against an event driven API like the Kinect for Windows SDK you don’t have much of a choice but programming against those events. But wait, there is this wonderful ReactiveExtensions library that comes to the rescue. These classes and extension methods give you the possibility to easily wrap an event in an object and furthermore the library provides you with tons of handy methods that you don’t have to code yourself.

So I decided to write my own IObservable extension methods to extend the Kinect API with the ReactiveExtensions programming model.
Here are two methods to give you an idea of what I’m talking about:

public static IObservable<AllFramesReadyEventArgs> GetAllFramesReadyObservable(this KinectSensor kinectSensor)
   if(kinectSensor == null) throw new ArgumentNullException("kinectSensor");

   return Observable.FromEventPattern<AllFramesReadyEventArgs>(h => kinectSensor.AllFramesReady += h,
                                                               h => kinectSensor.AllFramesReady -= h)
                    .Select(e => e.EventArgs);

public static IObservable<ColorImageFrameReadyEventArgs> GetColorFrameReadyObservable(this KinectSensor kinectSensor)
   if (kinectSensor == null) throw new ArgumentNullException("kinectSensor");

   return Observable.FromEventPattern<ColorImageFrameReadyEventArgs>(
                                                           h => kinectSensor.ColorFrameReady += h,
                                                           h => kinectSensor.ColorFrameReady -= h)
                    .Select(e => e.EventArgs);

And so on…