Kinect for Windows SDK開發(fā)入門(四)：景深數(shù)據(jù)處理上

時間：2018-12-12 11:00:01

關(guān)鍵字： kinect 入門嵌入式開發(fā) 數(shù)據(jù)處理景深

手機看文章

掃描二維碼
隨時隨地手機看文章

Kinect傳感器的最主要功能之一就是能夠產(chǎn)生三維數(shù)據(jù)，通過這些數(shù)據(jù)我們能夠創(chuàng)建一些很酷的應(yīng)用。開發(fā)Kinect應(yīng)用程序之前，最好能夠了解Kinect的硬件結(jié)構(gòu)。Kinect紅外傳感器能夠探測人體以及非人體對象例如椅子或者咖啡杯。有很多商業(yè)組織和實驗室正在研究使用景深數(shù)據(jù)來探測物體。

本文詳細介紹了Kinect紅外傳感器，景深數(shù)據(jù)格式，景深圖像的獲取與展示，景深圖像的增強處理。

1. Kinect傳感器

和許多輸入設(shè)備不一樣，Kinect能夠產(chǎn)生三維數(shù)據(jù)，它有紅外發(fā)射器和攝像頭。和其他Kinect SDK如OpenNI或者libfreenect等SDK不同，微軟的Kinect SDK沒有提供獲取原始紅外數(shù)據(jù)流的方法，相反，Kinect SDK從紅外攝像頭獲取的紅外數(shù)據(jù)后，對其進行計算處理，然后產(chǎn)生景深影像數(shù)據(jù)。景深影像數(shù)據(jù)從DepthImageFrame產(chǎn)生，它由DepthImageStream對象提供。

DepthImageStream的使用和ColorImageStream的使用類似。DepthImageStream和ColorImageStream都繼承自ImageStream。可以像從ColorImageStream獲取數(shù)據(jù)生成圖像那樣生成景深圖像。先看看將景深數(shù)據(jù)展現(xiàn)出來需要的步驟。下面的步驟和前面顯示彩色影像數(shù)據(jù)相似：

1. 創(chuàng)建一個新的WPF對象。

2. 添加Microsoft.Kinect.dll對象引用。

3. 添加一個Image元素到UI上，將名稱改為DepthImage。

4. 添加必要的發(fā)現(xiàn)和釋放KinectSensor對象的代碼?？梢詤⒄?A style="BORDER-BOTTOM: rgb(51,51,51) 1px dotted; PADDING-BOTTOM: 0px; MARGIN: 0px; PADDING-LEFT: 0px; PADDING-RIGHT: 0px; COLOR: rgb(80,171,197); TEXT-DECORATION: none; PADDING-TOP: 0px" href="http:///yangecnu/archive/2012/03/31/2427652.html" target=_blank>前面的文章。

5. 修改初始化KinectSensor對象的代碼如下：

private void InitializeKinectSensor(KinectSensor kinectSensor){    if (kinectSensor != null)    {        DepthImageStream depthStream = kinectSensor.DepthStream;        depthStream.Enable();        depthImageBitMap = new WriteableBitmap(depthStream.FrameWidth, depthStream.FrameHeight,  96,96,PixelFormats.Gray16, null);        depthImageBitmapRect = new Int32Rect(0, 0, depthStream.FrameWidth, depthStream.FrameHeight);        depthImageStride = depthStream.FrameWidth * depthStream.FrameBytesPerPixel;        DepthImage.Source = depthImageBitMap;        kinectSensor.DepthFrameReady += kinectSensor_DepthFrameReady;        kinectSensor.Start();    }}

6. 修改DepthFrameReady事件，代碼如下：

void kinectSensor_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e){    using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())    {        if (depthFrame != null)        {            short[] depthPixelDate = new short[depthFrame.PixelDataLength];            depthFrame.CopyPixelDataTo(depthPixelDate);            depthImageBitMap.WritePixels(depthImageBitmapRect, depthPixelDate, depthImageStride, 0);        }    }}

運行程序，將會看到如下結(jié)果，由于一手需要截圖，一手需要站在Kinect前面所以姿勢不是很對，有點挫，不過人物的輪廓還是顯示出來了，在景深數(shù)據(jù)中，離Kinect越近，顏色越深，越遠，顏色越淡。

2. Kinect 深度測量原理

和其他攝像機一樣，近紅外攝像機也有視場。Kinect攝像機的視野是有限的，如下圖所示：

如圖，紅外攝像機的視場是金字塔形狀的。離攝像機遠的物體比近的物體擁有更大的視場橫截面積。這意味著影像的高度和寬度，比如640X480和攝像機視場的物理位置并不一一對應(yīng)。但是每個像素的深度值是和視場中物體離攝像機的距離是對應(yīng)的。深度幀數(shù)據(jù)中，每個像素占16位，這樣BytesPerPixel屬性，即每一個像素占2個字節(jié)。每一個像素的深度值只占用了16個位中的13個位。如下圖：

獲取每一個像素的距離很容易，但是要直接使用還需要做一些位操作?？赡艽蠹以趯嶋H編程中很少情況會用到位運算。如上圖所示，深度值存儲在第3至15位中，要獲取能夠直接使用的深度數(shù)據(jù)需要向右移位，將游戲者索引(Player Index)位移除。后面將會介紹游戲者索引位的重要性。下面的代碼簡要描述了如何獲取像素的深度值。代碼中pixelData變量就是從深度幀數(shù)據(jù)中獲取的short數(shù)組。PixelIndex基于待計算像素的位置就算出來的。SDK在DepthImageFrame類中定義了一個常量PlayerIndexBitmaskWidth，它定義了要獲取深度數(shù)據(jù)值需要向右移動的位數(shù)。在編寫代碼時應(yīng)該使用這一常量而不是硬編碼，因為未來隨著軟硬件水平的提高，Kinect可能會增加能夠同時識別人數(shù)的個數(shù)，從而改變PlayerIndexBitmaskWidth常量的值。

Int32 pixelIndex = (Int32)(p.X + ((Int32)p.Y * frame.Width));Int32 depth = this.depthPixelDate[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;

顯示深度數(shù)據(jù)最簡單的方式是將其打印出來。我們要將像素的深度值顯示到界面上，當鼠標點擊時，顯示鼠標點擊的位置的像素的深度值。第一步是在主UI界面上添加一個TextBlock：

<Window x:Class="KinectDepthImageDemo.MainWindow"        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"        Title="KinectDepthImage" Height="600" Width="1280" WindowStartupLocation="CenterScreen">    <Grid>        <StackPanel Orientation="Horizontal">            <TextBlock x:Name="PixelDepth" FontSize="48" HorizontalAlignment="Left"  />            <Image x:Name="DepthImage"  Width="640" Height="480" ></Image>        </StackPanel>    </Grid></Window>

接著我們要處理鼠標點擊事件。在添加該事件前，需要首先添加一個私有變量lastDepthFrame來保存每一次DepthFrameReady事件觸發(fā)時獲取到的DepthFrame值。因為我們保存了對最后一個DepthFrame對象的引用，所以事件處理代碼不會馬上釋放該對象。然后，注冊DepthFrame 圖像控件的MouseLeftButtonUp事件。當用戶點擊深度圖像時,DepthImage_MouseLeftButtonUp事件就會觸發(fā)，根據(jù)鼠標位置獲取正確的像素。最后一步將獲取到的像素值的深度值顯示到界面上，代碼如下：

void kinectSensor_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e){   if (lastDepthFrame!=null)   {       lastDepthFrame.Dispose();       lastDepthFrame = null;   }    lastDepthFrame = e.OpenDepthImageFrame();    if (lastDepthFrame != null)    {        depthPixelDate = new short[lastDepthFrame.PixelDataLength];        lastDepthFrame.CopyPixelDataTo(depthPixelDate);        depthImageBitMap.WritePixels(depthImageBitmapRect, depthPixelDate, depthImageStride, 0);    }}

private void DepthImage_MouseLeftButtonUp(object sender, MouseButtonEventArgs e){    Point p = e.GetPosition(DepthImage);    if (depthPixelDate != null && depthPixelDate.Length > 0)    {        Int32 pixelIndex = (Int32)(p.X + ((Int32)p.Y * this.lastDepthFrame.Width));        Int32 depth = this.depthPixelDate[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;        Int32 depthInches = (Int32)(depth * 0.0393700787);        Int32 depthFt = depthInches / 12;        depthInches = depthInches % 12;        PixelDepth.Text = String.Format("{0}mm~{1}'{2}", depth, depthFt, depthInches);    }}

有一點值得注意的是，在UI界面中Image空間的屬性中，寬度和高度是硬編碼的。如果不設(shè)置值，那么空間會隨著父容器（From窗體）的大小進行縮放，如果空間的長寬尺寸和深度數(shù)據(jù)幀的尺寸不一致，當鼠標點擊圖片時，代碼就會返回錯誤的數(shù)據(jù)，在某些情況下甚至會拋出異常。像素數(shù)組中的數(shù)據(jù)是固定大小的，它是根據(jù)DepthImageStream的Enable方法中的DepthImageFormat參數(shù)值來確定的。如果不設(shè)置圖像控件的大小，那么他就會根據(jù)Form窗體的大小進行縮放，這樣就需要進行額外的計算，將鼠標的在Form中的位置換算到深度數(shù)據(jù)幀的維度上。這種縮放和空間轉(zhuǎn)換操作很常見，在后面的文章中我們將會進行討論，現(xiàn)在為了簡單，對圖像控件的尺寸進行硬編碼。

結(jié)果如下圖，由于截屏時截不到鼠標符號，所以用紅色點代表鼠標位置，下面最左邊圖片中的紅色點位于墻上，該點距離Kinect 2.905米，中間圖的點在我的手上，可以看出手離Kinect距離為1.221米，實際距離和這個很相近，可見Kinect的景深數(shù)據(jù)還是很準確的。

上面最右邊圖中白色點的深度數(shù)據(jù)為-1mm。這表示Kinect不能夠確定該像素的深度。在處理上數(shù)據(jù)時，這個值通常是一個特殊值，可以忽略。-1深度值可能是物體離Kinect傳感器太近了的緣故。

3. 深度影像增強

在進一步討論之前，需要會深度值圖像進行一些處理。在下面的最左邊的圖中，灰度級都落在了黑色區(qū)域，為了使圖像具有更好的灰度級我們需要像之前對彩色數(shù)據(jù)流圖像進行處理那樣，對深度值圖像進行一些處理。

3.1增強深度影像灰度級

增強深度值圖像的最簡單方法是按位翻轉(zhuǎn)像素值。圖像的顏色是基于深度值的，他們從0開始。在數(shù)字光譜中0表示黑色，65536(16位灰階)表示白色。這意味著下面最左邊那幅圖中，大部分的值都落在了黑色部分。還有就是所有的不能確定深度值的數(shù)據(jù)都設(shè)置為了0。對位取反操作就會將這些值轉(zhuǎn)換到白色的部分。作為對比，現(xiàn)在在UI上再添加一個Image控件用來顯示處理后的值。

<Window x:Class="KinectDepthImageDemo.MainWindow"        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"        Title="KinectDepthImage" Height="600" Width="1280" WindowStartupLocation="CenterScreen">    <Grid>        <StackPanel Orientation="Horizontal">            <Image x:Name="DepthImage"  Width="640" Height="480" ></Image>            <Image x:Name="EnhancedDepthImage" Width="640" Height="480" />        </StackPanel>    </Grid></Window>

下面的代碼展示了如何將之前的深度位數(shù)據(jù)取反獲取更好的深度影像數(shù)據(jù)。該方法在kinectSensor_DepthFrameReady事件中被調(diào)用。代碼首先創(chuàng)建了一個新的byte數(shù)組，然后對這個位數(shù)組進行取反操作。注意代碼中過濾掉了一些距離太近的點。因為過近的點和過遠的點都不準確。所以過濾掉了大于3.5米小于0米的數(shù)據(jù)，將這些數(shù)據(jù)設(shè)置為白色。

private void CreateLighterShadesOfGray(DepthImageFrame depthFrame, short[] pixelData){    Int32 depth;    Int32 loThreashold = 0;    Int32 hiThreshold = 3500;    short[] enhPixelData = new short[depthFrame.Width * depthFrame.Height];    for (int i = 0; i < pixelData.Length; i++)    {        depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;        if (depth < loThreashold || depth > hiThreshold)        {            enhPixelData[i] = 0xFF;        }        else        {            enhPixelData[i] = (short)~pixelData[i];        }    }    EnhancedDepthImage.Source= BitmapSource.Create(depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Gray16, null, enhPixelData, depthFrame.Width * depthFrame.BytesPerPixel);}

經(jīng)過處理，圖像（上面中間那幅圖）的表現(xiàn)力提高了一些，但是如果能夠?qū)?6位的灰度級用32位彩色表示效果會更好。當 RGB值一樣時，就會呈現(xiàn)出灰色?；叶戎档姆秶?~255，0為黑色，255為白色，之間的顏色為灰色?，F(xiàn)在將灰色值以RGB模式展現(xiàn)出來。代碼如下：

private void CreateBetterShadesOfGray(DepthImageFrame depthFrame, short[] pixelData){    Int32 depth;    Int32 gray;    Int32 loThreashold = 0;    Int32 bytePerPixel = 4;    Int32 hiThreshold = 3500;    byte[] enhPixelData = new byte[depthFrame.Width * depthFrame.Height*bytePerPixel];    for (int i = 0,j=0; i < pixelData.Length; i++,j+=bytePerPixel)    {        depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;        if (depth < loThreashold || depth > hiThreshold)        {            gray = 0xFF;        }        else        {            gray = (255*depth/0xFFF);        }        enhPixelData[j] = (byte)gray;        enhPixelData[j + 1] = (byte)gray;        enhPixelData[j + 2] = (byte)gray;    }    EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Bgr32, null, enhPixelData, depthFrame.Width * bytePerPixel);}

上面的代碼中，將彩色影像的格式改為了Bgr32位，這意味每一個像素占用32位（4個字節(jié)）。每一個R,G,B分別占8位，剩余8位留用。這種模式限制了RGB的取值為0-255，所以需要將深度值轉(zhuǎn)換到這一個范圍內(nèi)。除此之外，我們還設(shè)置了最小最大的探測范圍，這個和之前的一樣，任何不在范圍內(nèi)的都設(shè)置為白色。將深度值除以4095（0XFFF，深度探測的最大值），然后乘以255，這樣就可以將深度數(shù)據(jù)轉(zhuǎn)換到0至255之間了。運行后效果如上右圖所示，可以看出，采用顏色模式顯示灰度較之前采用灰度模式顯示能夠顯示更多的細節(jié)信息。

3.2 深度數(shù)據(jù)的彩色渲染

將深度數(shù)據(jù)值轉(zhuǎn)化到0-255并用RGB模式進行顯示可以起到增強圖像的效果，能夠從圖像上直觀的看出更多的深度細節(jié)信息。還有另外一種簡單，效果也不錯的方法，那就是將深度數(shù)據(jù)值轉(zhuǎn)換為色調(diào)和飽和度并用圖像予以顯示。下面的代碼展示了這一實現(xiàn)：

private void CreateColorDepthImage(DepthImageFrame depthFrame, short[] pixelData){    Int32 depth;    Double hue;    Int32 loThreshold = 1200;    Int32 hiThreshold = 3500;    Int32 bytesPerPixel = 4;    byte[] rgb = new byte[3];    byte[] enhPixelData = new byte[depthFrame.Width * depthFrame.Height * bytesPerPixel];    for (int i = 0, j = 0; i < pixelData.Length; i++, j += bytesPerPixel)    {        depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;        if (depth < loThreshold || depth > hiThreshold)        {            enhPixelData[j] = 0x00;            enhPixelData[j + 1] = 0x00;            enhPixelData[j + 2] = 0x00;        }        else        {            hue = ((360 * depth / 0xFFF) + loThreshold);            ConvertHslToRgb(hue, 100, 100, rgb);            enhPixelData[j] = rgb[2];  //Blue            enhPixelData[j + 1] = rgb[1];  //Green            enhPixelData[j + 2] = rgb[0];  //Red        }    }    EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Bgr32, null, enhPixelData, depthFrame.Width * bytesPerPixel);}

以上代碼中使用了ConvertHslToRgb這一函數(shù)，該函數(shù)的作用是進行兩個顏色空間的轉(zhuǎn)換，就是將H(Hue色調(diào))S(Saturation飽和度)L(Light亮度)顏色空間轉(zhuǎn)換到RGB顏色空間的函數(shù)。之前學過遙感圖像處理，所以對這兩個顏色空間比較熟悉。轉(zhuǎn)化的代碼如下：

public void ConvertHslToRgb(Double hue, Double saturation, Double lightness, byte[] rgb){    Double red = 0.0;    Double green = 0.0;    Double blue = 0.0;    hue = hue % 360.0;    saturation = saturation / 100.0;    lightness = lightness / 100.0;    if (saturation == 0.0)    {        red = lightness;        green = lightness;        blue = lightness;    }    else    {        Double huePrime = hue / 60.0;        Int32 x = (Int32)huePrime;        Double xPrime = huePrime - (Double)x;        Double L0 = lightness * (1.0 - saturation);        Double L1 = lightness * (1.0 - (saturation * xPrime));        Double L2 = lightness * (1.0 - (saturation * (1.0 - xPrime)));        switch (x)        {            case 0:                red = lightness;                green = L2;                blue = L0;                break;            case 1:                red = L1;                green = lightness;                blue = L0;                break;            case 2:                red = L0;                green = lightness;                blue = L2;                break;            case 3:                red = L0;                green = L1;                blue = lightness;                break;            case 4:                red = L2;                green = L0;                blue = lightness;                break;            case 5:                red = lightness;                green = L0;                blue = L1;                break;        }    }    rgb[0] = (byte)(255.0 * red);    rgb[1] = (byte)(255.0 * green);    rgb[2] = (byte)(255.0 * blue);}

運行程序，會得到如下右圖結(jié)果（為了對比，下面左邊第一幅圖是原始數(shù)據(jù)，第二幅圖是使用RGB模式顯示深度數(shù)據(jù)）。最右邊圖中，離攝像頭近的呈藍色，然后由近至遠顏色從藍色變?yōu)樽仙?，最遠的呈紅色。圖中，我手上托著截圖用的鍵盤，所以可以看到，床離攝像頭最近，呈藍色，鍵盤比人體里攝像頭更近，呈談藍色，人體各部分里攝像頭的距離也不一樣，胸、腹、頭部離攝像頭更近。后面的墻離攝像頭最遠，呈橙色至紅色。

運行上面的程序會發(fā)現(xiàn)很卡，我好不容易才截到這張圖，這是因為在將HUL空間向顏色空間轉(zhuǎn)換需要對640*480=307200個像素逐個進行運算，并且運算中有小數(shù)，除法等操作。該計算操作和UI線程位于同一線程內(nèi)，會阻塞UI線程更新界面。更好的做法是將這一運算操作放在background線程中。每一次當KinectSensor觸發(fā)frame-ready事件時，代碼順序存儲彩色影像。轉(zhuǎn)換完成后，backgroud線程使用WPF中的Dispatcher來更新UI線程中Image對象的數(shù)據(jù)源。上一篇文章中以及講過這一問題，這種異步的操作在基于Kinect開發(fā)的應(yīng)用中很常見，因為獲取深度數(shù)據(jù)是一個很頻繁的操作。如果將獲取數(shù)據(jù)以及對數(shù)據(jù)進行處理放在主UI線程中就會使得程序變得很慢，甚至不能響應(yīng)用戶的操作，這降低了用戶體驗。

4. 結(jié)語

本文介紹了Kinect紅外攝像頭產(chǎn)生的深度影像數(shù)據(jù)流，KinectSensor探測深度的原理，如何獲取像素點的深度值，深度數(shù)據(jù)的可視化以及一些簡單的增強處理。

限于篇幅原因，下一篇文章將會介紹Kinect景深數(shù)據(jù)影像處理，以及在本文第2節(jié)中所景深數(shù)據(jù)格式中沒有講到的游戲者索引位(Player Index)，最后將會介紹KinectSensor紅外傳感器如何結(jié)合游戲者索引位獲取人物的空間范圍，包括人物的寬度，高度等信息，敬請期待。