From Friday, April 19th (11:00 PM CDT) through Saturday, April 20th (2:00 PM CDT), 2024, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

Example Code

Image Stitching a Video Stream

Code and Documents

Attachment

Overview


This example goes through some of the theory an implementation of image stitching a video stream in LabVIEW. The application was tested using a Creative VF0220 WebCam, but will function properly so long as the user has an IMAQdx compatible WebCam (make sure the camera name is cam1 or modify the Acquisition Express VI to select the appropriate camera). The user will be required to have the Vision Development Module and the Vision Acquisition Software.

Image Stitching Theoretical to Implementation

Theoretical to Implementation Overview

    The following reference was used when this project was began in an attempt to understand still frame image stitching: http://www.codeproject.com/KB/recipes/automatic_panoramas.aspx. As you can see there are clearly outlined steps which, although extremely useful as a starting point, do not directly transfer into LabVIEW with the level of efficiency expected by the user. Each of the steps in the process will be outlined here and their corresponding VIs will be analyzed and explained. The primary file within the project is the DEMO - Image Stitcher.vi which will be covered generally at the end of this document. For the sake of readability/simplicity, code/equations will not be included in the text portion of this document. They may be viewed within the downloadable code attached to the document. 

    The general process is as follows:

    1. Initialize
    2. Acquire Image
    3. Identify and Process Particles
    4. Set Up First and Second Frames
    5. For Each Frame:
      1. Correlate Particles
      2. Eliminate Outliers
      3. Overlay and Display
      4. Merge Images
      5. To Step 2
    6. Dispose Images 

Step 1: Application Variable Initialization

    The example program makes use of many application variables which are persistent between loop iterations. These variables are designed to allow for the program to be expanded beyond it's current capabilities into some of the capabilities described at the end of this document. The initialize VI should always be called at the beginning of the application (outside of any loops) and wired to a shift register on the primary application while loop. The Application Variables and their explanations are as follows:

Application State - Allows the application to track whether it is on the first image, second image, or any other image. This is required because certain processes must be executed to help the setup of the stitching during the first two images in the stream.

Compound Image - This Image reference will contain the current compounded image of all the images read in the stream so far.

Image Meta-Data - This array contains clusters which correspond to every image added to the stream's associated meta data. This meta data includes:

    Origin Offset - The amount the current image is offset from the virtual origin point.

    Parent Node Offset - The amount the current image is offset from the frame from which it's location was referenced.

    Number of Particles - The number of distinct particles in the frame.

    Particle Measurements - An array of measurements of the distinct particles in the frame.

Direction - The current direction limitation on the application (if the second frame is to the right, all other frames will be required to be to the right; if the second frame is to the left, all other frames will be to the left).

Origin Shift Out - The computed shift of the original origin from the (0,0) point of the compound image.

Measure Weights - Used in the correlation phase, the weight of each of the particle measurements when establishing a correlation (default weights all equally, but could be investigated as a way of optimizing object correlation).

error out - The error state of the application.

    The demo file also includes a reset button which allows for the reinitialization of the application variables. These variables need not be disposed so long as the Imaq Dispose.vi is called at the end with the Dispose All boolean value wired to true (as can be seen in the demo).

Step 2: Acquire and Process

    The Acquire and Process.vi is the primary acquisition and processing VI which prepares the image, produces the meta-data and releases all of the iteration-dependent variables (the variables which are NOT transferred between iterations but ARE transferred between inter-iteration steps. Within this VI are two express VIs. The first express VI acquires the image (double clicking this VI will allow  you to configure it for your specific camera, or you may do so in MAX and name it cam1). The Vision Assistant express VI processes the image and acquires particle information. The acquired image is immediately cast into a U8 and the output image from the processing step is a binary image.

    The following image processing steps are applied in the vision assistant:

        Smoothing - Local Average with a 7x7 kernel
        Smoothing - Gaussian
with a 7x7 kernel
        Edge Detect - Diff
        Auto Threshold: Clustering
looking for Bright Objects
        Basic Morphology, Open Objects on a 7x7 kernel
        Particle Filter, remove particles who's %Area/Image Area is less then .01%
        Particle Analysis, analyzes the particles using the following metrics:
            Center of Mass X
            Center of Mass Y
            First Pixel X
            First Pixel Y
            Bounding Rect Left
            Bounding Rect Top
            Bounding Rect Right
            Bounding Rect Bottom
            Area
            %Area/ImageArea

    The results are output in the form of the number of particles and the associated analysis of each metric on each particle. These particle measures can now be used to find the relative offset of two images to each other, which can give us the overall offset on the compounded image and allow us to stitch the images together.

Step 3: Initial Frame Processing

    The first frame of the process requires a slightly different methodology to initialize the application. The function Initial Frame Processing.vi takes in the Application and Iteration variables, sets the origin offsets to (0,0), saves the particle metrics to the application variable array, asserts it's self as the compound image, and sets the application state to move on to the second call. 

Step 4: Second Image Frame Processing

    The second image frame must also be singled out from the remainder of the frames in order to establish a direction. The specifics of this VI will not be reviewed because of it's extreme similarity to Step 5, but it is singled out in the code because of minor differences between it's handling and the handling of all the remaining images.

Step 5: Other Image Frame Processing

    This is the primary processing step for the actual stitching. The functions within this block of code will be reviewed as individual steps, but described in overview in this section. The general steps are as follows:

    Correlate Regions - Compute a correlation score between two images to tell how similar their objects are.

    Eliminate Outliers - Eliminates the outlier correlations and computes an average offset between the two images.

    Overlay Regions - Create an overlay image for the user to see the difference between the two object sets.

    Stitch - This step also contains a few substeps, but is only executed if the direction of the image is correct and the     maximal correlation threshold is met. The substeps are as follows:

        Compute Absolute Location - Takes the offset from the parent and computes the absolute offset from the virtual         origin, then the absolute offset in the current compounded image.

        Get Image Size - Gets the size of the new compounded image.

        Merge - Combines the current frame into the compound image.

Step 6: Correlate Regions

    As was stated in Step 5, Step 6 and beyond are actually substeps within the other image frame processing step. However, for clarity, they will be treated as individual steps. The correlate region step makes use of the particle measures between two images. For the sake of the demo, it will only be comparing it's self to the previous frame. However, as discussed later in this document, this could be replaced by recalculating and approximating correlations with all similarly-located images.

    The Regional Correlation Measure.vi calculates the correlation measure and is called for every particle in the image. These correlations are then output along with an array of maximally correlated regions. These regions are the regions whose properties match most closely to each other.

Step 7: Eliminate Outliers

    It is common in this step of the algorithm to implement some sort of RANSAC algorithm. However, for the sake of quick processing and accuracy, we simply take the top 10 most well correlated values and use them to calculate the offset between the two images. The offset is calculated as an average of the differences between the Centers of Mass of the particles in question.

Step 8: Overlay Regions

    This step is purely optional and is provided as a visualization step for the user's benefit. It simply draws colored regions (red for the previous frame, green for the current frame) for each of the correlated regions as well as a vector line emanating from the center of the image to show the approximated offset.

Step 9: Stitch

    This step is no longer a processing step to determine the relative offsets of the picture. It takes the relative offsets computed in step 7 and uses that, as well as the virtual origin, to determine how the image should be overlayed on the primary compound image. The virtual origin is a method which allows for the tracking of each of the images locations relative to an absolute physical location. The origin is the (0,0) point of the first image. This point is readjusted each iteration as that image is repositioned in the composite. That virtual origin point is used to calculate the absolute location as each image keeps track of it's location relative to it's parent and the origin. The Compute Origin and Location function performs this calculation and passes the new origin out.

    The size of the image is also calculated within this step and is used to allocate a new memory location for the Compound Image to ensure that it has room for the extra data. 

    The images are then passed to the Merge function which simply places the current image on the compound image. This step could be improved by providing some sort of alpha blending, but unfortunately this would provide a VERY blurry image. In order to appropriately merge the images you would need to adjust for the perspective shifts which happen with the movement of the camera. Otherwise, every object which is viewed from multiple perspectives would appear as a ghost, weighted by the number of frames in which it was viewed from a given perspective. Although this is a very entertaining effect, it is hardly the point of image stitching. This will be discussed further later in the document.

Step 10: Display, Clean Up, Repeat

    The final step is more of a formality. The data is cleaned up and packaged in a format which is consistent with the Application Variables and passed through to the next iteration. The pause and reset values are checked each iteration and do not effect the application variables (they need not be present, but provide a better environment for testing).

Conclusions

Overview

    Although the performance of the application is variable based on lighting and camera considerations, many hundreds are frames are added to the image before any obvious distortion effects were introduced. The current implementation is a highly limited use case which doesn't full utilize the architecture of the application. The primary example of this is in the half-dimensional nature of the acquisition. You can only stitch in one direction in one axis. The architecture is designed to include a second axis and allow for any-direction stitching. Due to time limitations, this implementation was not completed.

Further Work

    There are many possible improvements which could be made to the program to improve performance and add features. Some of the considered feature additions and performance improvements are listed here with small descriptions.

Feature Additions

    • Maximal Correlation and Approximation - Each frame could be computed to have a certain "velocity" which would allow for the approximation of the next frame's location. Once this approximation is made, the meta-data from all the frames within a certain area of that approximation, along with the previous frame (if excluded) could be used to correlate with the current image. In this way it would actually be possible to have images which wrap back on themselves and create complex higher-dimensional structures. This is required in order to do accurate multi-directional stitching.
    • Multi-Directional Stitching - Provide support for stitching in multiple directions. This would require the user to consider correlating to not just the previous frame image in order to get the accuracy desired. If you do not correlate to all the localized frames then you run a high probability of minor errors compounding resulting in images which are very disjoint in the are of current acquisition. This is required in order to do accurate multi-axis stitching.
    • Multi-Axis Stitching - Although the capability exists within the architecture to do multi-axis stitching, there are many things which need be considered. Particularly, it would be wise to implement multi-directional stitching first so that the user does not have to pick a direction on multiple axis simultaneously. Once multi-directional stitching is implemented, multi-axis would simply be a matter of including the extra variable(s).
    • Perspective Correction - It can be seen when rotating the field of view of the camera that lines appear where perspective shifts such that everything appears a little squished in the compound image. This is due to shifts in perspective which could be corrected by applying interpolation steps to the correlated regions. It can be noted that when the perspective shifts to the right, the correlated values on the left half of the image will be more widely spaced than those on the right. If a linear (or nonlinear) approximation of this distortion was created one could distort the applied image and take full advantage of the resolutions of the image. This would require a less strict application of outlier removal in order to get more data to perform the approximation. Required for Alpha Blending.
    • Alpha Blending - Alpha blending is a process by which two images can be applied on to each other and blended in the regions they overlap. This cannot be done so long as there is no perspective correction without getting a strong "ghosting" effect on each object in the image. It can be done by averaging each of the color planes in the regions in which the images overlap. In order to get an accurate alpha blend on a large set of images, it's important to keep track of how many times a pixel has been written to. Keeping a running average allows you to weight the importance of the recent image relative to all the other images which have been applied to the same spot. In this way you can prevent any one image from dominating an area simply because it is recent. Once this step is applied an interesting phenomena appears. The moving objects in a relatively frequently stitched region will slowly be filtered out of the frame. This, of course, also opens the user up to problems with extreme compounding of errors over long run-times.

Error Correction, Troubleshooting and Recommendations

    • Make sure the application Acquire and Process step is configured so it is acquiring from YOUR camera (or cam1 in MAX by default).
    • Try not to pick a camera which auto-adjusts brightness, or perform stitching within an environment where lighting is too dynamic. The algorithm operates on the identification of "objects" within the images and changes in lighting can vastly effect the programs identification of particles, and thus, the whole process. It would take fairly advanced methods to adjust for this dynamically.
    • Make sure you're calling Dispose on ALL IMAGES at the end of the application. The images are managed well, but they are not disposed at the end unless you call this function.
    • Keep in mind that the direction is determined based on the second image. Be liberal with the reset button for testing unless multi-directional functionality is added.
    • There are a LOT of images being added, but often times the latest image dominates the frame. 
    • Don't be afraid to move the camera quickly. The application doesn't care if there is motion blur in the image and most cameras operate at an FPS high enough that you can move the camera VERY quickly.

 

Example code from the Example Code Exchange in the NI Community is licensed with the MIT license.

Contributors