Object Detection and Tracking using Modified Diamond Search Block Matching Motion Estimation Algorithm

advanced algorithms


I. Introduction
F rom the viewpoint of tracking in videos, the use of block matching algorithm can be made for the purpose of motion estimation and object tracking. Detection of an object is typically the first step when starting the tracking process. The object detection mechanism is carried out either in the sequence of frames or when the object first appears in the frame of the video for tracking [1,2]. Background subtraction is a process of subtracting the second frame (current frame) from the first background (reference) frame, thus the dissimilarity between two frames and the position of moving object can be obtained [3,4]. Object tracking means the process of locating the object of interest from a video sequence. The object is tracked by continuously monitoring the motion of an object in the video. Each of the frames can be divided into two set of objects, foreground and background objects [5]. An adaptive background subtraction technique can also be used for object detection [6]. The complete process of object tracking can be categorized into the following three steps, object detection, object classification and object tracking [7]. Many application of tacking object using different approaches and techniques are implemented. Video surveillance is the basic and widely used application of object tracking [8]. The block matching algorithm is used for the purpose of tracking an object. The algorithms for motion estimation by block matching are used where similar blocks in a sequence of frames of the video are located for the purposes of motion vector estimation. Motion estimation is the process of finding the motion vectors from all frames in a video sequence [9]. The block matching algorithm aims at finding a matching block from a frame in some other frame. Block matching involves partitioning the current frame into a number of macro blocks and compares each macro block with the corresponding macro block. A vector is created that maps the movement of a macro block from one location to another [10]. These motion vectors provide the displacement in the block which can be used for object tracking. Block matching technique also helps in removing the redundancy in the frames. This process can also be carried out in real-time environment along with some hardware [11,12]. The matching process is performed by minimizing a certain matching criterion (Mean of Absolute Difference), and the best-matched block, where the motion vector is found, which gives the minimum block-matching distortion (BMD) [13]. There are different approaches of block matching algorithm (BMA) amongst which some are advanced algorithms of the already

Abstract
Object tracking is one of the main fields within computer vision. Amongst various methods/ approaches for object detection and tracking, the background subtraction approach makes the detection of object easier. To the detected object, apply the proposed block matching algorithm for generating the motion vectors. The existing diamond search (DS) and cross diamond search algorithms (CDS) are studied and experiments are carried out on various standard video data sets and user defined data sets. Based on the study and analysis of these two existing algorithms a modified diamond search pattern (MDS) algorithm is proposed using small diamond shape search pattern in initial step and large diamond shape (LDS) in further steps for motion estimation. The initial search pattern consists of five points in small diamond shape pattern and gradually grows into a large diamond shape pattern, based on the point with minimum cost function. The algorithm ends with the small shape pattern at last. The proposed MDS algorithm finds the smaller motion vectors and fewer searching points than the existing DS and CDS algorithms. Further, object detection is carried out by using background subtraction approach and finally, MDS motion estimation algorithm is used for tracking the object in color video sequences. The experiments are carried out by using different video data sets containing a single object. The results are evaluated and compared by using the evaluation parameters like average searching points per frame and average computational time per frame. The experimental results show that the MDS performs better than DS and CDS on average search point and average computation time.
existing BMAs [14]: Three step search (TSS); novel four step search [15], New three step search (NTSS) [16]; Simple and efficient search; Four step search (FSS) [17]; Diamond Search (DS) [18] & Cross Diamond search (CDS) [19] etc. In this paper, the modified diamond search algorithm (MDS) is presented by introducing the capabilities of DS and CDS. It starts by initializing the small cross shape in the first step, which grows into the large diamond shape (LDSP). Estimation of Motion vector (MV) by reducing the size of the blocks can also make the tracking process fast [20]. The various applications of block matching algorithm other than tracking are used in video compression [21], block matching motion estimation using automata theory for fractal coding [22,23] and also modified Three-Step Search (TSS) algorithm using fractal coding [24]. The various sections of this paper are organized as follows: Algorithm and analysis of DS and CDS are given in Section II. Section III describes the Proposed MDS algorithm. Section IV reports the significant experimental results. The conclusion and future scope is given in Section V.

II. DS and CDS Algorithm
From the entire existing block matching algorithms the behavior of diamond search and cross diamond search are studied. These two algorithms are selected for study because these algorithms require less searching points than the other algorithms as seen from the studied literature survey. The DS and CDS algorithms are explained in detail below with its steps.

A. Diamond Search Algorithm
The steps of DS algorithm with different cases are explained below.
Step 1: Start at the center by initializing 9 search points on the search window. Check all the 9 search points, if the minimum cost found to be at the center of the search window then go to Step 3, else go to Step 2.
Step 2: The point marked with minimum cost in the Step 1 is re-set as the new center by forming LDSP around it. If the newly obtained minimum cost point is located at the center position then go to Step 3, else repeat Step 2.
Step 3: Shift the searching pattern from LDSP to Small Diamond Search Patter (SDSP). The minimum cost point found in Step 3 is said to be the final solution for the motion vector, which points to the best matching block.
Diamond search pattern has different cases depending upon the point of concern. The presence of the least cost point at different location is processed differently as shown in Fig. 1.

B. Cross-diamond Search Algorithm
Cross diamond shaped search (CDS) pattern implies a cross search pattern (CSP) over the diamond search (DS) pattern as shown in Fig. 2. The CDS Algorithm works as follows: Step 1: The minimum cost of each of the 9 search points of the CSP is found. If the point with minimum cost is found at the center of the CSP then stop the search as shown in Fig. 3(a), else go to Step 2.
Step 2: 2 more closest search points to the current minimum cost point of the central CSP are checked. If the minimum cost point from the previous step 1 is located at the center of the CSP and if the new minimum cost point found in this step coincides with this point then, stop the search as in Fig. 3(b) else go to Step 3.
Step 3: The point with minimum cost found in previous Step 2 is the new center of the LDSP. If the new point with minimum is still at the center of the newly formed LDSP then go to step 4. Else repeat this step.
Step 4: A new SDSP is formed with the minimum cost point in the previous step as the center. The new minimum cost point, in this step, is the final solution to motion vector.  Table I, among which the first four are the wellknown sequences such as "Traffic", "Pet", "Walk", "Ant", and the remaining four "Bottle1", "Bottle2", "Walk1", "Walk2" are the user defined video data sets. All these data sets consist of only a single object moving throughout the frames. Also for all the video data sets used, the camera is stationary. The video sequences such as "Traffic", "Walk", "Walk1" involves higher motion than other videos. The other remaining video sequences involve gentle or slower motion. The DS and CDS algorithms are performed on a 16x16 macro block, with a window size (w) of -7 to +7. The diamond search algorithm starts by initializing 9 points at the center of the search window. It then carries out the steps according to the algorithm and finds out the motion vectors. The motion vectors are found using the block matching criterion such as MAD. The Cross diamond search algorithm also starts by initializing 9 points in the first step, but it differs from the diamond search, in the shape of searching points initialized. The results, after performing DS and CDS on the eight video sequences, the searching points per frame and computational time per frame are tabulated as required by the DS and CDS algorithms, given in Table  II and Table III.  The diamond search pattern starts by initializing nine points, arranged in a diamond shape pattern, on 5x5 grids, to cover the search points in both the directions (up and down). The diamond search pattern may require 13 search points at its best case and 30 search points in the worst case. On an average, it may range between 13 to 30 search points. Table II are the results of the searching points and computational time computed, using the diamond search algorithm, on the video sequences. In the same way, the cross diamond search algorithm also requires nine points in the first step but the search points are arranged such that it forms a cross like structure (+). In the best case, CDS may require 9 points and 29 search points in the worst case.
On an average, it may range between 9 to 29 search points. Table III are the results of the searching points and computational time computed, using the Cross Diamond search algorithm, on the video sequences. The experiments are performed on the following sequences of video, "Traffic", "Pet", "Walk", "Ant","Bottle1", "Bottle2","Walk1", "Walk2". The first 10 frames of each of these video sequences are considered for analysis. The obtained results are studied and analyzed for further designing of modified diamond search algorithm. Fig. 4 and Fig. 5 represent the graph of the obtained results in terms of searching points and computational time, for DS and CDS, respectively. The main aim is to design an algorithm that could match or to improve the performance of the DS and CDS, in terms of the search points required and speed up the tracking process.

III. Proposed Work
The proposed work is concerned with designing an algorithm that could match or improve the performance of the DS and CDS, in terms of the search points required and speed up the tracking process. The tracking process can be made fast if the search points required by the proposed algorithm can be reduced.
Flowchart of the object tracking system is shown in Fig. 6. That helps in visualization of the stepwise working of the proposed approach. It represents the process for single object tracking using block matching motion estimation techniques.
• As a preprocessing step video is converted into YUV color space, where only the "Y" component (luminance) is considered as it is sensitive to the human eye.
• Later the video is divided into many frames. Only two frames are considered at a time of processing. From these two frames one is current frame and the other is reference frame.
• On all the frames of the video sequence the background subtraction method for object detection is applied. In background subtraction, the current frame gets subtracted from the reference frame. The result after subtraction is the object in the foreground.
• After detecting the object using background subtraction, the Block Matching Algorithm (BMA) is applied.
• BMA divides the block into number of macroblocks.
• BMA calculates the motion on a block by block basis. For every current frame block, a block from previous frame is found and matching is based on certain criteria. (MAD here to find best match).
• A vector is created that keeps track of the movement of a macro block within different locations. Motion vector provides displacement in the block in terms of MVx and MVy.
• Using all these motion vectors MVx and MVy the object is tracked from the video.

A. Diamond Search Pattern
The diamond search algorithm uses a Large Diamond Shape Pattern (LDSP) and a Small Diamond Shape pattern (SDSP) as shown in Fig.  7. The Fig. 7 (b) is used as the initial step to the Modified Diamond Search (MDS) algorithm, followed by the Fig. 7 (a).

1) The Modified Diamond Search Algorithm
Following are the steps of MDS algorithm: Step 1: Start with five points of SDSP as the initial step from the center of the search window and the 5 checking points of SDSP, are tested. If the cost with minimum point after calculation is located at the center position then stop the search. Otherwise, go to Step 2.
Step 2: The minimum cost point found in the Step 1 is re-positioned as the center point and forms a new LDSP. If the new point with minimum cost obtained is located at the center position then go to Step 3; else repeat this step in recursion until the minimum point occurs at the center of the LDSP and then go to Step 3. Step 3: With the new minimum point as the center, shift the search pattern, from LDSP to SDSP. The minimum cost point, found in this step, is the final solution of the motion vector, which is the best matching block.   Fig. 9 shows the various cases of MDS algorithm, in which the modified diamond search algorithm starts, by initializing only five points in the first step in a 3x3 window and nine points in the second step on the 5x5 search window respectively, when compared to the nine searching points used in the first step of DS and CDS. With this characteristic, it can facilitate the optimization of computation of MDS over DS and CDS. The design of MDS also helps in finding smaller motion vectors more efficiently. MDS algorithm behaves like DS only if the min MBD is not found at the center in the first step and keeps on modifying between successive LDSP, by three points or five points for searching. Thus the total number of search points, in the best case, are 5 and varies from 5 to (5+8) =13, in the worst case, whereas, on an average, it requires search points in between 5 and 13.

B. Analysis of MDS Algorithm
The MDS algorithm is quite different from any other fast BMAs. i.e. (i) The search patterns used in MDS have the minimum number of points; (ii) the directional search patterns are used, and (iii) the switching strategy of the diamond search patterns is adopted in the last stage. MDS starts with a small searching pattern and then grows into a large shape and again shrinks to the small shape, replacing the methods, in cross search pattern and simple diamond pattern.

A. Experimental Results on MDS
The proposed MDS algorithm is simulated and tested using the various standard video sequences and use created video sequences consisting of different motion types. For all the video sequences we have used the Mean of Absolute Difference (MAD) as the block matching measure (criterion), block size of 16x16, and search window of size 7 (i.e. w = -7 to +7). All these video sequences used for the experiment, consist of only a single object moving throughout the frames. Also for all the video data sets used, the camera is stationary. The video sequences, such as "Traffic", "Walk", "Walk1", involves higher motion than other videos. The other remaining video sequences involve a slow moving object. The results of the Modified diamond search algorithm are listed in Table IV. The results in Table IV are graphically represented in Fig. 10. A lot of improvement is seen in the result of MDS over the DS and CDS. The modified algorithm for diamond search is compared against the existing diamond search and cross diamond search algorithm, in the aspect of searching points.   The Table V compares the searching points among different block matching algorithms. It shows that the Modified diamond search algorithm (MDS) always consumes the smaller number of search points, as compared to the Diamond Search (DS) and Cross Diamond Search (CDS). Compared to DS, CDS approximately saves about 1.98 -2.24 search points. When CDS is compared to MDS, it also saves 0.36 -3.55 search/check points, which is much more than the other two BMAs. The average search points, per frame with observations, MDS < CDS < DS, is detected for the video sequences, with w= +7 to -7. With such an improvement in the searching point, the experimental results, using the video sequence, shows that the MDS algorithm achieves less searching points, than that of DS and CDS. For the video sequence "Traffic", the MDS algorithm saves up to 0.36 and even 3.02 search/check points per frame, as compared to CDS and DS respectively. For sequence "Pet" the proposed algorithm saves 0.37 to 3.44 search points, "Walk" saves 3.55 to 3.62 searching points, "Ant" search points ranges from 1.2 to 3.44, "Bottle1" sequence saves 0.79 to 2.89 searching points, video sequence "Bottle2" saves 0.72 to 3.21 search points, and the video sequences "Walk1" and "Walk2" saves 1.88 to 3.61 and 1.38 to 3.48 searching points per frame. This implies that the searching pattern of proposed Diamond Search (MDS) algorithm, with a search area of 3x3, in the first step, starts with 5 searching points, instead of with 9 search points as in DS and CDS. This feature outperforms the existing Diamond search and Cross Diamond search that use a larger search area and more search/check points in the first step. The smaller shape of MDS helps in finding smaller Motion Vector (MV) efficiently. MDS starts with a small searching pattern and then grows into a large shape and again shrinks to the small shape, replacing the methods in cross search pattern and simple diamond pattern. Thus, MDS performs better than DS and CDS. Fig. 11 and Fig. 12 plots the average searching points per frame using sequence "Traffic" and "Bottle1" respectively, on each frame. These two figures clearly show the superior performance of MDS over DS and CDS algorithm, in terms of the search points used. Fig. 13 shows the performance comparison of DS, CDS, and MDS in terms of average search points.

B. Experiment Results using MDS for Object Tracking
This research has applied the capabilities of Modified diamond search (MDS) block matching algorithm for the purpose of tracking an object. The block matching motion estimation algorithm is used, where similar blocks in a sequence of frames of the video are located, for the purposes of motion vector estimation. The purpose of a block matching algorithm is to find a matching block, from a frame, in some other frame. Block matching involves partitioning the current frame into a number of macro blocks and compares each macro block, with the corresponding block. A vector is created that maps the movement of a macro block, from one location to another. These motion vectors provide the displacement in the block, which can be used for object tracking. In the preprocessing, the RGB video sequence is converted into YUV space and all the experimentation is done, using the luminance of video sequence. Thus this research performs background subtraction for detection of the object and then applies the MDS algorithm to it, where the motion vectors are calculated and the object is tracked. Fig.14 shows the step wise results of the tracking. Final tracked object is shown by the blue bounding box.

C. Results of Object Tracking
The experiments start with the background subtraction and have implemented the Modified Diamond search algorithm on all the video sequences, to obtain the results. The first 20 frames of the "Walk1" sequence are processed and the step wise output is given, as shown in Fig. 15. The first image is the original video frame, followed by its YUV converted frame. Third image shows the output after the background subtraction and the last image shows the result after object tracking.

D. Simulator
The proposed approach is implemented using (MATLAB 8.1.0.604) (R2013a). The experiments are carried out on Intel (R) Core (TM) 3 Duo T6570, 2.10 GHz processor. The RAM of 4GB is used. The operating system is 32-bit installed on Windows 7 platform.

V. Conclusion and Future Scope
In this paper, a modified diamond search algorithm (MDS) is proposed by combining the capabilities of Diamond search (DS) and Cross diamond search (CDS) block matching algorithm for finding motion vectors between two macro blocks. These motion vectors are then used to track the object in video sequence. The proposed algorithm uses a 3x3 area as the first most step and diamond shape as the further steps. The proposed work is concerned with designing an algorithm that could match or improve the performance of the DS and CDS, in terms of the search points required and speed up the tracking process. The experimentation performed and obtained results show that the proposed algorithm requires fewer searching points as compared to DS and CDS algorithms. The tracking process can be made fast if the search points required by the proposed algorithm can be reduced. Hence this research concludes that the Modified diamond search Algorithm is better, when compared with DS and CDS, based on the experimental results. Bandera-Amravati, Maharashtra, India. He is the author or co-author of more than 70 scientific publications in refereed International Journal, International Conferences, National Journal, and National Conferences. He is a member of editorial board of over eight International journals; also, he is the life member of ISTE, India, IAENG and IAEME. He also worked as the reviewer for refereed international journals and conferences.