A Novel Hybrid Approach for Fast Block Based Motion Estimation

fixed


I. Introduction
R apid use of video based applications in broadcast and entertainment media has led to an overwhelming need to compress the video data. As a result many approaches came up towards video compression. Block based motion estimation is the most prevalent among the various techniques for motion estimation (ME). Due to the computational complexity of the ME process, an extensive research in the field has been conducted in the last two decades. Popularity of block based ME can be attributed to the simplicity and ease in hardware implementation of these algorithms. As a result, these algorithms have been used in many of the video coding standards including MPEG4 and H.264.
Block based motion estimation is based on the idea of reducing the temporal redundancy across the frames by matching the blocks in the current frame to the blocks in the reference frame in a specified search window. The brute force approach is to match all possible candidates in the search window and is known as full search [1]. This approach provides the optimum results but increases the computational overhead. The research then shifted over to finding the best match with the limited number of checking points in the search window. Three step search (TSS) [2], new three step search (NTSS) [3], four step search (4SS) [4] and diamond search (DS) [5] are the famous algorithms which resulted in high PSNR values and lesser computations. The main problem with all these approaches is of quality degradation due to trapping of search process in local minima as they use a fixed pattern for search. Adaptive rood pattern search [6] found a way out to this complication by using other search patterns in accordance with the estimated behavior of the current block. There after many algorithms have been given in this category which helped in decreasing the number of search points via dynamic search paths. In all the block matching algorithms motion estimation is established by locating the ideal match for the current block. The matching criteria that is used most widely and in current work is the sum of absolute differences (SAD) which needs to be minimized in order to maximize the performance which is measured with peak signal to noise ratio (PSNR).
Development of various fixed and adaptive search pattern based algorithms reduce the computational burden but at the same time they compromise with the video quality. The main aim of any motion estimation algorithm is to reduce the number of computations without deteriorating the video quality. Zero motion prejudgement (ZMP) and initial search centre prediction (ISC) have been proven to be beneficial in accelerating the process of motion estimation. ZMP helps to identify the stationary blocks before the calculation of the actual motion vector and thus saves the computations for calculating the motion vectors of the stationary blocks. On the other hand ISC aims to find an initial location in the search window so that a refined search is carried around this point instead of centre of search window.
All the above proposed algorithms use the center of the search window for starting the process of finding the best matching block. It has been observed that there exists spatial as well as temporal coherence between the adjacent neighboring blocks and hence the motion of the current block can be predicted by utilizing the motion information of

Abstract
The current work presents a novel hybrid approach for motion estimation of various video sequences with a purpose to speed up the entire process without affecting the accuracy. The method integrates the dynamic Zero motion pre-judgment (ZMP) technique with Initial search centers (ISC) along with half way search termination and Small diamond search pattern. Calculation of the initial search centers has been shifted after the process of zero motion pre-judgment unlike most the previous approaches so that the search centers for stationary blocks need not be identified. Proper identification of ISC dismisses the need to use any fast block matching algorithm (BMA) to find the motion vectors (MV), rather a fixed search pattern such as small diamond search pattern is sufficient to use. Half way search termination has also been incorporated into the algorithm which helps in deciding whether the predicted ISC is the actual MV or not which further reduced the number of computations. Simulation results of the complete hybrid approach have been compared to other standard methods in the field. The method presented in the manuscript ensures better video quality with fewer computations.  3 Professor, GGSIP University, Dwarka, Sector 16 C, New Delhi (India) the neighboring blocks. Using this information motion vector for the current block can be predicted. This predicted location is expected to be in the region of global minima, this reduces the number of search steps to attain global minima and thus the number of computations for motion estimation. Further, accuracy in determination of ZMP and ISC enhances the accuracy in determination of motion vectors which enhances the accuracy and quality of regenerated frame at the receiver end. Hence bit coding error, which refers to the difference in actual frame and regenerated frame at the receiver, should be reduced.
In the current work we have used a hybrid approach for fast block matching motion estimation. The idea is to firstly identify the stationary blocks and stop the search process for these blocks. For this purpose the dynamic threshold prediction technique as given in [7] has been used. The technique is not only simple but is also efficient in identifying the number of stationary blocks and thus helps to reduce the decision error.
After the identification of stationary blocks, initial search centers have been predicted with an approach as given in [8]. The advantage of this approach lies in its precise and accurate prediction of initial search centers which aids in speeding up the entire process of motion estimation.
Novelty of the proposed hybrid scheme is established with the help of increased PSNR, SSIM and search efficiency in comparison to various state-of-art algorithms in the field of fast block matching motion estimation. Also the number of computations is reduced when compared to the other standard methods This paper is organized as follows. Section 2 describes the concepts of zero motion pre-judgment (ZMP) and dynamic threshold estimation technique of ZMP. General concept of initial search center (ISC) prediction along with the method of predicting ISC is given in section 3. A brief introduction to half way search termination and small diamond search algorithm are given in section 4 and 5 respectively. The detailed hybrid algorithm designed for the solution of the problem is presented in section 6. Simulation results along with the analysis and comparisons are shown in section 7. Section 8 concludes the presented work.

II. Zero Motion Pre-judgment
Zero motion pre-judgment has been extensively used in the literature to identify stationary blocks early in the video sequences so as to save unnecessary computations. It has been established in [6] that block distortion for stationary blocks is very less in comparison to moving blocks which plays a key role in identifying stationary blocks. The SAD value of the current block to the stationary block represents the block distortion and this SAD value is compared to a predetermined threshold for detecting stationary blocks. Different approaches in literature have used different thresholds for ZMP.
The concept of fixed threshold based Zero motion pre-judgment was firstly taken by Nie and Ma [6] while proposing adaptive rood pattern search algorithm. This approach is based on using a fixed threshold of 512 but using this threshold a large number of moving blocks could be detected as stationary blocks especially for slow motion sequences. The concept of fixed threshold has also been used by Luo et al [9] along with the search priority assigned to each point. The dis-advantages of fixed threshold have led to the use of dynamic thresholds. Ahmed et al [10] have used an adaptive threshold which is determined by finding the highest or lowest of SAD values of the adjacent MBs based on specified conditions. Ismail et al [11] have used three level thresholds on the basis of three categories of SAD values. Dynamic early stop termination technique is also proposed in [11] to dynamically update the threshold by using the following equation [11]: (1) where SAD 0,0avg is the average of all the previous stationary blocks, λ is used to slow down or accelerate the ME process and ε is empirically taken as zero.
Two static thresholds based on motion contents have been given by Lin et al [12] which have been determined as per the static experimental results.
In yet another advancement of predicting threshold adaptively Ismail et al [13,14] have given a formula based on average SAD scores of all the stationary blocks. The threshold value T s is given as [13,14]: (2) where parameters α = 0.75 and β = 128.
But these thresholds do not guarantee the accurate results. A further refined dynamic threshold estimation technique given in [7] is based on the following observations: 1. A block which is having SAD below a particular threshold is not necessarily a stationary block.
2. SAD value of a stationary block w.r.t. its collocated block is least when compared to the SAD value of the stationary block with respect to its vertical and horizontal neighbors taken in the reference frame.
These drawbacks have been alleviated by using a two level threshold estimation technique given in [7]. We have used only a single level of the technique and incorporated in the proposed hybrid ME technique. The reason is that in [7] only the issue of ZMP has been taken up but here we are using other techniques along with ZMP to fasten the process of motion estimation. Use of both the levels incurs lot of complexity in terms of number of computations and thus only one level has been found sufficient when used with other techniques. Here T 1 is determined by modifying equation (2) defined above. Here 256 is taken, instead of 512 in max operator so that moving blocks in slow motion sequences with small distortions can be appropriately determined. SAD c is the SAD between the current block and its collocated block in the reference frame; SAD l , SAD r , SAD t , SAD b represent the SADs between current block w. r. t its left, right, top and bottom neighboring blocks in the reference frame. SADa represents the average distortion which is initially given a value of 512 from the results in [6] for fixed threshold so as to find the first stationary block. This value is updated and assigned the SAD value of first stationary block encountered. Max operator assists to pursue changes in SADa. SADa is updated based on the difference between SAD c and T 1 . If this difference is greater than α then SAD c is not considered for updating the average distortion SADa. As a consequence the effect of very large or very small distortion values of the current stationary block would not affect the average variation of threshold. Parameters α is taken as 0.75 and β is taken as 128.

III. Initial Search Center (ISC) Prediction
In most of the recent approaches using ZMP and ISC for fast motion estimation, initial search centers are identified before the stationary blocks. But using ZMP as a post processing step to ISC, leads to the identification of search centers even for the stationary blocks. This is the reason we have taken this step of predicting ISC after the ZMP.
Initial search center prediction helps in faster attainment of actual MV. ISC is predicted on the notion that there exist a lot of similarities in the neighboring video frames. These similarities may be spatial or temporal. Figure 1 shows the temporal and spatial neighboring relations.
Thus the current block will exhibit similar motion as compared to its surrounding blocks. So the motion prediction of the current block can be done from the motion of neighboring blocks in the current and temporal frame. Various methods have been proposed in literature for finding the ISC. A tabulated summary of these methods is given in [8].
The method used for ISC prediction in the current approach is the one given in [8]. This method has the following advantages over the previous methods: 1. The method makes use of the future points from reference frame to account for the fact that motion of an object is possible in any of the neighboring directions. No method in literature has used this concept.
2. The method works in two stages. First stage works by finding the suitable MVs whereas second stage finds the best among the previously found MVs.
A procedure used in [8] for finding the ISC is as follows: (1) Find an initial estimation of the motion vectors denoted by MPISC as: Find the variation of MPISC with all the neighboring MVs (9 blocks): The blocks for which V i > T 2 are the suitable blocks for further processing; where T 2 =2.
These candidate blocks are denoted by CISC i (candidates for ISC).
(2) Find SAD of the current block with MPISC and with all the CISC.
Minimum SAD implies that the probability of movement is in this particular direction. Therefore ISC is assigned MV in accordance to macro block with least SAD.

IV. Half-way Early Search Termination
Predicted ISC can be the position of actual MV. If this can be detected early then search can be terminated early. To do so, the SAD value of the predicted ISC is checked, if it is below a predefined threshold Td, then the current block may be assumed to have high correlation with that particular neighboring block. Same MV can be declared for this block as that of the best matched neighboring block and search is terminated thereafter saving huge computations. Threshold Td in the proposed manuscript is taken same as that T 1 calculated above.

V. Small Diamond Search (SDS) Algorithm
Once the initial search center is predicted with the proposed technique, there is high probability that it lies near the global minima. So the actual MV could be obtained by using a fixed and small search pattern to perform a refined search rather than using some fast BMA. Two types of fixed small search patterns have been defined in literature -four point pattern as in small diamond search (SDS) [15] and eight point square search pattern as in block based gradient descent search (BBGDS) [16]. We have used SDS rather than BBGDS to perform the refined search for MV. It is based on the comparative analysis of SDS and BBGDS given by Nee and Ma [6] indicating clearly that performance, in terms of PSNR, of both the algorithms is almost same whereas BBGDs incurs 40-80% more complexity in terms of number of calculations. Figure 2 shows the two fixed search patterns So with the ISC and SDS, minimum distortion point (MDP) is obtained which is then considered as the new search center. This recursive procedure continues till the MDP is the center of the fixed SDS pattern or search window boundary is met.

VI. Proposed Hybrid ME Algorithm
The proposed hybrid algorithm based on ZMP, ISC, Half way search termination and SDS works in following steps: 1. Find SAD c of current block and its collocated block in the reference frame. If this SAD c < T1 AND SADc is equal to min( SAD c , SAD l , SAD r , SAD t , SAD b ), block is declared as stationary block. Search is terminated thereafter and go to step 7 otherwise go to step 2. 2. Find the MPISC = median (c 1 , c 2 , c 3 , c 4, r 0 , r 5 , r 6 , r 7 , r 8 ) and identify the points using V i = abs (MPISC x -c ix /r ix ) + abs(MPISC y -c iy /r iy ) , which are distant apart from MPISC. The points for which V i is above a threshold will be the candidate points for ISC (CISC) prediction.
3. Compute the SAD of C 0 with MPISC and CISCs and then find minimum SAD. Declare ISC as the point corresponding to minimum SAD. 4. Check whether ISC could be the location of actual MV by comparing its SAD with a predicted dynamic threshold. If true, declare the position of ISC as MV for current block and go to step 7 otherwise to next step.

Exit
The steps followed in the proposed algorithm are depicted graphically in Figure 3.

VII. Performance Analysis and Comparison Results
The main goal of any algorithm is to lower the computational complexity while maintaining the video quality as that of FS algorithm. The performance of the proposed algorithm has been evaluate by doing window or boundary is met? Fig. 3. Block diagram of the proposed algorithm. simulations on various standard YUV test video sequences containing different motion characteristics, listed in Table 1. Experimental set up for simulations include a 15fps, size of ±7 for search window and16×16 for blocks. Proposed algorithm is compared with fixed size algorithms like FS, TSS, NTSS, 4SS, DS, and predictive motion based vector adaptive search pattern algorithms like ARPS, DPS and recently proposed APSP and FPS algorithms.
To measure the performance of proposed ME algorithm following parameters are evaluated -computational complexity and search efficiency, video quality in terms of average PSNR per frame, structural similarity index measurement (SSIM) per frame, average number of bits required per pixel to represent the residual frame (Difference between the actual and the motion compensated frame) and distance between the actual and predicted Motion Vector.
Computational complexity of a ME algorithm can be evaluated in terms of average number of search points required per block to estimate the MVs.
Search efficiency can be evaluated by finding the distance between the actual MV obtained from FS algorithm and the estimated MV using a fast BMA. (6) Where (MV x , MV y ), (MV fx , MV fy ) represents MVs of FS algorithm and fast BMA resp; NB represents the total number of blocks in a frame.
Where the value of Max is taken as 255 representing the maximum possible pixel value in a video frame. MSE is the mean square error between the original frame and the motion compensated frame. FS algorithm gives the best MVs, hence best video quality and maximum PSNR. Therefore provides standard PSNR with which the PSNR values calculated from other BMAs is compared.
Structural similarity index measurement (SSIM) is also a means to measure the similarity between two images. SSIM between the two blocks C (block in original frame) and R(block in motion compensated frame) is given as: (8) It has been observed from fixed search pattern algorithms like TSS that these use fixed search points to attain actual MV. Early search termination and search near center feature is added in NTSS and 4SS to reduce the search points especially in slow motion sequences. DS algorithm incorporated above features along with special search pattern further lowered the search points and became the most favorable algorithm in various standards. ARPS with zero motion prejudgment and DPS algorithms modified the large diamond search pattern in DS and able to reduce the search points. Recently proposed ASPS and FPS algorithms incorporated ISC & ZMP techniques to reduce the search points. In these algorithms ISC point is found first and then ZMP is implemented. However in case of slow motion sequences most of the blocks don't possess motion or are stationary. Therefore finding ISC before determining a block to be stationary or not would involve unnecessary ISC computations for ZMP. Therefore in the current approach ZMP is implemented first and ISC is calculated only for the blocks which are not stationary.
The simulation results are shown in Tables 2 and 3 for qcif and cif sequences respectively. Comparisons have been done on the basis of four parameters that are computations, PSNR, SSIM and distance between actual and predicted MV. It can be observed from the results that there is 9-11 times reduction in search locations with the proposed approach for very slow motion video sequences having stationary background like "Akiyo", "Clair", "Miss America" compared to DS algorithm in qcif video sequences. The reduction in CIF video sequences like "News" & "Mother-Daughter" is 6-7 times. Such high reduction in computations is possible because of accuracy in prediction of ISC which leads this predicted point to be in the region of global minima and search followed by small number of search points to attain the position of actual motion vector. The computations are slightly larger than recently proposed FPS algorithm because in the proposed ZMP technique, four additional neighboring points need to be checked. This small overhead increases accuracy in determination of slow moving blocks and stationary blocks otherwise very slow moving blocks whose distortion is less than dynamic threshold have high probability to be identified as stationary blocks. This influence can be observed as improvement in video quality in terms of PSNR, SSIM. For sequence like "Miss America", "Mother-Daughter" there is appreciable improvement in PSNR to recently proposed FPS or ASPS algorithms. This improvement is possible because of accuracy in determination of actual motion vectors and reduction in trapping in local minima. Further the proposed algorithm shows appreciable improvement in search speed and video quality especially for fast motion sequences.

VIII. Conclusion
In this paper a hybrid technique for fast motion estimation is proposed. The technique is based on using improved dynamic techniques for determination of zero motion blocks, improved accuracy in prediction of initial search center prediction, early search termination and small diamond search pattern. Proposed technique enhances the video quality in terms of PSNR and SSIM. Further it increases the search efficiency and reduces the number of computations required to estimate the motion vectors. Simulation results show the superiority of the proposed technique to the existing techniques in literature.