A Review on Block Matching Motion Estimation and Automata Theory based Approaches for Fractal Coding

--- Fractal compression is the lossy compression technique in the field of gray/color image and video compression. It gives high compression ratio, better image quality with fast decoding time but improvement in encoding time is a challenge. This review paper/article presents the analysis of most significant existing approaches in the field of fractal based gray/color images and video compression, different block matching motion estimation approaches for finding out the motion vectors in a frame based on inter-frame coding and intra-frame coding i.e. individual frame coding and automata theory based coding approaches to represent an image/sequence of images. Though different review papers exist related to fractal coding, this paper is different in many sense. One can develop the new shape pattern for motion estimation and modify the existing block matching motion estimation with automata coding to explore the fractal compression technique with specific focus on reducing the encoding time and achieving better image/video reconstruction quality. This paper is useful for the beginners in the domain of video compression.

w ITh the most challenging area in computer animations and multimedia technology, data compression remains a key issue regarding the cost of storage space and transmission times. Fractal Compression was first promoted by M. Barnsley which is based on Iterated Function System (IFS) [1] [2]. It basically deals with the exploration of the self-similarity present in the given image. Though the Fractal coding is advantageous with respect to the compression ratio and image reconstruction quality, but it has the heavier non-acceptance related to the time elapsed for the check of similarity. It is suitable for the gray level image compression, but later some new techniques were also developed for the color image/video compression. The collage theorem [3] is the basis for the fractal transform. The collage theorem for an input image I, a new set W(I) is computed by the union of n number of sub-images, each of which is formed by applying a contractive affine transformation w i on I as given in (1). A practical reality was given to fractal compression by Jacquin with partitioned IFS (PIFS) [3]. (1) Video compression deals with the compression mechanism for the series of image sequences. In coding, the correlation between the adjacent image frames may get explored, as well as the relativity between them may also be used in the development of the compression mechanism. Generally, the adjacent image frame does not differ much. The probable difference lies in the displacement of the object in the given image frame with respect to the previous image frame. Grey and color videos (i.e. image sequences of gray or color image frame) are the customers for the video compression approach. Different color spaces [4] can be used for the video images from the processing point of view. With the very fast development in multimedia communication with moving video pictures, processing on color images plays a very important role. A color image is represented by 24 bits/ pixel in RGB color space format, with each color component represented by 8 bits.
Block matching motion estimation is a popular technique for many motion compensated video coding standards. Video compression standards, in general, are used for the video coding. Basic video compression standards are-Video coding standards are related to the organizations-ITU-T Rec. H.261, ITU-T Rec. H.263, ISO/IEC MPEG-1, ISO/IEC MPEG-2, ISO/IEC MPEG-4, and recent progress is H.264/AVC. In a series of the image sequence, there are spatial, temporal and statistical data redundancy that arises between frames. Motion estimation and compensation are used to reduce temporal redundancy between successive frames. Motion estimation computes the motion/movement of an object in a given image. For achieving data compression in a sequence of images, the motion compensation uses the knowledge of the motion object. Different searching techniques are available to compute the motion estimation between frames.
A finite automaton is a mathematical model used in the theoretical foundation of computer science to show acceptance or rejection of a particular string of an algorithm. Transition diagram, i.e. graph, is a convenient way of designing finite automata. Any image can be represented by a finite automaton. Because of its simple mathematical structure, a finite automation is used in fractal image/ video compression. The principle of finite automata is based on the self-similarity present within the picture itself. Image compression with finite automata can also be applied to digital video sequences, which are typically represented by a series of frames or digital images [5]. The concept of finite automata is now generalized to weighted finite automata.
The remainder of this paper is organized as follows: An overview of fractal Image compression is given in section 2. Section 3 describes the related work on fractal image coding. An overview of fractal video coding is given in Section 4. Section 5 describes a related work on automata theory based coding. Section 6 focuses on work carried out on fractal video coding using block matching motion estimation. This section elaborates the different motion estimation approaches and Section 7 discusses about the quality measures associated with video processing. Section 8 summarizes the conclusion from the studied existing approaches. Finally, the paper ends with the future scope in section 9 followed by references.

II. oVeRVIew of fRAcTAl IMAge codIng
In general, there are two types of compression-Lossless and Lossy. In lossy data compression techniques some amount of the original data is lost during the compression process. For fast transmission of images across the internet media lossy techniques are used in World Wide Web. Examples of lossy techniques are JPEG, GIF, Wavelet, Fractal, DCT etc. In lossless data compression techniques very few amount of data is lost. Examples of lossless techniques are: TIFF, CCD RAW, etc.
Fractal coding techniques are generally applied on gray level images. For color image compression, each of red, green and blue component is compressed individually using gray image fractal coding algorithm. Fractal coding is a block-based processing technique which takes long encoding time for compression but less decoding time for decompression and it falls in the category of lossy compression technique. In basic fractal image coding, the original image is divided into small non-overlapping range block (R block) of fixed size which is nothing but a group of a collection of horizontal and vertical pixels. For each R block, find overlapping domain block (D block) of fixed size which is also a group of a collection of horizontal and vertical pixels and are generally two or four times the size of the range block. For instance, If the image of size I=2 N x 2 N is divided into non-overlapping range block R i=0,1,2,…m of size 2 n x 2 n , then search for the best match overlapping domain block D j=0,1,2,…n of size 2 n+1 x 2 n+1 i.e. double the size of range block . The number of range block for a single plane is R m = (2 N x 2 N ) / 2 n x 2 n and the number of domain block for a single plane is D n = ( (2 N x 2 N ) -(2 n+1 x 2 n+1 ) + 1) 2 . Domain pool D P consists of all the transformed and under-sampled domain block such that it matches to the size of the range block. Block schematic of basic fractal coding encoder is shown in Fig. 1. The steps for basic fractal encoding algorithm [6] are as follows.  [3] given in equation (2) and (3), compute the values of contrast factor-s and brightness factor-o by referring R i and D j t in domain pool D P . Given a pair of range block R i and transformed and shrink domain block D j t in domain pool D P of n pixels with intensities r 1 , r 2 …r n and d 1 , d 2 ,... d n to minimize the quantity i.e R.
where (2) and (  The decoding of a compressed image can be easily achieved with a starting value B (0) given in relation (5) Where W is Fractal Transformations applied on each domain block, q is number of transformations i.e. Eight affine/linear transformations and {B (0) , B (1) , B (2) ,B (q-1) , B (q) } set of sequence of transformation. The Peak-signal-noise-ratio(PSNR) value describes the image quality of the decoded image computed by using equation (6).

PSNR = (6)
Where A is amplitude of the signal and is given as A=2 8 -1=255 8-bit gray image and MSE is computed as follows MSE= (7) Where X i,j and Y i,j are the pixel (i, j) intensities of original and decoded image respectively.
In sequentially approximating each range block by suitable domain block, the major drawback is exhaustive searching that is required for finding a best matching domain block, which leads to increase encoding time. Several approaches have been proposed for reducing encoding time. In this review paper, we present various approaches for reducing encoding time for the image as well as video fractal coding. Classification of fractal image and video coding approaches is represented using Block schematic as shown in Fig. 2.

III. RelATed woRk on fRAcTAl IMAge codIng
Literature available on Fractal gray image compression is very large and few report about the color image compression. Formulation of approximate nearest neighbor search in [7] is based on orthogonal projection and prequantization of the fractal transforms parameters. Ghosh et al. [8] searched domain blocks randomly for every range block to minimize the encoding time. Truong et al. [9] optimized the range blocks pool and domain blocks pool using spatial correlation to minimize the search space and searching time. The encoding algorithm in [10] based on the law of cosines. Fan and Liu [11] presented the matching algorithm based on the Standard Deviation (STD) between range blocks and domain blocks. Wang et al [12] proposed Correlation information feature to find nearest neighbor domain block for each range block.
Ghosh et al. [13] proposed an approach for fractal image coding based on an innovative concept of relative fractal coding which found to be suitable for coding multi-band satellite images. Conci and Aquino [14] used fractal dimensions for the image part classification where first the image block fractal dimension complexity is evaluated and then only parts within the same range of complexity are used for testing the better self-affine pairs. Li et al. [15] presented a kernel function clustering based on an ant colony algorithm. It automatically realizes classification of the domain block. Hartenstein et al. [16] presented bottom-up region merging approach where regions are merged on the basis of collage error. Belloulata and Konrad [17] proposed an approach to fractal image coding that permits region-based functionalities where images are coded region by region according to a previously-computed segmentation map. The method in [18] is particularly well suited for use with highly irregular image partitions for which most traditional (lossy) acceleration schemes lose a large part of their efficiency. Franco and Mala [19] presented an algorithm for adaptive image partitioning achieving designated rates under a computational complexity constraint. Wang [20] proposed a graph based image segmentation algorithm used to divide the image into the different logical area and each logical area is coded into adaptive threshold quadtree partitioning approach.
Hassaballah et al. [21] minimized domain pool size on the basis of entropy value of each domain block. Domain pool reduction is parameterized and non-adaptive by allowing an adjustable number of domains to be excluded from the domain pool based on the entropy value of the domain block. He et al. [22] utilized one-norm of a normalized block to minimize the domain pool search space, in which the search process might be terminated early, and thus remaining domain blocks could be safely discarded. Rowshanbin et al. [23] used a special characteristic vector to classify the domain blocks to minimize the searching time. The approach in [24] uses the minimum distortion and variance difference between the range block and domain block to minimize the domain pool and searching time. To speed up the encoding, in [25], the classification features are used to classify the image blocks. Xing et al. [26] used hierarchical partitioning to classify the domain pool. Fuzzy pattern classifier is utilized in [27], to classify the original image blocks. Qin et al. [28] sorted domain pool on a number of hopping and the variance of continuing positive and negative pixels in the given block. The approach in [29] reduces the memory requirement, and speeds up the reconstruction. In Stochastic image compression using Fractals [30], stochastic image coding based on the fractal theory of iterated contractive transformations is discussed. Fractal image compression based on the theory of iterated function system (IFS) with probabilities is explained in [31]. Gungor and Ozturk [32] discussed a hash function based image classification technique. No search fractal image compression in DCT domain is discussed in [33]. Eugene and Ong [34] discussed two pass encoding scheme to speed up the encoding process. The approach in [35] is based on computing the gray level difference of domain and range blocks. Peng et al. [36] obtained the range blocks by partitioning the image using adaptive quad trees. A direct allocating method to predict the desired transformation for similarity measure is discussed in [37]. Liu et al. [38] presented the entropy based encoding algorithm. Melnikov [39] presented the method of acceleration of the image fractal coding. Distasi et al. [40] proposed an approximation error based approach to classifying the blocks. The fundamental idea of this approach consists in deferring range and domain comparisons, based on feature vectors.
A preset block , as a temporary replacement with which range and domain blocks are compared.
There are many color spaces [4] available to represent the color images. To display the image on the monitor, the RGB color space is generally used. For color image compression, the RGB model is best suited because it provides the highest correlation [41]. For color image compression on square architecture, the fractal coding is applied on different planes of a color image independently by treating each plane as a gray level image. This approach is Straight Fractal Coding or Separated Fractal Coding (SFC) [42]. Work related to fractal coding of color images reported in the literature are: Hurtgen et al. discussed a fractal transform coding of color images in [43]. To exploit the spectral redundancy in RGB components, the root mean square error (RMSE) measure in grey-scale space is extended to 3-dimensional color space for fractal-based color image coding in [44], where it is claimed that 1.5 compression ratio improvement can be obtained using vector distortion measure in fractal coding with fixed image partition as compared to separate fractal coding in RGB images. Comparative study of fractal color image compression in the L*a*b* color space with that of Jacquin's iterated transform technique for 3-dimensional color is presented in [45], where it is claimed that the use of uniform color space has yielded compressed images with less noticeable color distortion than other methods. Li et al. [46] presented fractal color image compression scheme based on the correlation between the three planes of RGB color space. Giusto et al. presented an approach for color image coding based on the joint use of the L*a*b* color space and Earth Mover's Distance in [47]. Thakur et al. [6] discussed a Fractal compression technique, which is basically a searching technique based on self-similarity search within an image and elaborated basic steps required for Fractal coding technique i.e. partitioning into rang/domain blocks, searching each range block with all domain blocks and stores the values of best transformations. Thakur and Kakde [51] proposed a modified fractal coding approach on spiral architecture to optimize domain blocks using local search. One of the major challenges in fractal coding technique is to optimize/reduce the size of domain pool. Reducing the number of domain blocks in a domain pool reduces encoding time in fractal coding technique.

IV. oVeRVIew of fRAcTAl VIdeo codIng
There are two methods for fractal video: coding-Frame based and Cube based fractal video coding. In frame based coding, each single frame of image is partitioned into a range and domain square block, which is already described in section 2, and then each frame is coded based on inter-frame and intra-frame coding to reduce the redundant part of data within a frame and between two frames using fractal transformations. The inter-frame and intra-frame coding are discussed in section 4.1. Every range block of the current frame is encoded by the domain block from the previous frame i.e. encode every range block by using inter-frame similarity as shown in Fig. 3. The error occurs in the latter/next frames due to the inter-frame similarity, i.e use of domain block from the previous frame, and delay occurs between two successive frames during the processes of decoding, which are the major drawbacks in frame based video coding. This method is advantageous in achieving a high compression ratio and thus it is used in video transmission through the internet/www media.
In cube based coding, The video sequence to be coded is first combined into a group of frames(GoF) and then each GoF is partitioned into a set of range and domain cubes as shown in Fig. 4. For instance, image can be viewed as 3D /cubic digital image of size I=2 N x 2 N x 2 N , which is divided into non-overlapping range block R i=0,1,2,…m of size 2 n x 2 n x 2 n then search for the best match overlapping domain block D j=0,1,2,…n of size 2 n+1 x 2 n+1 x 2 n+1 i.e double the size of range block . The number of range blocks for a single plane is R m = (2 N x 2 N x 2 N ) / 2 n x 2 n x 2 n and the number of domain blocks for a single plane is D n = ((2 N x 2 N x 2 N ) -(2 n+1 x 2 n+1 x 2 n+1 ) + 1) 2 . Domain pool D P consists of all the transformed and under-sampled domain blocks such that it matches the size of the range blocks as shown in Fig. 5. If N = 2 and n = 1, then for a cubic digital image of size I=4 x 4 x 4 having a number of pixels 64, the number of range blocks R m = 8, having a number of pixels in each range block that is 8, and the number of domain blocks Dn = 1, having a number of pixels in domain block that is 64. This method is advantageous in obtaining a high quality of decoded image in receiving end but disadvantageous in achieving a low compression ratio.  The basic fractal encoding algorithm for still images given in section 2 is easily extended to fractal video coding. The steps for obtaining frame based fractal video coding are as follows. 7. For each domain block D j t in domain pool D p , using least square regression method [3] given in equation (2) and (3) compute the values of contrast factor-s and brightness factor-o by referring R i and D j t in domain pool DP. Given a pair of range block R i and transformed and shrink domain block D j t in domain pool D P of n pixels with intensities r 1 , r 2 …r n and d 1 , d 2 ,... d n to minimized the quantity i.e R.
8. Compute error E(R i , D j t ) using equation (4) and Quantize factor s and o using uniform quantizer. 9. Searching all domain block D j t in domain pool D P for a particular range block to be encoded R i and find the most suitable block Similarly a basic fractal encoding algorithm for still image is easily extended for obtaining cube based fractal video coding are as follows. 7. For each domain block D j t in domain pool D p , using least square regression method [3] given in equation (2) and (3) compute the values of contrast factor-s and brightness factor-o by referring R i and D j t in domain pool D P . Given a pair of range block R i and transformed and shrink domain block D j t in domain pool D P of n pixels with intensities r 1 , r 2 …r n and d 1 , d 2 ,... d n , to minimized the quantity i.e R .
8. Compute error E(R i , D j t ) using equation (4) and Quantize factor s and o using uniform quantizer. 9. Searching all domain blocks D j t in domain pool D P for a particular range block to be encoded R i and find the most suitable block D j t with minimal error (R i , D j t ) = min E(R i ,D p ).

A. Intra-frame and Inter-frame Coding
In video coding there are two ways to reduce the data present in the frames. Firstly, spatial redundancy elimination, which is called intra-frame coding, in which the frames are coded individually as done in JPEG compression technique. Within individual frame coding, similar data part can be coded with fewer bits per pixels than the original data part, therefore reducing bits per pixel is a minor loss in noticeable visual quality of individual frame coding. On the other hand, temporal redundancy elimination, which is called inter-frame coding, in which the redundant data are eliminated between the frames as done in MPEG compression technique. Inter-frame coding finds the difference between the previous frame and current frame and stores only difference i.e. displacement of pixels instead of complete frames. The displacements of pixels are estimated by using a well-known technique called block matching motion estimation. Block matching motion estimation techniques are used to find out the motion vectors in a frame and then the displacement of pixels block identified by motion estimation technique is coded. Later this block is considered as a previous frame and the next frame in a sequence is considered as a current frame for finding out the difference. This coding process is repeated for all the remaining frames in a video sequence. Fig. 6 shows the block schematic of intra-frame and inter-frame coding techniques.  Table I. Block matching motion estimation algorithms play an important role in designing video coding standards. Video coding standards consist of motion estimation algorithm, encoding mechanism and decoding mechanism to eliminate redundant data.

C. Preliminaries of Automata Theory
In general, a finite automaton is a simple mathematical model used to recognize the strings generated by regular expression notation. In image / video compression technique, finite automata are used to represent the entire image and the address of each subimages of the entire image is specified by the regular expression i.e. ∑* = {0, 1, 2, 3}* where 0, 1, 2 and 3 are the address of the subimages. Finite automata coding process is similar to the fractal coding process by extending finite automata model with the weighted finite automata (WFA) model. WFA is an extended version of finite automata in which a weighted transition is associated between the two states.
The extended version of weighted finite automata is called as extended weighted finite automata (EWFA), which is similar to WFA in which all the subimages in EWFA are transformed using transformation function i.e. scaling and rotation etc. Similarly to WFA, EWFA is also used to store and compress data represented as an image/matrix. In EWFA the numbers of states to store the subimages/matrix data are less as compared to WFA and therefore memory space required to store the states are less.

V. RelATed woRk on AuToMATA TheoRy
Weighted Finite Automata (WFA) is a generalization of finite automata by attaching real numbers as weights to states and transitions proposed by Culik and Kari [49][50]. WFA provides a powerful tool for image generation and compression. The inference algorithm for WFA subdivides an image into a set of non-overlapping range images and then separately approximates each one with a linear combination of the domain image. A new predictive video-coding technique [50], using fractal image compression for intra-frame coding and second order geometric transformations for motion compensation in inter-frame predictive coding, is proposed. For motion compensation second order geometric transformations, compensating for translation, rotation, zooming, uneven stretching, and any combination of these, has been used. The decoded images can be displayed at arbitrary resolution without blockiness, which, together with the very high compression ratio achievable, is the most important advantage of the fractal-based image coding technique over the standard discrete cosine transform (DCT)-based image coding [50]. WFA is one of the techniques that have been used to compress digital images [51][52]. WFA represents an image in term of a weighted finite automaton with a very good compression ratio. WFA is based on the idea of fractal that an image has self-similarity in itself. In this case, the self-similarity is sought from the symmetry of an image, so the encoding algorithm divides an image into multi-levels of quad-tree segmentations and creates an automaton from the sub-images [53]. As the developing of the fractal image compression, the fractal coding method has been applied in video sequence compression [54], for instance, the famous hybrid circular prediction mapping and non-contractive inter-frame mapping [55]. The circular prediction mapping / non-contractive inter-frame mapping combines the fractal coding algorithm with the well-known motion estimation and compensation algorithm that exploits the high temporal correlations between adjacent frames. WFA codec is modified such that it compresses the video at low bit-rates. The video is the sequence of frames (images). Here a hierarchical motion compensation (MC) [56] and bin-tree based WFA codec are integrated to exploit the correlation between successive frames [57]. The advantage is that WFA encoding can replace other transformation coding in video applications. At low bit-rates, WFA encoded images are typically half the size of comparable JPEG images. Video coding scheme based on bit-plane modeling and GFA representation is used to explore the inter-frame, inter-bit plane and inter-level similarities present in the video. To form a generalized finite automata-based compact representation of video sequence, the GFA modeling takes advantages of the binary fractal similarity of the video sequence in the wavelet domain. After exploring the similarities we get compact generalized finite automata representation of video [58][59]. Such schemes significantly outperform the H.26X [60] series coding schemes in rate distortion performance and retain an acceptable perceptual quality of the video.
A generalized finite automaton [61] is used to encode/decode the images automatically. When the GFA is combined with the wavelet transform technique, the GFA is constructed in such a way that each state represents one wavelet function. So this method combines the advantages of both, the classical wavelet compression method and GFA. While encoding the image, GFA doesn't have to solve equations. So encoding may take significantly less time. Decoding of images can also be done more quickly. This method of GFA allows any combination of rotations, flips, and complementation of the quadrant image [62]. Quad-tree based EWFA coding is used in fractal color video compression technique [66]. The quadtree partitioning scheme is used to specify the address of each sub-images i.e. a complete quadtree represents a pyramidal structure of an entire image that is required for the EWFA encoding and decoding process. The concept of intra-frame coding i.e. individual frame coding is used to encode the number of frames in a video. In Quadtree based EWFA coding, each color frame is converted into the YC b C r color space and then converted into gray scale [67,68]. Each gray scale frame is divided into a fixed sized block (2 n ×2 n ) based on quadtree partitioning to represent the address of each pixel.

VI. RelATed woRk on MoTIon esTIMATIon
In circular prediction mapping and non-contractive inter-frame mapping, each range block is motion compensated by a domain block in the previous frame, which is of the same size as the range block even though the domain block is always larger than the range block in conventional fractal image codec. The main difference between circular prediction mapping and non-contractive inter-frame mapping is that circular prediction mapping should be contractive for the iterative decoding process to converge, while non-contractive interframe mapping needs not be contractive since the decoding depends on the already decoded frames and is non-iterative [66]. Recently, Wang [67] proposed a hybrid fractal video compression algorithm, which merges the advantages of a cube-based fractal compression method and a frame-based fractal compression method; in addition, an adaptive partition instead of fixed-size partition is discussed. The adaptive partition [68] and the hybrid compression algorithm exhibit, relatively, the high compression ratio for image [68] and the video conference sequences [67]. In conclusion, a fractal image codec performs better in terms of very fast decoding process as well as the promise of potentially good compression [69][70][71][72][73]. But at present, the fractal codec is not standardized because of its huge calculation amount and slow coding speed. In order to alleviate the above difficulties, a block-matching motion estimation technology [74,75] can be used to improve the encoding speed and the compression quality [76].
Block-matching motion estimation is a vital process for many motioncompensated and video coding standards. Motion estimation could be very computationally intensive and can consume up to 60%-80% of the computational power of the encoding process [77]. So research on efficient and fast motion estimation algorithms is significant. Block matching algorithms are used widely because they are simple and easy to be applied. In the last two decades, many block matching algorithms are proposed for alleviating the heavy computations consumed by the brute-force full search algorithm which has the best prediction accuracy, such as the new three-step search [78], the four-step search [79], the block-based gradient descent search [80], the diamond search [81], the cross-diamond search [82], etc. All these searches employ rectangular search patterns of different sizes to fit the center-biased motion vector distribution characteristics [83][84]. Hexagon-based search employs a hexagon-shaped pattern and results in fewer search points with similar distortion [85]. Block-matching algorithm called Novel Cross-Hexagon Search algorithm is proposed in [75]. It uses small cross-shaped search patterns in the first two steps before the hexagon-based search and the proposed halfway stop technique [74]. It results in higher motion estimation speed on searching stationary and quasi-stationary blocks. The traditional algorithms use all the pixels of the block to calculate the distortions that result in heavy computations. Modified Partial Distortion Criterion [86] that uses certain pixels of the block, which alleviates the computations and has similar distortion can be used. New Cross-hexagon search algorithm (NHEXS) proposed in [87][88] consist of two cross search patterns and hexagon search patterns which are similar in [75] for fast block matching Motion Estimation. This search technique is a frame based fractal video compression technique that helps to reduce the encoding time and increases the compression quality in fractal coding.

A. Motion Estimation Algorithms
Motion estimation used in the area of video application such that video segmentation, object/video tracking, and video compression. Motion estimation means the displacement of pixels position from one frame to another frame which gives the best motion vector. For estimating a motion in a video sequence, block matching algorithms (BMA) are widely used in most of the video coding standards. In BMA a frame is divided into a non overlapping block and for that block motion vector is estimated. A motion vector is computed by finding best suitable matched block between previous frame-f and the next frame-f+1 as shown in Fig. 7. Motion vector for the block B f (x, y) is computed as (+1,-1) i.e. MV-B f (x, y)=(+1,-1). Still, research is going on for efficient and fast block matching motion estimation. Different types of block matching motion estimation algorithms are described below.

1) Full Search Motion Estimation
This algorithm also called as exhaustive search algorithm because motion vector is computed after a complete window of size (2w+1)*(2w+1) is exhaustively searched for the purpose of best matching of block size n x n pixels as shown in Fig. 8(a). Due to the exhaustive search, this algorithm gives a better accuracy in searching the best matching block. The major drawback of this algorithm is to require a number of computations for the best matching block. The computational complexity of this algorithm is very high when the window size is too large and block size is too small.

2) Three Step Search
Koga et al. [89] proposed a robust and very simple three step search motion estimation algorithm. This algorithm is one of the most popular algorithms for the low bit rate application because of its efficient performance. Computational cost of this algorithm is low as compared to the full search algorithm. First step is to define a search window size for searching the best match. In the first step, plot nine points in the search window at the equal distance of step size. In the second step, the step size is divided by 2 if minimum block distortion measure (BDM) point is one of the nine point of search window and consider this point as a center point in the third step. This process is repeated again until the step size is smaller than one. This complete process is shown in Fig.  8(b). If the step size is s=7 then the number of checking points required for TSS is 25 using equation given below. Number of checking points = 1 + 8 log 2 (s +1) where s is step size of the search window.

3) 2D Logarithmic Search
This algorithm was originally proposed by Jain and Jain [90]. In this search strategy, an initial search window size is defined in the central area of the image. As compared to full search motion estimation algorithm, instead of searching all pixels in a complete search window, the search is done in five different directions which contain north, south, east, west and central direction as shown in Fig. 9(a), where Initial window: , Step size divided by 2: , Final step: . If the minimum BDM point is found at any of these five different directions then this direction is considered as a center of the search window for the next step in which search window area is divided by 2 and this process is repeated until the search window area is converted into 3x3 window size. At last, all the nine points are searched in the 3x3 search window and found minimum BDM point corresponds to best matching position that gives the motion vector i.e. block coordinates.

4) Orthogonal Search
A. Puri et al. [91] introduced an orthogonal search algorithm (OSA) based on the combination of three step search (TSS) and 2D logarithmic search algorithm. This algorithm performs firstly a horizontal search with 3 checking points and secondly a vertical search with 3 checking points including minimum BDM point as a center of the previous horizontal search. The step size is divided by 2 and this process is repeated until step size is one as shown in Fig. 9(b)

5) Cross search
M. Ganbari [92] proposed a cross search pattern algorithm (CSA) consisting of 5 checking points placed in a cross shape pattern (Χ). In each step find minimum BDM point then step size is divided by 2, consider this point as a center and place 4 points in cross shaped pattern across the center point. In the final step, as the step size is reduced to one, place cross search pattern (+) if minimum BDM point of the previous step found at any one of the checking point i.e. center, upper left corner and lower right corner otherwise place cross search pattern (Χ) as shown in Fig. 10(a), where 1 st step: : , 2 nd step: : , 3 rd step: , 4 th step: : . If the step size is s=7 then the number of checking points required for cross search algorithm (CSA) is 5+4+4+4=17 using equation given below. Number of checking points = 5+ 4 log 2 ( s), where s is step size of the search window.

6) Binary Search
Binary search (BS) Algorithm is used in MPEG-Tool [93] for block matching motion estimation shown in Fig. 10(b), where 1 st step: , 2 nd step: best case, 2 nd step: , average case, 2 nd step: , Worst case. Firstly this algorithm divides the 9 checking outer points of the search window into small 9 search windows and, if minimum BDM point is found at any one of the search windows, then performs a search operation on corresponding search window. If minimum BDM point is found at the center, corner and middle of the search window then the number of checking points required for BSA is 25+9=33 for the worst case, 8+9=17 for the best case and 14+9=23 for the average case respectively. The pixels on the blue lines or the pixels between the search windows are not considered for searching.

7) New Three Step Search
R. Li et al. [78] proposed a new three-step search (NTSS) algorithm consisting of two searching patterns, i.e. center biased checking pattern and halfway-stop technique (step size divided by 2) like three step search algorithm as shown in Fig. 11. This algorithm initially defines a search window size for searching the best match and plots 17 points on that search window size as shown in Fig. 11. From all the 17 points, plot 9 points on the 3x3 grid in the inner/central area of search window & rest of the eight points plot on the 9x9 grid in the outer area of the search window. If the minimum block distortion measure is found at the centre point of the search window then stop/halt searching, otherwise goto step 2. In step 2, if one of the central neighboring points on 3x3 grid is found to be minimum BDM point then go to step 3, otherwise goto step 4. In step 3, if minimum BDM point is found at the corner point or middle point on the 3x3 grid in the inner/central area of the search window of the horizontal and vertical axis, then consider these points are center point and plot/search additional 5 or 3 points respectively in the search window. In step 4, if minimum BDM point is found at corner points and middle points of every two corners on the 9x9 grid in the outer area of search window, then step size is divided by 2, this process is repeated until the step size is smaller than one. In this algorithm, the number of check points required for NTSS is 17 for the best case, 20 or 22 for average case and 33 for the worst case.

8) Four Step Search
Lai-Man Po et al. [79] proposed a novel four-step search (FSS) algorithm for block matching motion estimation. The performance measure of this search as compared to other search algorithm is better than TSS and similar to NTSS. Instead of the 9x9 search window in NTSS, this algorithm uses a central biased search pattern with 9 checking points on a 5x5 search window. If minimum BDM point is found at the center of 5x5 search window, then search window size is reduced to 3x3 window size i.e final step of FSS, otherwise goto next step. In next step, if minimum BDM point is found at any one of the four corners of the horizontal and vertical axis or midpoints between the two corners of search window, then check additional 5 points or 3 points in the search window. In the final step, a search window of size 5x5 is reduced to 3x3 window size and minimum BDM point is considered as a motion vector as shown in Fig. 12(a), where 1 st Step: , 2 nd Step: : , 3 rd Step: , 4 th Step: : . If the step size of the final step is greater than one, then again another FSS is performed with the final step of the previous FSS is the first step of the another FSS. The number of checking points required for FSS algorithm is 9+8 = 17 for the best case, 9+5+5+8 = 27 for the worst case, and 9+3+3+8=23 for average case if minimum BDM point found at center point in the first step, corner point in the second and third step and midpoint in the second and third step of the search window respectively.

9) Block-Based Gradient Descent Search
L.K.Liu et al. [80] proposed a block-based gradient descent search (BBGDS) algorithm based on central biased search pattern of 9 checking points. This algorithm performs an unrestricted search within a search window with a step size of one in each step of block generated search pattern. If minimum BDM point is found at the corners of the horizontal and vertical axis or mid points of two corners, then checked additional 5 or 3 points respectively in the search window. If the minimum BDM point is found at the center of current block searched pattern, then searching will stop or, if the search pattern reached to the search window boundary then also stop searching. This algorithm performs better for small motions. Search pattern of block based gradient descent search algorithm are shown in Fig. 12(b), where Upward MV: Step 1: , , Step 2: , , Step 3: . .

10) Diamond Search
S. Zhu et al. [81] proposed a two diamond shape search (DS) pattern called large diamond shape search pattern (LDSP) and small diamond shape search pattern (SDSP) consisting of nine and five checking points including center point to form a diamond shape respectively. This search initially starts with 9 checking points i.e. LDSP, continue searching with this LDSP to form a new LDSP until the minimum BDM point found on the center of the LDSP and then shifted a large diamond shape search pattern to small diamond shape search pattern. In the final step of this search, the minimum BDM point found in SDSP is considered as a final motion vector as shown in Fig. 13(a), where LDSP: 1 st Step: , 2 nd Step: , 3 rd Step: , 4 th Step: , SDSP : 5 th Step: Final MV. The number of checking points required for Diamond search algorithm is 13 for the best case. The performance based on the checking points computation of diamond search algorithm is closely related to efficient three-step search algorithm (ETSS) and better than three step search (TSS), four-step search (FSS) and blockbased gradient descent search algorithm.

11) Cross Diamond Search
C.H.Cheung et al. [82] proposed a cross diamond shape search pattern algorithm based on halfway stop technique for block matching motion estimation. This search algorithm initially starts with 9 checking points cross shape (+) search pattern instead of diamond shape search pattern in the diamond search algorithm. The performance based on checking points of this algorithm is better than the diamond search algorithm. In the first step, if minimum BDM point is found at the center of large cross shape search pattern (LCSP) i.e. 9 checking points, then stop searching. In the second step, search operation is performed on small cross shape search pattern (SCSP) i.e. 5 checking points. If minimum BDM point is found at the center of the SCSP, then stop searching, otherwise in step three check another extra two points nearer to the minimum BDM point found at the corner points of SCSP. If minimum BDM point found at the center of newly created SCSP, then stop searching, otherwise perform a diamond search in the final step as shown in Fig. 13(b), where LCSP: 1 st Step: : , SCSP: 2 nd Step: , 3 rd Step: , Diamond search LDSP : & , , SDSP : Final Step: Final MV.

12) Hexagon Search
Ce Zhu et al. [85] introduced an hexagonal search algorithm for motion estimation, which consists of two hexagon search patterns, Large Hexagon Search Pattern (LHSP) with 7 checking points and Small Hexagon Search Pattern (SHSP) with 5 checking points including center point of the hexagon as shown in Fig. 14(a). This search technique initially starts with LHSP, if minimum BDM point is found at the center of LHSP, then LHSP is shifted to SHSP and new minimum BDM point found in SHSP is the final motion vector of the search, otherwise minimum BDM point found acts as a center of LHSP and check three more points to form a new LHSP, this step is repeated until minimum BDM point found at the center of the new generated large shape pattern hexagon and then lastly shifted LHSP into SHSP. The number of checking points required for the hexagonal search is 11 for the best case. Search pattern of hexagon search algorithm is shown in Fig. 14(b).

13) Efficient Three Step Search
Xuan Jing et al. [94] proposed a modified version of three-step search algorithm for block matching motion estimation, which consists of unrestricted small diamond search pattern that is used to search the central area of defined search window and used in wide range of video applications like movies, sports etc. This algorithm initially defines a search window size and plots 13 points on that search window size. In step 1, outer 9 points and small diamond search pattern 4 points (Total 13 points) will be checked i.e. 4 points more than TSS and 4 points less than NTSS. If the minimum BDM point is in the center of search window then the search will be stopped/halts, otherwise goto step 2. In step 2, if minimum BDM point is one of the outer 8 points then TSS algorithm is used to search the point, otherwise goto step 3. In step 3, If minimum BDM point is one of the four points on the small diamond pattern, then consider this point as a center and checked another 3 points. This process is repeated until small diamond search pattern reaches to search window boundary. In this algorithm, the number of check points required for Efficient Three Step Search (ETSS) are 13 for the best case and greater than or equals to 29 for the worst case because of unrestricted small diamond search pattern at central part of search window as shown in Fig. 15.

Search pattern used in first step
Small diamond search pattern

14) New Cross Hexagonal Search
Kamel Belloulata et al. [88] introduces a new fast fractal cross hexagonal block matching motion estimation search algorithm consisting of two cross pattern search, i.e. a Small Cross Shape Pattern (SCSP) and Large Cross Shape Pattern (LCSP) as a few initial steps of search, and two cross hexagon search pattern, i.e. small and large hexagon search pattern as a subsequent steps of search as shown in Fig.   16, where SCSP: 1 st step: : , 2 nd step: , LCSP: 3 rd step: , LHSP: 4 th step: and , SHHP: 5 th step: Final MV. This algorithm initially starts with small cross shape pattern consisting of 5 points located at the center of the search window. If minimum BDM point is found at the center of the small cross shape pattern, then stop searching i.e. the number of checking points required for the new cross hexagonal search is 5 for the best case, which is better than all the techniques available for estimating a motion vector, otherwise consider a minimum BDM point as a center of newly formed small cross shape pattern. If minimum BDM point is found at the center of newly formed small shape pattern, then stop searching i.e. another best case solution is 8 checking points required for finding out best possible motion vector, otherwise check the other 3 unchecked points of large cross pattern and 2 unchecked points of the square center biased to show the best possible direction for the hexagonal search. A new large hexagonal shape pattern is formed by considering a center point as minimum BDM point found in small cross pattern search. If minimum point is found at the center of large hexagonal pattern, then large hexagonal pattern shifted /changed to small hexagonal pattern and find best motion vector in small hexagon shape pattern, otherwise again form a new large hexagon pattern, this formation of new large hexagon pattern is repeated until minimum BDM point is found at the center of large hexagonal pattern.

15) Modified Three Step Search
S. D. Kamble et al. [95] proposed another extended version of threestep search algorithm consisting of Two cross search pattern i.e. small and large cross search pattern and two cross hexagon search pattern i.e. large cross hexagon and small cross hexagon search pattern, which are used in the center part of search window to exploit central biased characteristic of MV in video sequences. Fig. 17 shows the search pattern used in modified three-step search. In the first step, total 9+4 points are checked out of 17 checking points. If minimum BDM point is found at the center of 9+4 points, then stop searching, otherwise go to the second step. If minimum BDM point is found at the outer part of the search window, then search process is same as TSS, otherwise go to the third step.

Search pattern used in first step
Two Search pattern If minimum BDM point is found at the 4 outer points of small cross search pattern, then search process is same as cross hexagon search. There is no restriction on searching in the center window part unless minimum BDM point is found at the center of large cross hexagon pattern or large cross hexagon search pattern reaches to the outer boundary of the search window. This unrestricted search in the central part of the window increases the probability of finding a true motion vector within the center area of the window.
Some other approaches are also proposed by Acharjee et al. [96][97][98]. Smaller block size based motion estimation approach for video compression is proposed in [96]. Based on the movement in the video, the low and high motion zone approach exists in [97], while the scope of parallel processing is experimented in [98] for motion vector estimation. Generation of the motion vector and motion compensation prediction error plays a key role in video encoding process. Therefore, motion vector dominates the quality of reconstructed frames/video. Y. -G. Wu and G. -F. Huang [99] proposed motion vector generation using gray theory [100] proposed in 1982. Fig. 18 shows video compression process flow using motion compensation WFA encoder.

VII. QuAlITy MeAsuRe
Inter-frame and intra-frame coding are used to eliminate a large amount of temporal and spatial redundancy that exists in the video sequences and therefore, help in compressing them. The matching of the one current frame macro block and previous frame macro block is based on the output of matching criteria. The macro block that results in the minimum value is the one that matches the closest to current block with respect to the corresponding previous frame macro block. The popular matching criteria used for block matching motion estimation are mean of absolute difference (MAD), mean squared error (MSE) and sum of absolute difference (SAD) given by equation (8, 9 and 10) respectively.
SAD (i,j) = (10) where N×N is the row and column of the macro block, C ij and R ij are the pixels values compared in the current macro block and previous macro block, respectively. In Block matching algorithms, the size of macro block is the important parameter for motion estimation. Smaller macro block size results in more motion vectors and more macro blocks per frame. Therefore, quality of motion compensated prediction error (MCPE) is improved. Most video coding standards used a macro block of size 16×16 and 8×8. The best/single motion vector is computed for each macro block in the reference frame. On the other hand, the total number of search points to find motion vector per frame is one of the key parameters in block matching motion estimation algorithm.
The performance of video coding is measured in terms of compression ratio, quality of the video, encoding time and decoding time. The compression ratio is given by equation (11).
Compression ratio (CR) in % = ( ) (12) When measuring the quality of the compressed video, the peak signal-to-noise ratio is used. Sometimes mean squared error (mse) is also used, which is given by equation (13). mse = (13) From this, the PSNR for an 8-bit grayscale image is defined by equation (14) PSNR(dB) = ) (14) Where 255 is the maximum value an 8-bit pixel can assume.

VIII. dIscussIon And conclusIon
Based on the studied literature, it is found that there is a scope of improvement on the challenging issues of fractal coding like searching best domain block, domain pool size reduction, partitioning scheme, domain pool classification and use of parallel computing architecture for fractal compression. The major challenge in fractal coding is how to reduce the encoding time due to a large number of computations involved in fractal coding. From the studied literature on automata theory based existing compression approaches, any image/sequence of image is represented by the finite automata. The automata theory based coding technique is similar to the fractal coding technique for searching the self similarity parts present in the image/sequence of image and regular expression notation is used to specify the address of each subimage. Our observations from the experimentation carried out by contributors/authors who have already contributed their work on automata theory based coding approach on the different existing databases, it achieves the high compression ratio, good image/video reconstruction quality, fast decoding and reduction in encoding time. The existing different block matching motion estimation approaches for finding out the motion vectors in a frame are widely accepted by video compression research community/society and efficiently used in video compression standards are-H.261 to H.263 and MPEG-1 to MPEG-4. Our observations from the studied literature on different block matching motion estimation algorithms are given in Table II. Table II shows the performance comparison for different existing approaches discussed in this paper based on number of search points per block and Table III shows comparative performance analysis based on the parameters i.e. step size, number of steps required and performance for small/large motions.

Ix. fuTuRe scoPe
There is a scope of improvement in developing a new shape pattern or modifying existing shape pattern approaches by combining two different algorithms i.e. developing a hybrid approach for finding out the motion vectors for fractal and other compression techniques with specific focus on reducing the encoding time and better reconstruction quality of video.
We can extend the block motion estimation approach in future by combing with the extended version of the finite automata for exploring the fractal video compression. We can also use these block matching motion estimation approaches for tracking a single/multiple object tracking applications to track the objects in a video.