MSA for Optimal Reconfiguration and Capacitor Allocation in Radial / Ring Distribution Networks

M of electrical distribution networks feed inductive loads at low voltage levels. This effect leads to higher currents and power losses accompanied by voltage drop whereas about 13% of the total power generation has been considered as line losses [1]. Therefore, these losses must be diminished to improve the power system stability and reliability, power factor and voltage profile. Connecting shunt capacitors is considered as one of the basic methods which has been used in distribution systems to solve such problems [2, 3]. However, the random locating of capacitors can cause more voltage drop and higher power losses. Moreover, the capacitor allocation problem has a combinatorial nature because capacitor locations and sizes are discrete variables [4]. Therefore, several optimization algorithms have been proposed in recent years to solve the optimal shunt capacitor placement and sizing problems in radial and ring distribution systems for maximizing their benefits. Flower pollination algorithm (FPA) [5], particle swarm optimization (PSO) [6, 7], discrete particle swarm optimization (DPSO) [8], genetic algorithm (GA) [9], teachinglearning-based optimization (TLBO) [10], artificial bee colony (ABC) [11], cuckoo search algorithm (CSA) [12], gravitational search algorithm (GSA) [13], modified monkey search (MMS) [14], whale optimization algorithm (WOA) [15], improved harmony algorithm (IHA) [16], fuzzy-GA [17], direct search algorithm (DSA) [18], differential evolution algorithm (DEA) [19], simulated annealing (SA) [20], plant growth simulation algorithm (PGSA) [21], fuzzy reasoning (FRB) [22], Analytical IP [23], improved binary particle swarm optimization (IBPSO) [24], Mixed-integer nonlinear programming (MINLP) [25] and fuzzy real coded genetic algorithm (FRCGA) [26] have been proposed to solve the capacitor allocation problem. However, some of these algorithms are not highly effective as the power losses still have high values [8, 9]. Other algorithms appear to be effective, but they may not achieve the optimal cost value [5, 10]. Al-Attar et. al [27] has proposed a new optimization technique called the moth swarm algorithm (MSA) which is inspired from the orientation of moths towards moonlight. This algorithm is developed based on the conventional moth flame algorithm by enhancing its exploitation and exploration by applying adaptive cross over levy mutation with associative learning mechanism. It is clear from the literature review that the MSA technique has not been applied to solve the problem of optimal capacitor location in the RDN. Hence, the authors propose to use the MSA method for dealing with the mention problem. In this paper, MSA is presented to minimize the system power losses, decrease the total cost and maintain the voltage profile for various electrical distribution systems. It is tested on multiple IEEE standard distribution systems i.e., (33 and 69-bus). Furthermore, it is tested on the mesh distribution systems which have two ways Keywords


I. Introduction
M OST of electrical distribution networks feed inductive loads at low voltage levels. This effect leads to higher currents and power losses accompanied by voltage drop whereas about 13% of the total power generation has been considered as line losses [1]. Therefore, these losses must be diminished to improve the power system stability and reliability, power factor and voltage profile. Connecting shunt capacitors is considered as one of the basic methods which has been used in distribution systems to solve such problems [2,3]. However, the random locating of capacitors can cause more voltage drop and higher power losses. Moreover, the capacitor allocation problem has a combinatorial nature because capacitor locations and sizes are discrete variables [4]. Therefore, several optimization algorithms have been proposed in recent years to solve the optimal shunt capacitor placement and sizing problems in radial and ring distribution systems for maximizing their benefits. Flower pollination algorithm (FPA) [5], particle swarm optimization (PSO) [6,7], discrete particle swarm optimization (DPSO) [8], genetic algorithm (GA) [9], teachinglearning-based optimization (TLBO) [10], artificial bee colony (ABC) [11], cuckoo search algorithm (CSA) [12], gravitational search algorithm (GSA) [13], modified monkey search (MMS) [14], whale optimization algorithm (WOA) [15], improved harmony algorithm (IHA) [16], fuzzy-GA [17], direct search algorithm (DSA) [18], differential evolution algorithm (DEA) [19], simulated annealing (SA) [20], plant growth simulation algorithm (PGSA) [21], fuzzy reasoning (FRB) [22], Analytical IP [23], improved binary particle swarm optimization (IBPSO) [24], Mixed-integer nonlinear programming (MINLP) [25] and fuzzy real coded genetic algorithm (FRCGA) [26] have been proposed to solve the capacitor allocation problem. However, some of these algorithms are not highly effective as the power losses still have high values [8,9]. Other algorithms appear to be effective, but they may not achieve the optimal cost value [5,10].
Al-Attar et. al [27] has proposed a new optimization technique called the moth swarm algorithm (MSA) which is inspired from the orientation of moths towards moonlight. This algorithm is developed based on the conventional moth flame algorithm by enhancing its exploitation and exploration by applying adaptive cross over levy mutation with associative learning mechanism. It is clear from the literature review that the MSA technique has not been applied to solve the problem of optimal capacitor location in the RDN. Hence, the authors propose to use the MSA method for dealing with the mention problem. between generation and consumers and this is more complicated in design and requires complex protection schemes which includes higher investment than RDN. In addition, the obtained results from the proposed approach are compared with those obtained from other algorithms to confirm its superiority. The rest of this work is organized as follows; section.2 provides the objective function formulation. MSA algorithm is represented in section 3. In section.4, the implementing of MSA algorithm for solving the capacitor allocation problem has been presented. Section 5 shows the numerical results of the proposed technique applied on multiple IEEE standard systems. The last section concludes the results and advantages of the proposed method.

A. Load Flow Calculation
RDN creates some negative conditions such as radial meshed networks, unbalanced operation, high R/X ratios and distributed generation. Due to these problems, the Newton Raphson, Gauss Siedel and other conventional load flow algorithms are not effective to solve the load flow calculation of the distribution systems [28]. Therefore, the modern algorithm called backward/forward sweep [28] is used in this work to analyze the power flow in the tested IEEE distribution systems. The line current I k is calculated from (1) as follows: The active power flow (P k+1 ) and reactive power flow (Q k+1 ) in RDN are calculated by (2) and (3) derived from single-line diagram as shown in Fig. 1.
where k is the sending end and k+1 is the receiving end. Voltages of a transmission line and real power losses in the line can be calculated from (4), (5), and (6) respectively: The total system loss is calculated by summing all line losses in the system as shown in (7):

B. Objective Functions
The main aim of the objective function of the optimal capacitor placement problem is to minimize the total cost per year by reducing the real power losses and the cost of installing capacitors subjected to voltage and reactive power limits. This paper uses the weighted sum method to evaluate the effectiveness of the proposed approach to find the benefits of optimal allocation and rating of shunt capacitors. The weighted sum method allows the multi-objective to be cast as a single-objective mathematical optimization problem resulting in only one solution, in addition to its lower computational cost (CPU-time). These advantages are more proper for real world problems. Hence, the multi-objective functions have been performed by using the following mathematical statement: where F 1 and F 2 are described as: where the cost function is defined as:

C. Constraint Conditions
The objective function is subjected to:

1) Voltage Constraint
The buses voltages are the inequality constraints. The bus voltage magnitude of each bus must be maintained within the following range: where V max and V min are the maximum and minimum values of bus (k) voltages. The lower and upper values are taken as 0.9 and 1.05 Pu, respectively.

2) Total Reactive Power Constraint
The total injected reactive power, which represents the equality constraints must be limited by: (11) Tloss d cap sys Q Q Q Q + = + (12) Power-flow equations, equality restrictions (2) and (3), can be satisfied during the process of power-flow calculation. In the encoding period, the inequality restrictions (10)- (12) can be satisfied through adding penalty function into the objective function in such a way that it penalizes any violation of the constraints. Consequently, the constrained optimization problem is then converted into an unconstrained form.

III. Overview of MSA
The moth swarm algorithm has been presented in 2017 by Al-Attar et. al [27]. It is inspired from the orientation of moths towards moonlight. The available solution of any optimization problem using MSA is performed by the light source position, and its fitness is the luminescence intensity of the light source. Furthermore, the proposed method consists of three main groups, the first one is called pathfinders which is considered a small group of moths (n p ) over the available space of the optimization. The main target of this group is to guide the locomotion of the main swarm by discriminating the best positions as light sources. Prospectors group is the second one which have a tendency to expatiate in a non-uniform spiral path within the section of the light sources determined by the pathfinders. The last one is the onlookers, this group of moths move directly to the global solution which has been acquired by the prospectors.
The steps of the MSA technique are discussed as follows:

A. Initialization
Initially, the positions of moths are randomly created for dimensional (d) and population number (n) as seen in (13). (13) where, max j x and min j x are the upper and lower limits, respectively.
Afterwards, the type of each moth is selected based on the determined fitness. Consequently, the best moths are elected as light sources and the following groups of moths (i.e., the best and worse) will be dealing as prospectors and onlookers, respectively.

B. Reconnaissance Phase
The moths may be concentrated in the regions, which seem to be a good performance. Therefore, the swarm quality for reconnaissance may be decreased during the process of the optimization and this process may lead to a stagnation case. To avoid the early convergence and enhance the solution diversity, a part of the swarm is compelled to determine the less congested area. The moths, which perform this role, update their positions by interacting with each other.
A new strategy for the diversity of solutions is presented to choose the crossover points. Firstly, the normalized dispersal degree t j σ of the individuals is measured as follows: (14) where, .
Then, the coefficient of variation, which measures the relative dispersion, is calculated as: σ µ (15) Any element of the pathfinder moths exposed to a low dispersal degree will be taken in the group of crossover points C p , as described below: To complete the full trail solution, each host vector (i.e., pathfinder solution) will update the position through the crossover processes by integrating the modified variables of the sub-trail solution into the analogical variables. The full trial solution V pj can be defined as: (16)

C. Lévy Flights
Lévy flights/motions are random processes based on α-stable distribution with ability to travel over large scale distances using different size of steps. Lévy α-stable distribution strongly linked with heavy-tailed probability density function (PDF), fractal statistics, and anomalous diffusion. The PDF of the individual jumps α λ − − ≈ 1 ) ( q q decaying at large generated variable q. The stability/tail index α ϵ [0, 2] or so called the characteristic exponent describes the shape of the distribution taper [27]. There are a few special cases that have a close form for the density of the general Lévy distribution, and can be defined as: • Gaussian or normal distribution, • A simple version of Lévy distribution, if density is: Mantegna's algorithm [27] is used to emulate the α-stable distribution by generating random samples L i that have the same behavior of the Lévy-flights, as follows: (20) where, step is the scaling size related to the scales of the interest problem, ⊕ is the entrywise multiplications,

D. Difference Vectors Lévy-Mutation
For crossover operations points, the proposed algorithm creates the sub-trial vector by perturbing the selected components of the host vector , with related components in the donor vectors (e.g. ). The Mutation strategy may be used for synthesis such a sub-trail vector, as follows: where, L p1 and L p2 are two independent identical variables used as the mutation scaling factor and generated by a heavy tail Lévy-flights using (L p~r andom(n c )Θ Levy(α)). The set of mutually indices (r1, r2, r3, r4, r5, and p) are exclusively selected from the pathfinder solutions.

E. Selection Strategy
The fitness value of the full trail solution is determined after finishing the last procedure, and then it is compared with its corresponding host solution. The suitable solutions are selected to continue for the next generation, which is used for minimization problems as follows: (22) The probability P p which is proportional to luminescence intensity f itp can be calculated from (23) and f itp is estimated from the objective function value f p with minimization problems from (24).

F. Transverse Orientation
The prospector moths are the next best luminescence intensity group of moths. The number of prospectors n f is proposed to decrease through all iterations T as follows: After the pathfinders have finished their search, the information about luminescence intensity is shared with prospectors, which attempt to update its positions in order to discover new light sources. Each prospector moth x i is soared into the logarithmic spiral path as shown in Fig. 2(a) to make a deep search around the artificial light source x p , which is chosen on the basis of the probability P p using (23). The new position of ith prospector moth can be expressed mathematically as follows: (26) where, θ∈[r,1] is a random number to define the spiral shape and r=-1-t⁄T. Although the same formula has been used in Moth-flame Optimization (MFO) [27] algorithm, the MSA is dealing with each variable as an integrated unit. In the MSA model, the moths are changed dynamically. Therefore, any prospector moth uplifts to become pathfinder moth if it discovers a solution with luminescence more than the existing light sources. That means the new lighting sources and moonlight will be presented at the end of this stage.

G. Empyreal Navigation
The diminishing of the number of prospectors during the optimization process increases the onlookers number (no= n-nf-np). This may lead to an increase in the speed of the convergence rate of MSA towards the global solution. The onlookers are the moths that have the lowest luminescent sources in the swarm. Their main aim for traveling directly to the moon is the most shining solution as shown in Fig. 2(b). In the MSA, the onlookers are forced to search for the hot spots of the prospectors effectively. These onlookers are divided into the two following parts: The first part, with the size of , walks according to Gaussian distributions using (5). The new onlooker moth in this subgroup x moves with series steps of Gaussian walks, which can be described as follows: Where, ε 1 is a random number generated from Gaussian distribution, ε 2 and ε 3 are random samples drawn from a uniform distribution within the interval [0,1], best g is the global best solution (moonlight) obtained in the transverse orientation phase. Based on many optimization algorithms, there is a memory to transfer information from the current generation to the next generation. However, the moths may fall into the fire in the real world due to the lack of an evolutionary memory. The performance of moths is intensely affected by the short-term memory and the associative learning [27]. The associative learning has an important role in connection among moths. Therefore, the second part of onlooker moths G A n n − = o n will sweep towards the moon light depending on the associative learning operators with an instantaneous memory to imitate the actual behavior of moths in nature. The instantaneous memory is initialized from the continuous uniform of Gaussian distribution on the range from . The updating equation of this type can be completed in form: (29) where, r 1 and r 2 are random number within the interval [0, 1], 2g/G is the social factor, 1-g/G is the cognitive factor and best p is a light source selected from the modified swarm based on the probability p i . It is worth mentioning that the constraints are checked and satisfied after each fitness evaluation in the flowchart of MSA (see Fig. 3).

IV. Numerical Experiments of MSA
In order to tune the parameters of the proposed MSA and evaluate its performance in terms of exploitation, exploration, convergence behavior and solution quality, a set of 23 benchmark functions commonly used in literature were tested. The details of these functions are given in [29]. In this section, a swarm of 50 moth with seven pathfinders has been employed over 50 independent runs with a 1000 maximum number of function evaluations for f 1 -f 13 and 500 iteration for f 14 -f 23 . MSA is compared with four metaheuristics algorithms, including MPSO [30], Modified Differential Evolution (MDE) [31] approach, MFO [32], and Flower pollination algorithm (FPA) [5], respectively. To maintain comparison consistency, these algorithms are tested with 50-population size under the same conditions and using their standard control-parameters setting as given in Table I. The mean and the standard deviation are used in order to assess the robustness of the algorithms under study.

A. Determination Control Parameters in MSA
In nature, light can be dangerous and a large number of artificial lights will decrease the flight activity of moths. A statistical study has been used to specify the required number of pathfinders, and the obtained results for a swarm of 50 moths at different values of n p are illustrated at Table II appendix (A). Judging from Table II, it can be seen that, the best required number of pathfinders is approximately 13% of the total populations.

B. Exploitation Analysis Based on Unimodal Benchmark Functions
The first set of experiments aimed to benchmark the exploitation ability of the proposed MSA. The unimodal function (f 1 -f 7 ), are designed to compare the convergence rate of the search algorithms. In the MSA, pathfinders and prospectors primarily carry out the exploration (global search). The mean and the standard deviation (noted as StDev) are performed as reported in Table III in appendix A. According to the overall rank, although the MSA and MPSO are satisfied the condition of convergence rate and significantly better than other metaheuristic algorithms, the MSA is stronger than MPSO in fine tuning around the global optimum due to its better global search ability. On the other hand, MFO is mainly searched in a small local neighborhood. In addition, the widespread step of the FPA is not a guarantee for obtaining the advanced order.

C. Exploration Analysis Based on Multimodal Benchmark Functions
A test suite has been employed to compare MSA performance with other algorithms at the high-dimensional multimodal functions (f 8f 13 ), and the final results are summarized in Table IV in appendix (A). It is obvious that, MSA and MDE are clearly escaped from the poor local optimum, and the GMSA approaches the neighborhood of the global optimum at f 8 and hits the exact optimum every time at (f 9 -f 11 ). On the other hand, the Lévy-flights updating strategy of FPA maintains a small protection against the premature convergence; whereas the MFO has a low probability to make such a long jumps, which may be the reason for its poor average best fitness.
The experimental study for the low-dimensional multimodal functions (f 14 -f 23 ), given in Table V appendix (A). , shows that the MSA and MDE have the best results compared to the rest of the algorithms, while MPSO has difficulties with functions of this kind. Although f 18 is an easy problem, the GMSA has failed to find the global optimum solution as other algorithms. In the three Shekel functions (f 21 -f 23 ), FPA obtains a better average performance than the other optimizers. In sum, the algorithms achieve a similar performance ranking for both multimodal categories, where MSA is ranked 1st followed by MDE, MPSO, MFO, FPA, respectively. To validate the comparative study, the pairwise Wilcoxon's rank-sum test, a nonparametric statistical test, is carried out at 0.05 significance level to judge whether the results of the GMSA differ from the other algorithms in a statistical method. The ρ-values of the Wilcoxon's rank-sum, based on outcomes of Tables III-V in appendix (A), are displayed in Table VI appendix (A). In this table, the -values that are less than 0.05 proved a sufficient evidence against the null hypothesis.
In order to verify the solution quality and further assess the robustness of the proposed algorithms, the graphical analysis of the Analysis of Variance (ANOVA) test for functions f 4 , f 8 , and f 21 are used, as depicted in Fig. 4. The boxplots confirm that MSA achieves, on average, superiority in comparison with the rest of the algorithms.

D. Analysis of the Convergence Behavior
The algorithms under study have been executed on 50 independent runs in order to assess their robustness through the mean and the standard deviation. To investigate the convergence behavior of the best evolution curves for the proposed methods are seen in Fig.  5. Generally, MSA and MPSO have smooth curves with a faster convergence rate more than the other algorithms. Whereas, the MPSO suffers from a premature convergence, caused by particles stagnating around local optima, when handling nonlinear functions. MFO has linear characteristics, meanwhile suffers from excessively slow rate as in f10. In other hand, the MDE and FPA have non-smooth convergence characteristics. We can say that, the developed optimization algorithm is a deep-PSO, fast-MFO and linear convergence of MDE and FPA. Seven benchmark functions are used to assess the convergence speed of the MSA against the four techniques under study and the wellknown PSO [33], group search optimizer (GSO) [33] and modified group search optimizer (MGSO) [34] as shown in Table VII appendix (A). It is clear that MFO gives less computational time cost than the other algorithms. This is because MFO owns one updating equation even though it applies to each component of variables. Although, the MSA contains a number of strategies, but each of them apply to a certain group of the population. Except the small group pathfinders, all moths are dealing with each variable as an integrated unit. In addition, no longer need to store the velocities and personal best solution for each onlooker moth, as in the basic PSO. These properties made the MSA give acceptable computational cost results. The MPSO has a higher cpu-time than the other methods, which may be attributed for the application of the two modifications on all particles. It is important to point out that, the MPSO and MSA have the highest convergence speed and therefore the quickest answer to the problem at hand.

V. Results and Discussion of Radial Distribution System
To evaluate the efficiency of the proposed MSA method against power loss and energy cost minimization, The IEEE radial distribution systems of 33 and 69-bus have been applied for this simulation. The MATLAB is used to implement the MSA technique for the optimal capacitor placement problem.
This study includes the annual cost of real power loss and the total capacitor banks. The obtained results are compared with other conventional algorithms over 50 independent runs described as follows.

A. IEEE 33-Bus Test System
To evaluate the impact of the proposed MSA on the medium scale of distribution system, the IEEE 33-bus system has been tested. Fig.  6 shows the single line diagram of this system. The system rated voltage is 12.66 kV. The load and line data are given in [13]. Load flow calculation is run before compensation, the minimum bus voltage is registered as 0.9036 p.u at bus 18 and the total active power loss is 210.98 kW with the annual energy losses cost of 35442.96 $. Using the proposed MSA method, only three capacitors are allocated at optimal locations at buses 12, 24 and 33 with the size of 450, 600 and 900 kVAR, respectively. As a result, the real power loss is diminished to 137.227 kW as 35.02% of the base case. Furthermore, as seen from  In addition, the system voltage profile is improved and the worst bus voltage is enhanced to 0.9329 PU as shown in Fig. 7. These results have validated the performance and effectiveness of the proposed MSA method. Furthermore, the minimization of active power loss and total cost is stabilized with the fast and smooth convergence as shown in Figs. 8 and 9. It is shown that the proposed MSA is more effective than the conventional algorithms under the medium scale of distribution system.

B. IEEE 69-Bus Test System
The proposed MSA is further applied on the IEEE 69-bus system which consists of 69 buses and 68 branches as shown in Fig. 10. The rated line voltage is 12.66 kV and total system load is (1.896MW+j1.347MVAR). The details load and line data are reported in [13]. After running the power flow calculation and before placing the capacitors banks in the RDN, the power loss is obtained at 224.975 kW with the lowest bus voltage at bus 65 is (0.9092 p.u.) and the total energy losses cost is 37800 $ per year. When applying the proposed method on this RDN, the best active power loss reduction is at 145.404 kW which increased the percentage of loss reduction to 35 [5]. This result is considered as the greatest value compared with other algorithms in Table IX Appendix (A). It is found that only three capacitor banks with optimum ratings of 450, 150 and 1200 kVAR have been installed at buses 12, 21 and 61, respectively. Furthermore, the MSA has minimized the total cost per year to 24820.84$ instead of 37800$ before compensation. The annual net saving is increased to 34.34% as shown in Table IX in appendix (A). This table displays the statistical performance of the proposed MSA with the best, worst and average values of the total cost for 50 independent runs. Moreover, Fig. 11 confirms the effectivity of the proposed technique by showing the improvement in system voltages. The minimum voltage has been improved to 0.9324 p.u. which is compatible with the voltage constrains. In addition, the fast and effective response of the MSA appears in the convergence curves of total real power loss and total cost in Fig. 12 and 13. The best result obtained from the 50 independent runs for the radial distribution systems were shown in Fig. 8, 9, 12, and 13.

VI. Results for Ring Distribution Systems
In this section, the proposed MSA optimization method is tested on the complex power systems known as the ring distribution systems, which are more sensitive to variations and uncertainties. Moreover, the ring main system is considered more complex than the radial system in terms of load flow problems and improper coordination problems. The ring distribution systems are built by modifying the standard IEEE 33 and 69-bus.

A. IEEE 33-Bus System
The radial IEEE 33-bus system is reconfigured to the ring main system as shown in Fig. 14. It consists of sectionalized switches from 1 to 32 and tie-switches from 33 to 37. In case of converting the system from radial to ring by using tie-lines (33 to 37), the power loss reduced to 202.68 kW and minimum bus voltage is 0.913 p.u. With optimal reconfiguration (7-11-14-32-37), it can reduce the active power loss by 41.2% and improve the minimum voltage to 0.938 p.u. On the other hand, applying the MSA method on the ring main system for determining the optimal locations and sizes of capacitor banks needs to minimize the power loss with optimal reconfiguration. The resultbased MSA shows a good performance as only three locations have been selected at buses 6, 24 and 33 with total reactive power of 1500 kVAR. Furthermore, the total active power loss is reduced with 69.1% from the base case which is better than the other methods as seen in Table X appendix (A). This table summarizes a detail comparison between the MSA, BGSA [35], HSFLA [36], PSO [36], IPSO [36] and ACO [36] for active power loss, minimum voltage and reduction percentage. Moreover, for all bus voltages-based the MSA method are maintained within desirable values and higher than 0.964 p.u as shown in Fig. 15. In addition, Fig. 16 shows the effective performance of MSA as total power loss converges smoothly to its minimum values without fluctuations.

B. IEEE 69-Bus System
The MSA method is further implemented on the large IEEE 69-bus test system after converted to a ring main system as shown in Fig. 17. It consists of sectionalized switches from 1 to 68 (normally closed) and tie-switches from 69 to 73 (normally open). In the base case with tie switches (69-70-71-72-73), the total power loss and the minimum bus voltage are at 224.97 kW and 0.909 p.u, respectively. In case of optimal reconfiguration, it can reduce the active power loss by 59.17% and increase the minimum bus voltage to 0.9877 p.u. as shown in Fig.  18. On the other hand, when installing capacitors banks in the ring main system and implementing the MSA technique for optimizing their locations and sizes gives the great results. Only three locations at buses 11, 50 and 61 have been selected to install capacitors with total reactive power 1500 kVAR. In addition, the total active power loss is diminished by 93.98% from the base case. This result is considered the best comparing with other techniques such as BFO [37], TSA [38], BA [39] and WOA [40] as seen in Table XI in appendix (A). Moreover, the minimum bus voltage increased to 0.99 p.u, which is considered a very good value as seen in Fig. 18. Furthermore, the effectiveness of the proposed MSA is seen in Fig. 19, which shows the fast convergence of the total active power loss. The best results obtained from the 50 independent runs for the ring distribution systems were shown in Fig.  16

VII. Conclusion
In this article, a novel MSA paradigm has been presented with two new optimization operators of adaptive crossover based on population diversity and associative learning mechanism with immediate memory, which may be appropriate to hybrid with other algorithms in the future. Twenty-three commonly used benchmarks under different statistical metrics are employed to verify the effectiveness of proper hybridization in terms of convergence, local optima avoidance, robustness, computational cost, exploration, and exploitation. From the obtained results, the final algorithm can be considered a hybrid of algorithms of the PSO, DE, MFO, and PFA in line with the natural characteristics of the moth swarm, and suitable for solving the complex problems. The comparative study with several metaheuristic search techniques, confirms the primacy of the proposed paradigm and its potential to find accurate, fast and robustness solutions. MSA approach has been successfully applied on the small, medium and large scale electrical distribution systems to solve the problem of capacitors allocation for minimizing the real power losses and annual energy cost, which is considered as an attractive economic issue. MSA superiority is clarified by testing it on radial/ring IEEE distribution networks (33 and 69-bus systems). Furthermore, the proposed MSA can improve the voltage profile at each bus in the systems. Moreover, overall numerical results obtained from the proposed MSA method such as minimum voltage, active power loss, power loss cost, capacitor cost, annual energy cost, net saving cost and CPU time have been compared with other algorithms. The MSA method presents a desirable and superior performance with stable convergence against the other techniques. The applications of the proposed MSA method can be considered as the most recent optimization algorithms for the network reconfiguration and dealing with the protection coordination system in presence of capacitors banks and distribution generation during grid faults are the future scope for this work.