TELKOMNIKA Telecommunication Computing Electronics and Control

Received Aug 25, 2021 Revised Feb 25, 2022 Accepted Mar 05, 2022 This paper proposes two attenuation factors (AF’s) to improve the performance of the soft-output Viterbi algorithm (SOVA) as well as to enhance the early termination mechanism in turbo decoding. The mean square difference between the systematic bipolar coded symbols and the a-posteriori information is used to estimate the first AF. The second AF is computed online for each iteration based on the correlation coefficient between the extrinsic and a-priori information instead of intrinsic information as customary to calculate in literature. The second factor is used in the early termination (ET) scheme which is practically useful to terminate iterations when there is no significant improvement is achieved. In addition, a method for offline computing the AF’s that cover a specific range of signal-to-noise power ratio is provided which results in a reduction of utilization and latency estimates with a shallow degradation in performance. The results show that the proposed scheme outperforms the previous related works by about 0.2 dB at bit error rate (BER) of 10 using interleaver depth of 512 and reducing the average number of iterations (ANI) by about 3 iterations.


INTRODUCTION
In the last five decades, coding theorists have been looking for codes to be capable of approaching the Shannon limit.Unprecedented results which are very close to the Shannon limit (within 0.7 dB) are presented in [1].The authors of the article introduced a structure called turbo code in which the previously adopted ingredients in concatenated codes (multiple encoders and interleaver) are organized as a parallel or serial concatenated convolutional code (PCCC, SCCC).The Bahl, Cocke, Jelinek, and Raviv (BCJR) algorithm [2], also known as the maximum a-posteriori probability (MAP) algorithm is optimal for estimating the sequence of a-posteriori probabilities (APP) for each data bit given the received sequence.The numerical representation of probabilities, non-linear functions, and mixed multiplications and additions for the MAP makes this algorithm too difficult to implement.As a result, different alternatives to this algorithm such as Log-MAP and Max-Log-MAP algorithm are proposed by Robertson et al. [3].
A modified version of the Viterbi algorithm called SOVA was first introduced by Hagenauer and Hoeher [4].It provides the reliability measure (soft-output) together with each decoded bit.The soft-output is necessary to the component decoders for the decoding process of concatenated convolutional codes such as turbo codes.It has an implementation advantage over MAP and Log-MAP due to its simple structure on the account of a little degradation in performance [5].The sub-optimal performance of the soft-output Viterbi TELKOMNIKA Telecommun Comput El Control  Efficient SOVA decoding and enhanced early termination mechanism based on … (Ahmed A. Hamad) 269 algorithm (SOVA) decoder is attributed to the optimistic estimation of the a-posteriori log-likelihood values exchanged between constituent decoders comprise the turbo scheme.Furthermore, the correlation between the extrinsic and intrinsic information traded between the constituent SOVA decoders is the key factor for the degradation in its performance [6], [7].The common method to treat the aforementioned deterioration of SOVA, as well as Max-Log-Map performance, is by multiplying the exchanged reliability information by attenuation factor to reduce their exaggerated values.
Many research efforts have been separately attempted for developing or improving the early-stopping and attenuation-factor algorithms (some literature refer to it as a scaling or correction factor).Practically, it is more efficient, in terms of computation complexity, to use an algorithm that produces one operator that can accomplish the two tasks together.Lin et al. [8] generated two scaling factors based on the sign difference ratio (SDR) scheme proposed by [9].The two factors are used to scale the extrinsic information produced by the two soft-input soft output (SISO) component decoders as well as the soft channel inputs.A coding gain of about 0.1 dB up to 0.2 dB is achieved with a reduction in the average number of iterations (ANI) and data storage requirements.To enhance the performance of turbo decoding, Fowdur et al. [10] proposed an adaptive scaling combined with ET schemes based on the SDR algorithm with prioritized quadrature amplitude modulation (QAM) constellation mapping for joint source channel coding (JSCC).Alberge [11], an optimal method is derived for closed-form expression to generate two scaling factors based on the properties of the mutual information between the extrinsic log-likelihood ratio (LLR) at the output of the two constituent decoders.Mutual information is also used as a metric for stopping criteria.The calculation of the attenuation factors (AF) based on Pearson's correlation coefficient (CC) is proposed by [12].The CC is also utilized as a stopping scheme at high signal to noise power ratio (SNR), and the regression angle for low SNR.The proposed scheme by Fowdur et al. [12], reveals improved performance compared to their foregoing work in [10].An efficient correction factor and iteration termination mechanism are also proposed by [13].Aarthi et al. [14] proposed a method based on applying two scaling factors for each constituent SOVA decoder to reduce the correlation between extrinsic information.The estimation of these factors depends on the correlation coefficient using the linear transformation method between intrinsic and extrinsic reliability values.The same approach that utilized in [14] to reduce the correlation effects is adopted by [15] besides two other reduction factors estimated offline to reduce the overestimation of the a-posteriori information produced by each SOVA decoder.This is accomplished by minimizing the mean square difference (MSD) between the extrinsic information produced by the SOVA and the corresponding MAP decoder for the same information bits.Most of the reviewed works are based on utilizing intrinsic and extrinsic information to compute the AF.One of the components that comprise the intrinsic information is the soft-channel output which is constant over each frame decoding session.Therefore, this redundancy information can be avoided by using a-priori information only.This in turn will reduce memory usage, latency, and utilized silicon area.The objective of this paper is to: − Improve the performance in terms of bit error rate (BER) of sub-optimal turbo decoding by alleviating the effect of overestimation in a-posteriori reliability values and the correlation between the intrinsic and extrinsic information traded between the constituent SOVA decoders.This can be accomplished by applying two pairs of AF's, one pair for each constituent decoder.

−
Propose two new online algorithms, for a more accurate estimate, to the aforementioned pair of AF instead of the offline scheme proposed by related works, e.g., [14]- [16], and hence more improvement can be achieved.

−
Eliminate redundant iterations by employing the value of the AF that is used to reduce the correlation between the intrinsic and extrinsic information for the second constituent SOVA decoder as a stopping mechanism.By using this method, we can eliminate the need to use one of the well-known early termination methods and thus reducing the complexity of the decoder.

−
Present a new method for offline computing the AF's that cover a specific range of signal-to-noise power ratio, which results in a reduction of utilization and latency estimates.The rest of this paper is organized as follows.Section 2 introduces the principle of iterative SOVA decoding.Section 3.1 explains the methodology adopted in the estimation of the AF's for scaling the a-posteriori information.The computation of AF's that are used to reduce the correlation effect is illustrated in section 3.2.In section 4, the performance of systems using various combinations of AF's and early termination schemes are analyzed and compared.Finally, our conclusions are presented in section 5.

SYSTEM MODEL AND PRELIMINARY
The degradation in performance of turbo decoders that utilize the sub-optimal algorithms like Max-Log-Map or SOVA compared to the optimal MAP decoder is attributed to the following reason; the sub-optimal algorithms use approximated formulas to estimate the required reliability values, and this results in an undesirable correlation between the intrinsic and extrinsic information exchanged between the constituent decoders [15].Whereas they should carry new information to the other constituent decoders as expected, the correlated intrinsic and extrinsic reliabilities introduce redundant information that is accumulated at each decoder.This tends to produce optimistic a-posteriori information at the output of the SOVA or Max-Log-Map decoders.Various approaches were proposed to mitigate these imperfections, and perhaps the most efficient ones are those which employed fixed or adaptive AF's [7]- [15].Assume that the turbo decoder receives the channel output sequence r kj , given by: where  1 ∈ {−1, +1} refers to the binary phase-shift keying (PSK) symbols associated with the information bits (  ∈ {0,1}) at the time sample k ( = 1 to , and  is the interleaver depth).The symbols,  2 , and  3 correspond to the parity bits generated by the first recursive systematic convolutional encoder (RSC1) and second RSC2 respectively.The noise samples n kj are assumed to have a Gaussian distribution with a variance of  2 =   2 ⁄ (  is the two-sided power spectral density of additive white Gaussian noise channel (AWGN) and zero mean).At the receiver, each constituent SOVA decoder receives a systematic symbol in addition to the parity symbol generated by its associated encoder as depicted in Figure 1.The SOVA decoder accepts a-priori information (   ) and produces a-posteriori LLR values (   ) for each information bit at the constituent decoder  ( = 1 or 2) and the  ℎ iteration.The a-posteriori probability  1  produced by the first SOVA decoder is defined [1]: where Λ 1  is the intrinsic information and is given by: The  ,1 is the received systematic symbol scaled by the reliability of the channel  ( = 4 ), which can be set to 1 for SOVA [17],   is the coding rate, and   is the energy per information bit.
Here  1  represents the extrinsic information produced by SOVA1 at the  ℎ iteration and is passed to SOVA2 as a-priori information after permuting its elements by the interleaver () as  2  = ( 1  ).SOVA2 intern produces its extrinsic information  2  and sends it back as a-priori information to SOVA1 in the next iteration,  1  =  −1 ( 2 −1 ), after de-interleaving ( −1 ) of its elements.The decoded information bits  ̂ are given by the hard decision of the de-interleaved values of  2  .To reduce the exaggeration in the reliability values    and    , they are scaled by two attenuation factors    and    respectively.Sections 3.1 and 3.2 provide a detailed description of how to estimate the values of these factors.271 is used to scale the extrinsic information before passing it to the other constituent decoder to reduce the correlation between the intrinsic and extrinsic information traded between the two constituent SOVA decoders.The mathematical formulas for estimating these factors are presented in the next two sub-sections.

A-posteriori attenuation factor (𝑨 𝒂𝒄
) To enhance the reliability values    produce by each half of the turbo decoder, they are scaled by two AF's (   ) as: The factor  1  for SOVA1 is calculated based on the MSD [18] between the systematically encoded symbols  1 and the sign of  1  (i.e., sign [ 1  ]) as: We can justify our assumption as follows; the sign of the a-posteriori probability  1 is a growing estimate to the systematic coded bits  1 at each iteration.Statistically, ∆( 1  ) is a measure of how efficient the algorithm is in conjunction with the usage of the attenuations factor  1  .The value of ∆( 2  ) is calculated similarly as: It is obvious that minimizing ∆(   ) results in better suboptimal decoding.Therefore, the parameters of    should be calculated to minimize ∆(   ) which means that d∆(   ) d ⁄    = 0. Hence it is easy to find    from ( 4) and ( 5) as: and given that . It is worth mentioning that the estimation of    have been done considering the availability of  1 which is practically unavailable at the decoder side in the online running.Therefore,    can be computed offline.The online calculation of    with acceptable accuracy is also possible if the transmitter sends a few well-known frames to the decoder from time to time.

Extrinsic attenuation factor (𝑩 𝒂𝒄 𝒊 )
To reduce the correlation between the intrinsic    and extrinsic    reliability values, many papers, for instance [14]- [16] proposed the computation of AF's based on the CC between them.Whereas in [12], [19] the calculation depends on the correlation between extrinsic and a-posteriori information    .
Since  ,1 is constant over all iterations and to avoid extra memory and unnecessary computations needed to store and calculate the intrinsic information, the AF's are computed online for each iteration based on the CC between the extrinsic and a-priori information.The gradient of the regression line (   ) which best fits the distance between    and    is a good measure of the correlation tendency and is given by [20].here   is a reduction factor that has been set to 0.7 and it was found to be the best value for improving the performance of SOVA over the specified range of     ⁄ and various frame lengths.In each half of the decoding process and  ℎ iteration, the a-posteriori LLR values of the modified system become:  ̂  =  ,1 +    +       (11) where    = (1 − 0.7   ).Since the value of    is a measure of how strong the correlation between    and    , it can be used as an et criterion instead of the well-known et algorithms like the cyclic redundancy check (crc), cross-entropy (CE), sign change ratio (SCR), hard-decision-aided (HAD), and sign difference ratio (SDR) [9], [21]- [24].

RESULTS AND DISCUSSIONS
In this work, the specification of the simulated turbo code is based on the long term evolution (LTE) standard [25] which consists primarily of two 8-state RSC encoders with a transfer function of (  ,   ) = (13,15) 8 in octal form and code rate of 1/2.The encoders are joined by a quadratic permutation polynomial (QPP) interleaver to achieve code diversity and maximize the minimum distance of the code [25].All simulation trials are tested over the AWGN channel with zero mean and noise variance of  2 using MATLAB R2020b platform.To assure reliable results, a total of 10 6 turbo coded frames are simulated.Intensive simulations are carried out to prove the effectiveness of using the proposed schemes instead of those adopted by [14]- [16] in the estimation of AF's.

Simulation of fixed iteration decoding
The decoders in this section apply a constant number of iterations (  = 8) for each received frame.Figure 2(a) presents a comparison in terms of BER performance against     ⁄ in dB between the proposed system (shortened by   ) that utilize the AFs    and    estimated in sections 3.1 and 3.2, and the one adopted by [15] that utilize corresponding    and    AFs and is abbreviated by   .The unmodified SOVA and Log-MAP (without using attenuation factors) are also simulated as benchmark systems to reveal the amount of gain that can be achieved using the different proposed modifications.It is noted that the proposed scheme   outperform the system   to be compared with 0.1 dB improvement at BER of 10 −5 .Figure 2(a) also shows that scheme   presents improvement by about 0.7 dB relative to the original SOVA at BER of 10 −5 (about 0.2 dB degradation relative to the Log_MAP).
The systems   and   simulated in Figure 2(a) are re-simulated for the coding rate   = 1/2 and plotted in Figure 2(b).Again, system   demonstrate superior performance compared to the original SOVA (up to 0.7dB at BER = 10 −5 ), and outperform   by about 0.1 dB at BER = 10 −5 .To confirm the benefit of using the proposed scheme, the CC between    and    for every half iteration and various     ⁄ is computed and depicted in Figure 3.The curves reveal a reduction in correlation for the proposed system   compared to   of the previously related work in [15].

Simulation of early terminating decoding
For many decoding cycles, especially for high     ⁄ , the decoder successively finishes the decoding even before reaching the maximum number of iterations.Therefore, to avoid excessive iterations, an ET scheme should be used.In this paper the value of the gradient  2  is utilized as a stopping mechanism.To get a better inspection of the threshold value of  2  at which the iterative process can be terminated, Figure 4 depicts a plot for the average values of  2  versus the number of iterations for system   over the interesting range of     ⁄ (0.6, 0.8, and 1 dB) and considering the employment of the Genie ET as AF scheme.The average of the three curves ( 2 ) is also depicted.It is obvious that for high     ⁄ ,  2 increases monotonically with iterations and reaches a value of about 0.56 which can be considered as a threshold ( ℎ ) for early termination of decoding iterations.In the same way, for system   ,  ℎ is found to have a value of 0.68.Simulation tests for both   and   schemes are carried out with N = 5120 and depicted in Figure 5(a).The performance of Genie and SDR ET techniques are also presented for comparison purposes.In general, the performance of tested systems may improve by rising the threshold level on the account of increasing the number of decoding iterations.The systems with these threshold values reveal a comparative performance, in terms of BER, to their corresponding benchmark system (Genie).The proposed scheme   presents an improvement of about 0.1 dB compared to   at BER of 10 −5 and consume about one iteration less at     ⁄ = 0.8 .The same schemes in Figure 5(a) are re-tested for interleaver length of 512 with a coding rate of 1/3 and depicted in Figure 5(b).Again, the proposed scheme   outperform the system it is being compared to   , in terms of BER and the ANI.An improvement of about 0.2 dB is achieved at BER of 10 −5 and about 3 iterations less at     ⁄ = 1.6 .To clarify the difference in complexity of implementation for   and   schemes, Table 1 shows the utilization estimation and latency for both of them using Kintex 7 field programmable gate array (FPGA) device (use Xilinx XC7K325T-2FFG900C as a core with 10 nsec clock cycle).It is obvious that   consumed less hardware and latency than AB I as stated in the last column of Table 1.

Simulation of offline AF decoding
To reduce the computational complexity of the turbo decoding process, many papers proposed offline methods to compute the AF for a given range of     ⁄ .The accuracy of estimating the channel conditions at the receiver determines the effectiveness of these methods.This task generally is intractable in digital communication systems.Therefore, there is always a tradeoff between performance and complexity.To overcome this problem, this paper proposes a method for offline computation of AF's to cover a selected range of     ⁄ .Figure 6(a) and Figure 6(b) plot the average of attenuation factors    and    , respectively, against the number of iterations for the     ⁄ of 0.6, 0.8, and 1 dB to the AB a systems using N = 5120, and   = 1/3.Simulating a sufficient number of frames is considered in the estimation of the average values of    and    .The figures also show the average of  1,2  and  1,2  over the aforementioned range of     ⁄ for each iteration, which are abbreviated by  1 ,  2 ,  1 and  2 respectively.Finally, the plot of   and   are depicted, which refer to the average of the equivalent factors generated by the two constituent decoders for each iteration.
Figure 7 illustrate the performance of turbo SOVA decoding in terms of BER and ANI using offline computations of AFs for AB a and AB I systems.In all tested systems the values of  1 ,  2 and  1 are used as constant AFs.The values of  2  is always computed online because it is the parameter used in the ET criterion.The online scheme is also depicted for comparison.When employing offline AFs, the BER performance of the two schemes   and   is fairly similar, as shown in  2 shows the utilization and latency estimates for   scheme using online and offline AF.Generally, in terms of implementation, it is clear that offline scaling is less consumption in utilization and delay than online scaling.

CONCLUSIONS
In this paper, a modified SOVA turbo decoder is designed and simulated successfully.This decoder helps to reduce the undesirable effect of correlation and overestimation of reliability values by using two appropriate AF's.It is shown that the implication of AF has improved the conventional SOVA by about 0.7 dB at BER of 10 -5 for turbo code that uses interleaver length of 5120 bits.The MSD between a-posteriori values and its corresponding coded information has been used in the calculation of optimal AF's to address the over-optimistic estimation of reliability values.These factors can be calculated offline or online by transmitting predefined frames from time to time during system running.This makes the proposed scheme more practical than the one is being compared to in [15], which depends on the MSD between the equivalent extrinsic information produced by the SOVA and MAP decoders.The gradient of the regression line which best fits the distance between the a-priori and extrinsic information is utilized to estimate the AF for resolving the correlation issue.This method presents advantages over the one that uses the intrinsic instead of a-priori information in the estimation of AF in terms of BER, ANI, complexity, and processing delay.A threshold estimation over a given range of     ⁄ to the gradient value of the second constituent decoder is presented as an ET criterion.The coded systems that adopted the proposed method of early iteration termination presents equivalent results compared with the ones that utilize the Genie or SDR criteria.Further reduction in computational complexity and latency is made by offline estimating of AF's for each iteration and dropping down their dependency on the channel condition.Finally, a hardware comparison is made, in terms of utilization estimates, using the FPGA device to confirm the advantage of utilizing a-priori instead of intrinsic information in the computation of AF.

Figure 1 .
Figure 1.Schematic diagram of modified SOVA turbo decoder

Figure 3 . 2 Figure 4 .
Figure 3. Correlation coefficient between    and    versus the number of iterations for   and   systems with different     ⁄ , N = 5120, and   =1/2

Figure 7 .
The proposed system TELKOMNIKA Telecommun Comput El Control  Efficient SOVA decoding and enhanced early termination mechanism based on … (Ahmed A. Hamad) 275 consumed less ANI by about 0.75.Table

Figure 5 .
Figure 5. BER performance and ANI of   and   schemes with (a) N = 5120 and (b) N = 512

Table 1 .
Utilization and latency estimates for   and   schemes  1 ,  2 and (b)  1 ,  2 versus no. of iterations for system   with  ℎ = 0.56 Figure 7. BER and ANI versus     ⁄ of   and   system applying online and offline computations of AF TELKOMNIKA Telecommun Comput El Control Efficient SOVA decoding and enhanced early termination mechanism based on … (Ahmed A. Hamad) 277

Table 2 .
Utilization and latency estimates for   scheme using online and offline AF