Spatial and Temporal Quality of Brain Networks for Different Multi-Echo fMRI Combination Methods

The application of multi-echo functional magnetic resonance imaging (fMRI) studies has considerably increased in the last decade due to superior BOLD sensitivity compared to single-echo fMRI. Various methods have been developed that combine fMRI data derived at different echo times to improve data quality. Here, we evaluated five multi-echo combination schemes: ‘optimal combination’ (OC, <inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-weighted), <inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-FIT (<inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-weighted, calculated per volume), average-weighted (Avg), temporal Signal-to-Noise Ratio (tSNR) weighted, and temporal Contrast-to-Noise Ratio weighted combination. The effect of these combinations, with and without additional postprocessing, on the quality of functional resting-state networks was assessed. Sixteen healthy volunteers were scanned during a 5-minutes resting-state fMRI session. After network extraction, several quality metrics in the temporal and spatial domain were calculated for their respective time-series and spatial maps. Our results showed that OC and <inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-FIT outperformed the other methods in both domains. Whereas the OC and <inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-FIT time-series were found to be the least associated with artifacts, OC resulted in the highest quality spatial maps. Furthermore, spatial smoothing, bandpass filtering and ICA-AROMA merely improved networks derived from the least performing combinations (Avg and tSNR). Because similar network quality was obtained following OC and <inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-FIT without postprocessing, we recommend future studies to implement these combinations without these postprocessing steps. This minimizes the amount of image modifications and processing, potentially leading to enhanced BOLD contrast. The results highlight the benefits of <inline-formula> <tex-math notation="LaTeX">${\text {T}_{2}}^{\ast }$ </tex-math></inline-formula>-weighted multi-echo combinations on resting-state network quality and raise its potential value in dynamic fMRI analyses or for diagnosis and prognosis purposes of neuropsychiatric disorders.


I. INTRODUCTION
Over the last decades, functional magnetic resonance imaging (fMRI) has provided numerous novel insights into brain The associate editor coordinating the review of this manuscript and approving it for publication was Amin Zehtabian .activation patterns of healthy individuals and patients with neurologic disorders.The application of fMRI has proven to be useful in research domains such as classification of healthy versus diseased individuals [1], the prediction of optimal treatment [2], [3] and the identification of biologically-based subtypes of neurological disorders [4].
Brain networks, commonly called resting-state networks (RSNs), have been widely studied in the healthy population and in neurological disorders for these purposes.For example, the default mode network (DMN), which has been found to be active during self-reflection and inactive during attention-demanding tasks [5], can be robustly detected during resting-state fMRI and abnormal activity of this network has been linked with major depressive disorder [6], [7], [8].Other consistently identified RSNs are the executive control network (ECN) and the salience network (SN) [6], [8], [9], [10].
In standard (i.e.single-echo) fMRI, a single slice is acquired after each radiofrequency (RF) excitation pulse at a specific echo time (TE) [11].Despite its non-invasiveness and high spatial resolution, single-echo fMRI is prone to different noise sources that can be of physiological, motion, thermal or hardware-related origin [12].To increase the sensitivity of the acquired blood oxygenation level-dependent (BOLD) contrast in fMRI and to reduce signal dropout, an MRI technique called multi-echo (ME-)fMRI has been developed [13], [14].In ME-fMRI, slices are acquired at different TEs following each RF-pulse [11].Signals acquired at shorter TEs have higher signal intensity than longer TEs [11].However, the contrast between gray and white matter (GM; WM) and cerebral spinal fluid (CSF) has been shown to be higher at longer TEs [11].Thus, by combining the images that are acquired at different TEs, an optimal balance between signal intensity and tissue contrast can be obtained, thereby increasing the sensitivity of the BOLD contrast.Several of such multi-echo combination methods have been developed.Examples are the temporal Signal-to-Noise Ratio (tSNR) combination [15] and a T 2 * -weighted combination (also called 'optimally combined', OC) [13], the latter takes into account the variation of T 2 * -known to be dependent on the brain location and tissue type [16] -over the brain.
There is a scarcity of studies investigating the effect of the different ME combination methods on fMRI data [15], [17], [18], [19].For example, a previous study compared the BOLD sensitivity of multi-echo fMRI data after combination of a simple echo summation scheme, a temporal Contrast-to-Noise Ratio (tCNR) weighted combination or the OC combination to data acquired by single-echo fMRI [18].They found significant sensitivity increases of more than 7% and 11% for the OC and tCNR-weighted combination method, respectively.Larger increases were found in regions with a short or long T 2 * that are more susceptible to signal dropout in single-echo fMRI [18].A different study assessed the tSNR, functional connectivity and RSN correlation and size for a single-echo, and OC with or without denoising acquisitions.They found increased values for all measures in OC compared to a single-echo and in OC with denoising compared to OC without denoising [19].Another recent study showed that the T 2 * -weighted and the tCNR-weighted combination methods increased the tSNR of resting-state fMRI time-series by more than 30% over the whole brain compared to single-echo [17].Controversially, results from a dual-echo study indicated that there were no significant sensitivity advantages on a group-level by using tSNR-weighted or tCNR-weighted multi-echo combination over simple multi-echo averaging [15].
Thus, whereas the majority of studies in general show that tCNR and OC increase fMRI data quality compared to single-echo or multi-echo data combined using simple echo summation or tSNR weights, there are also discrepant findings.Whether tCNR or T 2 * -weighted combinations should be used in future studies is still up for debate and remains a rather arbitrary decision.In addition, the effect that echo combination schemes and denoising approaches have on RSN quality has not been studied before.Insights on this topic could support researchers in making more objective decisions regarding multi-echo fMRI processing prior to brain network analyses.
Hence, in the current paper, the effect of different echo combinations, with or without additional postprocessing, on resting-state networks is evaluated.In our approach, five different echo combination methods will be applied to multiecho resting-state fMRI data: Average (Avg), tSNR, tCNR, OC and T 2 * -FIT (T 2 * -weighted, weights calculated per volume) echo combination.The temporal and spatial quality measures between resting-state networks resulting from the different echo combinations will be compared between each other and between the second echo (SE) reference.Based on previous work we expect enhanced temporal and spatial quality of RSNs for the tCNR, OC and T 2 * -FIT combination compared to the SE and Avg reference.Additionally, we hypothesize that the T 2 * -weighted combinations (OC and T 2 * -FIT) improve the temporal or spatial properties of the RSNs compared to the other multi-echo methods as a result of the echo weight optimization based on the T 2 * estimation per voxel.
The contributions of this study are as follows.First, our study sheds light on the currently unknown effects of different echo combination methods on RSNs.This is essential because future ME studies could benefit from improved brain network maps and time-series.Second, based on our analysis regarding the effect of several pre-processing methods on RSNs, other researchers might opt to adjust their fMRI processing pipeline to further optimize the quality of networks.Consequently, the reliability of results from studies that assess the functioning of the brain enhances.Ultimately, this contributes to the understanding of brain disorders and development of more objective diagnosis and prognosis thereof.

II. MATERIALS AND METHODS
The following subsections describe the main steps from the acquisition of the MRI data up until the evaluation of the consistency metrics between different echo combination methods.Fig. 1 shows a schematic of these analysis steps required to obtain the final results.3) The three time-series are combined using the different combination methods, resulting in a single combined time-series.For the single-echo reference, the second echo time-series is taken.4) Group independent component analysis (ICA) results in groupwise spatial maps and time-series.5) Individual spatial maps and time-series are obtained from dual regression.6) The resting-state networks of all subjects and combination methods are evaluated for temporal and spatial differences.

A. PARTICIPANTS
Sixteen healthy subjects participated in this research and all of them gave informed consent.None of the subjects had a medical history of neurologic or psychiatric disorders and they were between 20-65 years old (age = 43.4± 12.7 years old, 11 females and 5 males).The study was approved by the Academic Center for Epilepsy Kempenhaeghe (Heeze, the Netherlands) based on METC approval N16.098.

B. DATA ACQUISITION
Scanning was performed on a Philips Achieva MRI scanner (3 Tesla).At first, a T1-weighted anatomical scan was recorded using a 3D spoiled gradient-echo sequence (repetition time (TR) = 8.3 ms, TE = 3.5 ms) resulting in a matrix size of 240 × 240 x 180 with isotropic voxels of 1 mm.ME (3 echoes) images (300 volumes per echo) were acquired using a gradient-echo EPI sequence (TR = 2000 ms, TE = 12 ms, echo spacing = 23 ms, flip angle = 90 • , acquisition bandwidth = 4298 Hz/pixel).In total, 26 slices with a slice thickness of 4.5 mm (no gap) were obtained with an in-plane resolution of 3.5 mm x 3.5 mm and a final in-plane resolution of 3.5 mm x 3.5 mm.A SENSE acceleration factor of 2.7 was applied in the read-out direction.

C. IMAGE PRE-PROCESSING
The fMRI images were slice timing corrected using the Statistical Parametric Mapping (SPM) toolbox, version 12 (https://www.fil.ion.ucl.ac.uk/spm/software/spm12) [20], implemented in MATLAB version R2019a (MathWorks).The toolbox MCFLIRT from the FMRIB Software Library (FSL) 6 package was used to estimate the realignment parameters of the volumes to the reference image [21].The reference image was chosen to be the first image of the second echo due to its echo time being closer to the brain's average T 2 * time.The estimated realignment parameters resulting from echo 2 were applied to echo 1 and 3 using FSL's FLIRT.After these minimal processing steps, the echoes were combined using the different combination schemes as described in the next subsection 'D.Multi-echo combination methods' in a custom MATLAB script (MATLAB version R2019a, MathWorks).Subsequently, the T1-weighted image was coregistered to the functional reference image using a 6 degrees-of-freedom transformation, estimated based on normalized mutual information.The T1-weighted images were then intensity non-uniformity corrected and segmented into CSF, WM and GM, followed by normalization of the coregistered anatomical and functional images to MNI space 114538 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
by applying a 12 degrees-of-freedom transformation using SPM12.
In standard research practice, additional denoising methods such as bandpass filtering or smoothing are applied to further clean the BOLD signal.However, it has been shown before that these steps may not be necessary or can even decrease effect sizes following ME combination and cleaning [22].To date, the effect of additional denoising on the quality of resting-state networks is still unknown.Therefore, the temporal and spatial quality measures of the resting-state networks were assessed for the following three cases: 1.No additional denoising.
2. Minimal smoothing with a Gaussian filter with a full width at half maximum of 5 mm [23] and bandpass filtering between 0.05 and 0.2 Hz.Conservative smoothing and bandpass filtering were applied to ensure these postprocessing steps did not completely obscure the effect of ME combination.The bandpass cutoff frequencies were chosen to remove the most common fMRI artifacts: the 0.2 -0.25 Hz range is linked with contamination of respiratory signals while the 0 -0.05 Hz is frequently obscured by scanner drift (0 -0.01 Hz) and respiration-induced CO2 fluctuations (0 -0.05 Hz) [12].
3. Smoothing as described above, followed by automated ICA-based cleaning using ICA-based Automatic Removal Of Motion Artifacts (ICA-AROMA) [24] to reduce motion artifacts.Subsequently, the time-series were bandpass filtered as described above.This order of denoising steps was recommended in ICA-AROMA's manual.

D. MULTI-ECHO COMBINATION METHODS
For each described case, the separate echo images were combined according to five different combination schemes below.In addition we used the Avg and SE as reference.

1) OPTIMALLY COMBINED (T 2
* -WEIGHTED) The OC combination method was developed by Posse et al. [13] and is the combination scheme implemented in the ME denoising method ME-ICA by Kundu et al. [25].The OC algorithm optimizes contrast by estimation of T 2 * for each voxel [13], leading to reduction of susceptibility artifacts and thermal noise [26].The weights for echo combination are calculated as follows: where TE is the echo time and T 2 * is the estimated T 2 * from each voxel, and the indices n and i relate to the echo number.After calculating the weights for all three echo images, the combined fMRI time-series of each voxel can be calculated by taking the weighted average using the optimally combined weights.
In the tSNR-weighted echo combination method, first the tSNR of every voxel's time-series of each echo image is calculated.The tSNR is defined as the mean time-series divided by its standard deviation.Subsequently, the following equation can be used to calculate the weight, w n , tSNR, of the image with echo n [15]: where tSNR n is the temporal signal-to-noise ratio of the image with echo n, and i the echo index used to sum over all three echo images.After calculating the weights for all three echo images, the combined fMRI time-series of each voxel can be calculated by taking the weighted average using the tSNR-based weights.

4) tCNR-WEIGHTED
The tCNR-weighted echo combination approach, introduced by Poser et al. [18], combines the TE and tSNR values for each echo image reflecting the temporal contrast-to-noise ratio of the images.The advantage of this method is that it does not make any assumptions about the signal and noise because it is measured from the data while it simultaneously incorporates the echo time simultaneously.The tCNR weights are calculated as follows: Again, the weighted average is calculated to retrieve the final tCNR-weighted combined fMRI image.

5) AVERAGE REFERENCE
To compare the RSNs of ME combinations to a simple multi-echo reference, the average of the three echoes was calculated, i.e. three echo weights equal to one third.

6) SECOND ECHO REFERENCE
The RSNs that are extracted from the ME combined scans are compared to those derived from the SE, i.e. single-echo, fMRI images.The TE of the SE images is equal to 35 ms.

E. NETWORK EXTRACTION AND SELECTION
RSNs were extracted by group-level ICA implemented from the Group independent component analysis (ICA) of fMRI Toolbox (GIFT, http://mialab.mrn.org/software/gift)[27].ICA is a widely applied method for blind source separation [28].By assuming that the data consist of linear mixtures of unknown independent variables and by maximizing their non-Gaussianity, ICA facilitates the identification of statistically independent components [29].In fMRI, these components are referred to as the RSNs.In group ICA, first the subjects' fMRI dimensionality was reduced based on a two-stage principal component analysis, followed by extraction of independent components (fast ICA algorithm).The number of components was set to 30 in order to obtain isolated RSNs with minimal contamination from other networks or subdivision into smaller components [30].This resulted in a spatial map (of z-scores) and time-series for each individual component.
RSNs were identified using a goodness-of-fit approach with the RSN atlas from Smith et al. as [31].From this atlas, a mask image of each RSN was compared to each of the 30 components.For each component, the network that scored the highest on this goodness-of-fit score was selected.Subsequently, a visual inspection was performed to check the quality of the match.Ultimately, the following eight RSNs were identified: the default mode network (DMN), right and left frontal parietal network (rFPN/lFPN), lateral and medial visual network (latVN/medVN), somatosensory network (SMN), auditory network (AN) and cerebellar network (CN).Fig. 2 shows the identified general resting-state networks [31].

F. QUALITY MEASURES
Multiple measures were calculated to compare the properties of the RSNs derived from the combination methods described above.These measures provide insights into the quality of the spatial, i.e. related to the RSN maps, and temporal, i.e. related to the RSN time-series, domain.

1) TEMPORAL MEASURES
The first temporal measure that is implemented is the Pearson correlation of RSN time-series with derived artifact-related regressors (from here on called artifact correlation), which was implemented to assess the networks' extent of temporal confounders such as motion, cardiac and respiratory components.The absolute value of the correlation was calculated to take into account negative and positive correlations.Two motion regressors were derived: framewise displacement (FD) and the derivative of root mean square variance over voxels (DVARS).FD is an indicator of the head movement between subsequent volumes [32] and was calculated from the realignment parameters derived during the realignment step.DVARS reflects the signal intensity changes between subsequent volumes [32].From the WM and CSF separately, the first five principal components were extracted as implemented in CompCor [33].Signals from these regions are often assumed to be of non-neural origin, mostly cardiacor respiratory-related [33].DVARS and the WM and CSF components were derived from the echo 1 scans as these time-series are not yet ME combined and have the highest signal intensity and therefore assumingly highest artifact content.
The second metric that is calculated is the dynamic range (DR) [12], [34].This feature has been used before to classify independent components as either BOLD or non-BOLD and to estimate the noise contribution in RSNs [34], [35].Components with higher DR have been associated with true BOLD signals and those with lower DR as noise.
The third temporal metric is the low to high frequency power ratio (LHFpow) reflecting the ratio of lower to higher frequencies of which the latter is more often linked to artifact-related power [34].Similar to DR, components with higher values of LHFpow are associated with RSNs whereas lower ones are more likely to be noise [34].The DR and LHFpow were both extracted by the GIFT toolbox.

2) SPATIAL MEASURES
The first spatial consistency metric is the Pearson correlation between the identified RSN spatial maps and corresponding RSN spatial maps from the Smith et al. atlas [31] and is referred to as atlas correlation.The second metric is the spatial extent, i.e. the total number of 'active' voxels.The number of 'active' voxels is defined as the amount of voxels that exceeds a z-score of 3 [36].In addition, the maximum z-score for each network is obtained and referred to as spatial stability [36].Furthermore, to assess the overall strength of the RSN and ability of group ICA to extract true RSN components, the mean z-score is calculated.The spatial extent, spatial stability and mean z-score are calculated for each component after masking each RSN with the corresponding binary RSN of the Smith et al. atlas to avoid the influence of strong noise-related voxels on these metrics [31].Finally, in order to estimate the ratio between noise and true neuronal signals within each component, the percentage of significant voxels within the CSF and WM to the total number of voxels of that RSN is calculated, referred to as percentage voxels in CSF & WM.
3) STATISTICAL ANALYSIS Distributions of spatial and temporal measures are statistically compared between each combination method and the reference methods (SE and Avg ME combination) using paired t-tests after testing for normality with the Shapiro-Wilk test [37].p-values (α level = 5%, 1% and 0.1%) are corrected for multiple comparisons using the Holm-Bonferroni procedure [38].Denoising improvements are compared for each method with respect to the mean improvement of all combination methods by one-sample t-tests.This is done with the aim to evaluate whether and which combination methods benefited significantly more or less from denoising compared to the others.Again, Holm-Bonferroni correction is applied to correct for the numbers of statistical tests.For all t-tests we also ran separate permutation tests to evaluate the validity of the paired t-tests.Permutation tests can be more accurate because there are no distributional assumptions that have to be met and they are an exact approximation of the type I error [39], [40].We ran a paired sample permutation test based on a t-statistic [41] using a Matlab package [42] and corrected p-values again with Holm-Bonferroni testing.The following settings were used: number of permutations = 5000, tail = two-tailed test and α = 0.05.Finally, the observed effect sizes and power for the temporal and spatial measures are calculated and a sensitivity analysis is conducted to identify the smallest detectable effect size with the current study design using the G * Power 3 software package [43].

III. RESULTS
The performance for the combination methods with or without additional denoising is described below.The section is divided into the temporal and spatial domains of the resting-state networks and also a section exploring the network-specific performance.Subsequently, the effect of denoising on the network quality was assessed.The results for the sensitivity analysis for the current study design can be found in Fig. S1.We do not report the permutation test results as the p-values were either highly similar or slightly lower than the paired t-tests, supporting the validity of the latter.

A. TEMPORAL PERFORMANCE
The temporal measures for each combination method are shown in Fig. 3. Improvements in temporal measures versus the reference methods can be found in Fig. S2.
Overall, the time-series of the RSNs from the OC and T 2 * -FIT methods obtained the highest temporal quality measures.Whereas the DR and LHFpow were the highest for T 2 * -FIT (not significant and p corr < 0.001 compared to SE), they also correlated significantly less with the artifact-related time-series compared to Avg (p corr < 0.001).OC resulted in significant improvement compared to Avg in all temporal measures (artifact correlation p corr < 0.01, DR p corr < 0.05 and LTHpow p corr < 0.05).The overall temporal mean observed effect sizes and power ranged from 0.141 to 0.278 and 18.4 to 54.8%, respectively, see Table S1.To evaluate which nuisance regressors (motion, CSF or WM) were reduced most optimally, the correlation with separate regressors was analyzed.Regarding the motion regressors FD and DVARS, the highest performance was achieved for the tCNR combination (mean r = 8.82 * 10 −2 ), see Fig. S3.For the tissue regressors, the T 2 * -FIT combination correlated the least (r = 1.18 * 10 −1 ).

B. SPATIAL PERFORMANCE
The performance of the combination methods in the spatial domain is shown in Fig. 4. OC combination scored significantly higher on minimal three out of the five spatial measures compared to both reference methods, namely the atlas correlation (p corr < 0.001 vs SE and Avg), spatial extent (p corr < 0.01 vs SE and p corr < 0.001 vs Avg) and mean z-score (p corr < 0.001 vs SE and Avg).Furthermore, compared to Avg, the spatial stability (p corr < 0.001) and percent voxels in CSF & WM (p corr < 0.05) improved significantly.T 2 * -FIT and tCNR also significantly enhanced the spatial quality, with tCNR obtaining the highest spatial stability and T 2 * -FIT the lowest percent voxels in CSF & WM.Overall spatial mean observed effect sizes and power ranged from 0.211 to 0.483 and 35.2 to 95.0%, respectively, see Table S1.
Concluding, OC performed the most optimal.Whereas OC was ranked first on three spatial measures (significantly improved on three spatial measures compared to SE and on all measures compared to Avg), the spatial performance of T 2 * -FIT and tCNR followed.

C. NETWORK-SPECIFIC PERFORMANCE
The improvement of RSN quality resulting from different ME combination methods was assessed at different brain locations: anterior (DMN, lFPN and rFPN), central (SMN and AN) and posterior (CN, medVN and latVN).Fig. 5 displays the improvements of overall RSN quality in these brain areas with regards to the reference methods.Fig. S2 separates the overall quality into temporal and spatial measures.In accordance with previous results, the overall RSN quality improved the most in T 2 * -FIT, followed by OC and tCNR.
As expected, the anterior networks gained the most in overall quality for the T 2 * -FIT and OC combination.This gain is illustrated best in the spatial domain where both methods improve in quality almost 10% and 30% compared to SE and Avg combination, respectively.The Avg and tSNR combination showed lower performance.For example, the coverage of the lateral prefrontal cortex of the rFPN in the Avg combination was found to be minimal (see Fig. 6), potentially explaining the lower spatial extent and atlas correlation.The tCNR also improved the anterior networks except for the DMN.The spatial map of this network showed that the medial prefrontal cortex region that belongs to the DMN was extended more superior, up until the supplementary motor area, which is not part of the DMN.Compared to SE, the spatial metrics of the anterior networks were less explicitly improved for the OC and T 2 * -FIT combination (9% and 10% on average, respectively).Interestingly, the ventromedial prefrontal cortex region in the DMN was smaller for SE and located more distant from the tissue-air boundaries, see Fig. 6.For the central (SMN and AN) networks, the difference in performance of the OC and T 2 * -FIT versus the reference methods was also substantially larger (on average enhanced by 28% and 26% for OC and 17% and 13% for T 2 * -FIT, for SE and Avg, respectively).The SMN improved most.The spatial map for the SE method showed that the coverage of the SMN was restricted to the superior part of the cortex whereas for the other methods the regions extended also towards the ventral directions along the cortex.The improvement for the posterior networks (CN, medVN, latVN) was less prominent (compared to Avg) or almost nihil (compared to SE).
In the temporal domain, the T 2 * -FIT, tCNR and OC combinations also improved the quality of the time-series compared to either the SE or the Avg combination, see Fig. S2.Over all measures and networks, the time-series of the T 2 * -FIT method gained the most quality, with an increase of 13% and 25% over the SE and Avg combination, respectively.The largest improvements were observed in the anterior and central brain networks.

D. THE EFFECT OF ADDITIONAL DENOISING
As can be observed from Fig. 7, two combination methods benefited significantly more from smoothing and bandpass filtering in the spatial domain: Avg and tSNR (both p corr < 0.001 compared to the mean improvement).Another remarkable finding is the fact that smoothing and bandpass filtering decreased the quality of the RSN maps of the other methods (p corr < 0.001, < 0.01 and < 0.001 for SE, tCNR and T 2 * -FIT, respectively).Other noteworthy effects were a general (i.e. for all combination methods) increasing trend in spatial extent and decreasing trend in spatial stability.These are likely caused by the smoothing procedure, which could have spread activation patterns and faded activation peaks.Moreover, the percentage of voxels in the CSF and WM was slightly lower for most methods, probably a result of the bandpass filter removing non-BOLD signals in CSF and WM.
Additional ICA-AROMA had only slight effects on the spatial quality.The spatial quality of the tSNR combination decreased whereas the performance of tCNR 'recovered'.
In the temporal domain, several general trends could be observed for both case 2 and 3.The nuisance correlation decreased by almost 70% for all combination methods, see Fig. S4.This could have been caused by the bandpass filter, potentially removing coherent patterns of artifacts.Because the signal was filtered with a bandpass filter, the dynamic range dropped by about 20 percent for all combination methods.Finally, the LTHpow also dropped by about 40-60%, likely caused by the bandpass filter removing relatively more low frequency power.The differences of decrease in spectral measures and artifact correlation between combination methods, however, was minimal.Nevertheless, a subtle but similar pattern as before was observed for case 2: tSNR benefited the most from smoothing and bandpass filtering (relatively most improvement in DR and LHFpow (p corr < 0.01).Application of additional ICA-AROMA only slightly improved the quality of the RSN time-series for most methods in the spectral domain when compared to no denoising.

IV. DISCUSSION
In this paper we evaluated the quality of the spatial and temporal properties of RSNs between different multi-echo combination methods.Overall, it was shown that the OC and T 2 * -FIT combination outperformed the other methods.The OC and T 2 * -FIT combination resulted in the highest quality network time-series.The OC combination allowed the extraction of the highest quality spatial maps, followed by T 2 * -FIT and tCNR.Compared to SE or the Avg multi-echo combination references, most improvement was observed in the anterior and central networks.Moreover, the effect of denoising on the resting-state networks was assessed.The least performing methods, i.e. the Avg and tSNR combination, benefited the most from additional bandpass filtering and smoothing.These results highlight the strength of T 2 only minimally between echoes compared to the mean of S, making the S/σ weight calculation primarily dependent on the signal S.This causes the combined echo time-series to be heavily weighted towards the early echoes.Moreover, in resting-state fMRI, the BOLD fluctuations are the signals of interest.As the BOLD fluctuations are included in the 'noise' σ , i.e. the standard deviation of the signal, in the tSNR formula, echoes with higher fluctuations in BOLD-related signals are penalized.Future studies might opt to recast the formula, e.g. by separating the BOLD fluctuations from the non-BOLD noise by filtering or by substituting the noise components with percent explained variance in the data by nuisance regressors [48].By using the BOLD fluctuations as signal S and non-BOLD noise as noise σ , the weight calculation takes into account the higher BOLD contrast in the longer TE images.Likewise, in the tCNR-weighted combination, the added TE-factor takes into account the superior BOLD contrast of the longer echoes.The tCNR-weighted combination thus calculates weights that are balanced between signal intensity and BOLD contrast.As our results indicate, networks extracted from tCNR have superior spatial quality compared to tSNR but still lower than the T 2 *weighted methods (OC and T2 * -FIT).Moreover, temporal quality metrics (dynamic range and low-to-high frequency power) suggest that the networks from the tSNR combination demonstrate less RSN-related power.This could be a result of the decreased BOLD sensitivity since echoes with a longer TE are weighted less.
Bandpass filtering and smoothing improved the network quality of the Avg and tSNR combinations.Interestingly, these methods showed the lowest performance before additional cleaning.Potentially, these combinations were less efficient at reducing physiological and motion artifacts.The lower performance in the temporal domain supports this hypothesis.For the Avg combination, the artifact correlation revealed the largest abundance of motion, WM and CSF sources in the time-series.tSNR scored the lowest on spectral quality.The fact that the tSNR-combined time-series had the lowest DR and LHFpow of all methods suggests a larger abundance of non-BOLD high-frequency signals.The 0.2 -0.25 Hz frequency range is associated with contamination of respiratory signals [12].Bandpass filtering could have reduced the impact of artifact-related signals in the time-series of the Avg-and tSNR-combined networks.Accordingly, the DR and LHFpow of the tSNR-combined time-series relatively improved the most in DR and LHFpow compared to the other methods.More structural removal of these artifacts, e.g. by RETROICOR [49] or ME-ICA [25], [26], or simulations could aid in the process of identifying the origin of this power reduction.
Remarkably, the spatial quality of T 2 * -FIT, tCNR and OC decreased following bandpass filtering and smoothing.Potentially, smoothing decreased the higher BOLD contrast for these combinations.Smoothing usually reduces the randomly distributed noise [11].However, smoothing of data with minimal artifacts could reduce the BOLD contrast instead.Moreover, it has already been shown before that smoothing can diminish or abolish ME-based denoising benefits in task-based fMRI [22].Alternatively, the lowpass cutoff frequency of the bandpass filter at 0.05 Hz could have removed relevant BOLD signals.There is evidence of slow oscillatory (0.01 -0.05 Hz) signals of neural origin [50], [51], located more significantly in the frontal, parietal and occipital cortices.If artifact-related signals in that frequency range obscured true BOLD signals in the other methods, it could have enhanced performance.
Additional ICA-AROMA did not improve the networks' spatial and spectral quality for any of the ME combinations in comparison to merely bandpass filtering and smoothing.Yet, there are other ICA-based methods for cleaning the BOLD signal which not have been evaluated here.For example, Tedana [52], a software package that classifies ICA components as BOLD or non-BOLD based on TE dependence regresses out the latter components from the signal to reduce the amount of non-neural contributions.Another common method is ICA-FIX [53], which calculates temporal and spatial measures of the components and classifies them as BOLD and non-BOLD after model training on a by FSLor user-provided dataset with manually provided labels.The evaluation of the effect of Tedana and ICA-FIX was, however, not performed in this study because Tedana requires multi-echo data and we aimed to also compare the results to SE. ICA-FIX requires the need of a trained dataset and the provided pre-trained dataset has shown inaccurate component classification results and loss of signal [54].Therefore, we studied the effect of ICA-AROMA, a flexible, easily implemented and well-validated denoising algorithm which has shown to preserve the signal of interest [24], to lead to robust RSN reproducibility while removing noise [54], [55] and increase tSNR [55].Future studies are required to evaluate whether other ICA-based cleaning methods could lead to further improvements in RSN network quality.
One of the limitations of this study is the small sample size.However, because samples are compared between ME combined datasets from the same participant, i.e. a within-subject design, statistical power increases [56], [57].The sensitivity analysis showed that true effects of small to medium effect sizes (0.375) could still be detected in this study design with a power of 80%.Observed effect sizes ranged from low to high [44], with the mean effect size of spatial measures being close to medium.The magnitude of the effect sizes for the temporal measures, however, was in the range of what is considered a low effect size [44].Therefore, a study with more participants is required to confirm these findings.Besides that, more females were scanned than males (approximately a ratio of 2:1).There is evidence of sex differences in RSN connectivity [58], [59].For example, with increasing age, the central autonomic network has been shown to be functionally altered in females compared to males, reflecting altered autonomic regulation [59].Another study demonstrated that the most important contributors to gender prediction based on functional connectivity were connections within the DMN, FPN and SMN [58].Nonetheless, we analyzed the effects of ME combinations within a participant, thereby reducing the bias of each comparison.Yet, we cannot rule out the existence of a sex * ME combination interaction effect.
In summary, the T 2 * -weighted combinations allowed the extraction of robust and denoised functional maps and time-series of RSNs.Spatial and temporal measures indicated an enhanced performance compared to the reference methods.Moreover, additional denoising, such as spatial smoothing, bandpass filtering and ICA-AROMA, may be unnecessary for the T 2 * -weighted combinations.Consequently, the functional maps remain as intact as possible, whereas smoothing could decrease the BOLD contrast and bandpass filtering could remove low oscillatory neural signals of interest.Studies related to the investigation of the temporal aspects of RSNs might opt to combine ME-fMRI data using the T 2 * -based algorithms.An example of such a study could be the evaluation of RSNs using the wavelet coherence analysis in which phase shifts are a key factor [60]. Artifacts, such as the respiratory artifacts, that are left in the RSN time-series could bias these analyses significantly as these are often emerging as repeating temporal patterns.T 2 *weighted combinations could decrease the noise correlation and cause a relative power reduction in the high frequency ranges of this physiological confounder.Finally, as bandpass filtering may not be required, potential ultra-low oscillations of neural origin will be left in the BOLD signal.

V. CONCLUSION
The application of multi-echo fMRI in future studies is warranted thanks to its significant increase in BOLD sensitivity when compared to conventional single-echo fMRI.Here, we evaluated the effect of different echo combination methods on the quality of resting-state networks.It was found that the OC and T 2 * -FIT combinations performed better than the second echo and simple multi-echo average weighting scheme.Analyses of network time-series demonstrated that the OC and T 2 * -FIT combination reduced the artifact-related signals most adequately.The OC method demonstrated the most optimal spatial quality measures.Nonetheless, the T 2 * -FIT functional maps still achieved robust and consistently high scores on spatial quality, including the lowest percentage of voxels overlapping with CSF and WM regions.The anterior networks gained the most in overall quality for the T 2 * -weighted echo combinations, potentially reflecting the reduced signal loss in regions that are prone to susceptibility artifacts.Furthermore, additional postprocessing methods to clean the BOLD signal, specifically spatial smoothing, bandpass filtering and ICA-AROMA, were unnecessary for OC and T 2 * -FIT.These T 2 * -weighted combination methods resulted in similar network quality as networks that were derived following other multi-echo combinations and postprocessing steps.Therefore, we recommend future resting-state network studies to apply the OC and T 2 * -FIT combination without these additional denoising steps.Minimizing the amount of filtering and rescaling of the fMRI images could be beneficial as the original BOLD contrast remains largely untouched.Limitations of the current study include the relatively high repetition time and low sample size.Future studies could examine the effect of T 2 * -based combinations on dynamic network interactions or assess its value in diagnosis or prognosis in populations of specific neuropsychiatric disorders for which fMRI data are still substantially affected by physiological or motion artifacts.

FIGURE 1 .
FIGURE 1. Schematic of the required analysis steps to obtain the consistency metrics between resting-state networks.1) The raw data are the multi-echo data, i.e. three time-series acquired at different echo times.2) Pre-processing steps are applied.3) The three time-series are combined using the different combination methods, resulting in a single combined time-series.For the single-echo reference, the second echo time-series is taken.4) Group independent component analysis (ICA) results in groupwise spatial maps and time-series.5) Individual spatial maps and time-series are obtained from dual regression.6) The resting-state networks of all subjects and combination methods are evaluated for temporal and spatial differences.

FIGURE 3 .
FIGURE 3. A summary of the temporal measures for each multi-echo combination method.T2*-FIT showed the highest performance considering all measures.Compared to the other methods, the correlation with artifactual time-series decreased, whereas the dynamic range and low-to-high frequency power increased the most.The mean ± 95% confidence interval is shown.Statistical significance between the distributions of all combination methods and the second echo or average multi-echo combination were made.Holm-Bonferroni corrected p-values < 0.05, 0.01 and 0.001 are indicated with *, ** and ***, respectively.Abbreviations: tSNR = temporal signal-to-noise ratio, tCNR = temporal contrast-to-noise ratio, LTH = low-to-high, ME = multi-echo.

FIGURE 4 .
FIGURE 4. A summary of the spatial measures for each multi-echo combination method.Optimal combination performed significantly better than the second echo and average combination on minimal three out of five measures.Moreover, tCNR resulted in the highest spatial stability and T2*-FIT in the lowest percent voxels in the CSF & WM.The mean ± 95% confidence interval is shown.Statistical significance between the distributions of all combination methods and the second echo or average multi-echo combination were made.The mean ± 95% confidence interval is shown.Statistical significance between the distributions of all combination methods and the second echo or average multi-echo combination were made.Holm-Bonferroni corrected p-values < 0.05, 0.01 and 0.001 are indicated with *, ** and ***, respectively.Abbreviations: tSNR = temporal signal-to-noise ratio, tCNR = temporal contrast-to-noise ratio, CSF = cerebral spinal fluid, WM = white matter, ME = multi-echo.

FIGURE 5 .
FIGURE 5. Overall quality improvement of anterior, posterior and central brain networks compared to networks from the second echo and average multi-echo combination.Improvements of overall (calculated over the spatial and temporal domain) quality measures of anterior (default mode network, left and right frontoparietal network), posterior (cerebellar network, primary and lateral visual network) and central (sensorimotor network and auditory network) networks are shown, as compared to networks from the second echo (A) and average multi-echo combination (B) references.Note that for the calculations of improvement in artifact correlation and percent voxels in CSF and WM, the sign of change was inverted (as lower artifact correlation and lower percent voxels in CSF and WM correspond to more improvement).Abbreviations: RSN = resting-state network, tSNR = temporal signal-to-noise ratio, tCNR = temporal contrast-to-noise ratio, ME = multi-echo.

FIGURE 6 .
FIGURE 6. Spatial group maps of three resting-state networks.The upper image shows the right frontoparietal network for the Avg, OC and T2*-FIT combination.For the Avg combination, the coverage of the lateral prefrontal cortex is minimal (brown arrows) and no activation can be seen in the left hemisphere (brown circle).The middle image shows the default mode network for the SE, Avg and tCNR combination.The activation coverage of the brain in the ventromedial prefrontal cortex is relatively small (red arrows) and more distant from tissue-air boundaries for SE.The tCNR combination (yellow arrows) shows limited posterior cingulate cortex activation.Moreover, the anterior activation is shifted towards the supplementary motor area instead of where it is expected to be (medial prefrontal cortex) according to resting-state network atlases.The lower image shows the somatosensory network for SE, OC and T2*-FIT.The SE network shows small coverage ventrally along the cortex as opposed to OC (blue arrows), and especially T2*-FIT (pink arrows).Abbreviations: SE = second echo, Avg = average, tCNR = temporal contrast-to-noise ratio, OC = optimally combined, rFPN = right frontoparietal network, DMN = default mode network, SMN = sensorimotor network.

FIGURE 7 .
FIGURE 7. The effect of two additional denoising methods on spatial and spectral quality.The upper images show spatial measures changes for each combination method when A) smoothing and bandpass filtering and B) additional ICA-AROMA are applied.The lower images show the spectral measures (dynamic range and low-to-high frequency power) changes when C) smoothing and bandpass filtering and D) additional ICA-AROMA are applied.Statistical testing was performed to test which combination methods improved more than the mean improvement.Holm-Bonferroni corrected p-values < 0.01 and 0.001 are indicated with ** and ***, respectively.The gray line indicates the average improvement.Abbreviations: tSNR = temporal signal-to-noise ratio, tCNR = temporal contrast-to-noise ratio, ME = multi-echo.