A Semantic-Oriented Federated Learning for Hybrid Ground–Aqua Computing Systems

Nowadays, an ambitious target of the next-generation networks is to develop intelligent overarching space–air–ground–aqua computing systems, in order to provide a smart ecosystem able to efficiently operate computation in heterogeneous domains. In particular, in such a context, the underwater environment requires a special attention, since it is recognized as the most challenging domain, due to channel impairments and adverse propagation conditions. This article proposes a self-intelligent system able to efficiently perform underwater environment monitoring or underwater survey of critical infrastructure, by resorting to the use of the semantic communication paradigm to lower the impairments due to the underwater channel propagation conditions. In particular, in our case, images sent by underwater devices are collected by shore small base stations (SSBSs) to form their training data set to take part in a federated learning (FL) process with a ground base station. In particular, this article considers a semantic communication scheme based on a deep-convolution neural networks encoder–decoder architecture for an efficient exploitation of the data transmission from underwater devices to the linked SSBSs. Performance analysis is provided to show better behavior of the proposed system in comparison with the conventional alternative that does not involve the use of the semantic communication approach. Finally, a specific performance evaluation analysis is devoted to the investigation of the convergence behavior of the proposed FL procedure in reference to the cross ground–aqua system considered in order to highlight its advantages with respect to a classical implementation.

A Semantic-Oriented Federated Learning for Hybrid Ground-Aqua Computing Systems Benedetta Picano , Member, IEEE, and Romano Fantacci , Life Fellow, IEEE Abstract-Nowadays, an ambitious target of the nextgeneration networks is to develop intelligent overarching space-air-ground-aqua computing systems, in order to provide a smart ecosystem able to efficiently operate computation in heterogeneous domains.In particular, in such a context, the underwater environment requires a special attention, since it is recognized as the most challenging domain, due to channel impairments and adverse propagation conditions.This article proposes a self-intelligent system able to efficiently perform underwater environment monitoring or underwater survey of critical infrastructure, by resorting to the use of the semantic communication paradigm to lower the impairments due to the underwater channel propagation conditions.In particular, in our case, images sent by underwater devices are collected by shore small base stations (SSBSs) to form their training data set to take part in a federated learning (FL) process with a ground base station.In particular, this article considers a semantic communication scheme based on a deep-convolution neural networks encoder-decoder architecture for an efficient exploitation of the data transmission from underwater devices to the linked SSBSs.Performance analysis is provided to show better behavior of the proposed system in comparison with the conventional alternative that does not involve the use of the semantic communication approach.Finally, a specific performance evaluation analysis is devoted to the investigation of the convergence behavior of the proposed FL procedure in reference to the cross groundaqua system considered in order to highlight its advantages with respect to a classical implementation.Index Terms-Federated learning (FL), semantic communications, underwater communications.

I. INTRODUCTION
T HE EMERGENCE of the next-generation networks is giving rise to a novel and wide class of applications, requiring to create an intelligent environment able to properly support disruptive new generation applications.During the last decade, the emergence of these challenging aspects, alongside the ever-increasing proliferation of mobile computing and Internet of Everything (IoE) devices, accordingly to which every single object on the Earth will be connected to the Internet for information exchange and communication, The authors are with the Department of Information Engineering, University of Florence, 50139 Firenze, Italy (e-mail: benedetta.picano@unifi.it;romano.fantacci@unifi.it).
Digital Object Identifier 10.1109/JIOT.2023.3325289realizing intelligent and supervisory functions [1], has led to the need of developing seamlessly integration of systems operating in different domains, e.g., ground, air, space, and even underwater [1].Nowadays, this has generated intense research efforts to push the artificial intelligence (AI) frontiers to the network edge regardless of the domain in which they operate [2], e.g., to perform communication channel conditions prediction [3].This trend is in accordance with the emerging edge-intelligence (EI) paradigm which involves the use of edge computing nodes (ENs), arranged in proximity of the source of data, to support data gathering and computations with the aim at hosting dedicated machine learning procedure to properly interpret and manipulate data stemmed from the network devices [4].Nevertheless, EI is still at its infancy and the effective exploitation of AI techniques at network edges still represents a crucial problem.Recently, with the advent of 6G technology and the highfalutin applications which brings with it, has generated the imperative need to extend the unified AI-based network paradigm to space-airground-aqua integrated domains [1].Such an innovative vision finds its roots in the fact that about 71% of the Earth area is occupied by oceans, hosting critical infrastructures, such as submarine pipelines, oil platforms to name a few, while large ocean areas are still unexplored.However, the effective realization of computations in the underwater domain is bound to data sensing, transmission, and forwarding, which makes the transmission of large volumes of data costly in terms of both time and power.In fact, such a novel computation paradigm implies that information is extracted under the water using embedded processors via data mining and/or data compression [5].The main objective here is to include AI capabilities at the edge of a new generation integrated network (NGIN), to give rise to an EI-NGIN capable of information acquisition, processing, interpreting, and transmission in different domains and in relation to application purposes [1].
Whereas the interest in networks overlapping ground and underwater domains, to provide sustainable marine development and exploitation, is self-evident and it represents one of the most desideratum today's network challenges, there are still numerous issues to be addressed before it becomes a concrete reality.In particular, the limited communication conditions in the underwater environment dictate to decide between low transfer delay and short communication distance (expensive optical links), or longer communication delay with longer communication range (acoustic-based communications) [6].The ambition of this article is the proposal of an AI-driven framework to exploit beyond-the-edge computation capabilities, promoting the functional integration between machine learning and the device-centric approach, according to which end-devices play an active role in computing and communication.In this sense, the device-centric perspective is devoted to implement a novel AI-based semantic communication scheme to mitigate the drawbacks due to the hostile behavior of underwater environments, then exploited to perform federated learning (FL) on the ground.This approach may enable important advancements in several application fields as ambient monitoring, conservation biology, and marine security.Note that, for the best of our knowledge, this is the first paper proposing a semantic-based communication framework to enable hybrid ground-aqua FL paradigm, which typically is exclusively confined to the ground domain.
In this picture, this article considers the case of a proper ground-aqua EI-NGIN able to support the FL paradigm [7], [8], [9] to effectively train models on data (i.e., images) collected by underwater devices.In this context, the optimization of underwater data communications becomes crucial.Therefore, an AI-empowered semantic communication framework has been envisaged to extract and send only bits related to the semantic information of each collected image, instead of bits due to the statistic knowledge of source symbols.This involves a significant reduction in the amount of data to be transmitted and consequently a significant performance improvement in terms of transmission time reduction and data delivery reliability.Therefore, the contributions of this article can be summarized as follows.
1) The design and development of a ground-aqua framework, applied to perform data gathering from underwater devices, to create an image data set at the shore small base stations (SSBSs) that perform FL with the ground SBS and, successively, to accomplish intelligent tasks.Image transmission among underwater devices and SSBSs is realized through the involvement of the semantic communication paradigm, in order to send only semantic information through the underwater channel, usually allowing slow transmission rates and subjected to severe communication impairments.It is important to highlight that existing frameworks devoted to realize collaborative federated analytics approaches, are typically confined to the ground domain exclusively, assuming the data set on which they act yet gathered.Differently, this article develops a cross-domain (i.e., aqua and ground) FL framework, considering and optimizing the underwater communication aspects involved in the data collection process.
2) The proposal of a semantic communication scheme to overcome the criticalities of the underwater environment, enabling efficient and effective image transmission stemmed from the underwater devices, and devoted to cross-domain FL functioning.In this reference, a machine learning module based on the convolution neural networks (CNNs) has been applied, realizing the semantic encoder through the usage of a stack of Conv2D and max-pooling layer, whereas the decoder consists of a stack of Conv2D and upsampling layer.
3) An in-depth performance analysis to test the proposed framework, in order to exhibit the validity of the approach adopted, especially focusing on performance measurements due to the underwater transmission, that represents the bottleneck of the whole system.Specifically, results investigate performance of the semantic encoder/decoder scheme, expressed in terms of loss function and underwater transmission time, and the performance of the whole FL-based framework, given in terms of convergence time and accuracy of the learning model trained.The remainder of this article is organized as follows.In Section II an in-depth review of the related literature is presented.Section III details the system model and the problem considered, while Section IV presents the proposed overarching framework approach.Performance evaluations are presented in Section V and, finally, our conclusions are drawn in Section VI.

II. RELATED WORKS
The potentialities of the semantic communications in the next-generation networks have been accurately discussed in [10], in which an in-depth survey about the application of machine learning techniques to the transmission systems to extract and retrieve the meaningful information has been proposed.Likewise, paper [10] investigates the applications of machine learning to the semantic transmission, considering the human-to-human, human-to-machine, and machine-tomachine transmission modalities.The exploitation of EI to perform semantic transmission has been presented in [4], whereas a joint semantics-noise coding has been the object of the analysis addressed in [11], where a reinforcement learning approach has been applied.Furthermore, in [11], a critical discussion of the semantic approach has been provided, especially in relation to security and information overhead aspects.Differently, the coexistence of heterogeneous tasks types has been the focus in [12], where an encoder architecture able to discern between classification and detection has been designed.Similarly, in [13], it was outlined the development of a dynamic context aware machine learning decoder to dynamically interpret the tasks type, aiming at performing proper semantic extraction.
A deep learning scheme has been applied in [14] to give rise to a semantic transmission system able to interpret and transmit natural language sentences.Moreover, in [14], the simultaneous maximization of both the cross-entropy and the Kullback-Leibler metric has been provided.Speech transmission has been the focus in [15], where a deep encoder-decoder neural network is developed minimizing the mean-squareerror.In [16], a pruning redundancy method is propose to provide semantic communications in case of spectrum scarcity.In addition, fading channel effects have been considered, as well as channel state information aiming at mitigating the drawbacks of bad channel quality occurrence during transmission.The resource allocation problem has been analyzed in [17], where two novel semantic metrics have been proposed, namely the semantic rate and the semantic spectral efficiency.
In addition, paper [18] presents a novel semantic networking concept, in which the semantic training has been performed thorough FL to support offloading strategies.
Similarly, the FL paradigm has been exploited in [19] to perform traffic classification, applying deep learning and the cross-silo horizontal technique to improve privacy and security.An FL has been tested in [20], aiming at evaluating the actual potentials of the FL technology having privacy as main focus.The FL has been also exploited in [21] to enable both the mobile intelligent surfaces reconfiguration and the users power allocation, in order to maximize channel quality, spectrum efficiency and users data rate.Sun et al. [22] focused on an edge computing landscape enabling tasks offloading.In that context, the FL has been applied to predict the execution time of the edge server considering an asymmetrical information environment, aiming at optimizing the delay energy product metric, performing proper offloading policies.
Then, in [1], the hierarchical space-air-ground-underwater architecture is proposed, focusing on the protocol aspects of realizing a flexible and cross-domain communications.Differently, a data-driven approach based on the usage of an echo-state-network has been provided in [23], in order to model the underwater channel conditions.The underwater domain is also the objective of [6] where autonomous underwater vehicle (AUV)-assisted underwater wireless networks are devoted to monitor and track underwater pollution.More in detail, authors propose a software-defined AUV networking system based on the artificial potential field theory, devoted to network control to properly track underwater pollution.Underwater target tracking, along with target detection, is investigated also in [24], in which authors review several solutions to track unmanned underwater vehicles and they describe different ray tracing models, useful to perform tracking in underwater landscapes.A comprehensive survey about marine big data and corresponding data processing techniques, considering the problems typical of underwater environments, is represented by [25].Paper [26] focuses on AUV hybrid wireless network, where acoustic communications and magnetic induction coexist in the same network environment.The main contribution of that paper is the design and implementation of an alternating scheme to optimize AUV path planning and network data flow routing.Data collection problem within AUV sensor network is also the focus of [27], in which a data collection scheme integrating the AUV mobility is designed.In particular, the mobility model taken into account in [27] includes direction and velocity to produce a realistic 3-D AUV mobility pattern.Exploiting AUVs as edge computing platform, authors propose a data collection and target node selection algorithm to effectively and efficiently visit all nodes in the network, balancing time and energy consumption.A cloud-based solution to real-time aquatic monitoring, involving underwater acoustic telemetry supported by edge computing paradigms, is proposed in [28].The framework developed in [28] integrates a custom-designed, miniaturized, printed circuit board to compresse data and to increase transmission speed.A low-complexity algorithm to efficiently compute sound velocity, defined as the difference between the sound velocities at the transmitter and the receiver, for deep-sea Internet of Underwater Things (IoUT) networks, is developed in [29].Paper [30] exploits stochastic geometry to optimize the densities of surface stations of a K-tier space-air-ground-sea underwater acoustic network, adopting realistic communication channel models, to maximize the coverage probability of the system.The design and realization of an efficient overarching Space-Air-Ground-Ocean network is also studied in [31], where a lightweight recurrent neural network is placed on board of low-consumption AUVs to self-navigate within the aquatic environment, safeguarding the limited AUV resources.Paper [32] applies contrastive learning within underwater networks to compress machine-friendly features under low bitrates, devoted to underwater machine vision.Machine learning is also exploited in [33], whose main goal is to optimally set transmission parameters to avoid bandwidth loss in underwater acoustic communications when compressed images are sent.In particular, the decision making selection strategy is achieved through reinforcement learning.Byun et al. [34] proposed a comparison among machine learning techniques, properly tested on real underwater measurement gathered near the Gulf of Incheon, South Korea, to predict the most adequate communication parameters to mitigate the high propagation loss and drastic channel fluctuation problems.
In reference to the existing literature summarized before, this article, for the best of our knowledge, is the first paper proposing the use of the semantic communication paradigm in an underwater environment, as well as a combined groundaqua FL process.

A. System Model
The scenario considers the application of the FL to a cross-domain network arranged to classify images stemmed from underwater devices.In reference to Fig. 1, a two-layer EI-network has been considered.Whereas the first layer represents the aqua network domain, the second layer consists of the ground network level, in which an SBS S coincides with a cloud/edge network node (ENN), for which terms SBS and ENN will be used interchangeably.In the aqua domain, we have a set B = {1, . . ., B} of SSBSs, equipped with processing and storage capability, having the ability to handle semantic communications with a set of underwater devices M = {1, . . ., M}, that collect the images to be used at the linked SSBSs to locally training the FL model in cooperation with the ENN and the other SSBSs.After completion of the FL training process the ENN broadcasts the trained model to each SSBS.Then, the images subsequently sent by the underwater devices to the SSBSs according to the semantic communication paradigm are used to take, whenever necessary, intelligent actions.In particular, in our case, each underwater device, after capturing a new image, s u , exploits a semantic encoding process as described in Section IV, to extract the corresponding semantic information ι u and, then, send it out to the linked SSBS.

B. Federated Learning Remarks
The FL [35], [36] represents a cooperative learning paradigm in which two types of agents, i.e, participants, are typically involved: 1) the low-level devices (generally enddevices) and 2) a more structured and powerful node, for example, a central unit or more in general an aggregator.Enddevices individually perform a learning task by only send the local learning model to the aggregator instead of transfer the whole training data base.The FL is an iterative process, and it consists of global epochs, each of which, in its turn, split in three further phases: 1) local computation; 2) model exchange; and 3) central computation.Due to the to the very hostile underwater environment, such as propagation, absorption, and so on, the FL in this framework is performed between the SSBSs and the ENN.More specifically, the SSBSs represent the framework level typically attributed to end-devices, where the data on which the SSBSs train their models are gathered collecting the images sent by the underwater devices belonging to M.Then, the central aggregator is represented here by the ENN, linked to the SSBSs through radio frequency channels.
In step 1), the SSBSs provide local data training considering the shared model previously download from the ENN, and the data set created collecting images from devices belonging to the aqua layer following the approach presented in Section IV.Therefore, we denote with D b the data set of the SSBS b, with b ∈ B, composed by the images received by all the underwater devices linked to it. 1Consequently, for each sample j in D b , the main goal is the identification of a model parameter w that minimizes the loss function L j (w).Therefore, each SSBS b solves the minimization problem [7] min The corresponding learning model is represented by the minimization of the global loss function given by min w∈R e L(w) in which e is the input size.Furthermore, during each local computation round t of the FL framework, the SSBS b solves the local problem Hence, the ENN aggregates the received information by performing the following computations: and This procedure is iteratively repeated, until the desired accuracy is achieved or as a consequence of a termination criterion such as the reaching of the maximum number of iterations.
In reference to the model architecture of the FL, we adopted the vanilla FL averaging, consisting of a CNN designed as follows [37]. 1) Two 5×5 convolution layers (with 32 channels and 64, respectively, and in cascade two 2×2 max pooling, one for each layer).2) One fully connected layer with 512 units and ReLu activation.3) One softmax output layer.

C. Channels and Computation Modeling
In reference to Fig. 1, we have to take into account that the underwater channel has different propagation conditions than the wireless channel used for communications between SSBSs and the ENN.Hence, due to the heterogeneity of the nature of the channels involved, the link capacity characterization has to be specific of the channel considered, i.e., underwater or wireless. 2In particular, the wireless connections between each SSBS and the ENN are accomplished on individual no-interfering channels in the THz band, characterized by a channel capacity, i.e., maximum data rate R b,S given by in which P is the transmission power assumed equal for both the SSBSs and ENN, whereas W 1 represents the bandwidth of the communication link, d 0 is the distance, between the source and its destination and N 0 , considering both the molecular absorption noise and the Johnson-Nyquist noise at the receiving site, results in where g B consists of the Boltzmann constant, T 0 is the temperature in Kelvin, ζ is the wavelength, K(f ) is the global absorption coefficient of the physical medium, and A 0 = (c 2 /16π 2 f 2 ) [39].In reference to the underwater acoustic channels, it is widely recognized that they are related to one of the most hostile communication media.As a consequence of this, it is clearly certificated in the literature that acoustic underwater communications can be suitable exploited only at low frequencies, usually allowing a limited communication bandwidth.An in-depth discussion of the underwater acoustic channel behavior and related modeling approaches is anyhow out of the scope of this article.Any interested reader can find more details on this issue in [40] and references therein.Here, we limit our discussion to characterize the underwater acoustic channel capacity according to [41] and [42] as where W 2 is the signal bandwidth, P s and P n denote the mean received signal power and overall noise power, respectively, within W 2 , at the receiving side.
In reference to the computation model, each SSBS b has on board a CPU with working frequency f b , given in number of CPU cycles per unit time.Therefore, the time needed by the SSBS b to perform the local model computation is where log(1/ b ) is the number of local iterations needed to achieve the local accuracy b [43], [44] with respect to problem (1).Let v b be the local parameter size expressed in bits associated to the SSBS b, each communication round exhibits a cost in time defined as Therefore, considering the bth SSBS, denoting with N the number of communications rounds, the total amount of time spent results in where γ b represents the time needed to collect data from underwater devices.More in detail, defining with K the total number of steps necessary to gather images, the overall time spent by the SSBS to populate the whole data set can be defined recursively as follows: where and u χ,m is the size in bits of the data sent by the underwater device m toward the associated SSBS b, at step χ .Note that terms γ b and N are not related to each other.More in depth, γ b refers to the time spent to gather data in the underwater domain, and it depends on factors, such as the size of the semantic information to be transmitted, the rate of the underwater channel, the numerosity of the data set we want to build, etc. Differently, N is the number of FL rounds which are related to both the local accuracy b and the convergence accuracy of the FL we desire to reach.Note that the number of SSBSs is determined in dependence on the number and position of underwater devices to which SSBSs have to provide coverage.In this respect, it is important to highlight that the SSBSs have a triple role.
1) To provide coverage to underwater devices.Therefore, the choice about the number of SSBSs deployed is typically ascribed to a prior phase of network design.2) To gather data deriving from underwater device, in order to build the data set exploited by the FL framework.3) To perform the FL with the ground SBS and the other SSBSs involved in the process.In this reference, note that the number of SSBSs does not directly impact on the convergence time of the FL.In fact, such a time is the result of multiple factors, for example data distribution, size of data set, and so on.

D. Problem Formulation
With the aim at optimizing the FL framework for hybrid aqua-ground domain, the minimization of the mean overall time, i.e., the time needed to receive data from the underwater layer plus the time spent to actually training the model is crucial.Therefore, the main objective of this article is to design an overarching framework able to optimize the following problem: The optimization problem given in ( 14) is very challenging, due to the strong influence provided by terms γ b on T b values, 3with 1 ≤ b ≤ B. Therefore, in what follows, we will focus on the minimization of terms (u χ,m /R m,b ), by proposing a semantic communication framework that aims at the successful transmission of semantic information extracted from an image instead of a set of symbols or bits regardless of their meaning.

IV. UNDERWATER SEMANTIC COMMUNICATIONS
In the semantic communications, the main goal is represented by the transmission of the semantic meaning of the source data instead of the whole data.The key difference with classical systems is the use of a semantic encoder (Fig. 2) at the transmitter site able to extract the semantic features.As a consequence, semantic features exclusively are sent out, whereas data, once received at the destination, are processed on a semantic basis only by the receiver site, instead of at the bit level as in traditional systems.As depicted in Fig. 2, the semantic communication framework consists of two individual cascaded submodules: 1) the semantic and 2) the transmission part.
The first is responsible of semantic encoding and decoding performing information processing and semantic extraction, whereas the transmission level focuses on correctly transmitting the semantic information over the physical channel.The underwater channel is notoriously afflicted by numerous impairments and bad propagation conditions.For this reason, many efforts have been made during years to reduce such drawbacks by identifying suitable modulation procedures, in particular, to lower problems due to the low data-rate.However, in reference to this, we have to point out that an in-depth discussion of underwater communication techniques is out of the scope of this article.As a consequence, in the following, we will refer to the multicarrier binary frequency shift keying (MC-BFSK) scheme, recently proposed in [41], just as an example of a promising scheme for handle underwater communications without pretending to propose a definitive solution for this issue.Hence, in order to summarize the main features of the MC-BFSK scheme 4 we can say that, accordingly to [41], (M/2) parallel and independent subcarriers from a single transducer has been considered, resulting into a composite signal where each signaling instance carries (M/2) channel coded bits [41].From results reported in [41], it emerges that by a suitable selection of the modulation scheme parameters an error-free data transmission may be guaranteed for signal-to-noise ratio (SNR) about greater than 2 dB [41].Consequently, the errors due to channel impairments in our paper have been assumed negligible.Before transmission, the picture s is mapped into symbols x m in order to be sent over the physical channel, experiencing transmission impairments due to sea environment, which implies that the symbol received at the receiver y m,b is subjected to wireless channel impairments [15], [16].
The framework of interest here focuses on the imagetransmission and consists of an encode-decoder architecture based on deep CNNs.More in depth, the semantic encoder is realized through the usage of a stack of Conv2D and max-pooling layer, whereas the decoder consists of a stack of Conv2D and Upsampling Layer, whose architecture is described in Fig. 3.As depicted in Fig. 2, accordingly to literature, the encoder part is formed by a semantic and a physical coding modules.Let α 1 and α 2 be the neural network parameters set for the channel and the semantic encoder modules, respectively.Therefore, the encoded symbol x m can be expressed as where C α 1 (•) and S α 2 (•) express the channel and the semantic encoder functions, respectively.Then, y m,b is decoded by the receiver, with the aim at retrieving the original s.As a consequence, the retrieved copy of s, ŝ, results in in which S −1 is the semantic decoder with parameter β 1 , whereas C −1 β 2 (•) is the channel decoder having parameter β 2 .Since the sea environment usually exhibits deep channel fluctuations and adverse propagation conditions, the primary objective here is to design an integrated channel-semantic coding able to maximize the similarity between ŝ and s.
Referring to [14] and [4], a common loss function adopted is represented by the binary cross-entropy metric that, with the Adam optimizer, aims at giving insight about the similarity between ŝ and s.In order to have a measurement of the goodness of the encoding-decoding procedure provided, for a underwater device m, with m ∈ M, transmitting toward the SSBS b, with b ∈ B, the cosine image similarity metric [46] has been exploited.Hence, the similarity between the image sent by underwater device m toward the SSBS b results in that is the ratio between the dot product of the images expressed as vectors, and the product of L2-norm of both the vectors.The meaning of parameter ζ m,b is of a feedback metric to catch the level of validity and accuracy provided by the transmission system.In fact, a ζ m,b decrease due to hash propagation conditions, has a negative impact on the semantic communication quality.Note that the semantic framework exhibits a worst case complexity driven by the presence of the convolution operation.Therefore, its computational complexity is mainly in the order of O(U • V • u • v), where U • V and u • v express the size of the original image and the size of the kernel applied during convolution.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

V. PERFORMANCE ANALYSIS
This section provides a performance analysis of the proposed system whose focus is the functional integration of an FL process with an underwater semantic communication system.As stated before in Section IV, being the investigation of the data underwater transmission schemes out of the scope of this article in its present form, without loss of generality, we have referred here, as an example, to the MC-BFSK underwater data communication technique proposed in [41] and referring for the system parameter values to those reported therein.In particular, the information data transmission rate assumed in our computer simulations is derived in accordance with [41, eq. ( 6)], that depends on the modulation profile selected in accordance with the procedure outlined in Section II of the same paper.Furthermore, in order to properly test the performance of the proposed framework, we have resorted to the MNIST handwritten digits data set [47], considering images resolution of 28×28 pixels, where one pixel has been assumed composed by 8 bits.The 70% of the data set is devoted to the training phase, and the 30% is used for testing.We have assumed a number of SSBSs involved in the FL process equal to 15.Such an assumption is justified by the fact that, in our scenario, it represents the best tradeoff between the accuracy reached performing the FL and the number of SSBSs taking part in the process (a greater number of SSBSs does not remarkably improve accuracy, whereas a lower number of SSBSs gets worse accuracy values).Aiming at validating the proposed scheme (SC curve), Fig. 4 shows the loss function value as the number of learning epochs grows.As it is straightforward to note in this figure, the loss function decreases, i.e., the performance of the learning model improves, considering high values of the number of epochs.The performance behavior depicted in Fig. 4 is also confirmed by the increase of the encoder dimension considered in Fig. 5 in relation to the achieved loss function performance.Also in this case, the loss trend grows as the number of neural unit increases.This is clearly due to the fact that as the encoder dimension grows, the intelligence capability of the considered deep-CNNs network increases.Then, Fig. 6 shows the ability of the SC approach in performing image size reduction, in comparison with the JPEG2000 compression technique, as a function of data rates reported in [41].As it is evident to note, the SC curve remarkably lowers the transmission time, representing a useful solution to reduce the impact of the underwater environment on the FL framework.Differently, Fig. 6 illustrates the trend of the overall FL convergence time as a function of the data rates, comparing the scheme applying the SC and that in which the SC is not involved (WSC curve).When the semantic communications are not applied (WSC curve), the image is sent in the conventional method, without extracting and sending exclusively the semantic information.As it is evident to note from Fig. 7, the time required to send data when the semantic communications (SC curve) approach is used, is significantly lower than the time spent with the conventional method, implying a significant impact on the overall FL completion time, which  means an improvement on the practical applicability of the training framework.The behavior due to the results provided in Fig. 7 becomes clear evident when cross-analyzed them with those given in Fig. 8, dealing with the accuracy gap in comparison with the ideal case achieved by resorting to the considered FL scheme, i.e., by performing the semantic extraction/reconstruction process (WSC curve).As it is evident from this figure, the accuracy is not significantly afflicted by the semantic extraction/reconstruction process, and, as a consequence, the FL process reaches almost the same accuracy values of the scheme exploiting original images.Consequently, jointly considering Figs.7 and 8, it is easy to note that the convergence time improvements due to the semantic extraction-reconstruction process does not significantly afflict the accuracy of the overall framework, while it introduces significant benefits in terms of lowering the amount of data to be transmitted, thus speeding up the FL convergence process.Hence, as a final remark, we can say that the results provided in this section highlight the concrete possibility to implement an EI-ground-aqua network based on the exploitation of the semantic communications to counteract the hostile behavior of the underwater communication channels without a significant impact on the accuracy of the retrieved data and allowing a strong reduction of the FL convergence time that is of paramount important in several application cases.

VI. CONCLUSION
This article has investigated the problem of enabling the he FL paradigm execution in an integrated ground-aqua environment by exploiting data stemmed from underwater devices.In this reference, a ground-aqua network, mainly devoted to monitoring activities based on images transmissions, has been considered, in order to properly perform the FL framework.In addition, the semantic communication approach has been introduced to face with the hostile sea propagation environment and long data transmission delays.For this purpose, a deep-CNNs architecture has been designed, and its performance is validated by assuming a proper underwater channel model.Finally, performance analysis has been provided, in order to exhibit the validity of the overarching framework proposed, especially in terms of the accuracy and overall time needed by the considered FL process.Future works may include the integration of AI-based techniques to forecast underwater channel behavior in order to adapt to the selection of the most suitable data transmission scheme.

Manuscript received 7
February 2023; revised 30 July 2023; accepted 12 October 2023.Date of publication 19 October 2023; date of current version 7 March 2024.This work was supported in part by the European Union under the Italian National Recovery and Resilience Plan (NRRP) of NextGenerationEU, Partnership on "Telecommunications of the Future" (Program "RESTART") under Grant PE0000001.(Corresponding author: Benedetta Picano.)

b
= arg min w b ∈R e F b w b |w (t−1) , ∇L (t−1) (3) in which F b represents the objective function of SSBS b, w (t−1) is the global parameter produced during the previous iteration, and L (t−1) is the global loss function at time (t−1).Once each SSBS b has completed the local model training, the SSBS b uploads w t b to the ENN during the second step, in which the ENN collects the weights received by the SSBSs belonging to B. In its turn, in step 3), the ENN improves the global model by performing the weighted average of the local updates w t b previously uploaded by the SSBSs.

Fig. 4 .Fig. 5 .
Fig. 4. Loss function as a function of the number of epochs.

Fig. 6 .
Fig. 6.Underwater transmission time as a function of the data rate.

Fig. 7 .
Fig. 7. FL convergence time as a function of the data rate.

Fig. 8 .
Fig. 8. Accuracy reached by the FL model as a function of the number iterations.