Comparative Analysis of Decentralized Identity Approaches

Decentralization is essential when trust and performance must not depend on a single organization. Distributed Ledger Technologies (DLTs) and Decentralized Hash Tables (DHTs) are examples where the DLT is useful for transactional events, and the DHT is useful for large-scale data storage. The combination of these two technologies can meet many challenges. The blockchain is a DLT with immutable history protected by cryptographic signatures in data blocks. Identification is an essential issue traditionally provided by centralized trust anchors. Self-sovereign identities (SSIs) are proposed decentralized models where users can control and manage their identities with the help of DHT. However, slowness is a challenge among decentralized identification systems because of many connections and requests among participants. In this article, we focus on decentralized identification by DLT and DHT, where users can control their information and store biometrics. We survey some existing alternatives and address the performance challenge by comparing different decentralized identification technologies based on execution time and throughput. We show that the DHT and machine learning model (BioIPFS) performs better than other solutions such as uPort, ShoCard, and BBID.

Biometrics recognition is an intelligent solution that uses 104 machine learning algorithms for identification and authen-105 tication, such as face recognition, fingerprinting, and IRIS, 106 which are in the centralized identification category [8]. 107 Security and immutability, as two properties of blockchain 108 and DHT, can be used as backend technologies for identifi-109 cation systems. Alizadeh et al. [5] found that new decentral-110 ized applications combining blockchain, DHT, and ML can 111 improve older systems in order to achieve better performance. 112 Fersi et al. [9] showed that DHT-based systems could 113 increase the system performance by adding a fast lookup 114 in large-scale deployment among decentralized systems. 115 Alizadeh et al. [2] explained that a blockchain-based system 116 could increase system security, specifically Internet of Things 117 type of applications. 118 Navas and Beltrán [10] explained how different federated 119 identification techniques use third parties with their identi-120 fication solution. Cloud services with distributed centralized 121 architectures, such as Facebook and Google, are well-known 122 examples. Self-sovereign identity is a new generation identi-123 fication solution. Belchior et al. [11] presented digital identi-124 fication systems that deliver the strength to users to manage 125 their identity data, the credibility of disclosed identity data, 126 and network-level anonymity. Users' privacy is one of these 127 systems' properties. Kim et al. [12] systemically explored 128 key components of DID systems and analyzed their pos-129 sible vulnerabilities when deployed. Liu et al. [13] showed 130 different self-sovereign open-source identity management 131 systems provided to users, organizations, and other entities. 132 For example, ShoCard is a self-sovereign digital identity sys-133 tem that protects consumer privacy. Additionally, the authors 134 explained how an identity platform could be built on the 135 blockchain by showing a driver's license and how it can be 136 so secure that a bank can rely on it. Shuaib et al. [14] analyze 137 and evaluate the existing SSI solutions and develop the best 138 possible solution for a blockchain-based land registry system. 139 Furthermore, the authors investigate each SSI solution and 140 present its advantages and limitations. Alzahrani [15] com-141 bined the decentralized features and the ''lookup by name'' 142 property with a secure mechanism for maintaining synchro-143 nized replicas of an item in multiple locations to achieve short 144 lookup times. 146 This section defines different terminologies related to identi-147 fication technologies. 148 • A user or entity on the Internet is a person, organi-149 zation, computer application, thing, or smart device 150 digitally connected to a network. An entity recognized 151 by a unique property can be authenticated and eventu-152 ally authorized in case of requesting access to online 153 resources. Therefore, each entity has its digital identity. 154

III. TERMINOLOGIES AND DEFINITIONS
• An attribute is a characteristic of an entity. For example, 155 attributes might be permanent (such as a person's birth 156 date), temporary (such as an address), or long-term (e.g., 157 social security number individuals' privacy might be at risk, and their online activity 212 could be connected and eventually tracked. Furthermore, 213 an exceedingly fragmented landscape will emerge because 214 the user will be required to create a separate identity for each 215 service provider. Finally, from the service provider's stand-216 point, such an approach requires a significant investment of 217 resources to store, preserve, and safeguard users' data.

218
A federated identity system establishes mutual trust 219 between centralized systems. A federated identity is accom-220 plished by distributing verification and trust components 221 across all identification systems or by mutually accepting the 222 standards used by each system. For example, international 223 organizations or governments could agree to recognize each 224 other's credentials. It is also possible for businesses to agree 225 to accept each other's identity verification system. The own-226 ers of identification systems frequently use legal agreements 227 and shared technological standards to build one-to-one trust. 228 As a result, the network and its reputation rise as the number 229 of trustworthy relationships grows.

230
Users frequently prefer the simplicity of federated identi-231 fication while accessing numerous services on different plat-232 forms, resulting in the widespread use of federated systems. 233 On the other hand, building trust between two or more system 234 owners is not always simple. The same applies to centralized 235 systems, where the degree of trust depends on the system 236 owners, the identity verification degree, and the data vetting 237 process. Many web services propose identifying with Google 238 or Facebook accounts to use their services such that these 239 providers perform the user's identity verification.

240
Moreover, multi-factor authentication [21] is widely used 241 as an extra security layer to make systems recognize that the 242 people trying to gain access to an online account are who 243 they claim they are. E-identification is another example that 244 is typically used by some governments which are limited to a 245 certain country or geographical region. Different e-services 246 usually require that persons have Swedish electronic iden-247 tification. E-identification is equivalent to other standard 248 forms of ID, such as a driving license and a national identity 249 card. Moreover, it allows people to identify themselves or 250 sign a document or transaction securely online [22]. These 251 responsibilities, risk allocation, and the formation of techni-252 cal standards add complexity for system owners. In addition, 253 these issues may result in high implementation costs, which 254 typically lead to the lack of a variety of services consumers 255 desire [23]. 256 In summary, centralized and federated identification is 257 referred to as classical systems since identity attributes are 258 managed by a third party, such as an identity provider.

259
Decentralized identification is a technology that is handled 260 with the help of all participants. It has a different architecture 261 compared to centralized and federated identification services. 262 There is no single organization inside to manage identifi-263 cation [24]. Usually, a decentralized identifier works in a 264 peer-to-peer network, such as DLT and DHT. Decentralized 265 identification systems are formed by many nodes that can be 266 users, organizations, issuers, and validators. Self-sovereign 267 identity systems represent a kind of this system. They operate 268 VOLUME 10, 2022 over their digital identities in a decentralized manner [25].  credential holder is an entity that has a license, permis-301 sion, certificate, or registration issued by the govern-302 ment or a board being referred to as a credential holder.

303
Additionally, a person who has a pending application 304 for a credential for not more than one year from the date 305 the application was filed to the department is referred 306 to as a ''credential holder''. An entity can play a role 307 by having one or more verified credentials and using 308 them to create presentations. Credential repositories 309 represent a place where holders save their credentials.

310
The issuer is the entity that creates the credential. The Decentralized Identifiers (DIDs) are identifiers for 324 decentralized systems where users can have verified 325 digital identities [3]. They are introduced to the concept 326 of self-sovereign identity. A DID identifies any entity. 327 These identifiers allow a DID controller to demonstrate 328 control over it. They may be used without a central-329 ized registry, identity provider, or certificate authority. 330 DIDs are Uniform Resource Identifiers (URIs) that link 331 a DID subject to a DID document. An example of a DID 332 is did:example:123456abcdef.

333
Decentralized Identifier Document is a document that 334 is accessible through a verified data registry that con-335 tains information connected to a specific decentralized 336 identification, such as the associated repository and 337 public key information, and is also known as a DID 338 document [3].

384
One of the critical aspects of blockchain is storing a 385 cryptographic signature of recorded data and events [28].

386
Blockchains also help to protect transactions' data from being 387 changed. In decentralized systems, permissioned and permis-388 sionless blockchains are two forms of blockchain technology.

389
In permissioned blockchain systems there is a limited num-390 ber of known trusted participants carrying a copy of the 391 blockchain's ledger.

392
Permissionless blockchain refers to a system that allows 393 anyone to join or cancel their account. A well-known example 394 of a permissionless blockchain is Bitcoin. It manages decen-395 tralized digital money without relying on a central authority.

396
The blockchain is made up of blocks and data packages 397 that represent the historical data of transactions. The main 398 blockchain property is that it has unique timestamps and hash 399 values. Each block is linked to the previous block, referred to 400 as a parent block. It is possible to return to the first or genesis 401 block by following the parents. Most network participants 402 approve a new block using their consensus process, which 403 is added to the validated block list. The information will be 404 disseminated to several or all connected parties. After the 405 consensus procedure is completed, all nodes that received the 406 data will replicate and save an exact copy of the transaction 407 information. This information is maintained individually on 408 each node, resulting in trust between them. The use of crypto 409 or credits to pay for activities and transactions is required. 410 These credits incentivize participants to reach an agreement, 411 also known as proof of stake or labor and receive money from 412 the transaction's commission. Participants in the network are 413 also encouraged to compete to win extra credits. Ethereum 414 is a permissionless blockchain that uses smart contracts to 415 operate, where Ether is its currency [29]. A smart contract is a 416 Solidity-based computer program or transaction mechanism. 417 It can carry out legally relevant events and activities automati-418 cally following the provisions of a contract or agreement. Fig-419 ure 2 shows the interaction among users in the smart contract 420 schema. First, the owner publishes an accommodation with 421 the rental fee on the ledger. Then, other users, such as Renter, 422 can see the different announcements on the web portal. Next, 423 the smart contract will be executed when a Renter accepts 424 the contract terms. The transaction includes information such 425 as Owner, Renter, Signatures, and two side addresses. The 426 timestamp will be stored and shared through the network 427 participant as a copy of the ledger. Then, all network members 428 keep a copy of the proof and know who rented it, when it 429 was rented and to whom the accommodation was rented. This 430 asset will never be removed and changed during the network 431 lifecycle.

432
E. uPort 433 uPort is a way of registering identity by the help of the 434 Ethereum blockchain. It enables users to identify them-435 selves and send information to others in a clear, transparent 436 way [30]. Figure 3 shows three scenarios with uPort. Figure 3     as blockchain. Users can remain under complete control of 509 their data. In contrast to centralized solutions, decentralized 510 systems ensure that private data remain immutable and secure 511 and can only be shared when selected users consent to provide 512 information. Figure 4 shows a relation between response time 513 and privacy in three different architectures for hosting iden-514 tification: 1) Internet service providers as centralized host; 515 2) the cloud as a distributed host; and 3) DHT as a decen-516 tralized host. The throughput typically increase by changing 517 from centralized to decentralized. One solution is using IPFS 518 as a server to compile identification applications. It supports 519 static and serverless identification versions to compile. 520 J. PERFORMANCE 521 A performance study is the systematic explanation of the 522 action or process of performing a task or function. The 523 performance can depend on different parameters. For exam-524 ple, application performance is calculated based on user 525 accessibility and response time. This article defines high per-526 formance as the models' output providing high accessibility, 527 immutability, throughput, and trust. Most of these parameters 528 are achievements of combining blockchain and DHT in a 529 decentralized manner. The rate of successful message deliv-530 ery through a communication channel such as Ethernet or the 531 Internet is called network throughput. Less execution time 532 and high throughput are two independent properties of high-533 performance models. A system with these properties also can 534 normally manage larger network sizes and tasks (scalability). 535

536
In this section, we define and illustrate different identification 537 models based on different technologies. A classical model 538 VOLUME 10,2022 is an easy form of identification. The architecture is of the tion among decentralized systems. IPFS' role is to compile 576 the identification software as a web application and allocate 577 resources such as memory and storage to a virtual server.

578
The main difference between this and the BioIPFS model 579 is that blockchain is a record-keeping extension through its 580 immutable transactional environment. the developer registration only for that application. Therefore, 592 all applications have separate and different codes. In the 593 second step, the user must scan the QR code provided by 594 the developer on the web application and send the agreement 595 from the phone to the application through the uPort mobile 596 application after receiving a notification on their phone that 597 asks permission to access some part of the information. The 598 main differences between this model and BioIPFS and BBID 599 are that uPort is replaced with biometric recognition, and the 600 blockchain does not need to act as a record-keeping module. 601

602
In this model, the IPFS service helps to load the ShoCard 603 application. IPFS' duty is to compile the ShoCard software 604 as an application and allocate resources such as memory and 605 storage to a virtual server. Here, developers must fix the 606 connection scripts in their applications. The ShoCard website 607 provides these scripts and QR algorithms to the developers. 608 Then, users should have the ShoCard application installed on 609 their smartphone and register to the mobile application with 610 their face and driving license images. Then, after registration, 611 users should be logged into the application hosted on IPFS. 612 In the next step, the user must scan the QR code provided by 613 the developer on the application and approve the connection 614 by their phone through the ShoCard mobile application after 615 receiving a notification on their phone that requests allocation 616 to access user information. The main difference between this 617 and the previous model is that ShoCard is replaced with a 618 uPort with a different structure background, biometric recog-619 nition, and no immediate need for the blockchain as a record-620 keeping module.

622
In this article, we divided the evaluations into two parts. 623 The first part's main target is to compare three different 624 hosts for a decentralized application and measure the per-625 formance between cloud-based and IPFS-based versions. 626 Microsoft Azure VM was used as a cloud-based service and 627 cf-ipfs as a gateway for IPFS. A Linux/amd64 Ubuntu 628 server 20.04 LTS Gen2 was installed on Azure VM with 629 the Docker v20.10.7 deploy Node.js application on port 80, 630 where we set up AZURE Service Standard-B1s with 1 CPU 631 holding 1 GB memory and Azure service Standard-D4s-v3 632 with 4 CPUs and 16 GB memory. Apache JMeter 5.4.1 was 633 the application used to measure performance.

634
Tables 1 and 2 display different tests to show the hosts' 635 performance in the cloud and IPFS. Table 1 shows load testing 636 with 100 samples, while Table 2 shows load testing with 637 1,000 samples. The results present the hosts' strengths by 638 deploying the same application on three hosts. We choose the 639 fastest gateway between IPFS gateways (cf-IPFS.com) 640 to compare with the MS Azure cloud service. Additionally, 641 we choose central Sweden as the main resource backend and 642 storage for MS Azure and the closest MS Azure datacenter. 643 In the second part, we compare the execution time of 644 the four discussed models. JavaScript (npm 6.14.16 and 645     667 We consider 100 and 1,000 as the number of virtual users 668 per request in these tests. Therefore, the ramp-up value is 669 ten, which means that JMeter will take 10 seconds for all 670 100 and 1,000 threads to be up and running. We tested two 671 FIGURE 5. Different execution time for identifying users to enter the system. different configurations of cloud VM services with IPFS. 672 Additionally, we compared different parameters, such as 673 error, which shows the percentage of failed requests, and 674 standard deviation is the set of exceptional cases that devi-675 ated from the average value of the sample response time. 676 Finally, throughput is the number of requests that were 677 Table 3 and Figure 5 show the different models mentioned 690 earlier by repeating them ten times. The average of ten rep-

712
Identification is essential for providing user identity to appli-713 cations that fulfill the standard requirements of decentralized 714 systems. As a result, identification should meet minimum 715 requirements to capture users' trust.

716
This article started with a definition of decentralization 717 and decentralized identification. We also highlighted how the 718 DHT's immutability and speed could be a helpful mecha-719 nism for managing identification. Then, we addressed how 720 to provide decentralized application hosting and data sharing 721 in a DHT-based architecture. The IPFS-decentralized web 722 hosting solution is also an excellent way to keep all systems 723 decentralized while avoiding memory constraints.

724
This part compared different types of identification in a 725 decentralized environment. As one of the new approaches, 726 we consider SSI technology and a machine learning 727 facial recognition system-based solution, tuned with public 728 blockchain and DHT technologies for better performance. 729 Additionally, we showed that a system deployed on IPFS has 730 better throughput than a system deployed on cloud services. 731 Although the combination of DHT and SSI effectively 732 assists identification in a decentralized manner, they consume 733 a lot of energy and time. Blockchain is an excellent solution 734 for keeping the record of identities immutable but still has a 735 significant problem. It is slow when it calls and decodes a 736 query of many transactions' data in a large-scale system. 737 We plan to introduce additional decentralized techniques 738 for strong identification with tuning the system by adding 739 smart devices as entities to be identified by the system. 740 We are also considering expanding our research to include 741 device identification and human reaction and response 742 time measurements when using these devices as handheld 743 technologies.