In recent years, the majority of data produced are stored by cloud providers. However, their infrastructure is still largely invisible to users and leaves many concerns unaddressed.
Currently, a small number of corporations owns nearly all of the global data storage market. This is especially troublesome firstly because to assure the data is safe from exposure, the user has no other option than blindly trust the service providers. Secondly, data is usually stored hundreds to thousands of kilometers away from where it is needed to be consumed.
Meanwhile, there exists a significant amount of storage that sits unused, from small, free storage on our laptops to various smart devices in IoT networks; owned by large corporations or small businesses. A much more efficient system would decouple raw hard-drive space, i.e. the act of storing, from all the rest of the steps, needed to be provided for a data management service. The question is how can we build a truly functional data storage and management ecosystem using existing infrastructure that is easy to use?
Edge Computing as a relatively new phenomenon, in opposition to the cloud, emphasizes on keeping data as close as possible to where it is produced and consumed. This results in less data transfer which is costly and wasteful with respect to bandwidth. It also prevents unwanted data aggregation in a central location, which is the key to privacy issues.
As fascinating as Edge Computing sounds, there are a number of challenges that need to be addressed as described below:
A. Edge devices usually have limited capacity and unstable availability. In many cases, neighbor devices should collaborate to deliver the task.
B. Data and services should be distributed across the network and yet easily managed by the owner.
C. Requirements and offers of the different participants of the network should be matched.
D. The owner of the device is not considered as an active actor in the currently offered edge solutions. It is crucial to have the consent of the owner for all major activities (e.g. sharing and serving data) of the device within the network.
E. Deletion of data upon owner request needs to be assured.
As discussed here, to overcome A and B, decentralization seems to be the right solution. However, first, we need to know what the data storage life cycle is and identify its building blocks. When we task a system to store data there are four main services working together in order to perform the storage task:
Looking at the building blocks above, one can presume the key factor of making a decentralized ecosystem is to decouple these services from each other. In other words, different peers in the network can take different responsibilities and offer different services especially with respect to storing and locating services. The actor who stores the data is potentially different from the actor who helps the user find it.
Sharing and deletion are logically tied to the storing service. This means, if a peer did not store the data, then that peer cannot be the one who shares the data or is responsible for deleting the data.
Undeniably, decentralization is a move toward the right direction, however, it poses a new set of challenges. Namely, data distribution and decentralization of services lead to an exponential increase in system management, maintenance, and security risk, especially in a multi-party network.
To overcome C, D, and E the system has to support this workflow:
NATIX has developed a data and storage management system that aims to distribute the services while providing the same experience that users get from any cloud-based competitor. By acquiring different building blocks, it draws a comprehensive picture of an ecosystem that enables data and storage sharing with the necessary built-in security measurements.
To elaborate on the functionality of each block in the diagram, here we provide some descriptions of the utilized technologies.
Decentralized identifier (DID) — A decentralized, verifiable digital identity that the underlying layer of EdgeDrive device management. It might seem obvious that a decentralized system needs a decentralized identity, nevertheless, in a future post we discuss this topic extensively.
SLIX — A P2P Certificate Exchange protocol that provides live access management in a multi-party network.
IPFS — A peer-to-peer protocol that connects devices to store and share files, creating a decentralized File System.
DLT —Beside facilitating the decentralized public key infrastructure, a distributed ledger functions as the core of the system and the source of truth. It curates the agreement between the parties of the network in the form of legitimate contracts and makes all the transactions of the system traceable. To leverage the potentials of distributed systems over various underlying technologies (e.g. SAWTOOTH, Corda, Amazon Aurora and etc.) the current implementation of EdgeDrive is based on DAML.
To realize the Edge Computing platform vision and build a product competitive to similar cloud providers, there are various challenges to be addressed. As for data and storage management, the system should keep the data local, yet easily accessible. Moreover, an increase in management load and security risks are unavoidable consequences that come with decentralization and resource sharing. This is where NATIX EdgeDrive shines by offering a holistic solution to address all concerns at once.
Interested in experimenting with NATIX EdgeDrive in real-life scenarios? Then drop us an email at email@example.com, and our team will get back to you in no time.
DISCLAIMER: This post only reflects the author’s personal opinion, not any other organization’s. This is not official advice. The author is not responsible for any decisions that readers choose to make.