Built to Last: Data and Computing Power
A look at Matrix’s approach to data and computing power.
The three core pillars of artificial intelligence are data, computing power and AI models. Following the release of the Matrix 2.0 Green Paper, the team is publishing a series of articles delving deeper into key aspects of Matrix 2.0.
In part 1 of this series, the Matrix team introduced the relationship between data, computing power and AI models. In short, high-quality data and massive amounts of computing power are required to efficiently train AI models and algorithms. While simple at a glance, data and computing power issues are no small hurdles. This article serves to introduce a few prominent data and computing power issues as well as Matrix’s plan to solve them by building a blockchain-powered multi-dimensional big data platform — Matrix 2.0.
Much of the modern world’s data can be accurately envisioned as belonging to one of two categories: one-dimensional big data and multi-dimensional small data.
One-dimensional big data
One-dimensional big data refer to massive quantities of data with limited variety. For example, a large retailer might collect huge quantities of data, but all data relates to purchase behavior.
Multi-dimensional small data
Everything an individual does produces some kind of data. While multi-dimensional small data covers many facets of behavior, it all relates to a single individual. In other words, multi-dimensional small data refers to small quantities of data with high variety. This kind of data is usually insufficient to support machine learning.
Both one-dimensional big data and multi-dimensional small data have limitations as it pertains to training AI models. Limitations, whether due to breadth or depth, are ultimately still limitations. Compounding the issue are large corporations that are naturally averse to granting access to their hard-earned data due to insufficient rights management solutions and protections. Because of this, so-called data islands emerge. Data islands also serve to limit and stagnate the development and training of artificial intelligence.
Clearly, the status quo is insufficient. Something new is needed. But how does one go about encouraging users to share their own data? How does one acquire high-quality data?
There are 2 major data acquisition problems that need to be solved. The first major problem is that data providers often don’t have a clear idea of the value of their data. Some may not even know that their data has any value at all! This leaves many people open to exploitation. Data providers constantly lose out on opportunities to use their data to generate value. Even worse, most data providers do not receive fair compensation from the entities who use their generated data for personal financial gain. This is a Data Attribution problem. The second major problem is that, due to a long history of data privacy, security and abuse scandals, many data providers are wary of sharing their data. This is a Data Security and Privacy problem.
The Matrix AI Network is building a blockchain-powered multi-dimensional big data platform. Whether individuals or enterprises, data providers earn rewards for sharing their data. All data uploaded to the platform is encrypted, distributed and stored on the blockchain to track and attribute data ownership. Blockchains can record and store data to ensure data attribution. A platform that can guarantee and protect the rights of all parties goes a long way towards solving data attribution issues. The next problem begging for a solution is the Data Security and Privacy problem.
Data Security and Privacy
The Matrix AI Network is relying on two core technologies to preserve data security and privacy: Federated Learning and Homomorphic Encryption.
While no single blogpost can introduce Federated Learning in full, suffice to say that it is essentially distributed machine learning. With Federated Learning, each node trains models using only a small portion of the dataset. The data is then reintegrated to produce training results. This means that no single node has access to the entirety of the dataset; thereby ensuring data security and privacy.
Homomorphic encryption is a cryptographic technique that allows computation on ciphertext. It generates an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext. Using homomorphic encryption bars data scientists from accessing cleartext during training. This also ensures data privacy.
While Federating Learning and Homomorphic Encryption can solve data security and privacy problems, individually uploaded user data alone is unlikely result in a vibrant blockchain-powered multi-dimensional big data platform. As such, the Matrix AI Network is developing access channels and interfaces to support data sharing between the Matrix AI Network and other data giants and chains.
A major component of Matrix 2.0 is on-demand computing resources. The Matrix AI Network blockchain is the foundation that enables connected nodes to grow into a flexible global supercomputing network. This massive computing network will provide a long-term impetus for the development of artificial intelligence.
It goes without saying that the formation of a global supercomputing network will not occur in the blink of an eye. Improving the Matrix AI Network blockchain to more effectively deliver, share and transform the network’s computing power is a long-term project and is arguably the vital component of Matrix 2.0. Besides continuously improving the underlying blockchain, the Matrix AI Network is engaged in several projects aimed at driving the growth of the network’s computing power potential. One such project involves a new cloud computing architecture.
New Cloud Computing Architecture
Although cloud computing platforms currently exist and are growing in popularity, there are still many practical issues. For one, as centralized infrastructures, most cloud computing service providers have full access and control over user data. This poses a security and privacy risk to potential users.
While current cloud computing architectures can largely meet the aggregate computational needs of a blockchain-powered multi-dimensional big data platform, geographical factors and other constraints make real-time and parallel processing difficult; if not entirely untenable. This directly limits the demand for current cloud computing architectures, especially in areas such as industrial data and Industry 4.0 fields. Another downside — and perhaps the most pertinent to users — is cost. Current cloud computing architectures are incredibly expensive to use. There is clearly ample room for improvement.
Relying on their AI and technological expertise, the Matrix team created a high-performance decentralized blockchain platform to provide a strong structural protection to user data, distributed storage and aggregate computing power. The Matrix team is also upgrading the Matrix AI Network blockchain to support a new cloud-fog-terminal computing architecture. This computing layer comprises three parts:
- Cloud Computing Layer: Responsible for intensive computing tasks without high response time requirements.
- Fog Computing Layer: Responsible for real-time tasks that require little computational power.
- Terminal Computing Layer: Smart terminals that serves specific applications.
Massive computing power is not only useful for training AI models. It is also needed to support industrial big data applications, AI medical treatment, financial modeling, etc. This 3-layer computing architecture can dynamically determine which tier of computing power is necessary for a given task, thereby reducing wastage and reducing the cost of computing for individuals and businesses.
In addition to the new cloud computing architecture, the Matrix team continues to optimize and improve the Matrix AI Network blockchain so that each and every node on the blockchain can participate in the training of AI models and in the completion of other tasks.
Matrix AI Network nodes will both initiate collaborative computing tasks and join tasks initiated by others. Each node receives routing, addressing and computational procedures and processes tasks locally. Using this approach, results can be computed across a distributed network of local devices.
In part 3 of this series, the Matrix team will delve deeper into the third core pillar of artificial intelligence — AI Models and algorithms.