Discuss the paradigm of AI adoption in Web3, with an award-winning project
Updated at July 25th 2022:
Currently, I am working on a new version of PrivateAI. Fresh ideas are needed. If you are interested in this field, or have something in mind and want to discuss, please feel free to reach out. I will be more than happy to connect with you:)
In Web 2.0, machine learning algorithms run in the central server, compromising our privacy while providing good UX. When it comes to Web 3.0, there should be a new paradigm for AI adoption.
In this article, I will share my ideas about how to onboard AI in Web3.0. Along with my first attempt, PrivateAI, an award-winning project at ETHGlobal Hackathon.
Disclaimer: This is one of the rare articles that discusses AI in Web3.0 in-depth, with practical technical solutions. Given the topic is so new, and this area has not been fully explored, something might be updated in the furtue. Please be generous and provide your opinions/feedbacks to help with this process.
You can also read : Web3 Adoption Curve
To start, I want to scratch the surface of two fundamental concepts: (1) The Architecture of a Web 3.0 application, (2) The Purpose of AI algorithms. The following discussion will be based on these concepts.
(1) The Architecture of a Web 3.0 application
There are three elements. Front-end is nothing different from Web 2.0 world, except that it connects directly to a Blockchain or indirectly through a Provider.
From an architecture standpoint, AI can be adopted in all three components and here is how. In the Front-end, TensorFlow.js enables machine learning models running and training inside browsers. The Provider is more like a centralized Web 2.0 environment, so AI can be applied seamlessly. Additionally, in theory smart contracts running on Blockchain can implement gradient descent or solve machine learning model parameters in closed form.
(2) The Purpose of AI algorithms
There can be different approaches to classify AI algorithms. Here I am dividing AI into Customized Algorithms and General Purpose Algorithms. The former gives unique output based on the requesters’ identity (eg. YouTube feed), while the latter is identityless (eg. Spam detection).
AI in Web3 Paradigms
Based on the previous discussion, now we can plot the following matrix to define the 6 scenarios for AI adoption in Web 3.0. The goal of this section is to figure out the theoretical best practice among these scenarios. Let’s dig into them.
(1) Web2 Alike (Scenario 2 and 5)
First, I would like to rule out Scenario 2 and 5. They are essentially Web 2.0 and share the same attributes.
(2) Best Practice for Customized Algorithms (Scenario 1 vs 3)
Next, for Customized needs, we can either run AI in the Front-end or in the smart contracts on Blockchain. I think the former is a much better solution based on the following comparison.
Cost-wise, TensorFlow.js runs in Front-end costs nearly nothing, except minimal computing powers. However, if using smart contracts to run AI algorithm, then for each user a new smart contract will be deployed and trianed. The deployment can costs a lot, but the training costs, which updates potentially huge number of parameters inside a contract for hundreds or thousands of times, can be unexpectedly high. It is true that one can train the model, find the optimal parameter before deploying a smart contract. However, that requires technical expertise, which is infeasible for consumer-based product. Let alone the still high deployment cost. So, Scenario 1 is clearly the winner in terms of cost.
Privacy-wise, keeping all the data inside your browser is clearly a more secure solution. In scenario 3, to train the smart contract, users might have to expose their personal data. And the trained models that live inside smart contracts are open, which also lead to privacy concerns. Therefore, Scenario 1 wins again.
Speed-wise, this is the part I am not so sure. I think the speed of Scenario 1 is acceptable for normal applications, as demonstrated in the last section of this article. For the speed of EVM, I am not sure. I might do some experiment in the future. Please let me know if you have any idea! In this article, I will just assume they are at a similar speed based on my experience calling the view function inside contracts.
To sum up, for Customized Algorithm, you might want to implement it inside the Front-end using TensorFlow.js. There is one known down-side though, by doing so the front-end might request much more data than needed from Blockchain. As a result it can be memory draining and negatively influence the performance of the browser.
You can also read: Tokenizing Real Estate Assets On Blockchain
(3) Best Practice for General Purpose Algorithms (Scenario 1 vs 3)
Then, for General Purpose algorithms, things are different given there is no need to instantiate a smart contract per user.
Cost-wise, Scenario 4 is identical to Scenario 1, TensorFlow.js costs nearly nothing. For Scenario 6, the smart contract can only be deployed once and serve everyone.
Privacy-wise, since general purpose tasks normally do not require users’ data, both solutions are safe.
Speed-wise, the comparison is also identical to the previous section. Without concrete evidence, I would assume they are similar.
To sum up, personally I would still use TensorFlow.js in the front-end simply because I think it is easier to implement. However, I do not think there is a clear winner before I learn more about the speed difference.
PrivateAI (Demo fits Scenario1)
I participated in LFGrow Hackathon, and won the Most Creative use of ML award, also a Finalist award (Top10 projects).
This project uses AI models run in browser to enhance the user experience in a social network dApp. There are two models: (1) The first model detects malicious information and protects users against them. It is a pre-trained model based on 2 million toxic online comments provided by TensorFlow. (2) The second model helps recommend social media information based on users’ preference. Whenever users like/dislike a content, this information will be used for model training that happens inside browser. Over time, the model will be better at recommend/hide posts based on users’ preference. Since model training happens inside the browser, users’ data are always in their browser. In this way to guarantee data security.
Here is the demo video:
There you have it! I will keep working on PrivateAI, potentially turning it into a developer tool that developer can use to add AI to their dApp. Let me know if you want to partner with me! I can do all things, but passibely looking to team up with and learn from great people. Together we can achieve something bigger.
Credit : Medium