In the information era, data is a commodity whose value will soon surpass that of gold. This is already evident from the high price personal data, especially health records of high-profile individuals, fetches on the dark web (Schein & Trautman, 2020). However, such prices are negligible when considering the scale to which entities like Google, Facebook, Amazon, and governments collect, store, and use data despite ethical and privacy concerns. Under the guise of interconnectivity and increased integration to aid the delivery of services, companies and governments are increasingly concentrating power in their hands. With the exception of the General Data Protection Regulations (GDPR) active in EU countries, the law and legislative process in the rest of the world has yet to catch up to the disruptive pace of technology and aid the general public (Watcher et al., 2017). Therefore, it is crucial to understand the causes of these upcoming concerns and propose several solutions.
In this capstone project, we believe that the root cause of the aforementioned concerns, and others that could emerge in the future, is the infrastructural design of the internet. Since its inception, the internet has been centralized, thus allowing a few key stakeholders to take advantage of the opportunity and exploit it. After all, being centralized implies a single point of failure and from a cybersecurity perspective, a zero-day attack in progress. Everyone should be concerned about this vulnerability whether they use the internet or not. History has demonstrated the exploits can have real world effects. For instance, the Cambridge Analytica hack exploited the vulnerability to subvert the natural course of democracy (O’neil, 2016). Fake news is another serious concern, especially when targeted around specific individuals with data collected from social media platforms (Allcott & Gentzkow, 2017). The patch to the identified vulnerability is to decentralize the internet infrastructure.
Delegate your assignment to our experts and they will do the rest.
As a result, the InterPlanetary File System (IPFS) is a viable patch. According to Benet (2014), IPFS is a peer-to-peer distributed file system that is inspired by numerous successful systems from research in academia and the open-source community. A modest description of IPFS is that it has the potential to usher in the next era of information technology that gives users more privacy and control over their data. In other words, it is like the algorithms that guarantee trust on blockchain networks, where the users can trust the content they are consuming without necessarily trusting the source. The following is a brief explanation of the ideas IPFS borrows from existing systems.
As a decentralized file system, IPFS would need to record and quickly look up where all the files are stored, and do it securely. As a result, it used Distributed Hash Tables (DHT), which store and maintain metadata about the entire peer-to-peer system (Louati et al., 2018). For added security (and collision resistance), IPFS uses sha256, which is a proven algorithm that is the mainstay of cryptography and blockchain networks like bitcoin. Storage, however, is a premium for most users. Therefore, IPFS borrows the idea of block exchanges from the BitTorrent network to coordinate activities between peers who do not need to trust each other (Benet, 2014). In other words, nodes on the network store pieces of a particular file and share them when requested.
However, there are some applications where the shared data changes, thus needs to be constantly updated. IPFS, therefore, borrows from version control systems like git to facilitate the distribution of different versions of a file to the entire network in an efficient manner (Benet, 2014). Finally, the average user will not be concerned with these infrastructural details. IPFS, therefore, borrows from Self-Certified Filesystems (SFS) to create an abstraction layer that hides all implementation and platform dependent features of the entire network.
Proposal
Based on the findings of the literature review and performing background research on the subject matter, there is a need to understand the technology behind IPFS, with a special focus on identifying existing gaps, problems, and spaces that can serve as opportunities for future development. However, we noticed that there are few frameworks that are used to help host a website on top of IPFS. The few that were available were unreliable, difficult to find and understand. Therefore, this capstone project proposes to do a deep dive into IPFS as a distributed file system and how it can solve problems. One of the artifacts that will come out of the project is a website that will document all activities and findings from the research done. If a developer wants to use IPFS as a framework for their website, they will find the website a great guide and reference.
References
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of economic perspectives , 31 (2), 211-36.
Benet, J. (2014). IPFS-content addressed, versioned, P2P file system (DRAFT 3). arXiv preprint arXiv:1407.3561 .
Louati, T., Abbes, H., Cérin, C., & Jemni, M. (2018). Lxcloud-cr: towards linux containers distributed hash table based checkpoint-restart. Journal of Parallel and Distributed Computing , 111 , 187-205.
O'neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy . Crown.
Schein, D. D., & Trautman, L. J. (2020). The Dark Web and Employer Liability. Colo. Tech. LJ , 18 , 49.
Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech. , 31 , 841.