The Architecture & Algorithm of Solid PODs
Inrupt describes Solid Personal Online Datastores, or PODs, as “where you store your data… Once stored in a Pod, you control who can access your data.” (Get a Pod · Solid, n.d.). PODs can store any kind of data from text to videos to social networking data. You can think of PODs as “containers” for data, or even “digital twins” of users (Bader & Maleshkova, 2020). PODs are web-accessible storage servers that can either be “hosted” (owned/operated) by an individual, or by a corporate POD Provider. Below, we outline the different parts (architecture) of PODs and how they store and regulate access to data (algorithm).
Architecture
POD Provider: Entities responsible for “hosting” PODs. POD Providers are similar to cloud storage services like Dropbox (Sambra et al., 2016). The POD Provider determines which data storage laws (depending on geographic location of the Provider) and third party data access protocol apply to the user’s data. For the sake of this discussion, let’s assume that our POD is hosted Inrupt Pod Spaces in the United States. Inrupt Pod Spaces provides the physical servers on which user data in PODs are stored.
Resource Storage: The POD’s storage dedicated to housing user data. Resource storage houses both non-Resource Description Framework (non-RDF) data, such as jpeg or pdf files, and one for RDF data such as JSON-LD (a kind of encoded linked data). Whether RDF or non-RDF, all data are stored as files, like on your laptop. This data comes from any Solid-compatible applications the user interacts with.
Linked Data Platform Support (LDP “Containers”): PODs contain code enabling them to work with LDPs. PODs contain both “private” and “public” LDP containers (Bader & Maleshkova, 2020), each with different “access control rules” established by the user that make determine what data is shared with other parties and what data is held private.
Access Control: The feature of PODs that permits users to designate who can access what data.
WebID: When a user starts using a POD to store data, they are given a WebID that then serves as their means of accessing their POD(s). This is Solid’s identification system that is “decentralized and openly extensible” (Sambra et al., 2016, p. 5). WebID is a means of controlling access to your data.
Notification Support: Code within PODs that alert users to changes in data contents or access, similar to how your phone likely includes “push notifications” that say when you have a new text or Instagram follower.
Patch Update: Patch update capabilities allow information in Solid PODs to be edited, re-coded, or otherwise manipulated by authorized parties.
Access Control List (ACL) Support: ACLs are types of documents that keep track of who has access to what data at what times. ACL support enables PODs to keep track of who has been authorized to access data. (Web Access Control (WAC), n.d.)
HTTP Web Standards: Like current web technologies, Solid uses the HTTP standard to transmit data securely over the web (and across different components within the PODs themselves) (Capadisli et al., 2021)
Representational State Transfer (REST) Service: RESTful data management is an HTTP service that enables RDF data to be moved an manipulated with HTTP (Capadisli et al., 2021)
Deblackboxing Solid
Algorithm
In the current version of the web, companies store their users’ data. So, a user may have data stored separately on servers owned by Google, Twitter, Facebook, and Amazon, just to name a few. When users continue to give these companies information about themselves or interact with these platforms, the data is constantly read and updated. As mentioned on the What is Solid’s Impact page, this current structure does not allow the user to revoke access to the use of their data. Solid seeks to circumvent this problem by allowing users to maintain control of their data by storing their data in a POD. Once a user obtains a POD and purchases hosting services or hosts the POD themselves, they are ready to participate in the Solid ecosystem. But, in the decentralized web that Tim Berners-Lee envisions, how does an application access and update the data stored in a user’s POD?
Authentication with WebID-TLS
In the first part of this process, a user must verify their identity in order to approve access to their POD. Solid uses the WebID-TLS protocol to authenticate users. To accomplish this process, the user’s web browser creates a certificate that contains a blank field called Subject Alternative Name (Sambra et al. 2016). The certificate is used to perform public key authentication, a process where a message is encrypted (the public key) and can only be decrypted with a private key. As their names suggest, public keys can be distributed while private keys should be kept secure (Public-key cryptography, 2021). Once the user’s WebID profile is located on the identity profile server, a private key is added to the user’s profile. The certificate is updated to include the location of the user’s WebID profile and the public key is added to the Subject Alternative Name field. Now that the certificate and profile are linked, a user can be verified by navigating to the user’s WebID profile and checking to see if the private key that corresponds to the public key listed in the certificate is present (Inkster et al. 2014). While this seems like a very complicated series of steps, it is a simple authentication process for the user who merely has to choose what certificate they would like to use. Unlike many current manners of authentication, this method does not require exposing login credentials, like passwords (Sambra et al. 2016).
Modifying Data in a RESTful System
Once the user is authenticated, they can allow applications to read and write data stored in the PODs. This transfer of data is made possible because data is managed in a REST compliant, or RESTful way (Sambra et al. 2016). REST, which stands for Representational State Transfer, is a framework that provides standards which make communication between systems on the web easier (What is REST?, n.d.). RESTful systems streamline this transfer of information by separating the needs of the server, where the data is stored, and the client, the application that communicates with the server. Rather than being run by commands, RESTful systems solely use resources that describe particular objects stored on the server. This enables the client to send requests to get, modify, or delete a resource on the server by using a series of HTTP verbs. The 4 basic HTTP verbs used are GET, which retrieves a specific resource or collection of resources; POST, which creates a new resource; PUT which updates a specific resource; and DELETE, which removes a specific resource (What is REST?, n.d). The server then sends a response which informs the client about the status of the request. So, for instance, if a user wanted to update an event they created with a calendar app, the client would send a PUT request that contained the id of the event they wished to modify to the server. After finding the event, the server would update the relevant information and send a response back to the client informing them that the request had been satisfactorily completed.
Storing Data in a POD
Because user data is kept separate from applications in the Solid ecosystem, the data must be formatted in the POD in a way that can be consistently interpreted by different apps. To accomplish this, app developers must use existing vocabularies – or create their own – to describe data. These vocabularies, also known as ontologies, enable data to be linked to demonstrate relationships between things and properties. As mentioned above, RESTful systems, like Solid, the HTTP verbs modify resources on a user’s server, or POD. In the POD, a user’s data is stored in the same manner that folders and files on your computer are organized. But, instead of linking to a Microsoft Word Document, the files are Internationalized Resource Identifiers (IRIs). IRIs simply links that identify documents – much like the URL that you used to get to this webpage (http://solid.georgetown.domains/). When the request is made to retrieve a file, the link stored on the server takes the client to a document that is hosted on the server. These files may be described within the Resource Description Framework (RDF) which links items with IRIs to properties and property values. For instance, a resource about me may be linked to the property “favorite color” which has the property value “red.” By providing more information about resources, web information has a more precise meaning and can be interpreted by computers (XML RDF, n.d). If you are interested in creating your own vocabulary, the Solid Project offers a great tutorial on how to do this.
It is clear that Solid is an immensely complex system. However, the processes that the Solid team are using to redesign the web are protocols that are already being used in our current version of the web. But, what are the sociotechnical implications of this new system?
Unintended Consequences
The team’s research of Solid PODs encouraged a rich and reflective discussion with Dr. Meg Leta Jones (available to listen here). Through this conversation, the team developed further curiosities about the true impact of this technology and potential unintended consequences the system may have. Fundamentally, Solid PODs promote user control and awareness of data management to shine above all. Though this level of autonomous data organization is unique to the existing web services, our team is cautious about the burden of data ownership and management on the user. Current systems allot this burden on developers and applications. However, the engagement of the Solid PODs present an opportunity of empowerment, not an obligation of educational and experiential understanding of data protection. In addition, the team developed questions on the accountability guardrails surrounding the third party POD providers. How is the Solid team advancing their proposal while simultaneously combatting disenfranchisement through user data? The safety and regulation of data management is an ambitious yet necessary endeavor, but there are some structural issues that Solid must address.
References
Bader, S. R., & Maleshkova, M. (2020). SOLIOT—Decentralized Data Control and Interactions for IoT. Future Internet, 12(6), 105. https://doi.org/10.3390/fi12060105
Capadisli, S., Berners-Lee, T., Verborgh, R., Kjernsmo, K., Bingham, J., & Zagidulin, D. (2020). The Solid Protocol. Solid Project. https://solidproject.org/TR/protocol.
Get a Pod · Solid. (n.d.). Solid Project. Retrieved April 18, 2021, from https://solidproject.org/users/get-a-pod
Inkster, T., Story, H. & Harbulot, B. (2014, May 28). In Story, H., Corlosquet S. & Sambra, A (Eds.) WebID-TLS: W3C editor’s draft. Retrieved on May 1, 2021 from http://www.w3.org/2005/Incubator/webid/spec/drafts/ED-webid-20130206.
Public-key cryptography. (2021, April 20). In Wikipedia. Retrieved May 1, 2021, from https://en.wikipedia.org/w/index.php?title=Public-key_cryptography&oldid=1018892072.
Sambra, A.V., Mansour, E., Hawke, S., Zereba, M., Greco, N., Ghanem, A., Zagidulin, D., Aboulnaga, A., & Berners-Lee, T. (2016). Solid: A platform for decentralized social applications based on linked data. MIT CSAIL and Qatar Computing Research Institute. Retrieved from: http://emansour.com/research/meccano/solid_protocols.pdf.
Web Access Control (WAC). (n.d.). Web-Access-Control-Spec. Retrieved April 29, 2021, from http://solid.github.io/web-access-control-spec/
What is Solid? (n.d.). Codecademy. Retrieved on May 1, 2021 from https://www.codecademy.com/articles/what-is-rest.
XML RDF. (n.d.). W3 Schools. Retrieved on May 3, 2021 from https://www.w3schools.com/xml/xml_rdf.asp.