Decentralized Index of Knowledge

Background

The index of knowledge project holds a special place in my heart since it’s always been a B@B initiative parallel to my own. When I helped teach the DeCal in Fall 2017, we had a lot of interest in compiling a master list of resources that us at B@B considered good learning materials – an opinionated/curated list of resources.

It seems intuitive to have a central repository of resources serving as a baseline truth from which to write our course material, train our members, administer certifications, etc.. but every time it was revived, the project died within a semester. Difficulty to maintain, centralization, and inaccessiblity all contributed to its downfall. From this point onwards it was colloquially deemed a “cursed” project that no one ever wanted to pick up – at least in my mind. Some more history of the project are in the linked powerpoint above.

Note

These are the observations of a “computer scientist” (actually I just hack projects together…). I use some terminology that might not be colloquial or accurate, but make somewhat sense to me.

Analysis

Last semester, coinciding with proposed redesigns of the Blockchain Fundamentals DeCal, as well as the revamp of the Blockchain for Developers DeCal, I gained an interest in using computer science for education. I had pitched some ideas to Armando Fox of the ACE (Algorithms for Computing Education) research lab in attempts of inspiring interst in the area for a potential grad school research focus, but none of the existing grad PhD students wanted to pursue an idea different from their own research focus – only remarking that it was an interesting idea and that similar ideas have been explored before in the area of personal knowledge indices, or using ontologies to generate “wrong” answers to multiple choice questions. Basically no one wanted to work on something different, so I decided to investigate in my own free time.

Having no real formal training in education, but having been a student for many years and observing how I learn the best and how good and bad teachers/professors do their jobs, I decided to take an algorithmic perspective. I take the modern belief that personalized education is the future, and it seemed that from a fundamental (simplistic if you will) perspective, students differ in their initial pursuit of new knowledge. This boils down particularly to how they explore the knowledge space given to them. A simple way to think about this (and perhaps a cool experiment to conduct) is: given a topic to investigate, a test to take, and the entirety of some linked data set (e.g. Wikipedia), how would they study? e.g. would they attempt to understand individual components of study, or rather the big picture at first, a mix of both, etc..

I remember reading some articles (or was it my mom and her friends on Line..) on the difference of Eastern and Western perspectives on academia, particularlly mathematics. Of course it was a generalization, but the claim was that in the East, students learn from the bottom up, cramming problem sets out mechanically until they finally came up with the patterns or algorithms themselves, whereas in the West, teachers spend a great deal of time helping students understand “main concepts”, as is apparent by their constant pursuit of new ways of teaching math (a la common core.)

If we graph bits of knowledge in a directed graph, so as to represent a prerequisite for learning, it seems that each student would benefit from identifying unique traversal.

The idea was to present knowledge as an unopinionated graph and to represent learning as a traversal of that graph.

A parallel initiative last semester worked on writing a proof-of-concept blockchain textbook for non-technical audience. They were in comparison highly opinionated in what, how, and in what order they presented information, but it was a good case study in my mind of pedagogical traversal.

It was more of a level order traversal, in that it would consider the big picture of blockchain at a complete view but at low resolution, and then refine that big picture more and more. In other words, a BFS where each depth is a new abstraction layer lifted. The book, when completed, would still not traverse too deep, so as to appeal to the younger non-technical audience. However, it’s clear that if it were written for a more technical audience, the traversal would go deeper more quickly, especially considering students with more knowledge in certain areas, etc.

Method

The first reboot of the IoK I authored was a graphical PoC. It featured rich graphics by the cytoscape graph library, and users could add/edit nodes and edges with Google authentication. It had minimal integration with Google Docs, and could also generate markdown files, which could be rendered into pseudo-awesome-lists. The underlying “data-store” was a JSON file on GitHub.

I had plans of using GitHub’s public sharing and PR system as a “decentralization” layer, such that interested contributors could fork the repo, run a script to add nodes/edges to the IoK (and generate metadata), and make a PR of standard format. Then verified collaborators on the main IoK project could then accept/decline the PRs as a Proof-of-Authority. Users would of course always be welcome to fork the IoK to run their own.

Everything was clunky though, and use of Github, running scripts, etc., seemed like a poor UX, especially if I had ideas of making the IoK a more generalized digital pedagogical tool.

Decentralize it

Previous iterations of the Index of Knowledge allowed a system of editing similar to that require a Proof of Authority. While it worked in concept, by the time it got to getting people to adopt it, it was difficult to maintain, especially everything had to be audited by a core group of people. Like moving from testnet to prod, we needed to scale by decentralization. GitHub forks gave this to us for free, but I was looking for something less hacky, and more decentralized – after all, everything hosted on GitHub is kind of cheating…

I had learned about and played with Blockstack at Hack Davis, where we used it for decentralized identity management and data streaming (via personal data lockers), and it seemed like a good fit for the IoK. Also didn’t hurt that I was freshly graduated (and unemployed) with nothing too exciting going on. aka every day was like a hackathon.

The idea was to move the graph data for the IoK from GitHub to individuals’ own data lockers on Blockstack. Users could then use that to define their own IoKs. Users can also view each others’ IoKs and fork them to make their own edits.

I’ve also written a small webscraper that visits an IoK (e.g. my own), downloads the graph data, and linearizes it into an awesome-list to be commited back as a README.md file onto GitHub. Currently, this is automated in Travis build job to save myself money, but ideally we could write some webhooks for this, or do it on an interval with Cron.

Currently, we can fork IoKs, but can’t merge them back together. I’m writing a small opinionated webserver that performs simple graph join operations. Say another user has an IoK with a subgraph you really want to include in your own (e.g. a whole mini-IoK about MakerDAO). With this, you could borrow that and merge it with your own IoK.

The roadmap

I’m currently working on this project in my spare time, but am always looking for help! It’s by far the biggest personal project I’ve started and maintained for this long, and I’d like to see it go somewhere.

The IoK can be found on GitHub. It’s still very hacky and in indefinite Beta. Currently some B@B education members are contributing for their internal project requirement, and I think it’s a good way for them to get some hands on programming experience.

Alternative architectures

I have a tendency to switch between architectures on a whim, esp. with this being a personal project that started off with little documentation and just my brain going wild. While this project still has very little documentation, here are some notes I can take off the top of my head:

GitHub entirely vs blockchain-based systems:

UI: graph first vs text first:

Firebase suite vs Rust on cloud deploy/etc: