Temptations of the memex

Tagged: meta

Recently, I was having a chat with a friend who was expressing an interest in building a sort of database/ability to query their note-taking system. They use org mode, like me, and were starting to feel some limitations of their system. At first, I wondered, couldn't org-mode do everything you need already? And while I'm pretty sure you can bend org-mode and emacs to do nearly anything with text files, I started getting tempted down the path to building a custom system inspired by my friend's idea.

The memex

I heard of the idea of a "memex" sometime in 2016 or 2017 at a NodeSchool meetup, if I remember correctly. I filed it away as an interesting idea for later but never really explored it. Fast forward to 2020 and beyond where many people's "digital gardens" and "second brains" have flourished on the internet. More and more people are interested in exploring systems of organizing their digital files. I love seeing this. It's fun to explore people's methods for organizing their data, and seeing what they come up with. More people are building personal sites. Or, people are trying tools like Roam Research, LogSeq, or Athens. The graph as a method for representing connections between ideas has become popular (again, perhaps). It's all very fascinating to me.

What do I need?

I spent a lot of time building Firnº, and org-roam v1 still works for me. But at this rate, I'm wondering what my little wiki will be like in 10 years. In the end, I've built something that makes it possible to interface with my wiki as data as a by-product of the fact that firn uses a parser that converts org-mode into a data structure. I can get access to all kinds of metadata like dates and tags, and I could find a way to query them... but it might not be pretty.

Querying the past

When I think about my wiki in 10 years, I think I'd like to be able to ask things like:

  • show me what I was writing about five years ago

  • find that bookmark that I captured the other week related to elixir, but I can't remember what it was called

  • show me how I've spent my time over this week/month/year.

  • show me how I've connected concepts, ideas, and small notes in a zettelkasten type system, in a way that might even help me develop new ideas, connections, and concepts.

I can have the following with my current system, but it's a bit complex in that I'm either using org-roam's tags or ripgrep. I can't exactly access data by date or date ranges - unless I build that into firn somehow.

Capturing the present

One thing that most online documentation tools/graph notebooks don't have is a rigid structure for capturing data of a certain kind; you can put whatever you want in your notes. That's great for most people; I'm not sure that works for me.

What I've ended up creating over the past few years is a system of repeated archetypes that share some overlap, but remain distinct:

  • projects

  • research

  • blog posts

  • catalogue pages

I would love a system that would be slightly tailored for each archetype, but still enable bidirectional linking between related subjects. Specifically, the system would rigidly require me to input the required fields for each archetype (an album must be input with a year, the number of tracks, my rating, etc).

Let's explore each of the above a bit more.

Projects

My Projectsº page sums this up fairly well; a project is something that creates a finished artifact of some kind: it could be an album, a series of related paintings, a video game. etc.

Projects need the ability to be time-trackable, so I know how long it took to finish something (or how much time I spent before I gave up).

For projects, I want to be able to have the following

  • public pages that can display each project as a sort of portfolio representation.

  • the ability to query what projects got done in a time range.

  • to see what research might have been related to each project (ex: what books did I study about painting while I was working on a series, etc)

Research

Research is the exploration and note-taking that surrounds a practice or skill, but doesn't involve making something. Research would usually take the form of notes and reflections on a resource that helped me learn something (a youtube video, a book, etc).

I also categorize the practicing of certain skills as research, such as learning a new language.

Blog posts

Blog posts are time-oriented posts that usually involve some reflection or sharing of things I've learned or am excited about. I've gotten into the habit of writing posts because it:

a) helps me clarify thoughts that are rattling around my head b) as an extension, help me see how I arrived at certain thinking/solutions c) can be of use to others d) helps me improve my writing.

At the very least, blog posts follow the same schema as a project or research page - I want to be able to write them and query them easily. It should be possible to see related projects or research.

Beyond that, it would be nice to be able to have the ability to engage readers; I don't know how I feel about comment systems, but the option would be nice.

Catalogue pages

I have accrued a few different catalogue pages that essentially amount to tables of data:

  • books I've read (their titles, number of pages, genre)

  • music (favourites I've found over the years)

  • quotes

  • movies

  • bookmarks

  • travel

Being able to capture these types of data in a structured way (for the express purpose of being able to more easily query for that information) would be helpful and generally just fun. I'd love to be able to quickly find out what albums I loved in 2018. Or, what places I went to, on a map in the summer of 2021.

These kinds of things are possible with my current situation, but the ability to query things extensively isn't very developed.

How would I do it?

So, how would I build a more queryable system?

The friend I was speaking with referred to his smallest piece of information as an "atom", which got me thinking. When you boil it down, 80% (or more) of my wiki is just plain text. Every piece of text shares these minimum pieces of metadata:

  • a title

  • the body content (optional)

  • the timestamp when it was created.

But then, how would I achieve some sort of extension around that, to be able to create more structured data? The idea of a schema comes into play. If each atom could have the ability to belong to or adopt a certain schema, it could require that the user fills in certain fields. If I'm creating an atom that is intended to document a book I just read it needs additional fields: the author or the book, number of pages, etc. So, when it came time to log this information, the 'create' view, would essentially be a basic web form, that when telling it the type of schema to adopt, would then change to reflect what additional form fields are required. In a sense, it would be similar to the Todoist quick capture window, but more adaptive:

Instead, the above popup would ask me for specific metadata required by the schema of "book". The same goes for movie, bookmarks, blog posts, etc.

All of the above would go into a single table. Maybe the table would be called "atoms" or "entities", but the dynamic part of it - that which is dictated by the schemas - would mean that each entity would have a set of json data in the row that fulfilled the requirement of the schema it belonged to.

As a table, the data would look like this:

Titleschema_typeBodyMetadata
buy milktask-{"is_done": false}
A blog postposthi there ...{}
useful elixir postbookmark--{url: https...}

For example, if I was to capture an "atom" and have it apply the "todo task" schema to it, it would mean that it would have a blob of metadata with fields inside it such as: is_completed, due_date, has_reminder etc.

This would change how and where that item is rendered.

This also would mean that I could query a single day and see what I did, what websites I bookmarked to help me achieve a task, and how much time I spent on project X, etc.

Be careful...

All of this sounds pretty appealing. It's quite funny to me because so many people got pretty into static sites because they got tired of having to run databases / the LAMP stack etc. I also spent so long building firn, that it seems pretty silly (or, typical of a project-loving programmer?) to refactor and migrate to something entirely different.

On the other hand, it's possible to justify these kinds of projects, because as my friend put it, it's an investment in yourself to catalogue and capture this kind of information - the information that has helped you learn, grow, and build what you want to build.

I guess it's a nice curse to have.