Linking this wiki together

Tagged: meta firn rust org

Links have been difficult with this wiki. Links are what have consistently broken on the wiki - in fact, everything else has largely been fine, but linking has been challenging. Let's talk about why.

Laying some ground work

First off, this wiki is built off org-mode. Org mode can link to any number of different kinds of content. These links could be other org-mode files, images (which can be displayed inline in emacs), links to source code... and several others I know very little about.

What else?

Well, linking goes in one of three directions:

TypeExample
Upfile:../my/file.org
Straightfile:sibling_file.org
Downfile:descending/down/to/my_file.org

To transform an org "file" link into valid html we have to look at two things:

ComponentExample
Source of link/Users/weakty/wiki/notes/misc/originating_file.org
The Link pathfile:../../books.org

Each of the above links has their own respective components that need to be considered:

/Users/weakty/wiki/notes/misc/originating_file.org
<-----a------><-------b------><---------c--------->

In the originating file, we need to know what is the current working directory the wiki itself lives in (a), we need to know all the parents up to the root of the wiki (b) and we need to know the file name we came from (c).

file:../../books.org
<-a-><-b--><---c--->

In the link itself, we need to know that it is indeed a file link (a), we need to know if and how many times the link moves up (b) and we need to know the file name so it can be converted from .org to .html.

Mistakes were made

Let's look at some of the naive mistakes I made when I first started re-writing Firn in rust. My first error? I should have started with some tests when it came time to start transforming one string into a path and back into a string.

Attempt 1

Here's my first attempt at handling links beyond a flat-file structure.

pub fn transform_org_link_to_html(
    base_url: BaseUrl,
    org_link_path: String,
    file_path: PathBuf,
) -> String {
    let clean_file_link = |lnk: String| -> String {
        // if it's a link up a directory...
        let mut result = String::from("");
        for i in lnk.split("../") {
            if !i.is_empty() || i != "file:" {
                result = i.to_string();
            }
        }
        str::replace(&result, "file:", "")
    };

    let mut link_path = org_link_path;

    // -- handle different types of links.

    // <1> -- It's a local org file.
    if is_local_org_file(&link_path) {
        link_path = clean_file_link(link_path);
        link_path = str::replace(&link_path, ".org", ".html");
        return base_url.build(link_path, file_path);

    // ....

The main thing we're looking at is clean_file_link, which I made a closure because I thought I would be using it with the varying link types org has. clean_file_link in its first iteration worked by splitting a link based on ../ and kept everything but file: or ../ from the string. This worked fine for most links - anything that was "going up" a parent I assumed I could append the link's file_name to the end of the site's baseurl.

This didn't work, but I didn't notice because the majority of my wiki is a flat-file system. Then I started blogging. The blog is one directory deep, and occasionally will link up to top-level files. I won't go spelunking through my git-history, but I believe that was one source of broken links.

Attempt 2

My second attempt at fixing linking between varying directory depths was pretty wretched. Look at "num_parents". Yuck!

    let mut num_parents = 0;
    // this closure is a bit of a mess
    // but basically, we want to count how many ".." are in the file link,
    // so we can pass it to the Baseurl::build method.
    let mut clean_file_link = |lnk: String| -> String {
        // first, remove the `file:` prefix.
        let stripped_lnk = str::replace(&lnk, "file:", "");
        // now, let's turn it into a path so we can break it up.
        let link_as_path = PathBuf::from(stripped_lnk.clone());
        // now just get the directory parents of the original file_path
        let mut file_path_without_file = file_path.parent().unwrap().to_path_buf();
        let mut final_res: Vec<String> = Vec::new();
        let mut result: String = stripped_lnk.clone();

        // now for each ParentDir (".."), we push the file_name (last item in the link path)
        // to a temporary holding place, which we will then reverse
        // and join into a url.
        for comp in link_as_path.components() {
            match comp {
                Component::ParentDir => {
                    let file_name_as_str = file_path_without_file.file_name().and_then(|s| s.to_str()).unwrap();
                    final_res.insert(0, file_name_as_str.to_string());
                    file_path_without_file.pop();
                    num_parents += 1;
                }
                Component::Normal(f) => {
                    result = f.to_str().unwrap().to_string();
                }
                _ => ()
            }
        }
        result
    };

    // ....

    let x = base_url.build(link_path, file_path, num_parents);

// over in config.rs in the BaseUrl struct

    pub fn build(self, link: String, file_path: PathBuf, parents_to_drop: i32) -> String {
        let mut parent_dirs = self.strip_source_cwd(file_path);
        let mut link_res = PathBuf::from(self.base_url.clone());
        for _n in 0..parents_to_drop {
            parent_dirs.pop();
        }

        if parent_dirs != PathBuf::from("") && !self.link_starts_with_data_dir(link.clone()) {
            link_res.push(parent_dirs);
        }
        link_res.push(link);

        util::path_to_string(&link_res)
    }

I was using a counter? Ugh, why? Not only that, but in the above example I'm doing some munging of link data in the util function, and then some other transforming of the link is happening over in the build method on the BaseURL struct.

To be fair to myself, I'm still pretty new to Rust - and each little mistake paves the way to walk a little faster when developing in the future!

Attempt 3

I realized links were broken again when I went to view a top level page linking down into the blog. Strangely, that was broken. This was the problem in the previous code:

// for comp in path components:
    Component::Normal(f) => {
        result = f.to_str().unwrap().to_string();
    }

When I loop through the path components I'm actually overwriting the result variable every time. Why didn't I notice this? I think because, again, I don't have many links that go more than one layer deep. Since there was usually only one instance of a Component::Normal in my paths, it escaped me.

Finally, I found something that (hopefully) will work for a while, and cleans up things more:

pub fn transform_org_link_to_html(
    base_url: BaseUrl,
    org_link_path: String,
    file_path: PathBuf,
) -> String {
    let num_parents = 0;

    let mut link_path = org_link_path;

    // -- handle different types of links.

    // <1> -- It's a local org file.
    if is_local_org_file(&link_path) {
        // link_path = clean_file_link(link_path);
        link_path = str::replace(&link_path, ".org", ".html");
        link_path = str::replace(&link_path, "file:", "");
        let x = base_url.build(link_path, file_path, num_parents);
        return x

    // <2> it's a local image
    } else if is_local_img_file(&link_path) {
        link_path = str::replace(&link_path, "file:", "");
        return base_url.build(link_path, file_path, num_parents);
    }

    // <3> is a web link (doesn't start with baseurl.)
    if !is_local_org_file(&link_path) {
        return link_path;
    }
    // <4> We don't know? Just return the link.
    link_path
}
// in the BaseUrl  Struct
    pub fn build(self, link: String, file_path: PathBuf) -> String {
        let mut parent_dirs = self.strip_source_cwd(file_path.clone());
        // start with just the baseurl.
        let mut link_res = PathBuf::from(self.base_url.clone());
        // all things after "../" in links
        let mut link_tail = PathBuf::from("");

        let link_as_path = PathBuf::from(link.clone());
        for comp in link_as_path.components() {
            match comp {
                Component::ParentDir => {
                    parent_dirs.pop();
                }
                Component::Normal(f) => {
                    link_tail.push(f);
                }
                _ => ()
            }
        }
        link_res.push(parent_dirs);
        link_res.push(link_tail);
        util::path_to_string(&link_res)
    }

You might have noticed that I still have the num_parents variable. I haven't cleaned that up yet, but at least that variable is not being used!

How did I get here? I added some tests. I had some in the previous broken iteration, but not enough, it seemed! I added a few more to try and account for the different permutations of link types from different file locations. As usual, you can scope the code here. If you know Rust or have ideas on how to improve what I've got please get in touch! Hopefully I'm covered for now and things are a bit more reliable.

Thanks for reading!