Source: chenspec via Pixabay

A few years ago, Biella Coleman, an anthropologist studying hacker culture, observed that the events surrounding the Snowden leaks constitute a "critical event" in the development of hacker culture, in the sense that they were transformative and set things on a new, politically intensified course. There have been several previous critical events, including Operation Sundevil chronicled in Bruce Sterling's The Hacker Crackdown, that have been similarly consequential in shaping the politics and culture of hackers.

I suspect that the ongoing pandemic will turn out to be another such critical event, not just for the sorts of issues raised by technological efforts to contain the virus itself, but for the myriad digitally mediated social and cultural phenomena that emerged or accelerated during the lockdowns. It is hard to say what, exactly, its upshot will be, and it is still too early for any definite proclamations. Here I just want to deal with one small piece of the puzzle: the kinds of side projects self-described hackers have taken on since the first COVID–19 lockdowns.

The significance of side projects

Side projects have long been part of hackerdom, whether as ways of stealing back time from employers or as ways of "increasing your luck surface area." They can be at the root of resistance, or, as in the case of corporate hackathons, they can be yet another ritualized way of extracting free labor. Side projects can be "side bets" that deepen one's commitment to working in the software industry, or they can set up a "pivot" to another line of work altogether. Some side projects end up being deployed and depended upon countless times; some are described as "just a hobby [that] won't be big and professional." Sometimes both of those things are true about the same project, leading to a situation brilliantly summarized by this XKCD strip:

XKCD 2347

Often side projects take the shape of apps and software packages, but they can also be quirky "digital gardens" or artistic projects. In the words of one tech CEO, Natalie Gordon:

Any project that you are doing for fun, not to make money, is a side project. That's not to say your side project can't make money, but that’s not why you're doing it. Those fun projects that get done outside of work can have a huge, disproportionate impact on your whole career.

Others have noted how "side project culture" sets up a barrier around an industry that is already hard to break into, because not everyone can afford the up-front investment of hope labor it demands.

Gordon, incidentally, launched her own business with a "Show HN" post on Hacker News. Show HN is one of several online venues where people launch their side projects. Others include /r/sideproject, Product Hunt, Indie Hackers, and MeFi Projects. Here I will focus on Show HN, which provides one (admittedly distorting) lens on the kinds of projects people take on and share publicly. Part of a site that itself originated as a side project, Show HN is intended for sharing "something you've made that other people can play with." Other readers of the Orange Website, as it is (not always affectionately) known, can then share their feedback, which they do with varying degrees of constructiveness. One oft-cited example of less constructive feedback is the 2007 HN discussion of the Dropbox launch, where one commentator dismissed the product as "trivial."

Show HN submissions first started in 2009, when the site was still in its infancy, but only became an official feature of the site on July 3, 2014. Since then, there have been tens of thousands of submissions. Most submissions are links, though a small proportion of users (about five percent) opt to submit a "story" instead (which usually directs readers to the thing they can play with). More than a quarter of links posted are to repositories hosted on GitHub.

In the following, I will undertake a data-driven exploration of these submissions over time, in hopes of learning what, if anything, changed as a result of the pandemic. This may offer some initial clues about the ways hackerdom is evolving in this moment.

In [1]:
%matplotlib inline
from matplotlib.pylab import plt
In [2]:
import seaborn as sns
sns.set_theme(context="notebook", 
              palette="deep",
              style="ticks", 
              rc={"figure.figsize": (18, 12)})
In [3]:
import pandas as pd
pd.set_option("display.min_rows", 100,
              "display.max_columns", None,
              "display.max_colwidth", None)
In [4]:
import textnets as tn
tn.params.update({"autodownload": True,
                  "seed": 54,
                  "resolution_parameter": 0.01,
                  "lang": "en_core_web_md"})
In [6]:
import scipy.cluster.hierarchy as hc
In [5]:
from shifterator import JSDivergenceShift as JSDS
In [7]:
from showhn_utils import timeseries_by_cluster, date_term_matrix, period_tn

Thanks to the Hacker News search API offered by Algolia, gathering all posts is just a matter of querying the search_by_date endpoint.

In [8]:
showhn = pd.read_csv("show-hn-hits.csv")
data = pd.DataFrame({
    "date": pd.to_datetime(showhn["created_at"], utc=True),
    "points": showhn["points"],
    "num_comments": showhn["num_comments"],
    "oid": showhn["objectID"].map(str),
    "title": showhn["title"],
    "author": showhn["author"],
    "url": showhn["url"],
    "story_text": showhn["story_text"]
}).set_index("date")

Show HN through the ages

First, let's look at the distribution of Show HN posts over time.

In [9]:
with plt.xkcd():
    sns.lineplot(x="date", y="oid", label="Monthly Submissions",
                 data=data.resample("1m").nunique().reset_index())
    sns.lineplot(x="date", y="oid", label="Rolling Average",
                 data=data.resample("1m").nunique().rolling(3).mean().shift(-1))
    ax = plt.gca()
    ax.set_ylabel("")
    ax.set_xlabel("")
    line = ax.lines[-1]
    x, y = line.get_xydata().T
    ax.fill_between(x, 0, y, color=line.get_color(), alpha=0.3)
    sns.despine(trim=True, left=True)

The initial big peak in July 2014 corresponds to the offical release of the Show HN feature, which may have been prompted by the two earlier peaks in 2013. After July 2014, there were ordinarily between 800 and 1,000 submissions per month, but starting in early 2020, the number shot up to twice this previous average, and it remained elevated until nearly a year later, in early 2021. This striking pattern is a strong indication that increased pandemic side project activity was reflected in Show HN submissions.

What are side projects about? Looking at terms that tend to co-occur in submission titles can help to discern common themes. Here and in subsequent steps, I focus on submissions from July 2014 through the year 2021.

In [10]:
titles = data[(data.index >= "2014-07-01") &
              (data.index < "2022-01-01") &
              data["title"].notnull()].reset_index().set_index("oid")
In [11]:
corpus = tn.Corpus.from_df(titles[["title"]])
tokens = corpus.tokenized(remove_urls=False)
net = tn.Textnet(tokens, remove_weak_edges=True, min_docs=40, doc_attrs=titles[["date"]])
In [12]:
cluster_dates, cluster_terms, row_order, ts = timeseries_by_cluster(net, 75, freq="2w")
In [13]:
def fglabels(_, color, label):
    cluster = int(label.split(".")[0])
    ax = plt.gca()
    ax.text(0, .5,
            ', '.join(cluster_terms[cluster]) + 
            f" ({len(cluster_dates[cluster]):,})", 
            fontweight="bold", color="dimgray",
            family=["xkcd", "xkcd Script", "Humor Sans"],
            size="large",
            ha="left", va="center", transform=ax.transAxes)

g = sns.FacetGrid(ts, row="cluster", hue="cluster", 
                  row_order=row_order, hue_order=row_order,
                  aspect=18, height=1, 
                  palette=sns.dark_palette("orange", n_colors=75, input="xkcd", reverse=True))
g.map(sns.lineplot, "date", "count", linewidth=2)
g.refline(y=0, linewidth=2, linestyle="-", color=None, clip_on=False)

g.map(fglabels, "cluster")
g.set_titles("")
g.set(yticks=[], ylabel="")
g.despine(bottom=True, left=True)
<seaborn.axisgrid.FacetGrid at 0x7fd46969c2e0>