Ah, the humble hashtag. It’s likely you’ve never thought about it in any great depth. It seems such a fixture of our culture that it is hard to imagine that only a few years ago it was an obscure bit of jargon and a few years before that it did not exist at all. Now you encounter it more or less daily and it doesn’t seem very interesting.
But it’s surprising how much you can learn by looking at the boring things we encounter in everyday life. A lot of our knowledge of antiquity comes from the study of coinage and pottery. The most fascinating sociological studies deconstruct everyday encounters to reveal the structure of human social relations. A history of the hashtag is a history of the Internet in miniature, and that has to have some value.
(I should note that I am writing for an educated but nontechnical audience, so I may gloss over some obscure details. I am not particularly concerned with the creation of a precise timeline of the development of certain concepts; my main consideration is making the concepts themselves and the way they stack on top of one another clear.)
At the dawn of time there was hypertext. We already had rich text, which could be created in programs like WordPerfect and WordStar, but hypertext had one crucial difference: the addition of interactive elements such as forms and, most notably, hyperlinks. A hypertext document is written in a markup language appropriately named HTML.
A markup document is made up of elements arranged in a hierarchy: in the typical HTML document, there is an
HTML element, and within that there is a
head element that contains information about the document (such as its title) and a
body element that contains the document’s actual text. Each of these can be further subdivided in various ways depending on what you’re trying to accomplish. Here’s a simple example, which you can test by copying the code into your favourite text editor, saving it as a *.html file, and opening it in the web browser of your choice:
<html> <head> <title>My HTML Document</title> </head> <body> <p>A paragraph in my document.</p> </body> </html>
This markup is parsed by the web browser and turned into rich text for your viewing pleasure. Forms and links are the only concession to interactivity in HTML. For the most part it is meant for front-end display. (There is a school of thought that arose about ten years ago that says that an HTML document should only contain the structure of the webpage, while all layout and formatting decisions should be made in a separate stylesheet document written in the CSS language. There is some merit to these ideas but in practice it’s hard to disentangle format and structure.)
The problem with a website in pure HTML is that it imposes limitations on what you can do. The process of making updates manually is cumbersome. Even to correct a single spelling error, you have to get out a text editor, dig through a lot of markup that is not formatted to be easily edited, make the change, and use an FTP program to re-upload the file. More heavy edits to HTML require a certain kind of structural thinking that takes a lot of concentration to do even if you’re good at it (and most people do not naturally think in this way).
For the most common changes made to a webpage, we are concerned with just the text body, not the metadata or even the other content that may persist from page to page (sidebars, menus, footers, etc). It is unnecessary—not to mention inefficient and insecure—to have the person responsible for the web page’s content digging through all that extra stuff just to make minor changes. Also, and importantly for our present purposes, things like archive or category pages take a pointless amount of work to manually adjust. All of this means that there is a severe ceiling on the size of any site that is created with pure HTML.
To raise this limit, there has long been software that will yield as its output an HTML document that is generated through a number of operations such as the inclusion of stock material, queries to databases, mathematical operations, and connections to other servers. Over time some programming languages evolved specifically for the creation of these programs—PHP, Ruby on Rails, and others. Pages on a site can be automatically constructed by combining global templates with local metadata and content. This means that, for example, if you want to remove a link from the sidebar on every page of the site, you only have to edit one file, not every single page. It also means that things like monthly or topical archives can be generated automatically.
When, at the beginning of the Web 2.0 explosion, self-hosted blogs (mostly based on the WordPress platform) became common, a certain way of organizing content was entrenched. We’ll refer to this by the name “taxonomy”. Taxonomy involves creating a series of hierarchical categories (much like biological taxonomy) that persist over time and assigning bits of content (posts, on a blog) to one or more of these categories. Software like WordPress would facilitate the creation and management of these categories, generate archives automatically, and create a fancy menu for navigating these archives (you can see an example in the sidebar of this very site).
Organizing content through taxonomy tends to emphasize and reward certain features of the content:
- Rereadability: The arrangement of archives makes rereading easy.
- Focus: The tendency is for a post to be in as few categories as possible, and there is a strong bias against the creation of new categories where unnecessary.
- Continuity: Because people are likely to read everything you ever wrote on, say, video games there is a tendency to present a coherent position that evolves over time, rather than a series of unrelated ephemera. Content can’t just disappear into the ether because someone will miss it. You may even update old posts.
- Particularity: Categories are created to be uniquely appropriate to the site on which they are used. Sites are structured in a way that is appropriate to their content.
The creation of categories requires some thought and foresight. Just like creating a webpage in pure HTML, it benefits from a certain organized, hierarchical mindset that comes naturally to some and not to others. As Web 2.0 has continued in its relentless progression, this mindset has become anathema and a new system of organizing content has developed.
This system was given as a name the rather silly coinage “folksonomy”. Folksonomy categorizes content through “tags” which resemble categories but are different in important ways. The important differences can be captured in the contrasting formality: category names are generally in title case and have multiple words where appropriate, whereas tags are generally in all lowercase and resist the inclusion of multiple words.
As informal forms of organization, tags are created on the fly to suit the needs of the moment. A single post may carry a dozen tags, some of which only barely reflect the content of the post (for instance, a previous post of mine is tagged “Atlus” even though I never once mention the developer). There is considerable room for duplication since, unlike categories, tags do not exist independently of the posts they are attached to. It is easy to have the same name rendered slightly differently in different places (“video games”, “videogames”, “video-games”, “games”) which reduces the value of tags as a method of finding content on a particular topic.
In fact, for a while the purpose of tags was not entirely clear. For a long time they coexisted with more traditional categories (and still do to some extent) but nobody really used them for anything. As Web 2.0 marches on, a niche for them has emerged: they track, in an approximate way, the use of topics across different webpages. They can be processed automatically and fed into computers that analyze large-scale trends. Outside of individual posts, their paradigmatic appearance is in the tag cloud: a visual representation in which tags are arranged in a block in no particular order, with the more frequently used tags appearing in larger text.
Used in this way, content organized primarily through tags exhibits very different tendencies than categories:
- Ephemerality: Tags are not used consistently, so they are not very useful for finding content to reread. This means no one will reread most content, which means that most content is not designed to be reread.
- Trendiness: Tags can be created on the spot for a topic that is popular today but will never be mentioned again.
- Discontinuity: The assumption is that no one, least of all the creator, will ever revisit the content. The past doesn’t exist.
- Generality: Tags are designed to be easy to track across web pages, so the idiosyncrasies of a particular place are less important than the wider world.
To summarize it very simply, categories are designed to organize content so it can more easily be found by people, whereas tags are designed to label content so it can more easily be tracked by computers. But there is still one more important innovation. Twitter was established in 2006 as a service where users can post content containing up to 140 characters. The service has evolved somewhat since then but it still rotates around this core.
Early on in Twitter’s history it was decided that tweets would be categorized using tags, and these tags would adopt a new syntax: the tag’s alphanumeric name i preceded by the symbol #, known variously as the pound, number sign, octothorpe, or hash. Since “pound tag” doesn’t have a nice ring to it and the other names are too long to be catchy, “hashtag” is what we’ve been stuck with since that time.
The difference between a hashtag and a regular tag seems superficial at first: it’s just an extra character added to the beginning. But there’s another important difference that is often overlooked. The characters in a hashtag are counted against the character count for the tweet. This means that the category information is included as part of the text body, eroding the distinction between data and metadata that had been upheld even by folksonomic tags.
The result has been explosive. A few now familiar rhetorical devices would not be possible without this innovation, and did not exist before it. For one, there is the facetious categorization, in which a hashtag is applied simply for its rhetorical effect with no categorizing function intended. This confuses the computers, but no one cares about them. The technical appearance of the hashtag lends a different savour to the text it contains, separating it somewhat for the rest of the text and letting it be used for comic reversals or a comedian-straight man relationship. This was not possible with previous methods: there was a strong tendency for metadata to be truthful both to preserve its usefulness and because it could not effectively be leveraged for rhetorical effect (because nobody read it).
The viral tag is another new possibility. The hashtag is part of the reason that topics like #YesAllWomen have spread so quickly and effectively (and incidentally the reason that the hashtag is now a mandatory feature of every advertisement, even though most of these hashtags will never be used). We now have a convenient shorthand to refer to Internet happenings that functions across the major social media pages and is recognized when used elsewere.
Before we conclude I would like to add my obligatory doomsaying message. What we see here is technology adapting to the desires of consumers. Technology’s ability to do this is quite strong, but we should not assume that the desires of consumers are necessarily good, nor that this technological evolution represents a positive development for the Internet. To my view it’s rather more of an arc shape. At first there was relatively static content that was limited in scope by the technology of the time. Then technology improved to encourage more frequent updates while the long form and relative thoughtfulness of most content was preserved.
But the new regime has taken away the tendency toward long-form and thoughtful content, and even turned frequency of updates into a vice by making them instantaneous and therefore usually ill-considered. There is now a strong tendency toward an inundation of shallow, half-formed thoughts rather than daily or weekly updates of more valuable fare.
It gets worse. I don’t think I’m alone in seeing a decline not just in how we’re thinking and writing, but in what we’re thinking and writing about. It’s difficult to explain what I mean, but it seems that the Internet of today takes things more for granted than the Internet of just a few years ago. Mirroring the technological changes, we think less about structure and more about surface. We argue quickly and loudly over what colour to paint a room without considering whether the room is to be a torture dungeon or a library.
Though it is a powerful invention, I do not consider the hashtag to be a net positive for humanity, at least not yet. This is why I try my best to conduct my business in defiance of its commands.