How I Tried, and Failed, to Create an Index of Magic Articles

It’s worth mentioning from the get-go that my professional background is as a librarian. I have a Masters in Library Science degree from a prestigious university, and I’ve been working in and around libraries for more than ten years now. I mention this background because I want it to be clear than when I set out to create an “index” of Magic articles, I had a lot education and experience behind me. I knew the Index was a thing that could be very useful, and I knew, more or less, what needed to happen for it to come to fruition.

I still think an Index is a great idea and would be useful, but I also realize now that I don’t really have the capacity, even with the help of a few capable individuals, to make it happen the way it needs to. If I worked on it myself full-time I could do it, but I’m not at a point in my life where I’m willing or able to do so. More is the pity.

In this article I’ll talk a little bit about what an Index is, why I thought it would be good, and why it failed.

What the Index?

In research, indexes are traditionally finding aids. The index in a book, for instance, helps you find other information in that book, listing key words alphabetically and pointing you to page numbers where whose words appeared. In academia, there are entire sets (like 10-20 giant books sometimes) that are subject indexes; you look for a subject, or an author, or another specific entry point, and it will point you to additional resources, often articles or books, to help you complete your research. Since most research happens online these days, many of the paper indexes have been phased out, but electronic ones still exist and are useful.

Whatever the subject or medium of an index, the purpose is the same: collect either a selective or exhaustive list of resources on a particular subject, or by a particular author, or in a particular domain, and then add consistent data to those lists items to make those resources easy to find. The data added, called metadata, might include the obvious things: author, title, source, etc; but could also include additional fields, such as: publishing date, publishing city, subject, peer-review status, # of citations, and more. The more metadata you add to an index, the more ways your users have to find the resources they need.

Why an Index?

The purpose of an index of MTG Articles, then, should start to become apparent. Take ALL the articles written about Magic on a daily basis and start keeping a list. Then, to that list, add consistent metadata so that people who use the index can search and find articles effectively. In this case the metadata would include format (Modern, Standard, Pauper etc), site of publication, and author. After that it would also have any pertinent subjects, tags, or cards that were covered. Articles about Commander, for instance, would always be indexed with the name of the Commander card itself. That way, over time, anyone could use the index to find all the Commander articles about [c]Mayael the Anima[/c] decks, or whichever Commander interested them the most. No matter who wrote them. No matter where they were published.

There are examples of MTG Indexes out there that are basically glorified RSS feeds, and they don’t do any of the things that an index should really do. They don’t add metadata, most importantly, and they don’t apply consistent standards across articles and sources. If you pull a site’s metadata with their articles, for instance, you might end up with some sites sending you articles tagged “Pauper”, some tagged “Classic Pauper”, and some tagged “Pauper and other budget formats.” Plenty won’t come tagged by format at all. The only way to do this is by hand; there’s no automated way in the world to pull in articles and sort them by format, especially if they aren’t already tagged that way.

Feedreaders are great. I used Google Reader for years, and I use InoReader now, and they are incredibly useful for keeping up on articles. They aren’t great, though, for finding articles from the past, and they don’t organize articles for you in any meaningful way. If I was going to create an index, it had to be something different, and more useful, than an RSS feed.

How an Index?

I started off trying to find ways to do the whole thing myself. I’m pretty good at finding efficient ways to accomplish routine tasks (e.g. tasks where you are basically doing the same thing over and over again). The trick is that every minute you shave off the process can save you an hour or two in the long run, so it’s well worth it to be as efficient as possible.

I started by collecting as many Magic feeds as I could find. Seriously, I have THE list of sites publishing Magic articles. I won’t claim it’s complete, but it is very large. Since some sites didn’t / don’t have feeds, I had to use online tools to make feeds out of site content. PucaTrade, for instance, has fine articles, but since it’s not originally a blogging platform, there are no feeds.

Once I had my list, and my feeds set up, I tried an RSS-to-Post converter for WordPress. This took the articles from my feeds and automatically created WordPress posts for each of them. I figured all I would have to do then is cut the description down to “snippet” length, add some data, and I’d be set. The problem was that I got a lot of junk data that came over with the articles, and I often still had to skim the articles themselves to figure out the topic, so I wasn’t saving much time with the whole “importing” process. By the time I got rid of the junk data and added good data, I was probably down time from doing it manually.

So I switched to doing it manually. To index an article I would select one from the feed. Then I would either copy the article description used in the feed or use the initial paragraph of the article as a “snippet” or description for the index. Then I would add the author information at the top, and the link under the description telling viewers to “Read the original story at LINK”. Then I would add a category for the format, e.g. Pauper, and tags for all the other pertinent info, e.g. “commander, mayael the anima, josh, cmdr decks, uriah oxford, gatheringmagic.” That example used this article as a source, and you can see an issue that’s common in the process, the author information is hard to find. Uriah Oxford edits the CMDR Decks series, great. But Josh put this decklist together, and his name never shows up in full on the site. I’d like to give Josh credit, but who is he? What’s his last name? Having a tag for just “josh” doesn’t seem very helpful, but if I send time trying to track down every little citation mystery, the process becomes far less efficient.

I realized pretty quickly that I wasn’t going to able to do the whole index by myself. I needed help and I needed a good way to break things down so everyone’s workload was manageable. I put the whole thing aside, actually, for quite a few months. Then I started this website, With the site came authors who were willing to contribute to the Index, so I created some guidelines, broke the index down by format, and let them loose. For the most part, they did a really great job, but there were problems.

Problem #1: For an index to be useful, it has to be consistent

An index should either be exhaustive, trying to include EVERYTHING in its given domain, or it should be selective, carefully picking only the “best” things for inclusion. Either way, it should be clear to the reader which it is, and it should be consistent. Unfortunately, when people are working for free, and when they have busy lives and their own time constraints, it is difficult to ensure a consistent approach. We wanted to be exhaustive, but some articles were falling through the cracks, which was undermining the value of the entire thing.

Problem #2: An index should add to the value of the original resource

In this case, the idea was that the Index would point more readers to author’s articles, and they would be happy for the additional traffic. The issue was that, in some cases, indexers were including too much of the original article in their snippets / descriptions, and some authors of those original articles felt like the index was basically “stealing articles” for the our own profit. In every instance where too much of the original content was used, the issue was fixed within 24 hours. It never should have happened in the first place, but it did, and people noticed, and we lost some respect for a project that hadn’t even gotten its feet under it yet.

The byline was remove in the list, but still showed on the articles; I changed this to read “Index” and always included the byline of the original author first thing in the index entry, but byline issues occurred and also drew criticism.

Links to original articles sometimes didn’t show up. I checked these daily and added them, but it was just another consistency problem to add to the pile, and more tinder for our critics when it did occur.

Problem #3: An index should be easy to use and should provide reliable results

For the first month or so we had the index, the main idea was primarily to collect material, e.g. build the list, so that we had some entries to work with. This probably should have been done in private, really, but I wanted to have an easy way to see for myself and to show the progress we were making. It was, basically, just a list of articles other people had written. We had added subject categories for formats, and tags for sites and authors and content, but that was easy to ignore.

After that first month I looked at creating a search interface for the index. I was up to the task, but there were some hurdles. Namely, since the Index entries used a lot of the same metadata as our own original articles, separating the two was more challenging than I thought. WordPress is a great tool, but search options, even with plugins, are pretty lackluster. While I was fussing with getting search to work, indexers were going on vacation and missing articles, and I realized that all the work I was doing wasn’t go to make a difference. We weren’t consistent enough; articles would fall through the cracks and people who came to search the index for Mayael lists might only find 50%, maybe even 85%, of the articles published within our Index time-frame. The first time someone realized we had missed “X” article on a subject for our index, they would realize the index wasn’t reliable, and they would stop using it. And they’d be right.

So what?

So I scrapped working on search, and I scrapped the index, even though I think it would be useful if done right. Unfortunately, in the time we had it, we never got to show off the good side; we never got to demonstrate what the index was meant to be. All we did was have a list, for awhile, of Magic articles people on other sites wrote; all that did for us was draw ill will, despite our best intentions.

I haven’t given up on the idea of an index of MTG articles, but I need to find an even better way to go about it. It needs to be on its own site, probably, and it needs to launch with a full-functioned search tool. Most importantly it needs to be exhaustive, and that’s the hardest, most time-intensive part. I have an idea that I could get students in graduate school who are studying to be librarians, or even unemployed graduates who want experience, to work on the index. I’m still not sure that would be a consistent enough source of indexing, though. Interns come and go and its hard to get them really invested; the unemployed will only stick around until they get a paying job offer. Who can blame them?

For now, there’s an easy moral.

I set out to do something different and it didn’t work. I’m okay with that, but I’m disheartened that people saw the index as a way to capitalize on their work and not as the great tool it was meant to be. This article might not address all the concerns I heard along the way, but I wanted to provide some sense of transparency for the project, and its purpose, and, in the end, the reasons for its demise.

If you made it all the way down here, I congratulate you. Hopefully it was somewhat edifying, or interesting, or maybe even entertaining. If you have any questions about the Index. Any at all, you can post them in the comments, or feel free to send me an email.