The Problem with Numerical Scores

By in Articles | Comments closed

William Shakespeare once penned a line for one of his most indelible characters to utter; “What’s in a name?” Shakespeare used the rose, which would smell the same if it were called something other than a rose, as an example of how titles can sometimes mislead. Within the journalistic genre in which I put my pen to the test (when I’m not writing books like this that you should read… I know shameless self promotion!) the problem is often not names. Though we could easily re-apply Shakespeare’s words to those times when bands change members/sounds/styles, as they are so prone to do, the problem for those of us who review albums (and those of us who read reviews) is of a different sort.

ShakespeareFor us, the problem becomes the dreaded numerical score that we must (as a convention set in place long ago) ascribe to each album we review. Now, let’s be clear. This is not solely the domain of musical “products.” Since long ago films, artwork, albums, books, video games, and other forms of creative property have been looked at by critics and given a numerical value based upon their supposed merit. However, there are a few problems that are simply intrinsic to the process. Googling the words “IGN scoring system” alone provides enough insight into the problem to confound even the wisest of us (IGN has become infamous for changing their scoring system semi-frequently).

Now, I am a man of many hats. In addition to promoting my recent book (there, I did it again… shameless), writing for IVM, and working as a youth pastor, I am also an adjunct college professor. As a professor, I get to give my students a rubric, syllabus, and other tools that let them know exactly what I expect from written papers and how I will be grading them. When I grade these papers, I grade them on how well they wrestled with the content and adhered to the rubric. If a student failed to work through a topic or point that was clearly outlined, they lose credit.


Unfortunately, there can be no such rubric for creative works such as music. Trying to invoke such a guide goes directly against the heart of creativity itself. Creative projects of any form are always subjective products and, therefore, can only ever be “rated” subjectively. It is therefore that all ratings given to creative projects of any kind are sort of wibbly-wobbly.

I came excruciatingly close to titling this article, “The Problem with Numerical Scores and Why Dr. Who has the Answer to Everything.” You see, this summer I fell in love with Britain’s true gift to the world, Dr. Who. The show (spanning an incredible 50 years!) features the exploits of The Doctor, a time-traveling being who just can’t avoid frequently saving the world. In the show (starting during the tenure of the greatest of the Doctors, David Tennant) the Doctor describes time itself as “Wibbly-wobbly timey-wimey stuff.” This is the explanation given as to how the Doctor can run into people at different points in their own lives, but yet cannot cross back into his own, among other paradoxes.

So, when I see comments on our site about how two albums of disparate style can each receive the same score, my answer is that scoring an album is really just “Wibbly-Wobbly Album-Reviewy stuff.” Because there can never be a rubric for what music in general (or any of its sub-genres or artists in specific) can or must sound like, all album reviews on any site are all subjective fare. For example, I gave August Burns Red’s and Leaders’ most recent albums each a 4/5. Are the albums exactly the same level of “good?” Wibbly-wobbly.

Just so I’m not adding a confusing element into the mix, I’m not aiming this article at any specific post on our site… or posts in general, but I do think that some specific posters have had some great points about the idea of the numerical scoring system in the first place. You see, even though ABR and Leaders are in the same overarching field of music… even in the same overall classification of “hard” music… there is no way to truthfully

Van_Gogh_-_Starry_Night_-_Google_Art_Project compare the two. One is metalcore, while the other is hardcore. While the two may overlap at times, each has its own history and set of rules. It would be even more mind bending to try to compare August Burns Red’s newest album with, say, Hawk Nelson or Lecrae’s albums. Even though we’re still rating “music,” it’s like trying to compare a review of C. S. Lewis’ Chronicles of Narnia with Van Gogh’s Starry Night. The two items are completely different works of creativity. Even if both were given a perfect 5/5 (on IVM’s scale) for the simple sake of argumentation, the two ratings would still mean different things.

So, the numerical scores themselves are subject to the type of creativity they are supposed to be judging, but they are further bifurcated in the music world by what genre or style of music each represents. I would not expect to come away with the same score on a CCM album as Jonathan Andre, nor would I expect him to come away with the same score as me on Demon Hunter or For Today’s next album. This is why our site (and most other review sites) divide their writers up by their preferred genre.

So, it gets even more wibbly-wobbly. Not only are there different types of creative outlets with no ability to set a rubric, but there are also personal preferences. What if I were to tell you, purely for illustrative purposes, that The Dick Van Dyke Show is my favorite all-time TV program (with Dr. Who certainly making its way up the list) or that C. S. Lewis is my favorite author. What if I were to give each a 5/5 in the entirety of their corpus? You could still respond with, “Oh, that’s all well and good, but I really don’t like reading and anything that came out before 2005 on Television is just a little too old for my TV tastes.” God forbid anyone not like reading or things that came out before “their time.”

So, when I give Leaders’ Indomitable the same numerical score as Jonathan Andre does for The City Harmonic‘s newest, does that mean that the two albums are exactly equal in measure? Well, that’s sort of wibbly-wobbly. Perhaps the answer is in honing the ratings scale itself. Perhaps if we went back to a ten point scale, the problem would solve itself. Perhaps that is the answer?

Would the problem be solved if we were to use a 100 point scale? What about a 1,000 point scale (curse the day such a thing would ever exist)? No. No matter what scale is being used, the same underlying problems remain. Different people with different tastes are going to find differing levels of enjoyment out of artistic endeavors that, themselves, are not a fit for an objective ratings system.

You see, art is subjective. Now, I am a huge proponent of the fact that, unlike the cult in culture would try to have you believe, there is such thing as absolute truth. There are truths that are true no matter who you are, no matter what era or epoch you’ve lived in, and no matter what circumstances come to bear on the situation. Absolute truth exists. What a revolutionary statement that is! However, numerical scores on creative projects are never an absolute truth.

Take for example my own review of Project 86’s Wait for the Siren. For some reason when I first reviewed the album I foolishly gave it a 4/5. Upon further listening to the masterpiece that Wait for the Siren is, I later changed the score to a 5/5 (read my reasons why, and how I tried to maintain journalistic integrity after). Or, take for example Disciple’s Oh God Save Us All, which I also gave a 4/5. In looking back on that album, I would probably give it a 3 at this point. While I still maintain that it holds up better than some of their discography, it simply hasn’t had a lasting impact on me or the genre with a little distance between the review and now.

So, does that mean I have no true journalistic integrity? You may be inclined to think so. You’d be even more inclined to think that if you knew just how many review scores I actually question whether I hit the right mark on or not. But, I dare say I’m not alone in this conundrum. IGN, a popular media review machine itself, has famously changed their scale a few times, and are infamous for setting 7/10 as the “average” rather than 5/10. You see, numerical scores are always a problem with creative works. It’s all just a bit wibbly-wobbly.

highscoresSo, do we do away with numerical scores all together? Some of our lead staff have questioned the idea! That might make people read the reviews rather than skipping straight to the score. Doing away with numerical scores also immediately relieves the problem of too many reviewers giving 3/4’s (or 7/10 elsewhere) too often, among a host of other issues. You see, mathematically a “3” is the definition of “average,” but how do you define “average” for each genre and each artist within each genre? And, a bigger issue, does the statistical “average” change with each release, or at least each “game changing” release? How does that give new perspective on scores for past albums? Wibbly-wobbly.

Getting rid of numerical scores throws the baby out with the bathwater and does little to solve the problem. There is a reason that most every site has a socially sourced review system in place. If I’m looking for a new blender, a new app, or a new album, I want to see that shiny numerical score staring me back in the face, comforting me with its warm glow. It lets me know that everything will be alright. I also know many people come to sites like ours and skip straight to the numerical score to gauge whether or not they want to even invest their time in reading about a creative project in the first place. Heck, I’ll admit it. I’ve done that. If I am only mildly interested in an album, I’ll sneak straight to the numerical score and decide from there if I want to read on or not.

cats need bathsSo what is the conclusion from all this incoherent rambling? Could you have missed it? My main point, recurring throughout this post is simply this: rating creative efforts with any sort of scale is like trying to give a cat a bath. You’re going to have to fight and struggle to make it happen. You’re going to get some cuts and bruises along the way. But, in the end, cats need baths.

Numerical scores may be imperfect, but they are a necessary imperfection. When you see a score on a site like ours (or, you know… ours), what you’re seeing is a personal (educated, I might hope) opinion about a specific album from a specific band in a specific genre. In truth, the number is the reflection of both the long history of music itself and the varying history of the reviewer who assigns it.

Or, to put it another way: It’s all just a bunch of wibbly-wobbly timey-wimey stuff.