Thursday, September 29, 2011

Issues with my scoring system

I am sure that in reading the reviews here you wondered to yourself why exactly I decided to the scoring system as a I did.  Well, here is a brief explanation for some of the more common issues that I could think of.  I am sure there other questions, but these seemed to be the most likely ones.  

Why do you need a scoring system in the first place?

I find that a scoring system helps organize and delineate a person's needs and wants. I made a scoring sheet for purchasing a house and we are pretty happy. I did it with each car purchase. It is a way of taking different things, looking at them closely, and figuring out not only what works but also what you want. My hope is that the criteria I have chosen are things that people are universally interested in and that the scoring system is both informative and intuitive. If it is both interesting and useful, then the scoring system has accomplished its intended purpose.

One of the things that really changed the way people view wine was the advent of Robert Parker's very systematic scoring system.  It gave people a standard language to reference and allowed people to compare different wines from different places quickly and easily.  I have no illusions that the scoring system I have is going to be like that, but it is a start.  Someone else, someone with more knowledge, more gear, and better writing skills will come along and take another crack at this thing and give gear folks a standard nomenclature.  I'll keep trying in the meantime. 

Where did the idea of giving stuff scores based on specific attributes come from?

It was a shameless rip off of the old GamePro scoring systems, combined with the scoring system.   One thing that really gave me the idea was, in my super nerdiness I created a checklist of features for the flashlight I was looking for to replace my old Lumapower LM303.  I discovered that checklist in a folder on my computer and viola...    

Why 20 points?

There were a lot of ways to do this.  I could have used a 1-10 scale or a "letter grade" scale, or even the Metacritic 1-100 scale.  There are two different problems. 

With a low number of possible scores, a 1-10 system or even a letter grade system leaves too few scores.  You tend to get a lot of clumping.  There are many, many things that would get an 8, 9, or 10, but there would be a lot of variation in the score.  For example, in a 1-10 scale the it would be hard to given a great blade like the Delica something less than an 9 or a 10, but it is not in the same league, in my opinion, as the Sebenza, which would be a 10.

Then there is the problem at the other end.  In the 100 point scale there is just not that much differentiation between a 99 and 100.  With so many scores the difference between one point is too slight to even verbalize.  It becomes more of a feel thing or arbitrary.  20 points seems to be the happy medium--enough scores to make a difference and not so many scores that a few points is meaningless.

Is an item that score a 20/20 perfect?

No. Claiming an object is perfect, whether it is a car, a cathedral, or a pocket knife, is really hard to do. For me, the DFII ZDP-189 is pretty close to the PERFECT EDC knife, but I know tons of people who laugh at the tiny blade. I can defend this assertion, but again it is an opinion. It got a 20/20. So did the Muyshondt Aeon, the McGizmo Haiku, and the Small Sebenza. Of those, I think that all are excellent, but there are things I would do differently if I were the head designer (Aeon: scalloped or stippled bezel like on the G2X Pro; Haiku: shorten by 1/4 of an inch; Small Sebenza: flipper instead of thumb stud). They all work well and all deserve the score of 20/20, but still there are things I'd do differently. With the DFII ZDP-189 though, there is nothing I'd change. After a few months of carry, I can't really think of anything I'd do differently. So a 20/20 COULD be a perfect score, but not necessarily.

A score of 20/20 simply means that the item is very well done in all aspects.  

How do you decide what to review?

Thus far I have reviewed only those things that I have purchased with my own money, except for the Dark Sucks MC-18B, which was sent to me my Dark Sucks himself. I bought the Skyline, the Serac S3, and the Dragonfly II ZDP-189 for review, though I had planned on buying the DF II any way. I do, like I am sure most of you, a lot of research before I buy something, so the stuff that I am looking at has already passed through a filter of sorts.

Why are the scores, on average, higher than the average score of 10/20?

For the reason just mentioned, I have already filtered the stuff I am going buy so the very low scoring items, things that would score a 1-5 have already, hopefully, been eliminated by my pre-purchase research. Since all of the items I have reviewed are things I have bought (with one exception) that means that I am going to start out, naturally, with a high score (otherwise I would not have bought the item in the first place). That said, stuff still slips through that does very poorly, such as the Arc6 (6/20)and the Kershaw Scallion (8/20).

Why is there no consideration for price?

Here is a bit about why I don't mention price. It all goes back to this idea of performance v. value. I am not sure what you, as a buyer are looking for. I don't know what you want to do with a given item. Is it going to be a daily use tool or a shelf queen? Are you one of those people that can't handle lithium cells or liner locks? Do you have a limited gear purchasing budget? All of these are considerations that I can't know and thus determining value and mentioning price in terms of its impact on a given item's score are things I don't do.

If you really want a value score, divide the price by the score. For example, the Spyderco Tenacious got an 11. It costs about $30. Value: 2.72. Compare that to the Sebenza. It got a score of 20, but costs $330. Value: 16.5. The closer to zero, the better the value (imagine a light getting a score of 20 and being $20: it would have a value of 1, meaning for every dollar you spent you got 1 point of performance).

But here are two problems with this simple math.

First, value is not simply getting more. Sometimes there are things you just need and without it, the item is worthless.  Other times additional features are just useless to a given user. For me, a knife I can't open with one hand is worthless. No score will tell me that and thus, as a value proposition, comparing the score to the price doesn't matter. I can imagine a lot of people have features like that, features, if absent, are deal breakers, and so the idea of dividing price by the score can't tell you if something is a good value because it may have a very low ratio of price to performance and still not do the one thing you want.

Another thing is that, for me, the scale is not just one point equals slightly better. Because there so many different attributes I am looking at on a given item, the chance that the thing gets a perfect score is low. Significantly lower than the chance that it has only one or two minor drawbacks. We live in an age of pretty darn good gear, so lots of stuff will be in the 12-17 range. This means, again, that value is difficult to quantify. If, as I have planned it, a score of 20 is exceedingly rare, then simply dividing the price by the score is not fair. A score of 20 is not, in my mind, simply one point better than a 19. It represents an exceedingly uncommon confluence of factors and should be seen as significantly better than a score in the 12-17 range and more than just slightly better than a 19.

Why is there no score rating how well a knife cuts?

I had originally planned on making an criteria related to cutting performance, but after a while I realized that this did not make sense. First, when we say that a knife cuts well we are actually looking at four very different things. A knife can be good at taking very fine cuts, i.e. slicing. It can be good at detail work, like picking splinters or cutting out newspaper articles. It can also be good at hacking work, like whittling. Finally, a knife could be good at stabbing tasks. All of these are things that we can be referencing when we say that a knife is good at cutting. But since there are so many different types of cutting, I prefer to explain a blade's cutting performance in different ways. Generally a knife's cutting performance is defined by three things: 1) the blade shape; 2) the blade grind; and 3) the steel. Each of these is an criteria I score and so if you really want to know if a knife is good at cutting and which kind of cutting it is good at, look at the score for each of these three criteria.

What is the difference between "Carry" and "Retention Method" in the Knife Scoring System?

Carry means how the item feels in the pocket.  Does it weigh you down?  Does it clog the pocket opening?  Does it bang around a lot?  Does it pinch, poke, or protrude?  Retention method is bound up with carry, in that usually we are talking about a pocket clip, but it could be a nice lanyard or even a sheath.  I  separated the two because they are really different things.  For example, the Leafstorm carries very well, but it has a terrible retention method, in part because the clip is oddly placed on the knife.

These are all of the questions I can think of, myself.  If you have any, let me know in the comments and I will try to respond to them. 


  1. Tony,
    I love your reviews and appreciate your scoring system. I enjoy your pragmatism, clarity, and articulation. Very well done. I would however, like to see more reviews of blades and lights go across your table!

  2. I like the scoring system a lot. You put a lot of thought into it and it's proving to be a very smart way to evaluate things. Keep up the great work man, and I'm with "MostMen" - love the knife and flashlight reviews.

  3. Tony,

    Have you ever considered posting VIDEO reviews, either on your blog or on Youtube? I think many would appreciate it...

  4. I have thought about video reviews, but I don't really have the time to do them the way I want to do them. Having a little guy takes up an enormous amount of time.

  5. This comment has been removed by the author.

  6. I agree that there is not enough variation in a 1-10 point scale, and too much in a 1-100 point scale, but why don't you try to meet closer to the middle? Something like a 1-50 point scale, 5 points maximum for ten categories? The problem with the current system is that S30V is good, but S35VN and ZDP-189 are BETTER, yet with your current scale, three knives each with blades of one of these steels would all receive a 2/2 for materials...see what I mean?