Creative freedom, bravery, and risk in games development/publishing

Machina said:

JWeinCom said:

*snip*

Just wanted to say your last few posts in this thread were excellent and I'm glad that, when you researched our scores way back because you felt that there was a bias and wanted to see if the data would confirm that, when the data showed no bias against a particular manufacturer but rather that we're ~10% below the Metacritic average on the whole, you changed your conclusion. Props for that.

And you're correct - we, or at least I, have always aimed for VGC reviews to have a reputation for being tough on scoring and hard-to-please when it comes to our reviews and their scores. That's something I've cultivated over the last 10 years or so by explicitly stating it during the recruitment process, by having a firm review methodology, and by having a peer review process for all reviews. We definitely do not have a 'you must score this below the Meta' rule or attitude, but we have descriptors for each of our scores that the text of the review needs to reflect, and those descriptors set a high bar (they were the result of discussion and compromise amongst all review staff). If the reviewer genuinely thinks the game deserves an 8 on that scale and the text of the review matches the criteria for an 8, then that's ultimately what the game will get from us, even if the Metacritic average for it is say 6.5. And vice versa of course.

I noticed in one of your posts you disagreed with my pride in this tough approach, which is fine and I understand why people feel that way. Why do I like our approach though? Well, firstly because I'm quite a cynical and hard-to-please guy by nature. Another major reason is that I've always wanted our scores to actually mean something, especially on the upper end of the scale. If you give out 9-10s like candy to every hyped AAA game then, to me, your scores have no meaning or weight (and you're easily pleased). What use is that then to an audience really? And doubly so to people who want an honest assessment of the game they might be thinking of purchasing. But if your average over the last 8 years is 6.3 and you give a game 9.5 then I'm inclined to take notice and at least find out more about it, and if you give something a 10 then, well, it must be really fucking good.

A third and more minor point is I also feel like more of the ten point scale should be used. Granted, the process to getting a game to market is arduous and self-filtering, so there are very few games in the 1-3 range (unless you're mostly reviewing all of Steam's new releases), but what's the point in having a ten point scale and then only ever using five points on it (6 - and even that one rarely - then 7, 8, 9, and 10)?

Those are just my thoughts though, and while I ultimately make the final call on site policy I'm not some sort of dictator. Others on the team have and continue to contribute to our overall approach to reviewing, and I'll always take on board their feedback and try to reach a consensus where possible. Evan (aka Veknoid), for example, wrote most of the review methodology text (and did a great job imo). Lee's (coolbeans) input directly resulted in several word tweaks and a complete change to our method for giving out a 10. And during recent discussions, which eventually resulted in the methodology text being altered and scores for remasters being dropped, most of the team added their own views and we ultimately reached a majority decision on the changes.

You're in charge and you can do things how you want to. That being said, I strongly disagree with the review system.

The 1-10 scale is essentially a language. The purpose of language is to communicate clearly and effectively. And, when you are using language in a different way than everyone else, that's going to lead to confusion. Which is kind of what we see hear. I obviously don't think Jaicee's accusation of bias was justified, but it's not hard to see how she came to that. Like I said, I got that impression as well. And most people probably have more active social lives and would not spend all that time on actually testing things out.

I get how the review methodology works, but I think it's flawed. According to the methodology, anything above an 8 is a potential GOTY nominee, at least for some category. That means that about 1/4 of the possible scores are reserved for the handful of GOTY nominees. Meanwhile, 6.5 or below is considered "decent" with anything below a 6 being classified as an unsatisfying or incomplete product. That's 12 of the possible values.

So if anything 8 or above is reserved for GOTY candidates and anything below 7 is, at best, decent, where does that leave games that are good but not quite great? Well, somewhere in the 7 range.

Which is what I found when I looked into it (at the time there were no half values which help a little bit). There were zero games that scored a 0, zero that scored a 1, one that scored a 2, zero that scored a 3, four that scored a 4, five that scored a 5, seventeen that scored a 6, thirty nine that scored a 7, fourteen that scored an 8, five that scored a 9, and zero that scored a 10. Nearly half (43.8% to be exact) of all the games scored a 7. 57% score either a 7 or 6. 71% scored between a 6 and an 8. Maybe that has changed since I looked into it (I believe that the most recent game I looked at was Xenoblade Chronicles HD), but as I see it the review methodology funnels everything towards a 7. Of the games I looked at, Resident Evil 3make, DBZ Kakarot, Iron Man VR, Minecraft Dungeons, Shenmue 3, Trials of Mana, Hatsune Project Diva, Pokemon Sword/Shield, Retro Brawler Bundle, and The Last of Us 2 all received a 7. Are all those games really of equal quality? I know that there are half points now, but still, the review methodology leaves almost nowhere to put "good not great" games.

And honestly, most games should be in that "good but not great" range. Games get reviewed either a)because they sent in a copy or b) because the reviewer bought it themselves and wanted to review it. Companies generally aren't going to send out many games that are genuinely bad (which is why the data I have shows only 10 reviews under 5 out of nearly 100 games), and very few will be GOTY worthy. So, having so few options for "good" scores is a major problem.

If you think the typical 1-10 scale is flawed, then using it in an idiosyncratic way is not the answer. Again, whether justified or not, people are going to think you're speaking the same language as other sites who use 1-10, and that's just going to lead to confusion. A much better solution is to use an entirely different system to score games. Gamexplain for instance uses a system that goes from something like "hated it" to "loved it" which is a a really clear way to express how the reviewer felt about the game. An F to A+ system also works really well IMO because it's something that's familiar to people, and it allows a wide range of scores. You have 12 different "passing" grades, so you can still reserve As and A+s for the cream of the crop while also having a nice range of possible scores for games that are above average but fall short of greatness.

Again, it's not my site, and you could do things how you want, but since I did moderate the comments of reviews for a while, I can say that there are a lot of people who take reviews the wrong way (which to be fair is maybe unavoidable on the internet). If your goal is for reviewers to be able to convey their thoughts on a game as clearly as possible, I don't think the current system accomplishes that. By using a completely different system you get to do things differently than other sites with less risk of being misinterpreted. Everyone's on the same page, which is the whole point of communication.

Last edited by JWeinCom - on 07 July 2021

Existing User Log In

New User Registration

Sony - Creative freedom, bravery, and risk in games development/publishing - View Post

Recent Badges: