Jump to content
You need to play a total of 5 battles to post in this section.
Snargfargle

A small test of the Matchmaker

17 comments in this topic

Recommended Posts

5,418
[PSP]
[PSP]
Members
10,125 posts

Playing Tier VIII Atago all games. 

Conclusion:

  • MM is random for average WR.
  • MM is random for average XP.
  • Tiering is not biased.
  • Higher WR team not guaranteed a win.
Tier My team   OT   WR wins
  WR XP   WR XP
10 47 1233 48 1406 1
8 50 1349 52 1122 1
8 47 1021 50 1250 0
8 48 904 49 916 1
10 42 1055 48 1058 0
8 52 1088 52 1097 0
8 52 1126 49 1142 0
8 51 1067 54 981 0
10 47 1334 52 1325 0
8 48 1141 48 1115 0
8.6 48.4 1131.8 50.2 1141.2 0.3
           
MWU NS 48.4 50.2      
MWU NS 1132.8 1141.2      

Share this post


Link to post
Share on other sites
107
[WOLFG]
Members
222 posts
5,775 battles

Nice info Snarg but MM likes you and doesn't like me.  Alas, when I play my Cleveland that is tricked out a bit for AA, I get less battles with CV's.  However, if I play my Graf Spee, it's torp bombers all over the place.  Maybe it's become confirmation bias but  I want MM to like me so I am prepared to make an offering, maybe sacrifice a goat.

Share this post


Link to post
Share on other sites
569
[CVA16]
Members
2,788 posts
8,425 battles
19 minutes ago, Snargfargle said:

Playing Tier VIII Atago all games. 

Conclusion:

  • MM is random for average WR.
  • MM is random for average XP.
  • Tiering is not biased.
  • Higher WR team not guaranteed a win.
Tier My team   OT   WR wins
  WR XP   WR XP
10 47 1233 48 1406 1
8 50 1349 52 1122 1
8 47 1021 50 1250 0
8 48 904 49 916 1
10 42 1055 48 1058 0
8 52 1088 52 1097 0
8 52 1126 49 1142 0
8 51 1067 54 981 0
10 47 1334 52 1325 0
8 48 1141 48 1115 0
8.6 48.4 1131.8 50.2 1141.2 0.3
           
MWU NS 48.4 50.2      

 

WU NS

1132.8 1141.2      

Snarg, Ghostdog  here brother. You and I are tight. We've fought together, we talk offline, you have included my ship the BB DogPound and myself in your parodies. But Snarg I have to admit that you got me, Skip, Pippi and Buttkuss confused on your numbers. Remember we are dogs and shoot red ships. Love you man.

 

th?id=OIP.RtGeszUb-yjGTAhAJ1YJnwHaHu&pidMy sweet dog,Sam. - Dogs Photo (32838680) - Fanpop6920911252_58a7c49867_n.jpgth?id=OIP.lfMaGWxg_cF8UM0S1hXy4AHaLH&pid

Share this post


Link to post
Share on other sites
1,739
[GWG]
[GWG]
Members
6,518 posts

Um...  

I only see 10% of the games where your team WR was higher than the other team.

I only see 40% of the games where your team XP was higher than the other team. 

Your WR for the set is 30%

I call this a bad RNG day.  What was the server population during these battles?

 

Share this post


Link to post
Share on other sites
360
[TDG]
Members
1,736 posts
9,947 battles
15 minutes ago, AVR_Project said:

Um...  

I only see 10% of the games where your team WR was higher than the other team.

I only see 40% of the games where your team XP was higher than the other team. 

Your WR for the set is 30%

I call this a bad RNG day.  What was the server population during these battles?

I think that 30% is for times the team with the higher WR won.

Based on what I can see, I cannot be sure what the actual WR was.  (Though I would guess at least 40% from what I can see.)

The only conclusion strongly supported is the last one.

Finally, @Snargfargle, are you looking at the averages of the Green team after removing yourself?

Share this post


Link to post
Share on other sites
5,418
[PSP]
[PSP]
Members
10,125 posts

What I am looking at is here whether the matchmaker is biased or not. This is not the only data set I've collected. I've analyzed over 50 games and all show the matchmaker to be unbiased. Statistical analysis should put the matter to rest but I can see that some still nit pick the data and try to make it say one thing or another. Most people don't understand comparative statistics. If two samples can be shown to have the probability of having been randomly collected from the same population (or different populations having the same characteristics) then they are not significantly different.

If you want to disprove the hypothesis that the matchmaker os random collect your own data and perform the appropriate statistical tests, don't just look through the data and interpret it as you see fit. With small sample sizes, a Mann-Whitney U test is appropriate.

Note too that a higher average win rate for one team doesn't necessarily mean the team will win. In fact, more often than not in these games the team with the higher average win rate was the one that lost. There are too many complexities in the game to say that one team or another is "better" just because its players have collectively better stats.

Edited by Snargfargle

Share this post


Link to post
Share on other sites
1,657
[WOLF1]
Members
4,984 posts
2,259 battles
7 minutes ago, Snargfargle said:

but I can see that some still nit pick the data and try to make it say one thing or another. 

Of course, Snarg.  Most interpret the statistics to say what they wanted to see in the first place.  Of course, the other 84.86% of the statistics are made up on the spot.

 

 

Share this post


Link to post
Share on other sites
1,739
[GWG]
[GWG]
Members
6,518 posts

This was the Iron Duke when it appeared simultaneously in both accounts.

Both had the same upgrades and stock captains, set up in the exact same way.  There was a Random 'boost' mission going on.

These are the missions chronologically the way they were played.

You see I collect a bit more info on the Matchmaker -- team consist.  You will note some class disparity in some cases.

Noisy as it is, the data showed me a couple quirks I needed to attend to.  This was before I added population count to the roster.

Iron_Duke.jpg

Share this post


Link to post
Share on other sites
437
[KRAB]
Members
877 posts
7,323 battles

How you managed to get put on a team with a 42% average WR...

 

That ONE outlier makes most of the difference. I wonder how rare teams that bad (or as good as ~58%) are?

 

Also, from my work with WoT I think win rate statistics will form more of a gamma distribution than a Gaussian, if it makes a difference. 

Share this post


Link to post
Share on other sites
1,840
[ARRGG]
Members
5,770 posts

I love this Snarg However it will never settle the MM Bias whine from those who refuse to see the numbers. Thanks for doing the work.

Share this post


Link to post
Share on other sites
5,418
[PSP]
[PSP]
Members
10,125 posts
7 hours ago, MaxL_1023 said:

How you managed to get put on a team with a 42% average WR...

This is why you use nonparametric tests when dealing with small data sets. Nonparametric tests compare median versus mean values and thus are robust to outliers.

Edited by Snargfargle

Share this post


Link to post
Share on other sites
1,739
[GWG]
[GWG]
Members
6,518 posts

There is a problem with averaging. 

If one bases their conclusions on averages, then we needn't be concerned with safety from school shootings or air travel safety.

I like to look at the blown data points in the raw data and ask, "What the heck happened there?"

But I'm a safety officer at work, so that is my mindset.  Chances are that no accident will ever occur from not having safety guards on the equipment...  but that one time...  To the safety officer, managers who average risk assessment are the biggest danger of all.  Morton-Thiokol engineers said, "Don't launch the Challenger" and raised the alarm.  Managers said, "we launched it like this before....  so Launch is GO".

But we are questioning MM and using statistics.

I'm finding out many compulsive gamblers are very good statisticians, number crunchers, and math whiz kids.

They deal with the uncertainty of what I call 'Fairness'.

If one flips a coin 20 times, and it lands on tails 20 times in a row, the chances of 21 tails are astronomical..  So compulsive gamblers will risk their life's savings on HEADS...  But the coin flipper is no longer a physical coin.  It's a program inside a machine who sees the massive bet and makes another TAILS.  To add insult to injury to the compulsive gambler, the next flip is a HEADS...  But by then he considered the machine broken and only placed a minimal bet.

The wonder of THIS particular program which disturbs me the most, and I see it in the data:  Over time, it knows when it's being watched and becomes more balanced and fair.

ADVICE for WOWS:  Quit for two hours after 3 straight losses.  Play the same ship as if you are grinding it.  Data shows I'm not getting better..  the game is just becoming more 'fair'.  And if you think that's too much thought..  I must remind you that the programmers are Russians..  They are genius Russians.

Do I take my own advise?  Heck no.  I just like blowing stuff up. 

I spend more time admiring the ships in port rather than playing them.

Share this post


Link to post
Share on other sites
5,418
[PSP]
[PSP]
Members
10,125 posts
1 hour ago, AVR_Project said:

If one flips a coin 20 times, and it lands on tails 20 times in a row, the chances of 21 tails are astronomical. 

No, the chance of the 21st coin coming up tails is 50%. The chances of flipping 21 coins and having them come up 21 tails is astronomical. Each individual coin flip has a 50% chance of coming up tails.

Edited by Snargfargle

Share this post


Link to post
Share on other sites
1,739
[GWG]
[GWG]
Members
6,518 posts
6 hours ago, Snargfargle said:

No, the chance of the 21st coin coming up tails is 50%. The chances of flipping 21 coins and having them come up 21 tails is astronomical. Each individual coin flip has a 50% chance of coming up tails.

I know that.  You know that.

But a compulsive gambler will see it differently.

  • Cool 1

Share this post


Link to post
Share on other sites
575
[KP]
Members
2,066 posts
22,015 battles
19 hours ago, CLUCH_CARGO said:

I love this Snarg However it will never settle the MM Bias whine from those who refuse to see the numbers. Thanks for doing the work.

I think the whine is more being uptiered. I'm doing the same in my Mogami at the moment my last 10 battles have been 6 tier 10, 2 tier 9 and only 2 where I have been top tier. I decided to finally keep track of this stuff just to see if there is anything to the tinfoil hat crowd theories. I wish I hadn't started keeping track as it's starting to piss me off how many times I see +2 MM. 

Share this post


Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

×