Jump to content
You need to play a total of 20 battles to post in this section.
Phydeux

Please explain to me how this is balance match making?

71 comments in this topic

Recommended Posts

16
[-N-]
Members
14 posts
14,806 battles

Pretty sure the purple and green clan members are far better than guys on my side.  Does MM not take a players WR, etc into account?

World of Warships Screenshot 2021.02.17 - 21.18.04.02.png

  • Haha 2
  • Boring 4

Share this post


Link to post
Share on other sites
725
[TIMT]
Members
1,365 posts
4,910 battles
6 minutes ago, Phydeux said:

Pretty sure the purple and green clan members are far better than guys on my side.  Does MM not take a players WR, etc into account?

It does not, never has.

  • Cool 3
  • Thanks 2

Share this post


Link to post
Share on other sites
5,032
[WOLF9]
Wiki Lead
15,759 posts
4,771 battles

chef-spud.png.34ba281cca0f0256894d13cc11d575c6.png  WR ?  What's that?

  • Funny 1

Share this post


Link to post
Share on other sites
SuperTest Coordinator, Beta Testers
6,802 posts
12,597 battles

Win rate is not used for matchmaking. Ranked is a jumble of players mixed together.

Share this post


Link to post
Share on other sites
1,211
[RAGE]
Members
2,004 posts
9,095 battles

I have pretty average WR in randoms, as do many in my clan.  Your success and rank in Clan Battles comes from coordination and coms, something very lacking in randoms (and ranked is just random battles with smaller teams and better rewards).  Purple clans are pretty selective, but I wouldn't read too much into how "good" a player is just based on their being in a storm or typhoon clan.

Share this post


Link to post
Share on other sites
1,846
[RKLES]
[RKLES]
Beta Testers, In AlfaTesters
6,488 posts
22,872 battles
7 minutes ago, DouglasMacAwful said:

I have pretty average WR in randoms, as do many in my clan.  Your success and rank in Clan Battles comes from coordination and coms, something very lacking in randoms (and ranked is just random battles with smaller teams and better rewards).  Purple clans are pretty selective, but I wouldn't read too much into how "good" a player is just based on their being in a storm or typhoon clan.

Shh.....

Your spoiling my reverse boosting....

Yes I hide behind my stats....

1st 10k games spent mostly derping around in dds with friends....

 

 

Share this post


Link to post
Share on other sites
1,046
Members
5,651 posts
10,985 battles

13K battles and you don't know how MM works????  WR is a lousy way to determine any type of "skill".

Edited by CylonRed
  • Cool 1

Share this post


Link to post
Share on other sites
6,382
[WOLFG]
Members
32,288 posts
9,989 battles
17 hours ago, Phydeux said:

Does MM not take a players WR, etc into account?

 

Ship type and tier, that's it.

Share this post


Link to post
Share on other sites
258
[WOLF2]
Members
310 posts
7,287 battles
1 minute ago, Skpstr said:

Ship type and tier, that's it.

I'm genuinely curious -- how do you know that for certain? Has that ever actually been explicitly stated by Wargaming?

Share this post


Link to post
Share on other sites
Members
0 posts
18 minutes ago, HTSMetal said:

I'm genuinely curious -- how do you know that for certain? Has that ever actually been explicitly stated by Wargaming?

yes. In some rare circumstances, ship types won't be perfectly mirrored but for the most part they are. The only times when the tiers aren't mirrored is when you get stuck with a fail div.

Edited by JaeYuu80sevens

Share this post


Link to post
Share on other sites
753
[-TRM-]
[-TRM-]
Members
2,513 posts

You dont get to choose.

In grade school gym when two leaders have to pick people for a dodgeball game, slow and unagile kids are the very last to be picked and then only for the least amount of limitations first.

You load into a map with whatever you want to fight with and play the game. Otherwise in Russia, your game plays you via MM etc. There is no point in stressing about stuff you cannot change as a player.

Share this post


Link to post
Share on other sites
Members
7,360 posts
17,400 battles
9 hours ago, CylonRed said:

13K battles and you don't know how MM works????  WR is a lousy way to determine any type of "skill".

So what is skill? If I want to win games, I want players on my team that can win games. Not sure what other metric I would use. 

Edited by DuckyShot
  • Cool 2
  • Thanks 1

Share this post


Link to post
Share on other sites
6,382
[WOLFG]
Members
32,288 posts
9,989 battles
39 minutes ago, DuckyShot said:

So what is skill? If I want to win games, I want players on my team that can win games. Not sure what other metric I would use. 

That's my take. If some guy can't hit the broadside of a barn, and can't miss running into an island, but he wins, I want him on my team.

Share this post


Link to post
Share on other sites
7,068
[WORX]
Members
12,638 posts
19,907 battles
19 hours ago, Phydeux said:

Does MM not take a players WR, etc into account?

Nope... Never will either... As your pic shows... WR can be manipulated...

Share this post


Link to post
Share on other sites
258
[WOLF2]
Members
310 posts
7,287 battles
1 hour ago, JaeYuu80sevens said:

yes. In some rare circumstances, ship types won't be perfectly mirrored but for the most part they are. The only times when the tiers aren't mirrored is when you get stuck with a fail div.

Because I was curious, I decided to do some Google-Fu about the Wargaming matchmaker and how it functions, and I discovered this:

Dynamic battle session matchmaking in a multiplayer game -- Patent No. US8425330B1 -- https://patents.google.com/patent/US8425330B1/en

This is a very interesting read, and if Wargaming (the patent holder) has implemented even a tenth of the complexity of the matchmaker described in the patent, it's FAR more complex than what is assumed here (just ship type & tier).

  • Boring 1

Share this post


Link to post
Share on other sites
SuperTest Coordinator, Beta Testers
6,802 posts
12,597 battles
34 minutes ago, HTSMetal said:

Because I was curious, I decided to do some Google-Fu about the Wargaming matchmaker and how it functions, and I discovered this:

Dynamic battle session matchmaking in a multiplayer game -- Patent No. US8425330B1 -- https://patents.google.com/patent/US8425330B1/en

This is a very interesting read, and if Wargaming (the patent holder) has implemented even a tenth of the complexity of the matchmaker described in the patent, it's FAR more complex than what is assumed here (just ship type & tier).

 

The patent for that matchmaking requires a significantly standing larger player base and was never practically implemented in Warships. We can tell this because there are people with sub-45% WRs and people with over-60% WRs with unrestrained growth.

Share this post


Link to post
Share on other sites
258
[WOLF2]
Members
310 posts
7,287 battles
2 hours ago, Compassghost said:

 

The patent for that matchmaking requires a significantly standing larger player base and was never practically implemented in Warships. We can tell this because there are people with sub-45% WRs and people with over-60% WRs with unrestrained growth.

I see that you're involved in supertesting. Are you confirming from a development perspective that this matchmaking patent is somewhat or entirely functionally irrelevant to how the current matchmaker in WOWS actually works? If so, I appreciate the insight!

I highlighted a portion of your quote to see if you could clarify that for me -- I don't specifically see anywhere in the patent where controlling player win rate is expressly stated as a goal of the matchmaking patent. It reads as if it's designed to control battle composition through ship tiers and the number of battles a player has played in a given ship, the success rate they've had in it, and also a check on battle difficulty depending on whether or not the ship is a premium. In reading this it almost sounds like, at times at least, the patented matchmaker can or does use a player's "success" rate in a specific ship to create and balance teams...so if that is the case, then there is more to the matchmaker than the simple "ship type and tier" that seems to be the common assumption if it has been implemented in some way in WOWS.

Either way I find this fascinating.

 

Edited by HTSMetal

Share this post


Link to post
Share on other sites
1,150
[DRFTR]
Beta Testers
3,919 posts
27 minutes ago, HTSMetal said:

I see that you're involved in supertesting. Are you confirming from a development perspective that this matchmaking patent is somewhat or entirely functionally irrelevant to how the current matchmaker in WOWS actually works? If so, I appreciate the insight!

I highlighted a portion of your quote to see if you could clarify that for me -- I don't specifically see anywhere in the patent where controlling player win rate is expressly stated as a goal of the matchmaking patent. It reads as if it's designed to control battle composition through ship tiers and the number of battles a player has played in a given ship, the success rate they've had in it, and also a check on battle difficulty depending on whether or not the ship is a premium. In reading this it almost sounds like, at times at least, the patented matchmaker can or does use a player's "success" rate in a specific ship to create and balance teams...so if that is the case, then there is more to the matchmaker than the simple "ship type and tier" that seems to be the common assumption if it has been implemented in some way in WOWS.

Either way I find this fascinating.

 

this again...  (you didn't super sleuth anything that hasn't been seen before)  wg have stated many times it is not being used.  but you can collect your own data and decide for yourself.

it doesn't actually balance teams iirc... if you are doing well, it may put you in more matches as lower tier, doing poorly it may more than likely make you top tier...

i also don't think it has ever been used in tanks either, which has a much much higher player population.

 

 

 

 

  • Meh 1

Share this post


Link to post
Share on other sites
SuperTest Coordinator, Beta Testers
6,802 posts
12,597 battles
31 minutes ago, HTSMetal said:

I see that you're involved in supertesting. Are you confirming from a development perspective that this matchmaking patent is somewhat or entirely functionally irrelevant to how the current matchmaker in WOWS actually works? If so, I appreciate the insight!

I highlighted a portion of your quote to see if you could clarify that for me -- I don't specifically see anywhere in the patent where controlling player win rate is expressly stated as a goal of the matchmaking patent. It reads as if it's designed to control battle composition through ship tiers and the number of battles a player has played in a given ship, the success rate they've had in it, and also a check on battle difficulty depending on whether or not the ship is a premium. In reading this it almost sounds like, at times at least, the patented matchmaker can or does use a player's "success" rate in a specific ship to create and balance teams...so if that is the case, then there is more to the matchmaker than the simple "ship type and tier" that seems to be the common assumption if it has been implemented in some way in WOWS.

Either way I find this fascinating.

 

 

The developers have publicly talked about it in the past. They have never used it. Most of my background in theoretical matchmaking comes from working with machine learning and P=NP problems (https://en.wikipedia.org/wiki/P_versus_NP_problem).

When we examine a pre-defined scenario, it's easy for us to say "Oh, just use Win Rate" or "Try PR." These numbers certainly do represent "success" but they are ultimately derivations of a player's skill down to a single number. The problem with that is it indicates that WOWS, a complex game with up to 12 v 12 and completely different scenarios pertaining to difficulty, can be portrayed by win rate, and using players as adjustable pieces to balance those out. This is much easier in a 1v1 game like chess, because both players have almost-identical scenarios (minus the 52/48 WR benefit for white over black). This game has not only 4 ship classes, but probably a few dozen sub-classes based on those ships, as well as several hundred individual ships, some of which are so unique it's hard to categorize them consistently. And that's only from the player's perspective. There's also the player's opponents that a complex matchmaker would have to take into account for improved matchmaking and balancing.

For example, a player who runs a Shimakaze may find great success against teams with low radar/hydro count, but very terrible games against carriers, Russian radar, and hunter killer DDs. You can't judge that very easily by win rate or average damage, because the game's meta is fluid outside of damage.

Other factors a theoretical matchmaker would have to include would be modules and captain skills. Does Minotaur have Radar or Smoke? Should games be balanced by DFAA? Does a good player with a 0pt captain outweigh a meh player with a 21pt captain? These answers are tough to 100% determine even for us. Would we say 100% favors the good captain? 95%? There's so many variables that can be used to determine player strength and weakness that actually sitting down and listing some of those out creates a very complex implementation.

It's easy to check our answers by looking at a match the random matchmaker made and say "Oh, this is terrible, we could do better." And that's a P=NP problem. We can tell very quickly when MM poops the bed, but given the provided inputs, would a fully functional super-smart MM program actually be able to do better? There's only a few moving pieces for each slot, and with divs blocking and timers ticking, getting something 80% right is arguably pretty good for a system that is entirely based on RNG, and whether these individual tweaks in a matchmaker would improve beyond 80%, and whether it is cost and time-effective, are questions that need to be considered as well.

 

Practically speaking, if a person or team COULD make a working machine learning algorithm to handle all this and derivate a functional MM that actually works, it would be quite the accomplishment. The patent unfortunately does not properly delve into actual implementation, and I am not confident that "success rate" has anything more to do than 2 or 3 numbers to be actually all that effective, or even practical.

Edited by Compassghost
added conclusion!
  • Cool 3

Share this post


Link to post
Share on other sites
1,046
Members
5,651 posts
10,985 battles
13 hours ago, DuckyShot said:

So what is skill? If I want to win games, I want players on my team that can win games. Not sure what other metric I would use. 

That is the problem and it is a rabbit hole every time someone goes thru the thought process.  Many anti-MM folks believe it is painfully easy but every time they have tried to make it "easy" - there are huge issues. 

Big problem is - how do you know that the lousy 38% WR player is not actually moving their WR up like I did?  there are so many variables that effect the player and their chances or winning (flags, capt skills/points, ship upgrades, Divs, clans, users using OP ships the vast majority of the time....).   Defining "skill" is freakishly difficult and WR does not even come close to define skill.

Share this post


Link to post
Share on other sites
258
[WOLF2]
Members
310 posts
7,287 battles
10 hours ago, Compassghost said:

 

The developers have publicly talked about it in the past. They have never used it. Most of my background in theoretical matchmaking comes from working with machine learning and P=NP problems (https://en.wikipedia.org/wiki/P_versus_NP_problem).

When we examine a pre-defined scenario, it's easy for us to say "Oh, just use Win Rate" or "Try PR." These numbers certainly do represent "success" but they are ultimately derivations of a player's skill down to a single number. The problem with that is it indicates that WOWS, a complex game with up to 12 v 12 and completely different scenarios pertaining to difficulty, can be portrayed by win rate, and using players as adjustable pieces to balance those out. This is much easier in a 1v1 game like chess, because both players have almost-identical scenarios (minus the 52/48 WR benefit for white over black). This game has not only 4 ship classes, but probably a few dozen sub-classes based on those ships, as well as several hundred individual ships, some of which are so unique it's hard to categorize them consistently. And that's only from the player's perspective. There's also the player's opponents that a complex matchmaker would have to take into account for improved matchmaking and balancing.

For example, a player who runs a Shimakaze may find great success against teams with low radar/hydro count, but very terrible games against carriers, Russian radar, and hunter killer DDs. You can't judge that very easily by win rate or average damage, because the game's meta is fluid outside of damage.

Other factors a theoretical matchmaker would have to include would be modules and captain skills. Does Minotaur have Radar or Smoke? Should games be balanced by DFAA? Does a good player with a 0pt captain outweigh a meh player with a 21pt captain? These answers are tough to 100% determine even for us. Would we say 100% favors the good captain? 95%? There's so many variables that can be used to determine player strength and weakness that actually sitting down and listing some of those out creates a very complex implementation.

It's easy to check our answers by looking at a match the random matchmaker made and say "Oh, this is terrible, we could do better." And that's a P=NP problem. We can tell very quickly when MM poops the bed, but given the provided inputs, would a fully functional super-smart MM program actually be able to do better? There's only a few moving pieces for each slot, and with divs blocking and timers ticking, getting something 80% right is arguably pretty good for a system that is entirely based on RNG, and whether these individual tweaks in a matchmaker would improve beyond 80%, and whether it is cost and time-effective, are questions that need to be considered as well.

 

Practically speaking, if a person or team COULD make a working machine learning algorithm to handle all this and derivate a functional MM that actually works, it would be quite the accomplishment. The patent unfortunately does not properly delve into actual implementation, and I am not confident that "success rate" has anything more to do than 2 or 3 numbers to be actually all that effective, or even practical.

Amazing and thoughtful response and that explains a LOT! Thank you!

Share this post


Link to post
Share on other sites
40
[UWANK]
Members
24 posts
6,243 battles

No it doesn't even try to create balanced teams.

World of Warships just randomizes teams.  This would have been acceptable in 1999, but here we are in 2021, and most games are complete blowouts because they don't care enough (or are not capable enough) to even make any attempt whatsoever to create a challenging and fun match by trying to balance the teams.

It's frankly pathetic, and it's the biggest reason why this game isn't worth playing outside of clan battles any more.

Share this post


Link to post
Share on other sites
40
[UWANK]
Members
24 posts
6,243 battles
16 hours ago, Compassghost said:

 

The developers have publicly talked about it in the past. They have never used it. Most of my background in theoretical matchmaking comes from working with machine learning and P=NP problems (https://en.wikipedia.org/wiki/P_versus_NP_problem).

When we examine a pre-defined scenario, it's easy for us to say "Oh, just use Win Rate" or "Try PR." These numbers certainly do represent "success" but they are ultimately derivations of a player's skill down to a single number. The problem with that is it indicates that WOWS, a complex game with up to 12 v 12 and completely different scenarios pertaining to difficulty, can be portrayed by win rate, and using players as adjustable pieces to balance those out. This is much easier in a 1v1 game like chess, because both players have almost-identical scenarios (minus the 52/48 WR benefit for white over black). This game has not only 4 ship classes, but probably a few dozen sub-classes based on those ships, as well as several hundred individual ships, some of which are so unique it's hard to categorize them consistently. And that's only from the player's perspective. There's also the player's opponents that a complex matchmaker would have to take into account for improved matchmaking and balancing.

For example, a player who runs a Shimakaze may find great success against teams with low radar/hydro count, but very terrible games against carriers, Russian radar, and hunter killer DDs. You can't judge that very easily by win rate or average damage, because the game's meta is fluid outside of damage.

Other factors a theoretical matchmaker would have to include would be modules and captain skills. Does Minotaur have Radar or Smoke? Should games be balanced by DFAA? Does a good player with a 0pt captain outweigh a meh player with a 21pt captain? These answers are tough to 100% determine even for us. Would we say 100% favors the good captain? 95%? There's so many variables that can be used to determine player strength and weakness that actually sitting down and listing some of those out creates a very complex implementation.

It's easy to check our answers by looking at a match the random matchmaker made and say "Oh, this is terrible, we could do better." And that's a P=NP problem. We can tell very quickly when MM poops the bed, but given the provided inputs, would a fully functional super-smart MM program actually be able to do better? There's only a few moving pieces for each slot, and with divs blocking and timers ticking, getting something 80% right is arguably pretty good for a system that is entirely based on RNG, and whether these individual tweaks in a matchmaker would improve beyond 80%, and whether it is cost and time-effective, are questions that need to be considered as well.

 

Practically speaking, if a person or team COULD make a working machine learning algorithm to handle all this and derivate a functional MM that actually works, it would be quite the accomplishment. The patent unfortunately does not properly delve into actual implementation, and I am not confident that "success rate" has anything more to do than 2 or 3 numbers to be actually all that effective, or even practical.


I understand that this is not an easy problem to solve.  I'm not expecting a perfect solution.  I am asking for only the most minimal effort.  ANY effort whatsoever would be better than what they do now, which is nothing at all.

Those situations where the team formation obviously poops the bed is a good place to start.   There is a lot of low-hanging fruit here.

The most basic stat-based balancing (WR or PR or avg-xp, or some formula using those) will not create perfectly balanced teams and eliminate blowouts, but it can improve them and reduce the number of stomps that we see.

Perfection is not an attainable goal, and that's never a reason to not even try to improve it.
 

Share this post


Link to post
Share on other sites
725
[TIMT]
Members
1,365 posts
4,910 battles
1 hour ago, Commander_Corgi said:

The most basic stat-based balancing (WR or PR or avg-xp, or some formula using those) will not create perfectly balanced teams and eliminate blowouts, but it can improve them and reduce the number of stomps that we see.

Perfection is not an attainable goal, and that's never a reason to not even try to improve it.

You would think so, but when you try to come up with a detailed idea of how to handle this you will notice it is not so easy. From personal experience I tried to write a python script to see how this could be done, and I ran into the following difficulties:

  • Finding an efficient sorting algorithm. Sounds stupid, and I am fairly sure that there are decent algorithms and implementations out there. Nevertheless, the sorting problem (dividing a set of numbers in two equal sized groups in a way that their respective sums differ the least) is actually a really hard one. If I remember my basic computer science it is one of the NP problems, so good luck finding an algorithm that scales well enough for widespread use.
  • Taking a value such as WR or PR as a representative of skill only works because MM is random. Given that the random aspects average out over large numbers of games, the only constant for a particular player is their own ability, thus their success (WR/PR) are related to their ability. Introduce another player specific element (MM due to WR/PR) and now their WR/PR is also influenced by this. Example: If a 45% suddenly starts seeing not the server average population, but only other 45% players, the WRs of all of these players will go towards 50%. Similar with the 60% crowd. So now everyone's WR is driven towards 50%, meaning WR becomes worthless as a skill metric, similarly PR becomes worthless after a while of this MM system. Further, all systems using formulas etc. using these parameters or any parameters of just the players becomes useless.
  • The point above would require to measure the skill of a player by something else than WR, like relative performance to the players they encounter on the other team (similar to the idea behind ELO based systems). At this point you would need to look at player stats/performance in terms of how players did relative to the players they actually met in game (and how those people did previously, etc.) This is no longer anywhere near any basic MM balancing, unfortunately.

I hope this gives you a better understanding of why people say that "just some basic balancing" is not a thing, and that random MM, with all its flaws, is actually not so bad when you actually have to think about the implementation of such a system.

  • Cool 3

Share this post


Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

×