Tuesday, 19 November 2013

Facts of MODO


There are a lot of conversations about why MODO (Magic Online) is broken. This post is designed to collect information to inform that conversation. People correctly criticize me for opinions without facts. So I am going to try and go the other way.

I do not have any access to truly insider information from Wizards of the Coast or Hasbro.

What I do have access to:
Hasbro Financials (this includes transcripts from Investor Day, Accounting and Financial Statements, Annual Investor Report).
Facebook Comments (assorted from friends and acquaintences)
LinkedIn Profiles for various WotC employees
Hipstersofthecoast.com historical review of the MODO program (highly suggested reading).
Various salary and employment websites (glassdoor, salarylist, careerbliss)
Reddit, Wikipedia, Google etc..

The TL;DR estimates
Magic Revenues: $360M
Magic Players Worldwide: 3.3M - 12M
MODO Revenues: $140M
MODO Employees: 50-150
MODO Players: 500,000 - 700,000
MODO Developer Salaries: $60,000 (Industy Median is $75,000)
MODO Costs: $60-120M

The biggest problem with accuracy is that my statements will aggregate data from 2011-2013. And the timeline isn’t exactly clear even to me. So you might feel that the following answers are an unfair characterization of the situation.

What is the Problem with MODO?
            Hipstersofthecoast explains the problem with version 2 (note current client is v3 and Beta is v4) as said by Randy Buehler:

            “You might think that we could add more servers to deal with this problem, but that’s just not how Magic Online works. We can add more game servers to handle as many games as people want to play, but there is only one master server that handles everything else that goes on (chat, trading, ratings, etc.). Every time any user does anything outside of a “duel,” Magic Online has to spend some time thinking about that user. As we add more cool new features to the game, the amount of memory that needs to be allocated to each user keeps going up. At some point, when enough users are logged in doing enough things, the whole master server comes crashing down.”


I would highly suggest reading the full article here:

From what I can tell they attempted to go from one server organizing all non-game activity to multiple, but that clearly has not worked. It is unclear if we currently operate on one server still or multiple servers which do not scale well. Further discussion on Reddit suggests that MODO is built on antiquated language/framework (.NET /non-scalable etc…). I am in no way qualified to tell if this is true (EDIT 11/20: Reddit has also since commented that I am wrong).
           
How is Magic Doing?
Very good. Based on my readings of financial statements, the has been between 150%-300% growth in revenues during the 4 years since 2008. Additionally there is an expected growth of 35% in 2013.

Based on those numbers we would have revenues of 250-500M (Million) dollars this year.

Alternatively Hasbro reports 12M active magic players (including digital). Assuming each only spends 30$ per year on average. Then revenues are $360M. Hasbro’s revenues from “Games” is $1.2 Billion. It has stated that Magic is the biggest brand in the portfolio. Thus the the estimates seem reasonable.

*EDIT 11/20: The Hasbro 2012 report states there are 3.3M players currently. Despite their being an NBC article which quotes 12M as of 2013. The only official Hasbro source which uses the 12M number is a few years old, so I will be editing the range of players.

How is Hasbro doing? Any reason to think they are pinching pennies?
            As a company Hasbro had a rough 2012 (in terms of stock price). It has since rebounded during 2013 (though that was simply consistent with US Large Cap market in general). During 2012 things were bad enough that Hasbro was engaging in layoffs and restructuring as a cost saving measure.

            However there is little evidence that Hasbro makes many decisions re: the Magic the Gathering brand. I have read (in a statement by Aaron Forsythe I believe) that Hasbro has little input on the decisions to manage Wizards of the Coast properties. Sean McGowan (an analyst at Needham & Co.) says “The Best thing [Hasbro] did was leave [Magic] alone for several years.” when discussing the explosion in Magic’s popularity.

            In all financial statements Hasbro touts Magic as its model product. They  reference digital (though in most cases Duels of the Planeswalkers) and paper growth. Many people associated with MODO since 2008 (when Buehler et al. were fired), have been promoted. This includes Worth, Arron and Elaine Chase. 
Unclear whether MODO growth has mimiced paper.

How many people play MODO?
Note I will use multiple methodologies to arrive at different estimate and see how much they align.

According to the Linked In of Vice President of Digital Technology – ( See description below).  He alludes to “Direct Brand revenue impact of 150M+”.

I remember seeing somewhere that MODO is equivalent to North American revenues for Magic, was also equal to about 30-50% of overall revenues. Assuming this is true (it was according to Worth in 2007), and using the base number of 360M revenues overall, we estimate MODO has a revenue of 144M.

Assuming the average player spends $100 a year (which seems reasonable), then there are 1.4M people who touch MODO in a given year.

Based on personal observation there are only ~5000 people on MODO at peak times. Assuming each person plays 24 hours a year on average there would be 730,000 players.

According to Reddit other sources place MODO playerbase at 500,000.

So the True Number is likely somewhere between 500,000-1,400,000.

Note if we use the low end, then the average player is spending ~$300 year. Making MODO the highest revenue game per player that I could find.

To keep scale in perspective - a year old estimate puts League of Legends players at 32M active per month. Their revenues are in the $200M estimate range (Wikipedia).

How much resources are thrown at MODO?
           
According to the Linked In of Vice President of Digital Technology he is “Responsible for managing the technology development and operations for the Magic Online, free-to-play digital objects business, Duels of the Planeswalkers game title (XBLA, Steam, iTunes, Android Market) and the subscription D&Di digital experience.

He states he has a Budget of 40M+. Note the budget presumably wouldn't cover the fixed costs of developing MODO that he has no control over (office space, legal etc). I would assume MODO costs at least 150-300% of that budget.$60-120M.

He also mentions having 200 employees (150 of whom work onsite). Wizards of the Coast has 1000-5000 employees total according to glassdoor. There are 550 on LinkedIn. Given that Hasbro has 6000 employees total (based on company documents), 600-1000 overall seems about right for WotC.

Assuming MODO comprises 2/3s of the Digital Team at WotC there are 100 people working on it.

To put this in perspective Riot Games which makes league of legends has 2013 estimates of 200M in revenues and 1000 employees. MODO should require less employees (because actual game design is not part of the product). Blizzard (with Revenues in the area of $2B) had 7061 employees in 2012.

Are WotC software developers underpaid?
            According to Facebook WotC software interns earn $4000 less for a summer then other major software firms in Seattle. Getting an accurate measure of compensation for senior developers is much more complicated since few people actually report salaries.

Salary List Reports the following for Developers:
                        
                       “Wizards of the Coast Software Developer average salary is $59,000, median salary is $59,000 with a salary range from $59,000 to $59,000. Wizards of the Coast Software Developer salaries are collected from government agencies and companies. Each salary is associated with a real job position. Wizards of the Coast Software Developer salary statistics is not exclusive and is for reference only. They are presented "as is" and updated regularly.”


The median for Software Developers overall is $75,000 (average is slightly higher).

            If you look at the last 10 employee reviews on glassdoor.com for WotC, 7 out of the 10 rated Wizards 2/5 or worse on compensation. The 3 people who rated them higher worked in Graphics, Art and Game Design. A couple of people who interviewed for software positions complained that interviews were conducted by recruiters and not people working directly with the product. Those complaints cited that recruiters had a lack of knowledge (both technical and regarding magic). Take this with a grain of salt since I assume most complainees did not receive job offers. I am unsure how common the use of recruiters is in the software industry. Blizzard had similar people surrounding it.

Also note that Hasbro is routinely voted one of the best places to work in the United States (via Fortune Magazine). However most of the benefits are perks and not direct compensation.

Are 3rd Party Developers a realistic option?
            Hasbro already pays EA studios to develop games for 8 of its various brands. It has also acquired a majority stake in Backflip Studios (a mobile game developer). Duels of the Planeswalkers is developed by a third party and the original version of MODO was developed by a professional studio. Current versions are made inhouse.

Does wizards have a track record re: inhouse development. Are they planning to move it out of house?
            I would refer you the Hipsters’ article linked above. Wizards has repeatedly made comments similar to the 11/2013 blog post. They have also removed premiere events before. The last time they were down for about 4-6 months. The 3.0 version was delayed by about 18 months (on top of the 18 month schedule).

Wizards is currently hiring Senior Magic Developers/Testers/Technology Project managers to work on MODO (as of Nov 5/2013).

Monday, 29 July 2013

Final Thoughts on the HoF and the Skill Paradox


A final thought on the HoF.


PT Top 8s in the modern era are worth less than PT Top 8s from earlier in magic’s history. This is in spite of the average modern player being much more skilled.

**What I am writing about is an adaptation of well known theory in investing, Sabremetrics and Poker. For those interested in a more detailed analysis I suggest Mauboussin and googling the ‘Skill Paradox.’.

The Paradox of Skill

We start with the fairly simple assumption that

Performance = Skill + Luck

We also assume each person’s luck is drawn each tournament from some distribution that is equal to all players. E.g. LSV might have been be luckier in PT Kyoto than Nassif because he opened Nicol Bolas at that specific tournament, however they had equal chances to open it.

Because a person’s skill and luck are uncorrelated, we arrive at the Paradox of Skill:

Variance of Population Performance =
Variance of Skill in population + Variance of Luck in Population

As the Variance in skill gets smaller, the variance associated with luck starts to dominate in determining the overall outcome of tournaments.

Consider the following example:
A)    A PT in 1999, where Jon Finkel is far and away the best player. The 100th best player barely knows how to draft and the rest of population is somewhere in between.

B)     A PT in 2013 where, the top 100 players are all equal to skill Jon Finkel @1999 skill level.

It should be clear that someone’s final position in PT A is strongly correlated with skill. In other words we can be confident that the person in 8th was better than the person in 16th.

In PT B, the opposite would be true. The only difference between someone who gets 8th and 15th was the amount of luck they had in that specific tournament.

Assumptions I am using:
1)      The skill dispersion (especially at the top of the game) is much lower today than it was historically. In other words, the top 50 players in the game are much closer today (even if they are all much better) than they were historically.

And that’s it. Everything I have read from Kai, Finkel and Kibler on the topic would seem to support the view, but I haven’t bothered to try and prove the above assumption.

Just to reinforce that this situation isn’t completely impossible. In a world where the top 50 players attend 3 PTs a year and each have 10% chance to top8 a PT: we would still expect one person to top 8 two PTs a year. In other words the fact that some players do consistently well isn't enough to disprove assumption 1. If you have ever heard or read about the birthday paradox, the same principles apply.

Practical Implications
We are seriously overweighting T8s and wins in the modern era competitors. Instead we should focus on a looser metric (e.g. 32s/64s etc…). Rate metrics and consistency become much more important. For older players, top 8s are more likely to imply that they were one of the best 8 players in the tournament. And a top 32 is more likely to imply that they were NOT one of the top 8.

Recently in his SCG article Reid Duke made the point we shouldn’t punish anyone for having a few bad initial years on the PT. And I really wanted this to be true (Because me obv). But if we now know that luck is the major determinant in people’s short term success rates, things like 3 Yr medians should mean less for modern competitors. Forgiving a few “bad years” makes it more likely you select someone who's results are variance driven (as opposed to skill).

Putting this together for HoF implications I think should go as follows. Suppose someone has 2 PT Top 8s and 6 PT top 16s. In the Modern Age: I think “Hes unlucky”. If he is old school: I think “he probably wasn’t that good”.

Focusing on results through this lens I think we could argue:
Underrated (in no particular order):
1)      Shouta Yasooka
2)      Hoaen (do we consider him “modern”?)
3)      Osyp

Overrated
      1)   Edel.
2)      Saito (if “Modern”)
3)      Ikeda
4)      Gary

Final Unrelated HoF Thoughts:
Stats I used in my previous formula driven HoF Ballot:
Longevity = # of PTs, # of Pts
Consistency = PT Median, 3 Yr Median, Difference in Medians, T16s, GP Top 8s
Best in World = 3 Yr Median, POYs
Place in History = These are indicator variables (e.g. are you in the top 20%). In other words having 4 PT Top 8s is the same as zero because 80% of ballotees had 4 or less.

Top 8s, Money List, GP Top 8s, Pro Points.

Skill = T16s per PT, Median Finish, POYs per years played.

My Ballot (which I don’t have):

1. LSV
2. Edel + Ikeda
These are the only two who are not top 5 stat wise. I think pioneers in a field deserve credit. I am willing to go beyond the stats if there is proof they did something truly unique. I feel the case for Ikeda is weaker than Edel (he has more similar analogues in Fujita, Oishi etc..). I could be convinced to vote for Osyp (easily the most underrated candidate on the ballot) instead.

3. Shota Yasooka.
Stats + Skill Paradox already implied he was one of the best players skill wise on the ballot. Juza’s interview on cfb was a nice (if unnecessary) confirmation.

4. Saito.

1.      He was (or at least top 3) the best deckbuilder in the world for a long period of time. Still seems like he might be.

2.      He is one of the best players I have seen play. I can sometimes remember individual matches where I was blow away by the play I saw. Saito in TSP block is amongst those. Ditto San Juan. Most players I have talked to feel that he was easily amongst the best when he played.

3.      He was an angle shooter, but a lot of people on the PT are. Stalling in particular seems like one of the most hypocritical things for many players to call someone out on (based on my PT experience). So while he might be the scummiest of successful players (which I doubt), other players are close to that level. This might be too much apologizing for someone who is arguably a cheater (I differentiate between rules lawyers/cheaters/angle shooters), but I don’t believe (based on 1 and 2) that his results were significantly impacted by his angel shooting.

The honourable mentions: BenS, Efro, Gary.

Tuesday, 16 July 2013

One Game


It can be hard to figure out exactly how good you are.

You can play a whole game and make zero interesting decisions.

Or you could spend eight turns finding out you have a long way to go.

Three friends went to GP Providence. We had practiced a lot. We had a history of some success (2nd at the last Team GP) etc. But this isn’t a feel good tournament report. And it isn’t an appeal for pity.

I have finally found some time (and my notes) to do some honest reflection on facing two (maybe 3) future hall of famers; and then being weighed, being measured and  being  found wanting. Not a tournament report. Not a match report.

So don't call this a report so much as a story about one game against the best in the world.

Before the 2nd draft on Day 2, we were 9-3. That’s not the end of the world, but it is not a great place to face Cheon, Froelich and LSV.

I was summarily dispatched by LSV in the middle seat. Jamie beat Paul. Which means it would be Maksym vs Efro for all the marbles. In game 2, we played well and managed to find all the right attacks. It was one of those games, where you didn’t necessarily outplay your opponent. But rather we had managed not to snatch defeat from the jaws of victory.

I think most grinders would know the feeling.

So its time for game 3. The good news is that Maksym’s deck had Aetherling, Pack Rat and Soul Ransom. The bad news is that last year their team had more pro points then our lifetime totals combined. We were fighting the civil war of Ratinum. Efro’s deck was an aggressive boros deck splashing blue for Ral Zarek and Beck//Call. We knew about at least one Weapon Surge and were on the draw for game 3.





Jamie and I do a mental high five when they take their mulligan and we see a rat. At least I think we do. Jamie’s probably too a nice guy to revel in our opponent’s misfortune, but I hate imagining myself on the solo end of a high five.

Grade = A++. Opponent on 6. We have turn 2 rat with 3 lands in hand. Played this part perfectly.



Grade = A. Nothing to screw up. Yet.



There are some small set of scenarios where not playing pack rat is correct (and Maksym broached the topic). However, against an aggressive deck I don’t think you can possibly afford to be that cautious.

Grade: = A. Didn’t punt by not playing rat. No victories are too small for this story.

TURN 3



At 14 life we face our first real decision.

3) Should we spend a turn making rats or play a barrier?

3a) If we make a rat can we afford not to block?
If we don’t block next turn we will be at 10 (if he plays another creature), or 6 if he just double pumps.  After that we will have three 3/3s, but his Truefire Paladin is an abyss and his other guys are trading for rats.

3b) So we have to block if we make a rat. What are optimal blocks?
Presumably we would just block firstblade. A trick gets really costly here since we would be at ~10 with one rat facing 2 creatures. And again Truefire is close to abyss mode (assuming a 4th land).

3c) Whats the goal here?
We have soul ransom (he mulliganed) and tons of gas. So we just want this game to go as long as possible. Which means preserving life even if that means throwing away cards.

Hover Barrier makes the most sense in this context. Its going to be especially good if his 4th land doesn’t allow for double pumps (e.g. isn’t a mountain). A reasonable guess given his mulligan and being on the play.

Grade: A. Found the important strategy for the game.

TURN 4




Cluestone gives him double pump mana. Truefire gives a way to grow an army that could potentially fight rats. The whole game is going to shit. But he only has one card in hand and we have Soul Ransom. We could also Fatal Fumes here. Millenial gargoyle, call of the Nightwing and making PR#2 all don’t do enough defensively.

4) Should we Fumes or Soul Ransom?

4a) Assume Fumes whats the optimal target?
We can’t afford to let him have guildmage in the long game and Soul Ransom isn’t a permanent answer. So we would have to fumes guildmage. He then attacks with both.

4a – II) If he attacks with both what do we block?
Chumping with rat seems unadvisable (but maybe we should of considered it), so where to put the Barrier is question. Paladin would kill it setting us up with a Pack Rat vs his board of two creatures and being at 10. We would ransom paladin, he would discard two and we would still be at 10 and have to chump with Pack Rat or go to 2. Not a winnable board state.

If we block the Viashino, he pumps twice we go to 6 and Soul Ransom his Truefire. He discards and we put Hover Barrier in front. Leaving us with a rat at 4 life versus his two creatures. Not a winnable board state.

4b) What about Soul Ransom? Optimal Target?
I think its safe to assume he is going to crack the Soul Ransom to get back whatever we take. If he gets it back immediately, taking the Truefire is better since he can’t attack right away and the paladin isn’t useful summoning sick. If he is holding a good card (or draws one) he might wait a turn or two to crack it. In which case taking the guildmage is better. I didn’t want to give him option value (e.g. the ability to draw cards just make dudes), so I suggested we take the Sunhome Guildmage.

Grade: B. Not playing the fatal fumes is good and not an obvious line. In retrospect taking the guildmage might have been bad, since we can always fatal it the next turn if he decides to wait.

TURN 5



After Efro plays Goblin Rally, its obvious hes setting up to get his guildmage next turn.

5) Should we make a play, mainphase fatal fumes or hold up fatal fumes?

5a) Can we afford for him to get guildmage back?
No.

5b) So mainphase or wait for him to discard?
The first question is the interaction between Soul Ransom and Removal. Short answer is we get to draw 2. But, we had to ask a judge to confirm. Luckily LSV seemed to get the wrong read here (based on us asking the judge question). Maybe he assumed we knew basic rules interactions. Joke’s on him.

5b2) What happens if we wait, they figure it out and do nothing?
Well we have pack rat so our mana won’t really be wasted. And they won’t be able to attack. Seems like waiting is fine.

Grade A-: I think we made the right play, but it should have been obvious that we had removal because we had to talk to a judge. A massive leak which better players would avoid.

TURN 6


On his turn 6, efro discards two cards and we respond with fatal fumes. I have listed the 3 cards drawn on our turn 6 (two from Soul Ransom). He still gets to attack his board into our Rat + Barrier. We could also chump with a rat.

6a) Who are we blocking with Hover Barrier?
If we don't block Truefire, we go to two life. We would also be facing 6 creatures, with 4 potential blockers. So we need to block Truefire Paladin.

6b) Should we chump with rat?
We need to start making creatures at this point and Rat can make 2 a turn. Can’t afford to chump block (on Efro’s T6). 

For our turn 6 making two rats is the only way to make two blockers and not die. He has 6 attackers and we have 3 blockers during his turn 7.

Grade: A. Made all the right plays, though it is not like there were real decisions.

TURN 7


After he attacks with everything (4 tokens, firstblade, truefire). We make 2 rats going up to 3 total and chump + kill 2 tokens. On our turn we face a bunch of possibilities given that we have 2 rats in play.

7) What are the options?
Plan to make two more rats on his turn (while holding up Cancel). Suppose we make the third rat and block everything but one token. He can either pump (+ first strike) his Paladin or not. If he does we make the 4th rat going to one but ending up with 3 rats. If he keeps his mana up we can trade boards and have cancel for his threat, followed by a threat. We can beat a burn spell (assuming it costs more than 2 mana) with this line.

Alternatively we can play a land and cast CotN.

7a) Why cast Call of the Nightwing (CotN)? Why Not?
He can’t block the ciphered rat (because we can make a third rat in combat). We end up with 2 bats, 2 rats (1 untapped) and the ability to make 1 more rat, but no Cancel. Our ciphered rat is unlikely to get in again. This is fine if he draws nothing.

However we lose to burn and maybe top decked tricks. We are also lower on cards in hand (because we need to make another land drop_ so in a stalemate we could conceivably lose given his abyss Paladin.

Jamie and I thought we could afford to play around top decks (and hold up cancel).

Maksym wanted to CotN and try and end the game. Maksym was losing a game with turn 2 Pack Rat, so we overruled him. Just kidding. Kind of. Fuck Karma.

Grade: B. Upon further reflection I think it is definitely a close call. Also an important note was that his land didn’t make red.

TURN 8



With zero cards in hand. Untap. Upkeep. Efro draws his card.

Looks at LSV.

Cheon  ~ “We have to attack or eventually his rat will get us”.

Lucas -  “100% they drew weapon surge”.


Obviously we go into the tank.

8a) Could this be a bluff?
Very unlikely. If we didn’t have cancel we are essentially forced to make two rats and quad block. This goes very poorly for them if they don’t have anything (the board becomes our 3 rats versus their paladin + one token). Its worse then just sending Paladin probably. And since we are dead, we can’t really afford to play around anything.

This just reinforces the Weapon Surge read. I would like to think they give us enough credit to realize that bluff here doesn’t work. On the other hand the way they Hollywooded before attacking is a signal they aren’t giving us too much credit.

8b) Then what Sherlock?
Well we have to make a dude because we are dead without 3 blockers.

Lucas – “First things first, make a third rat”.

Sometimes you need to be precise. To be honest, I hadn’t even thought about what to discard. It was obvious to me that we needed the first rat, and I wanted to take an action to buy more thinking time. 

Except we needed to think first, because what we discard is important.

Unfortunately we discarded Deathcult Rogue.

8c) Can we make 4 guys and block?
Not if we actually believe he has weapon surge since he can plague wind us.

8d) What happens if we block only 3 guys?
He can weapon surge or use 2 abilities from Paladin, but not both.

If he decides to surge, then we cast Cancel, he makes paladin a 4/2. That ends with us at 1 life facing a token. Him with zero cards, but we would have CotN and Deathcult Rogue. Pretty good spot.

Except we discarded Deathcult Rogue. So we would have Island and CotN When he attacks with Token we have to chump with token. And it’s a topdeck war with us at 1 life. He has a cluestone he can crack to find an extra card as well. That isn’t great for us.

If he doesn’t cast weapon surge and instead makes a 4/2 first strike we can make another rat. He loses a goblin token and a Viashino Firstblade. We are at 1 life, but have 3 rats. Even better then above.

8e) Ok so, assuming he players correct (and weapon surges), what do we do now that have discarded Deathcult Rogue?

Then the doubt creeps in.

What made me so sure he had drawn weapon surge? Obviously a snap read is based on intuition but if you put a gun to my head how sure would I have been really? 70%? 90%? How likely are we to win the games where he is actually bluffing, and we just call?

Some people would tell you the pressure was overwhelming or they felt the world on their shoulders. But it was nothing so dramatic. My lucky history in Magic has given me wealth of experience on being embarrassed during feature matches.

Instead I gave my team “the speech”.

Lucas: “We fucked up. We are probably 10-20% to win if play around my initial read. We are close to 100% to win if we don’t play around and my read was wrong.

What do you guys want to do?”

Them: YOLO.

Oddly enough this seems to primarily be the refrain of those in the process of committing suicide.

We would be no exception.

Final Thoughts:
They had the weapon surge. We lost.

We played a game for 8 turns (7 on our side) and made at least 3 mistakes, 2 of which may have cost us the match.

We played a turn 2 pack rat and lost.

Because we made the perfect read against one of the best teams in the world, we had a chance to win even when they were drawing pretty well.

Its unfortunate that Magic chose that moment in time to be a skill game.

FIN.

Friday, 5 July 2013

HoF Voting. Quant Style.


I don’t want to get into an argument about the use of intangibles or subjective achievements (penalties as well) for use in HoF voting. This is just going to be a simple explanation and presentation of a mathematical approach to judging deservedness. Its not necessarily how I would vote exactly, but I think its a more a honest method then most.

1. The first cut.
Due to time constraints (aka lazy constraints) I only considered players with 5 top 16s or more. The cut is somewhat arbitrary, but it left me with 25 considerations and it seems like below that number you would have to rely on subjective arguments anyways (e.g. Pikula and Herberholz’s of the world).

Once I did this, I did all analysis WITHOUT names attached, to remove as much bias (during the methodology creation) as possible.

2. The meat of the method.
I created 5 super categories. Each one has multiple components. Then I gave a weight to each of these categories. This creates an overall score for each player and the top 5 scores were reported. I like this methodology since it can tell you where a player has a deficit or strength. If you disagree with my weights it simple enough to see how the rankings change based on your own personal preferences. For example, Lauren Lee doesn’t think consistency should matter whereas my friend Sam seemed to think it much more important.

Below I list the 5 categories, the weight assigned to each, an example of a subcomponent and some discussion of players who excel or fail in the category. Finally I might add some color commentary.

Note some weights changed since I posted on facebook based on discussions with people I respect.

Longevity (10%):  
How long was the player at a high level of magic. The simplest subcomponent is # of PTs played.

Top 3 (always in order)
Ikeda, Yasooka, Stark

Bottom 3 (no order unless mentioned)
Krempels, Justice, Soh/Kaji.

This is one the places where Justice gets really punished. If you put little weight on Longevity, I think its hard to argue that he shouldn’t get in.

Consistency (15%):
Was the person consistent at the highest level of play. PT Median Finish is here. Note this is somewhat independent of how long the player played.

Top 3)
LSV, Efro, Osyp/Mori

Bottom 3)
Jurkovic, Tiago Chan, Geoffrey Siron

I am not sure that consistency should be that important. If someone was bad early on in their PT career, but became dominant I could fully imagine they belong in the HoF.

Best in World (25%):
Could we consider the player amongst the best in the world during some extended period of time. I think its hard to argue that someone is among the best of all time if you can’t even provide evidence that they were the best during their time. 3 YR Median is one of the subcomponents.

Top 3)
LSV (Get used to this), Saito, Wafo-Tapa

Bottom 3)
Reitzl (Booooo), Ikeda (first good argument for why he shouldn’t be in), Jurkovic

Place in History (25%): How unusual is their resume? Do they have something that really stands out, makes you say “Wow that would be hard to do”. How many standard deviations above the mean are their stats.

Top 3)
LSV, Saito, Yasooka (15 Gp Top 8s, 16th on Money List, insanely high pro points)

Bottom 3)
Krempels (no idea who he is for good reason?),  Tiago Chan, Justice (0 GP Top 8s, almost no pro points).

Again this tries to quantify place in history, I know lots of people would argue Justice should be higher but I need a data point and I don’t have one.

Skill (25%): This is probably the most controversial category, I think even if you had low skill and had results from the above categories you might deserve to be in. %Top 16s is an example of skill.

Top 3)
Justice, LSV (now I wanna see a Justice/LSV grudge match), Efro

Bottom 3)
Tiago Chan, Ikeda, Fabiano

4th was a tie between Johns/Kaji.
The Final Top 10 (with scores and lower is better):
  1. LSV (1.65)
  2. Saito (6.4)
  3. Yasooka (7.55)
  4. Efro (7.85)
  5. Osyp (8.1)
  6. Gary (8.1)
  7. Stark (8.45)
  8. Wafo-Tapa (8.85)
  9. Mori (8.9)
  10.  Johns (9.05)

What do the top 10 have to do make a move into top 5:

Gary – literally anything to break the tie with Osyp.

Stark – Scores worst in Skill (low Median) and BiW (low 3 Year Median), which I am sure many would disagree with. Honestly just weighting those two areas slightly lower and hes in.

Wafo-Tapa – Consistency and Longevity were is two weakest areas and the ban definitely didn’t help that.

Mori – Skill and BiW need improvement.

Johns – Longevity is far and away his worst score. Hard Time to start PTQing IMO.

Justice – If you put 0 weight on Longevity/Place in History, Justice becomes a slam dunk candidate (2nd to LSV, with Saito falling to 8th in this case).

Monday, 17 June 2013

Welcome to Vegas

Welcome to Las Vegas

Vegas is going to be the largest magic tournament ever. And its going to be the largest by a big margin. Which makes it an interesting exercise to try and figure out the implications for how that effects records and the cut to top 8.

For those who read my facebook notes, you will have seen my previous attempt to make a guess at what records would top 4 the Player’s Championship. Something I nailed with reasonable accuracy. The methodology is fairly simple. I assume everyone is 50/50 in every matchup and then simulate it a crapload of times. Draws are not allowed.

There were 3 problems adapting this framework to Vegas:

  • How to incorporate byes.

I figured Vegas would have about 4000 people and a large number of those would have some number of byes. The original program isn’t designed to handle that, so I figured I would compensate by just setting the number of participants at 5000.

  • Time to Run

The initial version was pretty slow, which wasn’t a big deal when I had to simulate a 12 round tournament with 16 people. I had to make some pretty big adjustments to speed it up for a 15 round 5000 person tournament. I don’t think I made any errors when making these adjustments, but who knows.

  • Trials.

I am down to about 100 trials because of how long this thing takes. The information regarding 12–3 players is based on 10 trials.

Results

  1. The record needed to top 8 will be 13–2. In all 100 trials 8th place was 13–2.

  2. Between 17–20 people end the tournament with that record or better. The average was 18.72. So on average 10 people missed the cut at 13–2.

  3. An average of 87.8 people were 12–3 or better. Which means 20+ people were missing the money with a 12–3 record. The first GP I ever travelled to (GerryT winning in Denver) that record was a lock for t8. This actually makes me suspicious that I did something wrong (because the result is so ridiculous), but I haven’t found what it could be yet.

Practical Implications

  1. You can drop at X–4.

  2. Draws are much better then they would otherwise be, especially late on day 1.

    For example, assume you are 7-1 going into the last round of Day 1. A draw is going to be significantly better than a loss here assuming you care about t8 only. With either a draw or a loss you have to win out to have a shot at Top8ing. But with a draw you are 100% to top 8 assuming you win out. With a loss you are almost 100% eliminated from top 8.
    
  3. If you are 7–2 on day 1, your odds of top 8ing even if you win out are essentially zero.

Breakers are going to a matter a lot for the top 8 cut, so losing early is costly (See above). Though you can still obviously qualify for dublin.


Sunday, 31 March 2013

The Fog of War


Decklist


4 Breeding Pool
4 Temple Garden
2 Hallowed Fountain
1 Overgrown Tomb
1 Watery Grave
3 Sunpetal Grove
4 Glacial Fortress
4 Hinterland Harbor
1 Alchemist's Refuge
1 Nephalia Drownyard


3 Augur of Bolas
2 Snapcaster Mage

1 Gideon Champion of Justice
1 Jace Architect of Thought
1 Jace, Memory Adept
2 Tamiyo the Moon Sage
1 Garruk Wildspeaker

4 Fog
1 Clinging Mists
2 Feeling of Dread
4 Supreme Verdict
1 Terminus
2 Azorius Charm
1 Selesnya Charm
3 Sphinx's Revelation
2 Urban Evolution
4 Farseek

SB:
1 Nephalia Drownyard
2 Loxodon Smiter
2 Thragtusk
1 Pithing Needle
2 Witchbane Orb
3 Dissipate
1 Dispel
1 Oblivion Ring
1 Detention Sphere
1 Curse of Echoes

Selesnya Charm - You need a way to kill obzeday (its basically the only relevant thing Junk Rites can do against you). It would be better if we could find an answer that hits Falkenrath as well.

Gideon - The best win condition against Junk Rites. Also good against any lingering souls deck.

Urban Evolution - Better then the 4th Revelation because its slightly less clunky early. Also going 5 into fog is very common.


Strengths:

Junk Rites is a Bye.
Aggressive Red Decks are positive. Especially if they don't have skullcrack.
Esper Control/UWR are at all time lows.
Often has decent matchups against the niche crap which people bring to beat Reanimator because they aren't really super interactive decks (Hexblade etc....)


Weaknesses:

Time Management.
Softness to blue cards especially when backed up by a clock.
Easy to hate (for example it would be easy to build a jund deck which just kills this).

The deck isn't easy to play. Be very conscious of mana efficiency. Its often correct to Snap -> Fog before casting a fog in your hand, to make a revelation turn better. You need to be very precise technically (which is normally easy) but you also have to be super fast, since most games takes 10+ minutes even when you win.

Friday, 1 February 2013

Simulations Part 2: Yuya, Jund v2, Bannings and PT Nagoya


Summary from Last time

I think there was a misunderstanding about what I was trying to do with my last post.

To be clear: If you could guess the metagame and win percentage matrix perfectly you would know the best deck.

But this is obviously both unlikely and extremely costly time wise. Instead I want to use the math, programming and examples to challenge commonly held beliefs of the “pro” community which may or may not be true. All of us rationalize deck choice and it is useful for us to try and at least take an analytical lens to these arguments. So I wanted to summarize the practical advice that applied from my last article:
  • Don’t play the most popular deck if its bad (even if you are good with it). The playskill edge you need to make this worth it is large. See later section for details.
  • If you want to top 8 (or “do well”) focus on beating the most popular deck.
    • But a complete glass cannon doesn’t work either. See example: affinity/scapeshift/tron.
  • If you want to win, focus on (e.g. a PTQ) don’t worry about beating the most popular deck, worry about beating the decks that beat the most popular deck.
    • Top 8ing and Winning can require distinct deck selection considerations. See RPS example from previous post.
  • Just looking at what percent of the metagame (even a winner’s metagame ala Karsten/Chapin) is not a good indicator of what the best deck is and even more unintuitively it may not be a good guide as to what you should be focusing on beating.

This Week: Motherfucking Science!!!?#$

  1. Addressing reader comments on the previous article.
  2. How much is being Yuya worth (other than 57 pro points)?
  3. What happened to Jund.
  4. Theory as applied to PT Nagoya
  5. Theory vs Simulations. Math! Proofs! Almost Rigorous!
  6. 5 minute break to relieve your boner from the last section.
  7. Conclusions

1. Comments from last time

I would like to take a moment and genuinely thank everyone who made comments on the previous post. Most of it was on my facebook wall and it was cool to see how many people enjoyed a slightly different approach to Magical analysis. I appreciate every single comment and will try to address some of the points here.

Thanks to Paul Jordan who did some more analysis and hooked me up with some excel so that I could format things more easily.

Do More Trials Short answer I think this is a non issue. I am not sure why 1000 trials is not enough. The distributions I am using aren’t exotic enough for me to think they warrant it and my code is unbearably slow (1000 trials is already an overnight process). That being said Jarvis has been the sent the code and may be able to optimize it.

What happens now that Jund lost BBE? To be answered in a third and final post hopefully next week. I also want to address the # of rounds importance. But I imagine it will require a subsequent post, because people only want to read so much boring math. Eli Priest already had the gist if you read his facebook post.

What about writing for a major website? Unfortunately not possible right now. I really appreciate everyone who shared and retweeted the link for this blog since I don’t have reach any other way. Special thanks to Sperling (who tweeted) and whoever posted it on Reddit (4000 views from Reddit and 400 from MTGSalvation forums).

What about Model/Metagame uncertainty? Is this useful? Again this isn’t a tool for predicting the tournament exactly ex-ante. Rather a tool that helps us analyze “How We Should Think”.

Top 8 is 3 of 5 did you account for this? No. In a related point one reader thought I would systematically underestimate a top 8 deck winning (conditional on having top 8ed). Empirically that might be true. And the 3 of 5 might have to do with that. A deck with a great sb will get an edge in the top 8, rendering the win percentage matrix not constant throughout the tournament. Its a rather trivial addition to my program to correct for this but I am not sure how I would get the correct assumptions. Moreover many tournaments rely on 2 of 3 in the top 8 (GPs, PTQs, FNM etc..)

Pro Tours have limited Nothing much I can do about that other than assume it has zero impact or flip coins for every limited match. I have asked Paul Jordan to look into how Jund players did during limited rounds during PT RTR (to get a sense of how much it matters), but its a ton of work I imagine.

What about variations in player skill and deck construction See the section on what being Yuya is worth. Unfortunately I can only use Paul Jordan’s categorizations of decks. And he needs to aggregate disparate lists to get meaningful sample size on deck win percentages.

Is there anything else more practical we can do? Some ideas I have had:
  • How much can tie breakers actually move at the end of a X round tournament (Sorry Conley!).
  • Is the MODO metagame rational? Are there “sticky” deck choices (for all you economists out there). Whats the time-lag for information processing? Obviously you could check the IRL metagame as well.
  • Is Yuya a robot?
  • Prices. A long time ago I sent an involved article to Channelfireball about card prices, I am not sure exactly what happened to it. If there is demand for this kind of thing, I would consider trying to find it or redoing it. Essentially I wanted to mythbuster magic finance.

2. Why I would rather be Ari Lax than Yuya Watanabe.

From this point on I will be using theoretical probability tools as well as simulations . For a detailed discussion of why this might matter check out section 5. Otherwise take my word for it that the theory is sound.

We can measure how good a deck is in a given round by calculating its Expected Winning Percentage. Imagine Yuya has a 10% higher win percentage in every matchup (including the mirror). Thats a pretty substantial edge, especially at the Pro Tour level.

Chart 1.


 How do we make sense of this? In round 1 Yuya has the best win percentage of everyone. Yet in round 10, the value of being him with Jund is worse than being an average player with Poison, Eggs or Tron.

If we stop and think this is just a simple corollary of the previous post. By the time round 10 gets around all of Junds good matchups have been drastically squeezed and its bad matchups have proliferated. This happens because of the popularity of Jund. So Jund’s win percentage at the top tables is in constant decline. Yuya still has his 10% edge, but it isn’t enough to overcome his deck selection disadvantage (theoretically anyways, since obviously he top 8s and thus implodes math).

3. But can we explain what happened after PT RTR - Jund Edition

I think its fairly clear to most of us that the Pre-PT Jund builds were often inferior to what would become the best version of the deck. Deathrite, Liliana and Lingering souls weren’t even mainstays at that point. As a proxy for how the season developed I reran the simulation for PT RTR, but gave all jund players a 5% bump in every non mirror match. How did that change things:

Chart 2.


Note if the improvements over the course of the last 4 months were even larger its reasonable to see how Jund might have won 75% of the GPs. But a large part of its dominance would still be due to its initial meta size. The “improved” Jund from this case only wins ~52% of its matches. If the newest versions “solve” the affinity matchup it wouldn’t up their win percentage by that much but would of changed their win tournament percentage to the ~35% range.

If an extremely popular deck has a positive expected win percentage (even if that edge is small), it will post DOMINANT results

I wonder if there is some kind of psychological feedback mechanism in play at this point. The deck wins so people play it. But people playing it means it wins. Thus a deck seems dominant when in reality it would be perfect rational to play a host of other reasonable choices. #ThinkingCapsOn

I don’t want this blog post to get sidetracked, but I think in the wake of the B&R announcement, its easy to see how wizards might have made a rational overreaction. Banning might have been needed to break up the cultural inertia that had built up behind Jund. The metagame was stale not due to Jund’s dominance but because of its inertia. Bans are a way to encourage diversity by changing peoples perceptions (they think Jund is now as bad as it actually already was), but not the reality.

Let me know if this makes sense. Summarizing:
  • Jund is actually not a great deck (~53% with some bad matchups)
  • But people think its great (>60%) so a lot of them play it
  • The combination leads to a lot of success kind of like 10,000 monkeys on 10,000 typewriters. This reinforces the erroneous beliefs.
  • Wizards bans BBE which has zero impact on the actual viability of Jund but makes people adjust their beliefs regarding its power.
  • Now that its perceived power is equal to its actual power, people again begin trying alternatives.
  • Thus the metagame becomes more diverse.
  • If people were completely rational they would of tried new things even without a ban. But we needed a shock to a system because of incorrect perceptions/metagame inertia/some other reason.
Realistically Jund was probably overperforming too much for the above to be true, but I think its in the realm of possibility.

4. PT Nagoya

Per PV’s suggestion I thought Nagoya would be an interesting second case because the popular deck was actually very good. As of this very moment the Simulation for the PT is running but I would like to present my estimates based on theory for similar metrics to the last post. If the simulation ends up being drastically different than my predictions it will be reported.

In this case I am much less confident about how I filled in the win percentage matrix since I never played in the block format. If someone good wants to double check that for me shoot me a PM or comment. I also don’t have Infect or Tezzeret variations separated out.

Chart 3.




In this case I think intuition lines up much better with the results. The three best decks in terms of overall win percentages also top 8 the most. The two non-Tempered Steel decks with the best Tempered Steel matchup are the best decks for both top 8ing and winning the tournament. We can take away:
  • If the popular deck is good. Its a fine play. Especially if you want to top 8 (as opposed to needing to win).
    • If you personally had the a good mirror match than the deck becomes a very good choice. Unlike the previous Yuya example, there is no adverse selection in the metagame your bad matchups don’t get more popular.
  • Beating the most popular deck is much more important if the deck is good. This seems to be independent of your goal (Top 8 v Win) in this case.
  • If the popular deck is good, the number of viable decks is probably much smaller than when the most popular deck is bad (duh?).

5. Theory vs Simulations. Math! Proofs! Almost Rigorous!

Estimating the results by theory has a couple of huge advantages. The disadvantage is that I have to make even more assumptions. The advantage is mostly to due with speed and being able to adjust parameters instantly for instant results.

A comparison of results for the original PT RTR example. Simulation vs my Theoretical results. Note for the top 8% theoretical I am using the theoretical metagame of X–1s or better. This obviously isn’t exactly equal to the top 8.

Chart 4.


The results are very close for the top 8. And kind of close for the Win %. Not sure if thats because the simulation has noise, or the latter theoretical numbers are overburdened by the assumptions. Either way I am pretty comfortable pending the results of the Nagoya simulation.

6. Take a minute to please tweet this post. You can include me @toordeforce or not. Also feel free to share on fbook.

Don’t worry you can alt-tab. I’ll still be here.

7. Conclusion 

Obviously we are just barely scratching the surface of whats possible here. I hope to do one last follow up post on simulating metagames and then move on to other things (possibly one of the questions mentioned previously).

Monday, 28 January 2013

Simulating Metagame Evolution and PT Return to Ravnica

The tl;dr

Generally:
  • Beating the most popular deck is overrated
  • If your goal is winning the tournament (as opposed to top 8ing or doing well generically) then the optimal strategy may be very different.
For PT: RTR
  • Eggs was the best deck in round 1.
  • Poison was the best deck in round 10 of modern.
  • Scapeshift, Robots and UW were worse then you know

Table of Contents

  1. Introduction
  2. The Rock Paper Scissors Example (Mascoli 2012)
  3. PT: RTR
    3.1 Relevant Assumptions
    3.2 Simulation Results
    3.3 Theoretical Results (to be updated at a later date)
    3.4 Final Notes

1. Introduction

Ask ten pros and you will get ten answers on how and whether you should metagame. However, almost all of us (or if I choose to keep it real “them”) are answering mostly based on experience and intuition. Until recently I thought that was fine. The truth is far more interesting.

Chris Mascoli (check Gatheringmagic.com) recently published an article on metagaming and magic which was the entire impetus for the work I have done here. In the comments surrounding the article Mike Flores was credited with coming up with some ancient work which originally exposited on the idea. I hope to build on what they have started.

I am going to try and give more explanation, do more theoretical work (as opposed to pure simulations) and finally apply it to the most recent pro tour. The conclusions are hopefully interesting and unobvious enough to be worth devoting a mammoth post to. The program I used to simulate the results was created completely independent from Chris’ work and this conveniently provides me a way to test the validity of the program (assuming Chris’ work was also correct).

I summarize the key insights in Metagame Rules that are bolded below.

2. Advanced Rock Paper Scissors

Consider a rock paper scissors tournament where you pick one strategy and must play option every round. The tournament is run using with the standard swiss rules and a mirror match is 50/50. What is the optimal deck if the tournament featured 300 players and 8 rounds with the following metagame:
  • 40% Rock
  • 33% Scissors
  • 27% paper
Chart 1. Metagame Shares


Original Metagame

% of Top 8 (Chris)
% of Top 8 (Lucas)
Win % (Chris)
Win % (Lucas)
Rock
40.00%

13.57%
14.85%

13.33%
14.10%
Paper
26.67%

55.43%
53.90%

16.36%
17.70%
Scissors
33.33%

31.01%
31.25%

70.31%
68.20%


I have included both the result of my simulation (1000 trials) and Chris’ so that we can gain a little confidence that the algorithm is working (admittedly using some assumptions). Note the use of 8 rounds (which is incorrect) was a small oversight by Chris but since the example still provides the intuition we are looking for I decided to run with it.
The key results are that:
  • Paper is the best deck for top8ing
  • Scissor is the play if you want to win
Metagame Rule 1: A popular but poorly situated deck will see its metagame share over the course of the tournament. The top situated decks start to become over-represented relative to initial popularity. The metagame evolves.

Metagame Rule 2: This significantly impacts who top 8s (and as a corollary who wins). Beating the most popular deck is good for getting a top 8. But to win you want to beat the well situated deck. Winning and doing well (defined as top 8) should be treated as different goals.

This chart shows what the above percentages mean from the perspective a specific individual. For example, if I told you the result was that Paper and Rock both had 33% of the top 8 you might think there was no advantage to picking one deck. But, there were different starting positions for the two strategies. While we might expect there to be 2.66 players for both strategies in the top 8, there were more people who started with rock than paper, thus showing up with rock is worse than paper for each individual rock player.

Chart 2. Player Value
Representative Tournament
Original Metagame (# Players)
# Players in Top 8 (Lucas)
# of Winners
Rock
120.00
1.20
0.14
Paper
80.00
4.32
0.18
Scissors
100.00
2.48
0.68

In this case we can see that although roughly an equal number of tournaments are won by Paper and Rock, you personally are much more likely to win if you play paper (since fewer of them exist).

Metagame Rule 3: Whether a deck is good, is not dependent on how much of the top 8 it expects to be, but rather whether or not the deck is increasing its metagame percentage from the initial position and by how much it does so. You would rather be 5% of the initial meta and 10% of winners, than part of a deck that was both 50% of the Meta and 50% of winners. This is important for when we analyze the actual Pro Tour.

3. Pro Tour: Return to Ravnica

Assumptions for Simulation
  • There were 382 players who participated in every round
  • The tournaments was 10 continuous rounds of Modern and had no limited portion
  • The win percentage matrix (see below)
  • The metagame consisted of all decks with an initial metagame share greater than 1.5% and all other decks are lumped into other
The Win Percentage Matrix is a table of every deck which desrcibes the probability that the deck on the Y-Axis beats the deck on the X-Axis. For Rock-Paper-Scissors it looks like:

Chart 3. Win Percentages RPS
Win Probability
Rock
Paper 
Scissors
Rock
50%
0%
100%
Paper
100%
50%
0%
Scissors
0%
100%
50%

For PT RTR it looks like:
Chart 4. Win Percentages Modern



Some of this was filled out using Paul Jordan’s metagame article and some of it was just my subjective best guess. I tried two different versions. In version one I used my best guess for rough win percentages. In version 2 (the one from above), I tweaked version one until the probability of winning a random match was equal (or close) to their actual total win percentage at the Pro Tour (again from Paul’s article). The big difference comes in how the decks are treated when they play the nebulous “other” decks. We can see based on these assumptions the overall win percentage and how it compares to what actually happened. This is simply a sanity check.

Chart 5. Comparing Assumed (based on above) vs Actual Win percentages



Notes from the previous 2 charts
  • Affinity and Scapeshift have very good matchups against Jund, but not great matchups elsewhere
  • The two most popular decks have below average win percentages (Jund and Other)
  • Eggs has pretty much universally favorable matchups
  • Poison has a high overall win percentage but a poor Jund matchup
Simulation Results This shows us how we should expect the tournament to shake out in terms of composition.

Chart 6.  Metagame and Top 8 Shares

The following gives us an idea of what the best decks are. The added value measure calculates your advantage compared to assuming that each person is exactly equally likely to top 8 (or win) the tournament. 100% means you are twice as likely as someone with no advantage or disadvantage from deck choice to top 8. −50% means you are half as likely. The probabilities are the probability that a given individual would accomplish top 8 (or win). In other words if I chose to play Jund at PT RTR I was giving myself .1% chance of winning which is distinct from Jund having an 11.8% chance of winning overall.

Chart 7 Finding the Best Deck.


My Takeaways
  1. Poison wasn’t hurt much by its bad Jund matchup. The rest of the field slowly whittled Jund down and Poison was able to prey on the decks that were doing that (Scapeshift, Tron, Eggs etc.).
  2. Affinity was hurt because its main source of +EV is dissapearing in the later rounds. It also starts to form a large part of the meta as the tournament evolved and thus faced increasingly frequent mirror matches (making it harder for individual pilots to succeed).
  3. Only 15% of decks significantly improved upon the benchmark for the purposes of top 8ing.
  4. Edit: 01/29 Eggs was the second best deck choice for the tournament. But it was only the 4th most likely deck to win the tournament, of course this ignores the possibility that Cifka's list was better than generic eggs.
  5. The fact there was so much Jund in the t8 was probably due to above average limited performances (Ochoa 15, Edel 12, Yuya 15)
  6. Storm’s Performance (which was fairly solid) suggest a lack of “average” storm players. More likely were some people with very good decks and some people with very bad versions.
Metagame Rule 4: In this environment most decks are bad choices. Arguably 85% of decks were below average choices. If the most popular deck is not the “best deck” the metagame decision is very important.

Metagame Rule 5: Being better with your deck (as opposed to trying to metagame) is better when the tournaments are less rounds. Imagine being a rock player with a 20% win percentage against paper and a 70% win percentage in the mirror. You still wouldn’t want to be in the average top 8 from the first example. In general we probably overestimate the value of being good with your pet deck. This will fall out of the theoretical work I plan to examine later.

Final Thoughts

I am going to add some further analysis based on theory to answer some hypotheticals that some may find interesting such as:
  • How much is being Yuya worth?
  • Does number of rounds matter a little or a lot?
  • Does the theory (which requires even more simplification) agree with the Simulations (spoiler: Science works)?
I would also like to run some numbers on another metagame where the “best deck” was actually the most popular deck. In what I expect will be the mother of all plot twists, I assume the math will say to play the best deck. Not sure where to find that stuff so I will do some digging. Maybe PT: Tempered Steel or Cawblade.