All those away goal graphics are misleading you. [Updated]

November 04, 2016 by Brit Byrd

They also told me that you look like you got fat.

Updated November 5, 7:57pm

Keeping track of all the possible outcomes of a 2-leg playoff series can be cumbersome. Luckily over the past week a series of visualizations have popped up across the soccer internet to help us all out.

Quick reference guides for the 2nd legs of @MLS conference semifinals, accounting for away goals rule: pic.twitter.com/aWQojtfsqe
— Steve Fenn 🏆🛡️__ (@StatHunting) October 31, 2016

However, you are being misled. Very lightly and subtly, but misled nonetheless. The trouble is that these are great reference tables, but bad data visualizations -- and the key inclusion of color is ensuring that your eyes are interpreting these graphics as both.

By coloring in each entry according to the winning team, the grid moves beyond being a dry reference table and takes on a new life as a kind of area map. We’re not just looking at this to see how many goals Dallas needs to score if Seattle scores one, we’re looking at that huge wave of rave green and that tiny dot of red and thinking “wow Dallas are screwed.”

Which, as it happens, they are. But yet these graphics are still being sloppy in communicating how much -- especially for the other closer games. More space is communicating better chances, so the shape and size of each entry in the grid matters. There are two ways in which I think the existing graphics fall short on this front:

uneven axes and rectangular grids
over representing low frequency results

Ultimately, adjusting for both of these aspects would leave us with something more like this:

Numbers in each grid represent % of occurrence of given score in select European leagues from 2005-2010

Uneven axes and rectangular grids

These things are made of rectangles. This is probably just because people didn't fudge with the excel defaults, and didn't mind having the extra padding for text. But the result is that one team's axis is significantly shorter than the other. Now this is not the worst sin, as ultimately the area of each team's rectangles are the same. But our eyes interpret height and width differently. There's no reason to not try to control for this uncertainty by making them all squares, so that one team doesn’t command more of our attention from unconscious bias.

Additionally, the original graphics posted featured a 5x6 grid. I'm not sure why, but the result is that the home teams get an entire additional row of real estate -- and sometimes low information real estate at that (but more on that later). Now in this case it worked out so that it was the y-axis that received this extra row, which somewhat compensates for the "wider" x-axis. Maybe this was an intentional evening-out of sorts, I don't know. But either way it's imprecise and presents an uneven universe of possibilities, and there's no reason to not make the entire graphic a square.

Over representing low frequency results

Even though all the grids and axes are evened out, we’re still over-representing certain data. Specifically, we are over-representing very high scoring games and blowouts. Many more games end 0-0 or 1-1 than 4-0 or 4-4, but the graphic is representing these score lines with the same visual weight. For reference, here is a distribution of scores in a similar grid:

@Viewfrom202 true, but w unusual incentives during playoffs I was uncomfortable with making scoreline dist central pic.twitter.com/TUg8u6cM4M
— Steve Fenn 🏆🛡️__ (@StatHunting) November 1, 2016

Consider that while the 5x5 grid (goals from 0-4) represents 96.7% of all scenarios, a 4x4 grid (goals from 0-3) still represents 89.5% of all scenarios. That final row and column represent just 7.2% of all actual outcomes, yet they are taking up 36% of the space. That’s a disproportionate amount of area representing pretty unlikely events.

By shading in each entry corresponding to its frequency, we can avoid some of these pitfalls. In the straight color version, Montreal occupies 72% of the graphic even though only 64.3% of the time they get a score that sees them advance. The gradient helps adjust for this.

When I raised this stink on Twitter, the creator of the original graphics somewhat agreed, but offered that due to the unusual scoring incentives of the playoffs, he felt uncomfortable making score line distribution too central. (He also provided the very handy graphic of score line distributions which I have used to fill out these tables.) While it’s not ideal to be applying regular season data to the playoffs, I think it is certainly better than nothing. A 4-0 result is still rare and remarkable in the playoffs, and a 4-4 result especially so. And the playoff incentives are more likely to distort the distribution at the lower scores, as teams might bunker more than they would in a regular season game (as you would expect of Montreal on Sunday). I’m more interested in diluting the outer fringes of the board, where currently a score line with 0.1% frequency is occupying 4% of the space. Even if that 0.1% figure is off by an order of magnitude, it’s still a worthy change.

Also worth considering is how much information we’re really getting from some of these 4-0 squares anyway. The four- goal row is most useful when it’s telling us that the Red Bulls need to score four when Montreal scores just two. And Montreal scoring two is not a crazy scenario, it happens 19.3% of the time. But stepping back into a little bit of common sense tells us that when it gets to some of these more extreme score lines, it’s not very informative at all. Obviously the Red Bulls cannot afford to give up four goals, and obviously they cannot tie 4-4, as they are already behind. You know this without referencing the table or away goals. This part of the grid is only here as the extension of the more interesting part of the array, but we can de-emphasize it appropriately.

These 4-0 squares are kind of like if FiveThirtyEight somewhat prominently displayed one of their 10,000 simulations in which Hillary Clinton wins every single electoral vote. Sure it could happen, but it’s very unlikely, and in any case very obvious. The Red Bulls cannot afford to lose 0-4, just like Donald Trump cannot afford to lose every state. Now of course your brain know this, but your eye doesn't; at a glance, it first picks up that this square takes up the same amount of space as the most likely score, a 1-1 draw.

If this were a more robust, respectable, and generally better publication, we might have a nuanced model a la FiveThirtyEight to generate a distribution of likely scores taking into account match-up specific ELO rankings rather than imported frequencies from other competitions. Alas, this is a pedantic rant conducted over a lunch hour so this is what you get. But we can at least avoid representations that we know to be bluntly misleading (and indeed, imperfect and low information analytical efforts can be better than doing nothing at all).

UPDATE: I've done the graphics for the other three match-ups.

The Toronto - NYCFC graphic was actually fairly close in its original form. The original graphic was overstating Toronto's chances by only 4%, and the gradient may be a little harsh to NYCFC's primary color of light blue. Even at full opacity, it might look weak compared to Toronto's strong red, and at the bottom of the spectrum, it barely registers at all, even though the 4-0 and 4-1 squares are half of their entire entries.

LA - Colorado is, as you'd expect, the exact same thing as RBNY - Montreal. The original graphic is over-representing both teams' chances, but the gradient redirects our attention back to the more meaningful part of the graphic.

For Seattle - Dallas, the original graphic was actually understating how screwed FC Dallas are. Even though Seattle is being given all that low information, but eye-catching 4 goal territory, the frequencies of the score lines Dallas's needs are so low that they were still getting the better of the graphic. I've reintroduced the 5 goal row for Dallas here, as I think it effectively communicates the daunting task ahead of them, whereas this row was just not as relevant for the other match-ups. Ideally I would have been able to throw in a 5 goal column for the away team as well, but alas the data was not available.

***

comments, questions, and general snark can reach the author at brit.byrd@gmail.com

Targeted Points vs. Actual Points [Updated]

August 01, 2016 by Brit Byrd

On this page, you will find our best-case-scenario points projection (from Episode 019) along with an automatically updated sheet and chart on how we're doing for the season in relation to our targets.

An important distinction: the results in the "target" column are not predicted or expected results. As discussed in Episode 019, predicting 7 wins out of 12 and only one loss for the remainder of the season would be quite bold. Rather, they are targets that would deliver us 56 points for the whole season, and a solid chance at a first round bye in the playoffs. They can be considered as a "best case scenario" of sorts that, while very optimistic, acknowledges we're not going to win out.

Bill Reese Full Interview

July 22, 2016 by View From 202

You can check out our full interview with Bill here. We talk about the "Take 'em All" chant, RBA Attendence, and T-shirts and branding.

RBNY vs. Orlando Predictions

July 13, 2016 by Brit Byrd

Hey you, we see you. You're on the PATH train, it just came out of the underground at Journal Square, and you've got about ten minutes til Harrison and don't know what to do with your newly resuscitated data service.

How about read our predictions?

Peaches

Orlando are not in a good place right now, with Adrian Heath recently parted and Kaka on a calf injury. I don't expect Larin to be able to carry Orlando to a W at RBA. With Perrinelle not fully fit, and Zizzo seemingly not able to last a full 90, I expect to see Duvall and Collin return to the back 4 tonight. Veron looked dangerous against Portland, even if nobody could finish, but I expect us to be a bit sharper tonight.

2-0 RBNY.

Lineup: 4-2-2-2

Brit

Orlando is a team in turmoil, and we seem to be finding our legs. Now, this game isn't literally a "must win," in that it is still July and in the case that we don't, there's still plenty of chance that things can end up okay. Not great, but okay. But Dax is still right to say that "anything less than 3 points is unacceptable." This is a kind of game where elite (or even just good) teams prove themselves by relying on their individual quality and team organization, even if the ball hasn't been bouncing their way as of late.

I'm predicting a win, and I will admit this is not 100% anchored in reason (insofar as any of these predictions are). It's based on the premise that we are a top 3 team, the likes of which resoundingly beat FC Dallas and Toronto at home. If we don't bag all 3 points tonight, I will settle more comfortably into the frame of mind that we're somewhere between 3-6 in the East, scrapping together points with an eye toward a good playoff run.

2-0 RBNY.

Lineup: 4-2-2-2

Sam

With Heath gone and Kaka injured, this could be the game to put wind back in our sails. Major concern right now is the back line. Highly anticipating DP/Collin in the back, although Duvall will suffice if need be. Key matchup is Shea/Zizzo, due entirely to Zizzo's mediocre run of form. I expect the offense to click more and for Veron to play well in the 4-2-2-2.

2-1 RBNY.

Lineup: 4-2-2-2

RBNY at RSL Predictions

June 22, 2016 by Brit Byrd

Due to this week's crazy schedule congestion, we will be recording after Wednesday's game in Utah, and including it with discussion of the Seattle game, the US Open Cup, as well as the latest USMNT action.

Before we kick off, we've supplemented our usual Twitter-prediction posts with some short blurbs and preferred starting 11s:

Sam: RSL are without Beckerman, and looked unconvincing against NYCFC and Portland. With two fully rested center backs, RBNY should take this one 2-1.

Peaches: It's going to be a tough time on the road, and we might be without Grelladinho. I think that RSL has enough offensive power to put 2 past us, and I think we'll rally back for a hard-earned road point. It seems like we're going to have some rotation (and dealing with nagging injuries to the likes of Zubar), and we may even see the likes of young guns like Derrick Etienne, Jr. Draw, 2-2.

Brit: Marsch's gamble to rest Collin and Baah on Sunday paid off -- so far. Hopefully their fresh legs see us through another solid defensive game, if not a clean sheet. Also don't see the team losing its eye for goal quite yet. A big Q is if Jesse rotates the midfield, and how someone like Davis might perform. Draw, 1-1.

Gaby Kirschner Full Interview

June 18, 2016 by View From 202

You can check out our full interview with Gaby here. We talk about the culture of soccer support, differences between the US and England, and Adam Lallana.

By Official White House Photo by Pete Souza [Public domain], via Wikimedia Commons

2016 Presidential Candidates as MLS Teams

June 13, 2016 by Brit Byrd

The 2016 Presidential campaign has been a mess. A big fat waste. So has the 21st season of Major League Soccer. We think that there's room for comparison, so we made a 100% scientifically determined, 100% accurate list of Presidential candidates as MLS teams. In other words, we know lists, we have the best lists, guess what? This list just got 5 bullet points longer.

[Editor's note: This list was written mostly at the time of the Wisconsin primary, and for narrative purposes weaves in and out of considering this year’s results, last year’s results, and even the entire history of certain clubs in making these judgments. We welcome your angry and disappointed tweets.]

Jeb Bush - Seattle Sounders

By Michael Vadon - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=42198021

Conventional wisdom would have us believe that Jeb Bush was supposed to be a force to be reckoned with. Much like it had us believing that the Seattle Sounders would start their season off strong this year. Both had loads of money, but it's proven ineffective thus far. Both are big fans of call and response.

Jim Gilmore - Chicago Fire

By Michael Vadon - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=46751356ns

So lackluster that even their failures are overshadowed by the failures of others. They won something once, and it wasn’t insignificant! But that seems like a long time ago. Now people mostly just feel sorry for you, when they remember you at all.

Scott Walker - Colorado Rapids

By Michael Vadon - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=41706317

The Rapids, like Scott Walker, seemed to have somehow lost the season before the damn thing even began. Questionable internal leadership and upper level strategy seemed to doom both organizations -- a far cry from the victories of 2010, which in hindsight seem unbelievable. Yet, despite all the mockery, the embers of hope seem to be smoldering for both Walker and the Rapids. No, Walker isn’t going to become President, but he did flex a surprising bit of clout in the Wisconsin primary, and may well have an influential career left for him somewhere. The Rapids, meanwhile, pulled off a couple of personnel moves that don’t look bad at all really -- and now you find yourself commenting things online like, “actually, I never believed the Rapids hate in the first place.”

John Kasich - Columbus Crew

By Gage Skidmore from Peoria, AZ, United States of America - John Kasich, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=47475973

A no-frills team, with purportedly solid fundamentals and no glaring weak spots (according to the pundits). Less extreme in style. From nondescript parts of both the country and your imagination. “I’m sure Columbus is an alright place,” you imagine. “It can’t be the worst, right? I would’ve heard about it way earlier if it were the worst.” You never get to visiting Columbus, and you never quite get to researching John Kasich. Years from now, you remember them both as “pretty alright, I guess.”

But back to brass tacks -- a throwback to simpler time, when success in MLS 1.0 and the GOP primary seemed predictable, albeit unsatisfactory. Like RBNY, they are widely respected by the establishment, but even though they seemed to have lagged behind the whole time, RBNY’s post-season ended earlier.

Marco Rubio - New York Red Bulls

By Gage Skidmore from Peoria, AZ, United States of America - Marco Rubio, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=46825578

Similar to Columbus, RBNY has cautious establishment approval, but not much to show for it yet. Unlike Columbus, RBNY is known, for some reason, to have some sort of “flair” to their appeal. But when put on a larger national stage, they seem predictable and vulnerable. Circa 2010, they were hyped as a standard bearer of a new incoming wave. But in reality, they’re as much of a product of the old establishment as there is -- albeit more extreme in some of their methods. With the arrival of some louder and easier to hate peers, they’ve become slightly less reviled (and sometimes even liked!). But ultimately in 2016 they find themselves on the outside looking in, as the establishment warms itself to its shiny new toys.

Now admittedly, this comparison would work best if, instead of retiring from public life, Rubio were buckling down and resolving to improve his previously nonexistent grassroots organization. But maybe Team Rubio knows something we don’t? Is his withdrawal an omen of changes at the top of RBNY? (No, almost certainly not.)

Hillary Clinton - LA Galaxy

By Gage Skidmore from Peoria, AZ, United States of America - Hillary Clinton, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=47942582

Both Hillary Clinton and the LA Galaxy ooze inevitability and success. They shuffle around personnel and positions, but it can be easy not to notice as the continued winning results feel business-as-usual. They always have their detractors and haters, yet in a way excessively hating on them also feels a bit passe. It certainly doesn’t stop many though, as the latest DP-rule-change/manufactured Benghazi “scandal” will put people into a froth.

Bill Clinton - DC United

By US Department of Labor - http://www.flickr.com/photos/usdol/8450163113/sizes/o/in/photostream/, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=24971217

Good ole bill and DC United were killing in the 90s, and they still have the residual respect to show for it. But even though they’ve stuck around, they haven’t really been competing for quite some time now. Around 2012/2013 they had a resurgent moment in the sun that reminded us of the glory days; for DCU, their unlikely US Open Cup run, and for Bill, his raucous keynote address at the 2012 DNC Convention. But now they’ve receded from the spotlight and we mostly seen them when acting on behalf of others: for Bill, his support of his wife’s Presidential campaign, and for DC, gifting the Red Bulls an easy first round match-up in the playoffs.

Bernie Sanders - New England Revolution

By Gage Skidmore, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=46471539

Vermont Senator Bernie Sanders is from Vermont which is in New England, and he wants to lead a Revolution. This may not be the Bob Kraft-owned Revolution who play on artificial turf at Gillette stadium, but both NE Revolution and Bernie Sanders are likely to get depressingly close to their goals without reaching it all together.

Donald Trump - NYCFC

By Gage Skidmore, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=47943348

Ok listen, smurfs. We don’t take this lightly. Being compared to Trump is pretty severe, we know. We’d really prefer to compare no one to Trump, but alas, the whole exercise would feel a little empty if we left him out. So we have to give it to someone. And, well all this being said, we don’t really like you, you know that, we both know that. But many of you, we assume, are good people.

The branding has been years in the making. To a degree, you understand the appeal -- the money, the grandiose environs (that are actually owned by someone else), and the growing lineup of high profile, albeit late arriving, endorsements. You can’t deny that the crowds are impressive. Although unsettling and often containing unsavory elements, they’ve brought a lot of eyes and voices to the arena that weren’t necessarily there before, at least not in this way. As often as you wish you’d never heard from them at all, you know you can’t ignore them. At times you’re sympathetic to the argument that it’s necessary, perhaps even good for the league country, to publicly exorcise these demons and have a larger populace tuned in and participating week in and week out. And hey look at all this swag! It’s everywhere. It’s not a bad design, you know.

But then you snap out of it. Did it have to happen quite like this? A manufactured claim to authenticity, wearing the robes of (civic) nativism, and funded by the very sources of modern excess many of its adherents claim to abhor? Blind to the irony that many of those in attendance once called somewhere else home? Do we really need the ties to unethical labor practices and misogyny? The belief that, despite never having won anything in their life, they feel entitled to all the glory of a hard-working long-suffering base? The arrogance towards others, and insistence that they, single-handedly, will somehow both elevate the league country to a new level and return it to a false image of an imagined glorious past?

Look, we’ll be the first to tell you the corporatization of our team league country is a problem -- but there are better alternatives than this.

Ted Cruz - the Nazi Prison Guard Team from Escape to Victory

By Michael Vadon (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

We struggled mightily with this one. Ted Cruz’s hallmark is his ability to be hated by everyone, yet somehow still succeed. The only suitable comparison would be a team that isn’t just reviled by the Metro faithful (DC United), but by all teams. The LA Galaxy and NYCFC were the closest we could think of, but ultimately not as snug of a fit. Seattle registers closer to obnoxious than reviled. The Red Bulls themselves were once perhaps so widely hated, but the arrival of NYCFC seems to have diverted the generic NY sports hatred of many, and the shift toward a team-first and academy-building strategy seems to have even earned respect in some corners.

We’ve paired him with other ideological fanatics with a penchant for strong organization and a questionable persecution complex. Is this comparison haphazard and kind of a cop-out? Yes. Does it succumb to Godwin’s law? You betcha. Are there other candidates that have been more often compared with men in black? Sure.

But, Ted Cruz literally breaks our scale of hate-ability.

Think we're misguided, wrong, or hacks? We welcome your animus in the comment section.

Brent Gamit Full Interview

May 21, 2016 by View From 202

You can check out our full interview with Brent here. We talk about DC hate, rivalries, and times of Metro past.