Sunday, August 11, 2013

Thoughts on Stolen Bases

There is a widespread belief among sabermetrically knowledgeable people around the internet that in order for a base-stealer to be beneficial to his team, he has to have an extremely high success rate.  This is based on evidence supported by data.  As in most applications of data analysis, an erroneous or superficial understanding of what the data shows can lead to misleading or erroneous conclusions.

In the case of Stolen Bases, you must account for two outcomes and the impact they each have on the game.  There is the positive outcome of a successful SB, which advances a runner 1 base without costing an out.  There is also the negative outcome of a CS which takes away a runner while adding an out.  Analysis of run scoring averages in different situations enables statisticians to assign a numerical weighted value to both outcomes.  While there are minor differences of opinion, the most widely accepted "break-even" rate of success in stealing bases is right around 75%.

Let's take the most prolific base-stealer of all time, Rickey Henderson.  He stole 1406 bases in his career, but was CS 335 times for an 80% success rate.  Over the course of his career, all those bases Rickey Henderson stole probably benefitted his team by a marginal amount.  Wait a minute!  You say.  Rickey played 24 seasons in MLB. His success rate was dragged down by his later years when he was old and not as fast.  OK, then let's look at the year he set the single seaoson record for SB's at 130 in his 4'th season.  He was CS 42 times for a success rate of 76% which is very close to the break even point.  Those 130 SB's probably were of no net positive value to his team because of the negative impact of the 42 CS.  In other words, the A's would probably have scored the same number of runs over the course of the season if Rickey had not attempted 1 SB the entire season!  Of course, there were other seasons where Rickey's success rate was much better, and his SB's probably made a positive contribution to the team's run-scoring, but you see the point.  Rickey Henderson was a very good ballplayer without the SB's, but probably not a HOF'er.  He is in the Hall of Fame for a record that probably did not contribute much to his teams' successes!

What the amateur sabermetricians miss here, though, is that the value of a SB is different for different situations depending on the makeup of the team and the game situation.  SB's can have a different positive value and CS can have a different negative value depending on the inning, score of the game and place in the lineup.

Let's take a successful SB as an example.  Intuitively, a successful SB will always have some positive impact on the team's chances of scoring a run(s), but does it always?  Take the following situation:  2 outs, a runner on first base, Barry Bonds at the plate.  If the runner at first base successfully steals 2B, the opposing team will almost certainly walk Barry Bonds to face the next batter.  If the next batter is Jeff Kent, then that may be a good thing for the Giants, but if it is Edgardo Alfonso, then the successful SB probably made it less likely that they would score that inning!  In  that situation, the SUCCESSFUL SB has a negative value.  Of course, if the runner is CS, the inning is over, Barry Bonds does not get to bat at all and there is clearly a strongly negative value to that.

Now let's take a look at a second situation:  Runner at first base, 2 outs and Marco Scutaro at the plate with Pablo Sandoval on deck.  Marco Scutaro gets a lot of basehits, but he does not hit many HR's.  Pablo Sandoval hits a few more HR's than Scutaro and has some extra-base power, but is hardly a league leading slugger.  If the runner at first base steals 2B, the opposing team is not going to walk Scutaro to get to Sandoval.  The runner will likely not score from first base on a hit by Scutaro, but will likely score from 2B, so a successful SB has significantly increased the team's chances of scoring.  Wait a minute!  You say.  What if the runner stays at first and waits for Scutaro to advance him on a hit and now you have Sandoval up with 2 outs and 2 runners on base!  Now, us Giants fans intuitively know that is not a great situation but that is born more out of years of frustration watching Pablo leave runners on base than any deep insight into sabermetric theory.  Here are the numbers:  Let's just say that Scutaro is a .320 hitter and Pablo is a .270 hitter.  There is a 32% chance that Scutaro will get a base hit in that AB, but if the runner does not steal 2B, it will take consecutive hits from Scutaro and Pablo to score the run.  The probability of getting those 2 consecutive hits is .32 X .27= .086.  There is just an 8.6 % chance of getting those 2 consecutive basehits required to score the run!  So, a successful SB increases the chances of scoring a run from 8.6% to 32%.  If the runner is CS, the chances of him scoring from 1B in the innning were only 8.6% anyway.  The positive value of the SB is enhanced at the same time the negative value of a CS is mitigated by the situation.

Let's take a look at a broader picture.  If you have a team that hits a lot of HR's, and produces most of their runs as a result of those HR's, there is little positive value in stealing a base.  The runner will score from 1B on a dinger just as well as from 2B.  On the other side of the equation, if the runner gets thrown out, he loses his chance of scoring ahead of a dinger.  The Baltimore Orioles, with a lineup that has Chris Davis, Adam Jones, JJ Hardy, Matt Wieters and Manny Machado, should not be trying to steal bases!

On the other hand, if you have a team with a pop-gun offense who plays in a tough park to hit dingers in, a successful SB enhances the chances of scoring, while the CS is less harmful because the runner who stays at 1B is not likely to score anyway.

In conclusion:  Successful SB's have a smaller positive impact, and unsuccessful SB's a greater negative impact on teams that hit for power.  Successful SB's have a larger positive impact and unsuccessful SB's have a smaller negative impact on teams that hit for less power.

PS:  Historically, you could say that Maury Wills' baserunning was more valuable to the Dodgers of the 60's than Rickey Henderson's was to the A's and Yankees of the 80's just because of the context of the teams they played on.


  1. I think it would be good to comment that a successful basestealer affects defenses, what pitches are thrown, and also draws the pitcher's attention to the batter. Having a threat and speed at first base is always a positive even at a 75% clip.

    1. While I don't dispute the truth of that, it is extremely difficult to prove and even more difficult to quantify. Thanks for pointing it out, though.

    2. I would also add that sabermetrics has tried to measure those effects and have failed to identify anything yet. Does not mean it does not exist - I've believed in clutch, which saber studies had not found any evidence until in recent years - just wanted to note that sabers have tried.

      Speaking of which, I would be surprised if there hasn't been a study yet, but to DrB's point about where you are in the game and type of team you have, there is a line of study that looks at each event in a game and calculates a number (sorry, don't remember the metric) representing how much that event changed the game's win/loss dynamics, based on averages. Does not cover DrB's point about type of offensive teams, but still this could add some nuance if anyone has seen such a study.