Discussion Forum

Forum >> Discussions >> The Early Bird Bonus and you ... Now with math!   Bookmark This Forum Thread

Post ID Date & Time Game Date Function
Seca
Joined: 05/05/2014
Posts: 5201

Waterloo Dinosaurs
Legends

Broken Bat Baseball
The perceived advantage of drafting early has long been a contentious topic. With the new draft approaching which will reveal more information about prospects before drafting, I suspect concerns will only grow worse.

This is an attempt to put some actual math (with as few "fanny numbers" as possible) into the discussion. Note that although I am good at math (physics and computers teacher with 20+ years experience) these aren't tools I use very often. If you think I've calculated something wrong or applied a tool improperly, please say so. I am open to making corrections.

Pool Size and Sampling

I am going to assume all pools begin with 10% (<-- fanny number) of the players as "good". Note:
- this is probably wrong (too big? to small?) but since it's constant so I don't think it's a problem
- Hardwood has shown there is annual variation in the number of good players in a pool. On average they probably share a percentage, but in any given year one pool might have a greater fraction of good players than another.
- good or not good is binary. BrokenBat players have a much broader spectrum of quality.

Arriving First

Pool of 10 (1 good player)
There is only 1 way to draw 10 players.
The first person to arrive has a 100% chance of getting a good player.

Pool of 11 (1 good player - only example not 10%)
There are now 11 ways the players can be drawn. 10 of these contain the good player. First to arrive has a 91% chance of getting a good player.

Pool of 20 (2 good players)
Doesn't take long for numbers to start getting big. :)
There are now 184756 (20 choose 10) combinations of players that may be drawn.
There are 43758 combinations that do not contain a good player (18 choose 10).
First to arrive has 1 - (43758/184756) = 76.3% chance of getting a good player.

Pool of 50 (5 good players)
1 - (45 choose 10)/(50 choose 10)
1 - (3.19E9/1.027E10)
69% chance of a good player.

We see the sensitivity of sampling from small pools. As we increase the size of the pool while maintaining the percentage of good players, the chances of getting a good player drops.

The effect flattens out relatively quickly. If we apply the above math to our pools:
International (100 players, 10 good) = 67%
Asian (250 players, 25 good) = 66%
High School (3000 players, 300 good) = 65%

So. If you are very first in line and have your choice of pools, the pool you choose has little effect on your success rate.

Arriving Later

Starting with the small pools again:

Pool of 10 (1 good):
1st to arrive = 100%
2nd to arrive = 0

Pool of 11 (1 good):
1st to arrive = 91%
2nd to arrive =
0 if 1st hit
100% if 1st whiffed

Pool of 20 (2 good):
1st to arrive = 76.3%
2nd to arrive =
1 - (18 choose 10 over 19 choose 10 )= 53% if 1st player hit
1 - (17 choose 10 over 19 choose 10) = 79% if 1st player whiffed

What we see from these examples is the effect of faster depletion of smaller pools. They are much more "granular" (not smooth), with your odds changing significantly based on the success of those who have come before.

Putting this into the context of our pools,

Let's say you are person #51 to arrive.
25 of the previous used the Asian pool.
25 of the previous used the High School pool.
Let's assume each group had a 60% success rate (<-- fanny number. Fair?)

When you go to pick,
The Asian pool contains 225 players, 10 are good.
1 - (215 choose 10)/(225 choose 10) = 37% chance of getting a good player
The high school pool contains 2975 players, 285 are good.
1 - (2690 choose 10)/(2975 choose 10) = 64%

(Fanny 60% probably ok for HS, but seems inaccurate for Asian. So Asian looks a little worse than it likely should).

What we see:
- Early Bird bonus can work against you. If you think you are early and use the Asian pool, you might actually be hurting your chances. It doesn't take very many people ahead of you with average success rates to drop your percentage. (Anecdotal, but this is consistent with what I saw when I looked at top 90 picks this season. The Asian pool went cold around 40-50 ).
- The big pools are more resilient. If person #51 uses the high school pool their chances of getting a good player (64%) are basically the same as the first person in line (65%-67%).
- The big pools are "smoother". The success rate drops more gradually and is not as contingent on the success of those before.

The next step ... Round 2

I think the obvious next step is to apply this thinking to round 2 of the draft. I'd love to do that, but the quantity (and quality) of fanny numbers explode. How many players were drafted in round 1? What pools did they come from? What were the success rates? I'll think about it some more, but I fear any numbers I could generate would start to look more like opinion than fact.

A number that came up elsewhere was pick 300. If we took all of those players out of the high school pool and use 51% (<-- my best fanny number) as a success rate,

2700 players, 147 good
= 1 - (2553 choose 10)/(2700 choose 10)
= 43%

When do we have 300 drawn from the HS pool? Start of round 2? End of round 2? Round 3?

Anyway, thanks for reading. Open to discussion and corrections.
xLee227
Joined: 07/06/2015
Posts: 269

Inactive

Broken Bat Baseball
Thanks for getting this discussion started, Seca! Those are some really interesting results especially with regards to the differences between draft pools that I think help support the anecdotal evidence we've heard over the last few years.

To add a bit more of an empirical perspective, I used data from all drafted players in the first round of the 2038 draft to create a plot of rolling averages:

kdZW7Up.png

There were 664 players drafted in Round 1 of 2038, and they a rolling average was calculated in 100-player chunks, i.e., players 1-100, then players 2-101, then players 3-102, and so on. The player draft number refers to the order in which players were drafted - the first player drafted had a draft number of 1, the second player drafted had a draft number of 2, and so on. The plot above shows the rolling average of players' potential, which exhibits a relatively downwards trend - the later in the draft, the lower the average potential although there were some visible fluctuations as well.

I also did a quick chi-squared test using Python to determine whether there was a difference in the potential between different draft groups. Contingency table is below. The first group of players is the first 100 players to be drafted, the second group is the second 100 players to be drafted, and so on. The potential bins refer to players with 12 or lower potential, 13 potential, and 14 or higher potential.

ibzrVHn.png

The p-value was 6.02%, meaning there is a 6.02% chance we get a chi-squared statistic equal to or larger than the observed statistic of 20.383 assuming there is no difference in the potential between different draft groups. Although this is higher than the typical 5% threshold, I believe this in conjunction with seca's theoretical work above helps support that there is a difference in the potential based on how early a player was drafted.

Updated Sunday, May 12 2019 @ 1:44:59 pm PDT
Rock777
Joined: 09/21/2014
Posts: 9602

Haverhill Halflings
III.1

Broken Bat Baseball
That is nice work.

Although I think those of us who understand math didn't really need these calculation to understand what was happening, while those who don't understand math are likely still unconvinced...

Updated Sunday, May 12 2019 @ 2:15:34 pm PDT
Brewnoe
Joined: 03/25/2014
Posts: 818

Fall River Naughty Dawgs
IV.5

Broken Bat Baseball
*(&*&^% the 8th round limit

and twice for starting with the 3rd round
Seca
Joined: 05/05/2014
Posts: 5201

Waterloo Dinosaurs
Legends

Broken Bat Baseball
Appreciate the input. :) Please don't take any of this as criticism. I enjoy the discussion.

I believe this in conjunction with seca's theoretical work above helps support that there is a difference in the potential based on how early a player was drafted.

There isn't any question whether there is an effect (Round 1 > Round 8). It's all about the significance.

In my view, the significance of the significance is more significant in the first 100. I wouldn't be surprised if the same 60 or so managers showed up in the first 100 season after season. Conversely, those in the 600s probably have more season to season variation in draft position. If there is an advantage, its only a problem if its systemic. That would happen early in the round (imo).

My own conclusion is that the significance can be largely mitigated by pool strategy. When I look at the draft data from 2038, I picture decay curves for the different pools superimposed on each other, all with different rates and coefficients. A lower line at 300 is blending depleted small pools and those that aren't. With "skill" I can achieve an above average result.

Skill is definitely a thing. My numbers assume that if there is a good player in the list, that player will be chosen. That's not always the case. The early birds are the keeners, and have more skill (generally speaking) than those who arrive late in the draft and a better chance of identifying good players. At the other end of the spectrum, bots (are they the dip 240-270?) have no skill at all.

The concern in the new draft is that it will require less skill (potential visible). May turn out that way. Then again, looking at written scouting and amateur stats and crossing that against an uncertain potential may actually mean more skill is needed.

The other little pitfall is associating potential with player quality. Its our best way to quantify it, and it probably has a chi squared under 5% :), but it does introduce an error bar.

I found my analysis comforting. I am less concerned about the small pools. My main worry moving into the new draft is the influence of being able to draft pitchers.
buffmckagan
Joined: 12/22/2013
Posts: 651

Scranton Bears
Legends

Broken Bat Baseball
I appreciate all the work put into this. Do you think early bird effect will matter if you are drafting for a specific position, as teased by Steve?
BUDude
Joined: 05/05/2019
Posts: 54

Inactive

Broken Bat Baseball
I want to make sure I understand what is being said here. If I'm understanding correctly, your work suggests that within the first round of the draft, the players who log in and make their pick earliest are more likely to have a higher potential pick?
buffmckagan
Joined: 12/22/2013
Posts: 651

Scranton Bears
Legends

Broken Bat Baseball
The premise I got:

TL;DR
- Drafting from the smaller pools is not a good strategy unless you're sure you are drafting super duper early
- Drafting from the bigger pools is a safer bet overall (especially HS)
Rock777
Joined: 09/21/2014
Posts: 9602

Haverhill Halflings
III.1

Broken Bat Baseball
Still always going to have a big advantage the earlier you draft regardless of the pool size. But yes, it gets worse the smaller the pool gets.
BUDude
Joined: 05/05/2019
Posts: 54

Inactive

Broken Bat Baseball
Cool, thanks for clarifying that for me. I'm getting better at understanding all the math-y wordings, but sometimes ya just need a good TLDR.
Yuri84
Joined: 10/14/2014
Posts: 639

Apple Valley Raccoons
IV.4

Broken Bat Baseball
Urgh, math... can you explain it using history or geography instead? Even biology will do... probably. :)
Frankebasta
Joined: 09/15/2013
Posts: 885

Kodiak Mules
III.3

Broken Bat Baseball
Hmm... I'll give it a try with history:

Those who came first, got to choose the best land in the New World, with minimal hustle because there was room for all
Well, actually the VERY FIRST got screwed, because they landed too much to the North and there was not much to farm there...
that's for those who draft HS at the very beginning and get a Pot11 Very Good :)

Then, they had to go further and fight the harsh climate and those brave, desperate, local inhabitants (that's Texas and Utah settlers).
Still got a nice place to settle in, but it was not easy. Required lots of skills.

There's a late surge then: the route to California opened up, and the Oklahoma land rush too. Easy Peasy :)

Last ones to come, they had to stay in NYC or Boston for a while, as indented laborers.
That's still a 1st rounder, yes, but not everyone succeeds.

Makes sense??? :)

Updated Friday, May 17 2019 @ 5:42:51 am PDT
Yuri84
Joined: 10/14/2014
Posts: 639

Apple Valley Raccoons
IV.4

Broken Bat Baseball
Thanks, it kind of makes sense now. I only have one question... will I be able to buy Alaska back? I'm willing to pay in... apples? :)

Updated Friday, May 17 2019 @ 5:17:55 am PDT
Frankebasta
Joined: 09/15/2013
Posts: 885

Kodiak Mules
III.3

Broken Bat Baseball
hahahaha

well, I think Alaska here would be.... that pot16, who was scouted as Good a fell thru the cracks till the 8th round :)

as for the site's currency, yes I think Apples are the standard ;)
Seca
Joined: 05/05/2014
Posts: 5201

Waterloo Dinosaurs
Legends

Broken Bat Baseball
Still always going to have a big advantage the earlier you draft regardless of the pool size

Hehe. Thought you implied you understood the numbers? :) The emphasis is mine, but this is the type of rhetoric I was hoping to curb.

Do you think early bird effect will matter if you are drafting for a specific position, as teased by Steve?

The concern with pitchers is:

All of the "good player" references in the original post change to "good pitcher". May be very significant. Ie., being able to view amateur stats is huge for pitchers.

The pool size is unknown. Probably bigger than the big pools? Or about the same? 70:30 position players to pitchers? Is it going to see Asian-style depletion if 70 of the first 100 go for a hurler?

The "advantage" of drafting early in the first round is that your choice of pool is not constrained. You can get a good percentage from every pool, so you choose the pool that best suits your needs. If you draft later in the first round, your choice is guided more by strategy. You might like an Asian player that would be ready a season or so sooner, but you are compelled to go elsewhere to max out your %.

That isn't really a big deal currently. Need a pitcher? Pick a pool and cross your fingers. The small pools being depleted doesn't matter. There are lots of good pitchers to be found in the second round.

With draft by position? Remains to be seen. That pool may show the effects of depletion in the first round. If you draft in the 400-500s you are likely compelled to avoid the pitcher pool to keep your % up, and by the end of the 2nd round the chances of finding a star may be minimal.

I argued against draft by position, and still think its a bad idea. The current structure is so elegant. Each of the pools has clear advantages and disadvantages. There is some definite strategy involved in your decision.

With draft by position, the groups lack advantages and disadvantages. The cream (pitchers) may be skimmed off the top. The rest of the groups are incoherent randomness. Deciding between the 1B pool and OF pool seems more like throwing darts than strategy.
Rock777
Joined: 09/21/2014
Posts: 9602

Haverhill Halflings
III.1

Broken Bat Baseball
Yes. I do understand the numbers. It is a big advantage. I suspect you aren't appreciating the magnitude of a .1 delta in SI.
MukilteoMike
Joined: 08/09/2014
Posts: 3294

Inactive

Broken Bat Baseball
I also dislike draft by position for precisely the same reasons you do, Seca. I dislike the current pools as well because, as you know, you're screwed if you pick a small pool that has been drafted from a lot. Hmmm. An idea for a suggestion just came to me--each pool should show how many are left along with the initial number available. For example, Asian Pool 124 left of 136.
buffmckagan
Joined: 12/22/2013
Posts: 651

Scranton Bears
Legends

Broken Bat Baseball

...each pool should show how many are left along with the initial number available. For example, Asian Pool 124 left of 136.



Love this idea. Could save a lot of potential heartache for people who think they're early but aren't
admin
Joined: 01/27/2010
Posts: 4985

Administrator
Broken Bat Baseball
Interesting...right now I think we only show the current pool size. I guess I could record the pool size at the beginning of the season -- but what would we do for the cases where people select a position?

Also, players might be in the pool for more seasons...so things are getting a little more complicated.


Steve
Seca
Joined: 05/05/2014
Posts: 5201

Waterloo Dinosaurs
Legends

Broken Bat Baseball
Seeing the original pool size would be nice information for everyone, but does that help those drafting later? Part of the protection for those drafting later is having some that draft early not realize a pool has been hit.

Its a big advantage. I suspect you aren't appreciating the magnitude of a 0.1 delta in SI.

At least you left the "always" out this time. Progress! :)

"Big" is subjective. We are now firmly in the realm of opinion. I do not consider 0.1 SI delta "big". In my view:
- a manager has had the ability to draft above the curve through pool choice and good player selection
- the draft order is not fixed, the 0.1 is not perpetuated round to round or draft season to draft season
- SI has a strong correlation with player quality, but it's not perfect. A great hitting 12 pot may be a better player than a good hitting 14 pot.

To the last point, 2 of your 26 ML players are 12 pot. Amoung my 29 ML players I have 2 11 pots and a dozen 12 pots. I've cut 13 pot pitching and position prospects this season so I could keep 12s. Seems clear we have different takes on the importance of SI.
amalric7
Joined: 01/20/2016
Posts: 2238

New York Lancers
V.4

Broken Bat Baseball
I guess I could record the pool size at the beginning of the season

Well technically anyone can, albeit you record numbers in each pool (right now) when you come in to draft as opposed to when each round opens. At least it gives you some idea how many have been taken from each pool from round to round. I do it on a spreadsheet, but hey pen and paper still works (last I checked).
Rock777
Joined: 09/21/2014
Posts: 9602

Haverhill Halflings
III.1

Broken Bat Baseball
A great hitting 12 pot may be a better player than a good hitting 14 pot.

True, but on average a Great Hitting POT 13 will be better than a Great Hitting POT 12. On average, getting that higher POT player is going to net you a better player, which is why there is a big difference between drafting 12.5 POT on average and 12.2 POT on average. The 12.5 POT group will always end up with more quality stars than the 12.2 POT group over the long haul. 0.1 on average POT makes a big difference.

You are also glossing over # of choices here. That guy who drafted in the first group may have only gotten a POT 13, but he also could have had 4 Very Good options to choose from. While the guy in the last group was probably stuck with one Very Good. So sure he got a POT 13, but not necessarily one with an optimal build.

The best built players go fast, just like the best build managers. No one is going to pass up that Very Good, Great Hitting, Power Slugger. But plenty of Very Good, No hitting comment guys will slide through to the later picks.


Previous Page | Show Page |