Home About Résumé

Why We Need Anti-Discrimination Laws: a computational approach

My libertarian friends, back when I had libertarian friends, often imagined that anti-discrimination laws were unnecessary because "the market will take care of it".

The argument goes like this: companies compete for the best employees, and quality of employee is a significant determinant of corporate success, and as such companies that discriminate against a sub-set of the population will under-perform those that don't because they will necessarily forgo the best candidate in some cases and that will result in a lower-than-average employee quality that will result in an increased rate of corporate failure.

This is an imaginary argument, which is to say: it is not an argument at all. While such propositions stimulate thought, they ask us to do something that is far beyond the capabilities of the human imagination: accurately envision the working out of a diverse set of forces represented by probability distributions.

In particular, the way the argument is usually deployed is intended to focus our limited attentional resources on the high-end of the employee skill distribution. But this is wrong: the average person is average, and for discrimination to have an effect it has to occur in a situation where dropping the minority out of the population somehow changes the average skill of available workers. This is mathematically impossible.

Furthermore, remember that the whole trick of industrial capitalism is to create great products with average workers. This is why Wedgewood and others were able to create so much wealth, and why common items like pottery and clothing are now so cheap we hardly notice them, whereas before industrialization they were so dear that it was possible to live by stealing them.

It follows from this that the average worker in the average industry in the average capitalist economy is... average. Therefore it is mathematically impossible for discrimination against a minority to materially affect the success of a business, because the minority population will have on average the same distribution of skills as the majority population. Dropping out the minority population from consideration in business would therefore have a trivial effect on hiring decisions in the average case, and the exceptional case is not sufficient to punish businesses that discriminate to the point of affecting their success.

It's worth looking at some examples of distributions before considering a more complete simulation. The image below considers a case where a company interviews 100 workers for a position where there is a 10% minority representation amongst the worker population. Worker "skill" has a Gaussian (bell curve) distribution with a mean of 10 and standard deviation of 5. Anti-skilled workers (people who negatively contribute to the company) exist. Both majority and minority populations have the same distribution.

The best candidate...
The best candidate is probably in the majority because any random worker is probably in the majority.

If we assume a strict meritocracy where the company chooses the best worker as measured by some particular skill, it is easy to see that discrimination or lack thereof will have no effect on the outcome: the best worker is probably a member of the majority because any randomly selected worker is probably a member of the majority. This is what "majority" means.

Even if we relax the criterion somewhat and say the company takes the first candidate who exceeds some particular level of skill--say 15 on the graph--we can see that the odds are still in favour of the first worker to meet that criterion being in the majority, again because any randomly selected worker is probably in the majority.

It takes about 200 lines of Python to simulate a situation where there are 100 businesses with 50 - 300 employees (in the SME range) who hire out of a population of about 18000 workers (re-sized to generate 3-6% average unemployment, which can be thought of as the effects of migration or economic fluctuations) with 10% minority representation. Each business has a "discrimination factor" between 0 and 1 that multiplies the skill level of minority workers for comparison during hiring, so a value of 0 means essentially no minority worker is ever hired and 1 means minorities are treated the same as the majority workers.

Every year there is 5% attrition as employees move on for various reasons, and every year companies are tested by the market to see if they have sufficiently skilled workers to survive. The test is applied by finding the average worker skill and adding a random value to it. Worker skill is over 80% of the company test score, so while randomness plays a role (as it does in the real world) worker skill is dominant.

The test has been set up so 10 - 20% of companies fail each year, which is pretty harsh. They are immediately replaced by new companies with random characteristics. If discrimination is a significant effect we will see more discriminatory companies go out of business more rapidly, and slowly the population of companies will evolve toward more egalitarian ones.

The hiring process consists of companies taking the best worker out of ten chosen at random. Worker skills are distributed on a bell curve with a mean of 1 and standard deviation of 0.5, with the additional condition that the skill-level be positive. As noted above, companies multiply worker skills by the corporate "discrimination factor" for comparison during hiring, so some companies almost never hire minority workers, even when they are highly skilled.

Here's the code (updated to reflect changes discussed in the edit below):

import random

import numpy as np

"""For a population of two types, one of which is in the minority and

is heavily discriminated against, does the discrimination materially

harm businesses that engage in it?"""

class Person:

def __init__(self, bMinority):

self.bMinority = bMinority

self.fSkill = np.random.normal(1.0, 0.5)

while self.fSkill < 0: # constrain skill to be positive

self.fSkill = np.random.normal(1.0, 0.5)

self.bEmployed = False

def getHireFactor(self, fDiscriminationFactor):

if self.bMinority:

return self.fSkill*fDiscriminationFactor

else:

return self.fSkill

class Business:

def __init__(self, nEmployees, fChanceFactor, fDiscriminationFactor):

self.nEmployees = int(nEmployees)

self.lstEmployees = []

self.fChanceFactor = fChanceFactor

self.fDiscriminationFactor = fDiscriminationFactor

def hire(self, lstUnemployed, lstEmployed):

"""Take the best person out of first 10 at random"""

if len(self.lstEmployees) < self.nEmployees:

random.shuffle(lstUnemployed) # randomize unemployed

pHire = lstUnemployed[0]

fBest = pHire.getHireFactor(self.fDiscriminationFactor)

for pPerson in lstUnemployed[1:10]:

fFactor = pPerson.getHireFactor(self.fDiscriminationFactor)

if fFactor > fBest:

fBest = fFactor

pHire = pPerson

pHire.bEmployed = True

lstEmployed.append(pHire)

lstUnemployed.remove(pHire)

self.lstEmployees.append(pHire)

def test(self, fThreshold):

fAvg = sum([pEmployee.fSkill for pEmployee in self.lstEmployees])/len(self.lstEmployees)

fSum = fAvg + random.random()*self.fChanceFactor

if fSum > fThreshold:

return True

else:

return False

def attrition(self, lstEmployed, lstUnemployed):

lstMovedOn = []

for pEmployee in self.lstEmployees:

if random.random() < 0.05:

lstMovedOn.append(pEmployee)

for pEmployee in lstMovedOn:

self.lstEmployees.remove(pEmployee)

pEmployee.bEmployed = False

lstEmployed.remove(pEmployee)

lstUnemployed.append(pEmployee)

nEmployeeRange = 250

fChanceFactor = 1.15 # equal to average employee skill (> 1 due to eliminting negative values)

lstBusinesses = []

nTotalWorkers = 0

for nBusinesses in range(0, 100):

lstBusinesses.append(Business(50+random.random()*nEmployeeRange, fChanceFactor, random.random()))

nTotalWorkers += lstBusinesses[-1].nEmployees

fMinorityFraction = 0.1

lstUnemployed = []

lstEmployed = []

nMinority = 0.0

nMajority = 0.0

fFullEmploymentFactor = 1.03

nPopulation = int(nTotalWorkers*fFullEmploymentFactor)

print nPopulation

for nPeople in range(0, nPopulation):

lstUnemployed.append(Person(random.random() < fMinorityFraction))

if lstUnemployed[-1].bMinority:

nMinority += 1

else:

nMajority += 1

print nMajority, nMinority

print "Initial hiring. This may take a few minutes..."

while True: # initial hiring phase

random.shuffle(lstBusinesses)

nFull = 0

for pBusiness in lstBusinesses:

pBusiness.hire(lstUnemployed, lstEmployed)

nFull += pBusiness.nEmployees == len(pBusiness.lstEmployees)

if nFull == len(lstBusinesses):

break

print len(lstEmployed), len(lstUnemployed)

nMajorityUnemployed = 0.0

nMinorityUnemployed = 0.0

for pPerson in lstUnemployed:

if pPerson.bMinority:

nMinorityUnemployed += 1

else:

nMajorityUnemployed += 1

print nMinorityUnemployed/nMinority, nMajorityUnemployed/nMajority

print "Starting iteration..."

lstLastTenFailCount = []

outFile = open("adjusted.dat", "w")

fTestThreshold = 0.98 # about 2% failure rate overall to start

while True: # yearly iteration

random.shuffle(lstBusinesses)

lstFailed = [] # first test businesses

for pBusiness in lstBusinesses:

if not pBusiness.test(fTestThreshold):

lstFailed.append(pBusiness)

for pPerson in pBusiness.lstEmployees:

pPerson.bEmployed = False

lstUnemployed.append(pPerson)

lstEmployed.remove(pPerson)

lstLastTenFailCount.append(len(lstFailed))

if len(lstLastTenFailCount) > 10:

lstLastTenFailCount.pop(0)

nTotalFail = sum(lstLastTenFailCount)

if nTotalFail < 15:

fTestThreshold += 0.01

if nTotalFail > 25:

fTestThreshold -= 0.01

print nTotalFail, fTestThreshold

for pBusiness in lstFailed:

lstBusinesses.remove(pBusiness)

nTotalWorkers -= pBusiness.nEmployees

for pBusiness in lstBusinesses: # attrition from remaining businesses

pBusiness.attrition(lstEmployed, lstUnemployed)

while len(lstBusinesses) < nBusinesses: # creation of new businesses

lstBusinesses.append(Business(50+random.random()*nEmployeeRange, fChanceFactor, random.random()))

nTotalWorkers += lstBusinesses[-1].nEmployees

# migration keeps unemployment between 3% and 6%

nWorkers = len(lstUnemployed)+len(lstEmployed)

nPopulation = int(nTotalWorkers*fFullEmploymentFactor+fFullEmploymentFactor*random.random())

random.shuffle(lstUnemployed)

while nWorkers < nPopulation:

lstUnemployed.append(Person(random.random() < fMinorityFraction))

if lstUnemployed[-1].bMinority:

nMinority += 1

else:

nMajority += 1

nWorkers += 1

while nWorkers > nPopulation:

pWorker = lstUnemployed.pop()

if pWorker.bMinority:

nMinority -= 1

else:

nMajority -= 1

nWorkers -= 1

while True: # hiring

random.shuffle(lstBusinesses)

for pBusiness in lstBusinesses:

pBusiness.hire(lstUnemployed, lstEmployed)

nFull = 0

for pBusiness in lstBusinesses:

nFull += pBusiness.nEmployees == len(pBusiness.lstEmployees)

if nFull == len(lstBusinesses):

break

fDiscrimination = 0.0

for pBusiness in lstBusinesses: # how discriminatory are we now?

fDiscrimination += pBusiness.fDiscriminationFactor

nMajorityUnemployed = 0.0

nMinorityUnemployed = 0.0

for pPerson in lstUnemployed:

if pPerson.bMinority:

nMinorityUnemployed += 1

else:

nMajorityUnemployed += 1

outFile.write(str(len(lstFailed))+" "+str(fDiscrimination)+" "+str(nMinorityUnemployed/nMinority)+" "+str(nMajorityUnemployed/nMajority)+" "+str(nMinorityUnemployed)+" "+str(nMinority)+" "+str(nMajorityUnemployed)+" "+str(nMajority)+"\\\\n")

print len(lstFailed), fDiscrimination, nMinorityUnemployed/nMinority, nMajorityUnemployed/nMajority, nMinorityUnemployed, nMinority, nMajorityUnemployed, nMajority

Feel free to run it and tweak it. If you talk about the results please give me attribution for the original.

The only thing I can say with any degree of certainty about this code is that it's a better, more accurate, more useful representation of the actual market than anything your imagination can possibly run on its own.

Note that so far as the arguments I'm making here go, it doesn't matter if you once heard a story about a company that failed because they refused to hire a black genius: anecdotal accounts of singular events are not arguments. Social policy should be based on large-scale averages, not one-offs.

So what does the code show?

Unsurprisingly, based on the arguments above, there is insufficient selection effect to drive discriminatory companies out of business on a timescale of less than centuries... by which time all of the companies would be out of business anyway for reasons that have nothing to do with discrimination. This result follows from the fact that the average worker is average, and the strength of capitalism is making great products with average workers.

Here's a typical run of the code that simulates 100 years of this little micro-economy:

Discrimination Over a...
Discrimination Over a Century

The discrimination factor is simply the sum over all company's individual discrimination factors and it can be seen to slowly rise (which is equivalent to decreasing discrimination) by about 20% over the course of a century.

So the notion that "the market will take care of it" isn't entirely insane, it is merely far too weak to make a meaningful difference over the lifetime of currently discriminated-against workers. Furthermore, the simulation is almost insanely generous to the hypothesis under test. It assumes, for example, that there is zero cost to hiring minority workers, whereas history shows this is false: the US is replete with stories of shutdowns, protests and other work actions by majority workers in the face of minority hiring. If we add even moderate costs to the model it will generate segregation, not integration.

I'm fairly surprised the model shows any effect at all. The effect demonstrated under extremely favourable assumptions is certainly far too small to be socially significant, and the model was not intended to closely emulate the real world, but to explore the numerical reality behind the historical fact that no free market anywhere ever has produced an integrated society without benefit of regulation.

Edit: the stuff below is follow-up to what goes above, as I thought it interesting to dig into the model parameters to see how realistic they are, and discovered that "not very" was the answer.

The most important factor in determining how efficient the market is in fighting discrimination is the rate of business turn-over. Small businesses (< 500 employees) account for the majority of employment in developed countries. It turns out there is quite a bit of data available on the rate at which such businesses fail, and the number is not 15% per year but somewhere around 2%. The UK data linked above gives the distribution of ages, which can be reasonably well modeled with a 3% failure rate, and the US data gives the mid-400 thousands for the birth/death rate, which is a 1.5% turn-over rate on a population of 28.2 million businesses.

So my 15% estimate was wrong by an order of magnitude. It's also the case that chance plays a bigger role than the model allows, so I tweaked it such that chance (or rather, anything other than employee skills... it might be CEO competency, better competition, etc) accounts for 50% of the overall test score on average, instead of the average of 20% in the original model. I've updated the code above to reflect the changes.

Critics might say that I've fine-tuned these parameters to reach my preferred conclusion, which is nonsense on two counts: the first is that I'd far rather have markets take care of discrimination than leaving it to the nice people who work for the government. The second is that my parameter choices are empirically-driven in the first case and extremely generous in the second. I've worked for (and owned) small businesses where a few exceptional people were vital to the successes we did have, and which still went broke due to other factors. Anyone who claims things other than employee skills don't have a very large impact on business success has never run a business.

My business experience is entirely in high-tech, working in areas where employee skills are vastly more important than in any other area, and it is still a matter of ordinary empirical fact that the success or failure of the business was only weakly tied to the quality of the team.

There is a larger question here, though. Is it reasonable to critique a computational model on the basis of parameter choice when the alternative is the un-aided and highly-fallible human imagination? Is it reasonable to say, "My imaginary argument doesn't require any difficult parameter choices, so it's better than your computational argument that does!"?

Does it weaken my argument because you can see how it works, analyze it and criticize it based on that analysis?

I don't think so.

Most of what passes for argument about social policy between ideologues comes down to disagreements about what they imagine will happen under various conditions. Since we know with as much certainty as we know anything that our imaginations are terrible guides to what is real, it follows that ideological critiques of numerical, probabilistic arguments--Bayesian arguments--don't hold much water.

Yet we very often feel as if a model like the one I'm presenting here is "too simplistic" to capture the reality of the system it is simulating in a way that would give us the power to draw conclusions from it.

It's true that we should be cautious about over-interpreting models, but given that, how much more-so should we be over-cautious of over-interpreting our known-to-be-lousy imaginings?

If this model is too simplistic, it is certainly vastly more sophisticated--and accurate--than the unchecked, untested, unverified results of anyone's imagination.

And what is the result of this model with somewhat more realistic parameters? I added a little code to adjust the test threshold dynamically to maintain a failure rate of between 1.5 and 2.5% (otherwise as time went on good companies tended to dominate, suggesting I am still setting the role of chance far too low) and tweaked the role of chance up to about 50%. The results of 250 years are shown in the figure below. Remember, this is more time than most nations in the developed world have existed as nations, so there have certainly been nothing like stable market conditions anywhere on Earth over this time such that this experiment might actually be performed.

Note scale is...
Note scale is 250 years, not 100 as in previous figure

The line fitted to the results has a slope of about 0.02/year, so after 1000 years less than half the original bias will be gone from this magically stable population of companies. This is indistinguishable from no progress at all when we try to apply it across the broad sweep of human history, which has in fact seen more and less discriminatory times as companies, industries, nations and empires come and go.

We can also look at the unemployment rate of the majority and minority population.

Minority Unemployment Stays...
Minority Unemployment Stays High

The overall unemployment rate varies between 3% and 6%, but the majority never sees much fluctuation. Instead, the minority--whose unemployment rate runs at about twice the majority in a typical year--gets hammered. This is also what we see in the real world, which speaks to the model's basic soundness.

So there you have it. Numerical evidence that the proposition "The market will take care of discrimination" is not very plausible. "I imagine it otherwise" is not a counter-argument, or evidence for the proposition. If you want to argue against me, improve the model, don't deploy the awesome power of your imagination, because your imagination isn't any good at this stuff. Neither is mine. Neither is anyone else's.

Contact Home
Copyright (C) 2018 TJ Radcliffe