Identifying and Removing Disparate Impact
Employers implementing a mass layoff often agonize over the decisions they have to make, not just because of the hardship for employees and the potential impact on the business, but because employees might claim their selection for layoff was discriminatory. A recent paper published by Michael Feldman, et al., on the website arXiv.org, Certifying and Removing Disparate Impact, offers an interesting point of departure for considering possible discrimination claims. As I explain below, however, employers should be cautious about adopting the authors’ recommended solution, which effectively rescores employees with formulas that vary by protected category, as this may be statistically attractive but is legally questionable.
“Disparate impact” claims are relevant in this context because, unlike more common discrimination claims such as race discrimination or sexual harassment, they don’t require any showing of intent. A claim of disparate impact asserts that a specific employment practice or selection criterion, which may appear neutral with respect to protected categories, nevertheless has an adverse impact on a protected group (usually a particular race, sex, or age bracket), and that reasonable factors other than the prohibited criterion cannot explain the discrepancy. If the employee states a prima facie case of disparate impact, the employer has the burden of showing that the rule it applied was job-related and justified by business necessity. If the employer meets this burden, the complainant can still establish a claim of discrimination by showing that an alternative method, which the employer refuses to adopt, would have achieved the same goal with less adverse impact. According to the EEOC, a set of decisions about which employees to retain and to discharge in a particular workforce, together with their race, sex, and age (which, for convenience, I’ll refer to as a “selection pattern”) is at significant risk of exhibiting unlawful disparate impact if members of one category are selected for termination at less than 80% of the highest rate at which members of any relevant category are chosen. For example, if African-American employees were selected for layoff at a rate of 50%, and 36% of Hispanic employees in the workforce were selected for layoff, an argument could be made that the employer’s selection process, whatever it was, had an adverse disparate impact on African-Americans because 36% is less than 80% of 50%. (This is the so-called “Four-Fifths Rule.” See the EEOC’s website for further discussion.)
Courts (and, in some cases, the EEOC) are also willing to entertain disparate impact claims using an argument based on standard deviations from a hypothetically “average” case. The argument contemplates a completely random, lottery-style selection process for determining which employees would be selected for layoff that ignores all protected categories, including race, sex, and age. If, in this hypothetical situation, the actual selection pattern in the case would have a probability of less than 5% of occurring, then a court may agree that the employee has raised legitimate grounds for concern, and the employer must show that its selection process was job-related for the position in question and consistent with business necessity. In our example above, some employers have cited an urgent and well-documented business need for retaining bilingual English/Spanish employees, and pointed to selection criteria that identified language ability, regardless of ethnic group, to explain the selection pattern. In some cases, that defense might be appropriate.
The growing reliance on statistical analysis to articulate or defend against potential claims of discrimination does not, however, seem to have resulted in widespread comfort with statistical techniques among lawyers or human resources professionals. Two techniques, binomial analysis and chi-squared analysis, are overwhelmingly popular, but perhaps used somewhat uncritically. What these tests measure can be hard to explain without resorting to jargon. In addition, I wouldn’t want to guess how many practitioners are attentive to the instability of chi-squared analysis when working with small groups, and chi-squared analysis may be misleading when there are very few minority employees and very few terminated employees compared to the overall size of the workforce. Less commonly-used tools can address some of these concerns, such as the Quetelet Index, which answers the question, “Based on the pattern of selections presented here, how do my chances of being selected for termination change because of my membership in a particular protected group?”, and the Kulczynski metric, which isn’t affected by large numbers of non-minority and non-selected individuals in the employee pool. However, I wouldn’t recommend using any of these tests in isolation.
Feldman et al.’s approach is conceptually similar to the Quetelet Index, but moves in the opposite direction and asks, in essence, “How accurately can we guess an individual’s membership in a particular protected category from the remainder of the data set, including information about who was selected for termination?” Their solution presents a plausible tool for assessing potential disparate impact, which I’ll consider incorporating into future advice in this area. The authors do, however, acknowledge that their test may be become unreliable if the selection pool is much larger than the set of affected employees. Interested readers can turn to their paper, linked above, for the statistical techniques, but of greater practical importance for managers and human resource professionals are the authors’ application of their statistical measure.
In short, the authors suggest we can use their formula to “certify” a particular pattern of employment selections as free of unlawful disparate impact, and they recommend an automated algorithm to “remove” any disparate impact from the selection pattern while preserving, to the extent possible, the ranking of employees otherwise evident from the data set. In essence, the authors seem to be proposing a rescoring scheme that preserves each employee’s percentile ranking within their own protected category, and then maps those different rankings onto a single curve that would be used to create the selection pattern. Although I appreciate the attraction of a streamlined, impartial, and automated process, such an approach seems misguided. Under U.S. law, selection quotas or criteria based on race, sex, age, or other protected categories are generally unlawful, and I fail to see how an algorithmic manner of “repairing” a pattern of termination selections that arguably had a disparate impact based on race – in other words, smoothing out the disparate impact – by subjecting employees to different performance standards according to their protected categories could be regarded as anything other than a discriminatory employment decision. Appellate courts, including the Second Circuit in Briscoe v. City of New Haven, have not considered the authors’ proposal, but have concluded in other contexts that employers may make certain adjustments only when they have a strong basis in evidence to conclude that they would otherwise face liability on a disparate impact theory. The results of a single statistical screening, however, are probably insufficient to establish such liability, and the authors’ approach seems to encourage the reliance on supposedly “fair” but unlawful numerical quotas, rather than specific business needs, as the basis for a particular selection pattern. By following such an approach, employers may forget the original goal of the exercise, which was to reduce their litigation risks.
Most of the disparate impact analyses of workforce reductions that I’ve conducted or reviewed for clients have presented some statistical snarl that prevented an easy conclusion that a finding of disparate impact based on statistical evidence alone was unlikely. (In fact, if the selection too closely fits the average breakdown predicted in a perfectly non-discriminatory, lottery-style selection, I might become concerned about unlawful quotas.) Rather than simply telling the client they’re incurring risks – which they already know – these statistical concerns present an opportunity to look deeper into the underlying business decisions creating the disparate impact, because those business concerns may be entirely benign. Especially memorable to me is one case where the client and I determined that a seemingly pernicious disparate impact against women of a particular ethnic group could be traced to the shutdown of a particular assembly line manufacturing delicate electronics, where only employees with a certain diminutive hand size could perform the intricate soldering that was needed; in other words, the composition of the department didn’t resemble the company as a whole for understandable business reasons. I’ve found in many cases that a proper investigation of these details allows the employer to formulate a sound defense to a disparate impact claim, if one should ever be raised, so that it can reasonably decide to proceed with the workforce reduction using the business-based selection criteria it thought most appropriate in the first place. In other cases, of course, we dig deeper and find that the selection criteria are not as reliable as the client originally assumed, which may present an opportunity to present more appropriate business criteria. As a result, taking action to “remove” a possible disparate impact without identifying the selection criteria responsible for the statistical disparity may not only create a potential discrimination claim where none existed before, but may also fail to identify and correct arguably discriminatory selection criteria that contributed to the original disparity.
If they have to implement a layoff, employers want to make sure they’re not exposing themselves to liability and risk of further loss in the process, and it’s understandable that they’d like to rely on an objective statistical formula to protect themselves against liability. But the relevant case law in this area suggests that, while Feldman et al.’s statistical measures may be helpful to a point, employers shouldn’t rely on them to the exclusion of other methods and, more importantly, shouldn’t unthinkingly treat the numbers as an end in themselves. At best, statistical analysis should be treated as one tool among many to help companies exercise sound judgments in their business decisions.