Distribution inequality can be measured via a Lorenz Curve and/or Gini Coefficient. Here's a best attempt at defining both, in simple terms:
- Lorenz Curve - line chart visualizing the cumulative percentage of something (ie. distribution of wealth in a given country) for a given population
- Gini Coefficient - a number from 0 to 1 representing the degree of equality (0 = Complete equality | 1 = Complete inequality)
What is this used for? A common area is wealth distribution. For example, if one wanted to put a statistical lens over the popular US political slogan, "We are the 99%". Internalizing the Gini Coefficient for a country's distribution of wealth provokes a deeper understanding of the topic, as well as a measurement to contrast against other geographic locations.
The Lorenz Curve and Gini Coefficient can be used to represent the level of inequality in many places outside of wealth distribution as well.
For example, it could be used to measure
revenue concentration by customers,
usage of datasets in a BI Tool, or
usage of features in a Web Application.
General steps for computing the Gini Coefficient:
- Load data of interest – in the example below we're using a dataset containing 25 fictitious annual gross salaries
- Sort data in ascending order
- Store the line of perfect equality in an array
- Store the cumulative % of each individual annual income in an array
- Compute the area between the line of perfect equality and the distribution of income
Now, let's rock this out in Python: