Skip to content

Latest commit

 

History

History
42 lines (34 loc) · 1.51 KB

README.md

File metadata and controls

42 lines (34 loc) · 1.51 KB

binscatter is inspired by Stata's binscatter, described fully by Michael Stepner here. You can use it in essentially the same way you use Matplotlib functions like plot and scatter. A more extensive description of this package is here.

Getting started

  1. Copy and paste: Binscatter's meaningful code consists of consists of just one file. You can copy binscatter/binscatter.py into the directory the rest of your code is in.

  2. Install via pip: To make it easier to use binscatter in multiple projects and directories, open a terminal and

Usage

import binscatter
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

# Create fake data
n_obs = 1000
data = pd.DataFrame({'experience': np.random.poisson(4, n_obs) + 1})
data['tenure'] = data['experience'] + np.random.normal(0, 1, n_obs)
data['wage'] = data['experience'] + data['tenure'] + np.random.normal(0, 1, n_obs)
fig, axes = plt.subplots(2)

# Binned scatter plot of wage vs tenure
axes[0].binscatter(data['wage'], data['tenure'])
axes[0].set_ylabel('Wage')
axes[0].set_ylabel('Tenure')

# Binned scatter plot that partials out the effect of experience
axes[1].binscatter(data['wage'], data['tenure'], controls=data['experience'])
axes[1].set_xlabel('Tenure (residualized)')
axes[1].set_ylabel('Wage (residualized, recentered)')
plt.show()