Let's show a few convenient methods to deal with Missing Data in pandas:
import numpy as np
import pandas as pd
The key in a dictionary are the columns. Use np.nan to signify missing / null value
df = pd.DataFrame({'A':[1,2,np.nan],
'B':[5,np.nan,np.nan],
'C':[1,2,3]})
df
Use dropna() to remove ROW(S) with null/missing value(s)
df.dropna()
Use dropna(axis=1) to remove COLUMN(S) with null/missing value(s)
df.dropna(axis=1)
Use the threshold thresh argument to specify a minimum of non-na values. The row will be kept if the number of non-na values >= number specified in threshold
df.dropna(thresh=2)
Missing value are indicated by NaN. We can replace the missing value with fillna()
df.fillna(value='FILL VALUE')
Set the fill value to be the mean of the column
df['A'].fillna(value=df['A'].mean())