This is a small Python Function that allows you to see the variance, percentage of missing values and unique values within a dataset!
Installation instructions:
pip install quickpt
from quickpt.quickpt import quickpt
quickpt(df, graph=None, encode=True, width=800, height=400)
or
!pip3 install quickpt
from quickpt.quickpt import quickpt
quickpt(df, graph=None, encode=True, width=800, height=400)
quickpt library
Creates a DataFrame showing the missing values, total unique values, data type, and variance of each feature.
If the argument graph is passed, then a bar chart of the specified parameter is visualized.
Parameters
graph : var, null, uniq (default is None)
encode : True, False (default is True)
width : int (default is 800)
height : int (default is 400)
Description of Parameters
- var = variance
- null = percent of missing values in decimal form
- uniq = sum of unique values
- encode –> True = Uses LabelEncoder to encode categorical variables and receive summary statistics
- encode –> False = Only shows DataFrame/Visualization of original numeric variables of input data
- width = update graph width
- height = update graph height
Use
- Used on preprocessed datasets that have only numerical features
- If data has categorical features set encoder=True to temporarily LabelEncode categorical features to numeric :D
Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request