Outdpik: The fundamental toolkit for outliers search and visualization
Note
Usage in programme lenguages
This package is only designed to be used in Python.
General Parameters
First, outdpik class is instanciated = outdpik = outdpik(). It requieres no arguments.
Searching for outiers:
- def outliers(df, columns, method):
It returns a dictionary with the columns selected and the outliers
- Parameters
df (pandas.DataFrame, pandas.Series, numpy.array and list, optional(default = None)) – The set to explore
columns (list and string, optional(default = "all")) – Selected columns, by default = “all”. “all” parameter will return the outliers for all the numeric columns.
method (string, optional(default = "all")) – Method to use for outliers search. “iqr”, “zscore” and ‘all’ are available. “all” parameter will return the outliers for all the methods.
- Returns
Dictionary of outliers
- Return type
dict
Warning
The columns selected must be numeric.
Plot outiers:
- def outliers_plot(df, columns, method, size, palette):
It returns a strip plot with the outliers marked in other color
- Parameters
df (pandas.DataFrame, pandas.Series, numpy.array and list, optional(default = None)) – The set to explore
columns (list and string, optional(default = None)) – Selected column. Only one and numeric column can be selected.
method (string, optional(default = "all")) – Method to use for outliers search. “iqr”, “zscore” and ‘all’ are available. “all” parameter will return the outliers for all the methods.
size (list, optional(default = [5, 7])) – Size of the plot.
palette (tuple, optional(default = ((133/255, 202/255, 194/255), (38/255, 70/255, 83/255))))) – Color palette to use. It must be a tuple of 3 elements each
- Returns
Strip plot of the selected column
- Return type
plt.figure
Warning
The columns selected must be numeric.