Outdpik: The fundamental toolkit for outliers search and visualization

Note

Usage in programme lenguages

This package is only designed to be used in Python.

General Parameters

First, outdpik class is instanciated = outdpik = outdpik(). It requieres no arguments.

Searching for outiers:

def outliers(df, columns, method):

It returns a dictionary with the columns selected and the outliers

Parameters
  • df (pandas.DataFrame, pandas.Series, numpy.array and list, optional(default = None)) – The set to explore

  • columns (list and string, optional(default = "all")) – Selected columns, by default = “all”. “all” parameter will return the outliers for all the numeric columns.

  • method (string, optional(default = "all")) – Method to use for outliers search. “iqr”, “zscore” and ‘all’ are available. “all” parameter will return the outliers for all the methods.

Returns

Dictionary of outliers

Return type

dict

Warning

The columns selected must be numeric.

Plot outiers:

def outliers_plot(df, columns, method, size, palette):

It returns a strip plot with the outliers marked in other color

Parameters
  • df (pandas.DataFrame, pandas.Series, numpy.array and list, optional(default = None)) – The set to explore

  • columns (list and string, optional(default = None)) – Selected column. Only one and numeric column can be selected.

  • method (string, optional(default = "all")) – Method to use for outliers search. “iqr”, “zscore” and ‘all’ are available. “all” parameter will return the outliers for all the methods.

  • size (list, optional(default = [5, 7])) – Size of the plot.

  • palette (tuple, optional(default = ((133/255, 202/255, 194/255), (38/255, 70/255, 83/255))))) – Color palette to use. It must be a tuple of 3 elements each

Returns

Strip plot of the selected column

Return type

plt.figure

Warning

The columns selected must be numeric.