dashi package
Subpackages
Submodules
dashi.constants module
dashi.utils module
Utils functions
- format_data(input_dataframe, *, date_column_name=None, source_column_name=None, date_format='%y/%m/%d', verbose=False, numerical_column_names=None, categorical_column_names=None)[source]
Function to transform dates into ‘Date’ Python format
- Parameters:
input_dataframe (pd.DataFrame) – Pandas dataframe object with at least one columns of dates.
date_column_name (Optional[str]) – The name of the column containing the dates. If you are not performing a temporal analysis, set this parameter to None.
source_column_name (Optional[str]) – The name of the column containing the source information. If you are not performing a multi-source analysis, set this parameter to None.
date_format (Optional[str]) – Structure of date format. By default ‘%y/%m/%d’.
verbose (bool) – Whether to display additional information during the process. Defaults to False.
numerical_column_names (Optional[List[str]]) – A list containing all the numerical column names in the dataset. If this parameter is None, the variables types must be managed by the user.
categorical_column_names (Optional[List[str]]) – A list containing all the categorical column names in the dataset. If this parameter is None, the variables types must be managed by the user.
- Returns:
A pandas.DataFrame with each column cast to its correct dtype and any rows containing missing values in the date or source fields dropped.
- Return type:
pd.DataFrame