dashi package
Subpackages
Submodules
dashi.constants module
dashi.utils module
Utils functions
- format_data(input_dataframe, *, date_column_name=None, source_column_name=None, date_format='%y/%m/%d', verbose=False, numerical_column_names=None, categorical_column_names=None)[source]
- Function to transform dates into ‘Date’ Python format - Parameters:
- input_dataframe (pd.DataFrame) – Pandas dataframe object with at least one columns of dates. 
- date_column_name (Optional[str]) – The name of the column containing the dates. If you are not performing a temporal analysis, set this parameter to None. 
- source_column_name (Optional[str]) – The name of the column containing the source information. If you are not performing a multi-source analysis, set this parameter to None. 
- date_format (Optional[str]) – Structure of date format. By default ‘%y/%m/%d’. 
- verbose (bool) – Whether to display additional information during the process. Defaults to False. 
- numerical_column_names (Optional[List[str]]) – A list containing all the numerical column names in the dataset. If this parameter is None, the variables types must be managed by the user. 
- categorical_column_names (Optional[List[str]]) – A list containing all the categorical column names in the dataset. If this parameter is None, the variables types must be managed by the user. 
 
- Returns:
- A pandas.DataFrame with each column cast to its correct dtype and any rows containing missing values in the date or source fields dropped. 
- Return type:
- pd.DataFrame