bblocks.dataframe_tools.add
Functions
|
Validate parameters to use in an add column function type |
|
Add population column to a dataframe |
|
Add poverty headcount column to a dataframe |
|
Add population density column to a dataframe |
|
Add GDP column to a dataframe |
|
Add Government Expenditure column to a dataframe |
|
Add value as share of GDP column to a dataframe |
|
Add population share column to a dataframe |
|
Add value as share of Government Expenditure column to a dataframe |
|
Add an income levels column to a dataframe |
|
Add short names column to a dataframe |
|
Add ISO3 column to a dataframe |
|
Add median observation column to a dataframe |
|
Add flourish geometries column to a dataframe |
|
Module Contents
- bblocks.dataframe_tools.add.__validate_add_column_params(*, df: pandas.DataFrame, id_column: str, id_type: str | None, date_column: str | None) tuple
Validate parameters to use in an add column function type
- bblocks.dataframe_tools.add.add_population_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'population', update_data: bool = False) pandas.DataFrame
Add population column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the population for that year will be used. If it’s missing, it will be missing in the returned column as well. If the _data isn’t specified, the most recent population _data from the world bank is used.
target_column – the column where the population _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
the original DataFrame with a new column containing the population _data.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_poverty_ratio_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'poverty_ratio', update_data: bool = False) pandas.DataFrame
Add poverty headcount column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the population for that year will be used. If it’s missing, it will be missing in the returned column as well. If the _data isn’t specified, the most recent _data is used.
target_column – the column where the population _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
the original DataFrame with a new column containing the poverty _data.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_population_density_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'population_density', update_data: bool = False) pandas.DataFrame
Add population density column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the population for that year will be used. If it’s missing, it will be missing in the returned column as well. If the _data isn’t specified, the most recent _data is used.
target_column – the column where the population _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
- the original DataFrame with a new column containing the population
density _data.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_gdp_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'gdp', usd: bool = True, include_estimates: bool = False, update_data: bool = False) pandas.DataFrame
Add GDP column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the GDP for that year will be used. If it’s missing, it will be missing in the returned column as well. If the date isn’t specified, the most recent _data is used.
include_estimates – Whether to include years for which the WEO _data is labelled as estimates.
usd – Whether to add the _data as US dollars or Local Currency Units.
target_column – the column where the gdp _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
- the original DataFrame with a new column containing the gdp _data from
the IMF World Economic Outlook.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_gov_expenditure_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'gov_exp', usd: bool = True, include_estimates: bool = False, update_data: bool = False) pandas.DataFrame
Add Government Expenditure column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the expenditure for that year will be used. If it’s missing, it will be missing in the returned column as well. If the date isn’t specified, the most recent _data is used.
include_estimates – Whether to include years for which the WEO _data is labelled as estimates.
usd – Whether to add the _data as US dollars or Local Currency Units.
target_column – the column where the expenditure _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
- the original DataFrame with a new column containing the expenditure _data from
the IMF World Economic Outlook.
- Return type:
DataFrame
Add value as share of GDP column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the GDP for that year will be used. If it’s missing, it will be missing in the returned column as well. If the date isn’t specified, the most recent _data is used.
value_column – the column containing the value to be converted to a share of GDP.
decimals – the number of decimals to use in the returned column.
include_estimates – Whether to include years for which the WEO _data is labelled as estimates.
usd – Whether to add the data as US dollars or Local Currency Units.
target_column – the column where the gdp _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
- the original DataFrame with a new column containing the _data as a share
of gdp _data, using the IMF World Economic Outlook.
- Return type:
DataFrame
Add population share column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the population for that year will be used. If it’s missing, it will be missing in the returned column as well. If the _data isn’t specified, the most recent population _data from the world bank is used.
value_column – the column containing the value to be used in the calculation.
target_column – the column where the population _data will be stored.
decimals – the number of decimals to use in the returned column.
update_data – whether to update the underlying _data or not.
- Returns:
the original DataFrame with a new column containing value as share of population.
- Return type:
DataFrame
Add value as share of Government Expenditure column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
date_column – Optionally, a date column can be specified. If so, the expenditure _data for that year will be used. If it’s missing, it will be missing in the returned column as well. If the date isn’t specified, the most recent _data is used.
value_column – the column containing the value to be converted to a share of expenditure.
include_estimates – Whether to include years for which the WEO _data is labelled as estimates.
usd – Whether to add the _data as US dollars or Local Currency Units.
target_column – the column where the expenditure _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
- the original DataFrame with a new column containing the _data as a share
of expenditure, using the IMF World Economic Outlook.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_income_level_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'income_level', update_data: bool = False) pandas.DataFrame
Add an income levels column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
target_column – the column where the income level _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
the original DataFrame with a new column containing the income level _data.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_short_names_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'name_short') pandas.DataFrame
Add short names column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the short names will be stored.
- Returns:
the original DataFrame with a new column containing short names.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_iso_codes_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'iso_code') pandas.DataFrame
Add ISO3 column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the iso codes will be stored.
- Returns:
the original DataFrame with a new column containing ISO3 codes.
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_median_observation(df: pandas.DataFrame, group_by: str | list = None, value_columns: str | list[str] = 'value', append: bool = True, group_name: str | None = None) pandas.DataFrame
Add median observation column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
group_by – the column(s) by which to group the _data to calculate the median.
value_columns – the column(s) containing the values to be used for the median.
append – if True, the median observation will be appended to the dataframe. If False, the median observation will be stored in a new column.
group_name – the name of the group to be used in the id_column or as the name of
observations. (the column containing the median)
- Returns:
- the original dataframe with added rows for the median (if append is True)
or a new column containing the median observations (if append is False).
- Return type:
DataFrame
- bblocks.dataframe_tools.add.add_flourish_geometries(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'geometry') pandas.DataFrame
Add flourish geometries column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the flourish geometries will be stored.
- Returns:
the original DataFrame with a new column containing the flourish geometries.
- Return type:
DataFrame