bblocks
Submodules
Attributes
Classes
An object to help download data from the World Bank. |
|
An object to extract GHED _data |
|
Class to download and read WFP inflation and insufficient food data |
|
World Economic Outlook _data |
|
An object to extract data from UNAIDS. |
|
Import data from the World Bank's International Debt Statistics database. |
Functions
|
Extract DSA _data from the |
|
Add ISO3 column to a dataframe |
|
Add an income levels column to a dataframe |
|
Add short names column to a dataframe |
|
Clean a string and return as float or integer. |
|
Clean a numeric column in a Pandas DataFrame or a Pandas Series which is |
|
Converts a Pandas series into a date series. |
|
Takes a Pandas' series with country IDs and converts them into the desired type. |
|
Converts a Pandas' series into a string series. |
|
Formats a Pandas' numeric series into a formatted string series. |
|
Filter a DataFrame by continent. |
|
Filter a DataFrame by UN region. This includes, for example, "Western Africa", |
|
Filter a DataFrame to keep only European countries. The current list of members |
|
Filter a DataFrame to keep only African countries. |
|
Calculate the latest value of (a) column(s) over a period of time. |
|
Package Contents
- bblocks.__version__ = '1.4.1'
- class bblocks.WorldBankData
Bases:
bblocks.import_tools.common.ImportDataAn object to help download data from the World Bank. In order to use, create an instance of this class. Then, call the load_indicator method to load an indicator. This can be done multiple times. If the _data for an indicator has never been downloaded, it will be downloaded. If it has been downloaded, it will be loaded from disk. If update_data is set to True when creating the object, the _data will be updated from the World Bank for each indicator. You can force an update by calling update if you want to refresh the _data stored on disk. You can get a dataframe of the _data by calling get_data.
- _indicators: dict[str, tuple[pandas.DataFrame, dict]]
- load_data(indicator: str | list[str], start_year: int | None = None, end_year: int | None = None, most_recent_only: bool = False, db: int = 2, **kwargs) WorldBankData
Get an indicator from the World Bank API
- Parameters:
indicator – the code from the World Bank data portal (e.g. “SP.POP.TOTL”)
start_year – The first year to include in the data
end_year – The last year to include in the data
most_recent_only – If True, only get the most recent non-empty value for each country
db – The database to use. By default, use the WDI database (2)
- Returns:
The same object to allow chaining
- update_data(reload_data: bool = True) bblocks.import_tools.common.ImportData
Update the _data saved on disk for the different indicators
When called, it will go through each indicator and update the _data saved based on the parameters passed to load_indicator.
- Returns:
The same object to allow chaining
- get_data(indicators: str | list = 'all', **kwargs) pandas.DataFrame
- class bblocks.GHED
Bases:
bblocks.import_tools.common.ImportDataAn object to extract GHED _data
To use, create an instance of the class and call the load_indicator method. If the _data is already downloaded, it will be loaded from disk. If not, it will be downloaded. If update_data is set to True, the _data will be downloaded regardless of whether it is already on disk. To force an update, call the update method. To get the _data, call the get_data method. To get the metadata, call the get_metadata method.
- _metadata: pandas.DataFrame = None
- load_data() bblocks.import_tools.common.ImportData
Load GHED data
- Returns:
The same object to allow chaining
- update_data(reload_data: bool) bblocks.import_tools.common.ImportData
Update GHED _data
- Returns:
The same object to allow chaining
- get_metadata() pandas.DataFrame
Get GHED metadata as a pandas dataframe
- Returns:
A pandas dataframe with the metadata
- class bblocks.WFPData
Bases:
bblocks.import_tools.common.ImportDataClass to download and read WFP inflation and insufficient food data
- property available_indicators: KeysView
View the available indicators from WFP
- _country_codes() dict
- load_data(indicator: str | list) None
Load an indicator into the WFPData object
- update_data(reload_data: bool = True) None
Update the data for all the indicators currently loaded
- class bblocks.WorldEconomicOutlook
Bases:
bblocks.import_tools.common.ImportDataWorld Economic Outlook _data
- year: int | None = None
- release: int | None = None
- __post_init__() None
- __repr__() str
- __load_data() None
loading WEO as a clean dataframe
- Parameters:
latest_y – passed only optional to override the behaviour to get the latest
WEO. (release year for the)
latest_r – passed only optionally to override the behaviour to get the latest
value (released)
- _check_indicators(indicators: str | list | None = None) None | dict
- load_data(indicator: str | list) bblocks.import_tools.common.ImportData
Loads a specific indicator from the World Economic Outlook _data
- update_data(reload_data: bool = True) None
Update the stored WEO _data, using WEO package.
Args:
- available_indicators() None
Print the available indicators in the dataset
- get_data(indicators: str | list = 'all', keep_metadata: bool = False) pandas.DataFrame
- class bblocks.Aids
Bases:
bblocks.import_tools.common.ImportDataAn object to extract data from UNAIDS.
To use, create an instance of the class. The load indicators using the load_indicators method. This can be done multiple times. To return a dataframe of all available indicators to load, use the available_indicators class attribute. If the data for an indicator has never been downloaded, it will be downloaded. If it has been downloaded, it will be loaded from disk. If update_data is set to true, the data will be downloaded each time an indicator is loaded. You can force an update by calling ‘update’, and all indicators will be reloaded into the object. You can get a dataframe by calling ‘get_data’ and passing the indicator name(s) (or None and this will return all indicators) and passing the area grouping(s) (‘all’ by default)
- property available_indicators: pandas.DataFrame
Returns a dataframe of available indicators
- load_data(indicator: str, area_grouping: str = 'all') bblocks.import_tools.common.ImportData
Load an indicator to the object
- indicator (str): The name of the indicator to load. To see a DataFrame of available
indicators, use the available_indicators method.
area_grouping (str): The grouping to use. Choose from [“country”, “region”, “all”].
- Returns:
The same object to allow chaining
- update_data(reload_data: bool)
Update all loaded indicators saved on the disk
When called, it will go through each loaded indicator/area grouping combination and update the data saved on disk.
- Returns:
The same object to allow chaining
- get_data(indicators: str | list | None = None, area_grouping: str = 'all') pandas.DataFrame
Get the data as a Pandas DataFrame
- Parameters:
indicators – By default, all indicators are returned in a single DataFrame. If a list of indicators is passed, only those indicators will be returned. A single indicator can be passed as a string as well.
area_grouping (str) – The area grouping to use. Choose from [“country”, “region”, “all”]. Default is “all”.
- Returns:
A Pandas DataFrame with the requested indicator data
- class bblocks.DebtIDS
Bases:
bblocks.import_tools.common.ImportDataImport data from the World Bank’s International Debt Statistics database.
To use this object, first create an instance of it. Then use the load_data method to load indicators. One or more indicators can be loaded at a time, and a starting and end year must be specified.
If the data has not been downloaded before, it will be downloaded from the World Bank API. If the data has been downloaded before, it will be loaded from the local data folder.
To get a DataFrame, use the get_data method. You can get the data for one or more, or for all indicators at once.
To update the data, use the update_data method. This will download the latest data from the World Bank API and overwrite the local data.
To get a list of available indicators, use the get_available_indicators method.
To get a list of available debt service indicators, use the debt_service_indicators method.
To get a list of available debt stocks indicators, use the debt_stocks_indicators method.
- __post_init__()
Set the path to the data folder and create it if it doesn’t exist
- _check_stored_data(indicator: str, start_year: int, end_year: int) str | bool
Check if the data is already stored locally
This also checks if the years requested are inside another file.
- Parameters:
indicator (str) – The indicator to check
start_year (int) – The start year of the data
end_year (int) – The end year of the data
- Returns:
The filename of the data if it exists bool: False if the data doesn’t exist
- Return type:
str
- static _indicator_parameters(indicator: str) tuple[str, int, int]
Get the indicator, start year and end year from the indicator name.
- classmethod get_available_indicators() dict
Get a dictionary of all available indicators in the IDS database.
- classmethod debt_service_indicators(detailed_category: bool = True) dict
Get a dictionary of Debt Service indicators in the IDS database.
- classmethod debt_stocks_indicators(detailed_category: bool = True) dict
Get a dictionary of Debt Service indicators in the IDS database.
- _get_indicator(indicator: str, start_year: int, end_year: int) bblocks.import_tools.common.ImportData
Get data for an indicator. This method is not meant to be accessed directly. Instead, use the .get_data() method.
- Parameters:
indicator – The indicator to get. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To view all available indicators, call .get_available_indicators().
- Returns:
The same object to allow chaining of methods
- load_data(indicators: str | list, start_year: int, end_year: int) bblocks.import_tools.common.ImportData
Load the data for an indicator or a list of indicators.
- Parameters:
indicators – The indicator(s) to load. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To view all available indicators, call .get_available_indicators().
start_year – The first year to include in the data
end_year – The last year to include in the data
- update_data(reload_data: bool = True) bblocks.import_tools.common.ImportData
Update the data for all loaded indicators.
- get_data(indicators: str | list = 'all', **kwargs) pandas.DataFrame
Get the data for an indicator or a list of indicators.
- Parameters:
indicators – The indicator(s) to get. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To get all available indicators, set indicators=”all”.
- Returns:
A pandas dataframe with the requested data.
- bblocks.get_dsa(update=False, local_path: str = None) pandas.DataFrame
Extract DSA _data from the
Extract the most recent Debt Sustainability Assessment (DSA) _data for PRGT-Eligible Countries from the IMF website. URL = https://www.imf.org/external/Pubs/ft/dsa/DSAlist.pdf
- Parameters:
local_path – where the downloaded PDF will be stored
update (bool) – if True, updates the _data from the IMF website. Otherwise it loads the _data from the local file. If a local file does not exist, the _data will be extracted from the website.
- Returns:
pandas dataframe with country, latest publication date, and risk of debt distress
- bblocks.add_iso_codes_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'iso_code') pandas.DataFrame
Add ISO3 column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the iso codes will be stored.
- Returns:
the original DataFrame with a new column containing ISO3 codes.
- Return type:
DataFrame
- bblocks.add_income_level_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'income_level', update_data: bool = False) pandas.DataFrame
Add an income levels column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
target_column – the column where the income level _data will be stored.
update_data – whether to update the underlying _data or not.
- Returns:
the original DataFrame with a new column containing the income level _data.
- Return type:
DataFrame
- bblocks.add_short_names_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'name_short') pandas.DataFrame
Add short names column to a dataframe
- Parameters:
df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the short names will be stored.
- Returns:
the original DataFrame with a new column containing short names.
- Return type:
DataFrame
- bblocks.clean_number(number: str | pandas.Series, to: Type = float) float | int
Clean a string and return as float or integer. When selecting to=int, the default python round behaviour is used.
- Parameters:
number – the string to clean
to – the type to convert to (int or float)
- bblocks.clean_numeric_series(data: pandas.Series | pandas.DataFrame, series_columns: str | list | None = None, to: Type = float) pandas.DataFrame | pandas.Series
Clean a numeric column in a Pandas DataFrame or a Pandas Series which is meant to be numeric. When selecting to=int, the default python round behaviour is used.
- Parameters:
data – it accepts a series or a dataframe. If a dataframe is passed, the column(s) to clean must be specified
series_columns – optionally declared (only when _data is a dataframe). To apply to one or more columns.
to – the type to convert to (int or float)
- bblocks.to_date_column(series: pandas.Series, date_format: str | None = None) pandas.Series
Converts a Pandas series into a date series. The series must contain integers or strings that can be converted into datetime objects
- bblocks.convert_id(series: pandas.Series, from_type: str = 'regex', to_type: str = 'ISO3', not_found: str | None = None, *, additional_mapping: dict = None) pandas.Series
Takes a Pandas’ series with country IDs and converts them into the desired type.
- Parameters:
series – the Pandas series to convert
from_type – the classification type according to which the series is encoded. Available types come from the country_converter package (https://github.com/konstantinstadler/country_converter#classification-schemes) For example: ISO3, ISO2, name_short, DACcode, etc.
to_type – the target classification type. Same options as from_type
not_found – what to do if the value is not found. Can pass a string or None. If None, the original value is passed through.
additional_mapping – Optionally, a dictionary with additional mappings can be used. The keys are the values to be converted and the values are the converted values. The keys follow the same datatype as the original values. The values must follow the same datatype as the target type.
- bblocks.date_to_str(series: pandas.Series, date_format: str = '%d %B %Y') pandas.Series
Converts a Pandas’ series into a string series.
- Parameters:
series – the Pandas series to convert to a formatted date string
date_format – the format to use for the date string. The default is “%d %B %Y”
- bblocks.format_number(series: pandas.Series, as_units: bool = False, as_percentage: bool = False, as_millions: bool = False, as_billions: bool = False, decimals: int = 2, add_sign: bool = False, other_format: str = '{:,.2f}') pandas.Series
Formats a Pandas’ numeric series into a formatted string series.
- Parameters:
series – the series to convert to a formatted string
as_units – formatted with commas to separate thousands and the specified decimals
as_percentage – formatted as a percentage with the specified decimals. This assumes that the series contains numbers where 1 would equal 100%.
as_millions – divided by 1 million, formatted with commas and the specified decimals
as_billions – divided by 1 billion, formatted with commas and the specified decimals
decimals – the number of decimals to use
add_sign – add a plus sign to positive numbers
other_format – Other formats to use. This option can only be used if all others are false. Examples are available at: https://mkaz.blog/code/python-string-format-cookbook/
- bblocks.filter_by_continent(df: pandas.DataFrame, continent: str, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame
Filter a DataFrame by continent. :param df: the DataFrame to filter :param continent: the continent to filter by (e.g. “Africa”, “Europe”, “EU”) :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)
- Returns:
A filtered copy of the DataFrame.
- bblocks.filter_by_un_region(df: pandas.DataFrame, region: str, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame
Filter a DataFrame by UN region. This includes, for example, “Western Africa”, “Eastern Africa”, “Southern Asia”, “Northern America”, “Central America”, “Eastern Asia”.
- Parameters:
df – the DataFrame to filter
region – the region to filter by (e.g. “Western Africa”, “Eastern Africa”, etc.)
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)
Returns:
- bblocks.filter_eu_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame
Filter a DataFrame to keep only European countries. The current list of members of the European Union is always used.
- Parameters:
df – the DataFrame to filter
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)
- Returns:
A filtered copy of the DataFrame.
- bblocks.filter_african_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame
Filter a DataFrame to keep only African countries. :param df: the DataFrame to filter :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)
- Returns:
A filtered copy of the DataFrame.
- bblocks.filter_latest_by(data: pandas.DataFrame, date_column: str, value_columns: str | list | None = None, group_by: str | list | None = None) pandas.DataFrame
Calculate the latest value of (a) column(s) over a period of time.
- Parameters:
data – a DataFrame with a date column (datetime or int) and one or more numeric columns
date_column – the name of the date (datetime or int) column
value_columns – one or more columns to calculate the average over
group_by – Optionally, specify which columns to consider for the latest operation
- Returns:
A DataFrame with the latest value of the specified columns
- bblocks.set_bblocks_data_path(path)