bblocks

Submodules

Attributes

__version__

Classes

WorldBankData

An object to help download data from the World Bank.

GHED

An object to extract GHED _data

WFPData

Class to download and read WFP inflation and insufficient food data

WorldEconomicOutlook

World Economic Outlook _data

Aids

An object to extract data from UNAIDS.

DebtIDS

Import data from the World Bank's International Debt Statistics database.

Functions

get_dsa(→ pandas.DataFrame)

Extract DSA _data from the

add_iso_codes_column(→ pandas.DataFrame)

Add ISO3 column to a dataframe

add_income_level_column(→ pandas.DataFrame)

Add an income levels column to a dataframe

add_short_names_column(→ pandas.DataFrame)

Add short names column to a dataframe

clean_number(→ float | int)

Clean a string and return as float or integer.

clean_numeric_series(→ pandas.DataFrame | pandas.Series)

Clean a numeric column in a Pandas DataFrame or a Pandas Series which is

to_date_column(→ pandas.Series)

Converts a Pandas series into a date series.

convert_id(→ pandas.Series)

Takes a Pandas' series with country IDs and converts them into the desired type.

date_to_str(→ pandas.Series)

Converts a Pandas' series into a string series.

format_number(→ pandas.Series)

Formats a Pandas' numeric series into a formatted string series.

filter_by_continent(→ pandas.DataFrame)

Filter a DataFrame by continent.

filter_by_un_region(→ pandas.DataFrame)

Filter a DataFrame by UN region. This includes, for example, "Western Africa",

filter_eu_countries(→ pandas.DataFrame)

Filter a DataFrame to keep only European countries. The current list of members

filter_african_countries(→ pandas.DataFrame)

Filter a DataFrame to keep only African countries.

filter_latest_by(→ pandas.DataFrame)

Calculate the latest value of (a) column(s) over a period of time.

set_bblocks_data_path(path)

Package Contents

bblocks.__version__ = '1.4.1'
class bblocks.WorldBankData

Bases: bblocks.import_tools.common.ImportData

An object to help download data from the World Bank. In order to use, create an instance of this class. Then, call the load_indicator method to load an indicator. This can be done multiple times. If the _data for an indicator has never been downloaded, it will be downloaded. If it has been downloaded, it will be loaded from disk. If update_data is set to True when creating the object, the _data will be updated from the World Bank for each indicator. You can force an update by calling update if you want to refresh the _data stored on disk. You can get a dataframe of the _data by calling get_data.

_indicators: dict[str, tuple[pandas.DataFrame, dict]]
load_data(indicator: str | list[str], start_year: int | None = None, end_year: int | None = None, most_recent_only: bool = False, db: int = 2, **kwargs) WorldBankData

Get an indicator from the World Bank API

Parameters:
  • indicator – the code from the World Bank data portal (e.g. “SP.POP.TOTL”)

  • start_year – The first year to include in the data

  • end_year – The last year to include in the data

  • most_recent_only – If True, only get the most recent non-empty value for each country

  • db – The database to use. By default, use the WDI database (2)

Returns:

The same object to allow chaining

update_data(reload_data: bool = True) bblocks.import_tools.common.ImportData

Update the _data saved on disk for the different indicators

When called, it will go through each indicator and update the _data saved based on the parameters passed to load_indicator.

Returns:

The same object to allow chaining

get_data(indicators: str | list = 'all', **kwargs) pandas.DataFrame
class bblocks.GHED

Bases: bblocks.import_tools.common.ImportData

An object to extract GHED _data

To use, create an instance of the class and call the load_indicator method. If the _data is already downloaded, it will be loaded from disk. If not, it will be downloaded. If update_data is set to True, the _data will be downloaded regardless of whether it is already on disk. To force an update, call the update method. To get the _data, call the get_data method. To get the metadata, call the get_metadata method.

_metadata: pandas.DataFrame = None
load_data() bblocks.import_tools.common.ImportData

Load GHED data

Returns:

The same object to allow chaining

update_data(reload_data: bool) bblocks.import_tools.common.ImportData

Update GHED _data

Returns:

The same object to allow chaining

get_metadata() pandas.DataFrame

Get GHED metadata as a pandas dataframe

Returns:

A pandas dataframe with the metadata

class bblocks.WFPData

Bases: bblocks.import_tools.common.ImportData

Class to download and read WFP inflation and insufficient food data

property available_indicators: KeysView

View the available indicators from WFP

_country_codes() dict
load_data(indicator: str | list) None

Load an indicator into the WFPData object

update_data(reload_data: bool = True) None

Update the data for all the indicators currently loaded

class bblocks.WorldEconomicOutlook

Bases: bblocks.import_tools.common.ImportData

World Economic Outlook _data

year: int | None = None
release: int | None = None
__post_init__() None
__repr__() str
__load_data() None

loading WEO as a clean dataframe

Parameters:
  • latest_y – passed only optional to override the behaviour to get the latest

  • WEO. (release year for the)

  • latest_r – passed only optionally to override the behaviour to get the latest

  • value (released)

_check_indicators(indicators: str | list | None = None) None | dict
load_data(indicator: str | list) bblocks.import_tools.common.ImportData

Loads a specific indicator from the World Economic Outlook _data

update_data(reload_data: bool = True) None

Update the stored WEO _data, using WEO package.

Args:

available_indicators() None

Print the available indicators in the dataset

get_data(indicators: str | list = 'all', keep_metadata: bool = False) pandas.DataFrame
class bblocks.Aids

Bases: bblocks.import_tools.common.ImportData

An object to extract data from UNAIDS.

To use, create an instance of the class. The load indicators using the load_indicators method. This can be done multiple times. To return a dataframe of all available indicators to load, use the available_indicators class attribute. If the data for an indicator has never been downloaded, it will be downloaded. If it has been downloaded, it will be loaded from disk. If update_data is set to true, the data will be downloaded each time an indicator is loaded. You can force an update by calling ‘update’, and all indicators will be reloaded into the object. You can get a dataframe by calling ‘get_data’ and passing the indicator name(s) (or None and this will return all indicators) and passing the area grouping(s) (‘all’ by default)

property available_indicators: pandas.DataFrame

Returns a dataframe of available indicators

load_data(indicator: str, area_grouping: str = 'all') bblocks.import_tools.common.ImportData

Load an indicator to the object

indicator (str): The name of the indicator to load. To see a DataFrame of available

indicators, use the available_indicators method.

area_grouping (str): The grouping to use. Choose from [“country”, “region”, “all”].

Returns:

The same object to allow chaining

update_data(reload_data: bool)

Update all loaded indicators saved on the disk

When called, it will go through each loaded indicator/area grouping combination and update the data saved on disk.

Returns:

The same object to allow chaining

get_data(indicators: str | list | None = None, area_grouping: str = 'all') pandas.DataFrame

Get the data as a Pandas DataFrame

Parameters:
  • indicators – By default, all indicators are returned in a single DataFrame. If a list of indicators is passed, only those indicators will be returned. A single indicator can be passed as a string as well.

  • area_grouping (str) – The area grouping to use. Choose from [“country”, “region”, “all”]. Default is “all”.

Returns:

A Pandas DataFrame with the requested indicator data

class bblocks.DebtIDS

Bases: bblocks.import_tools.common.ImportData

Import data from the World Bank’s International Debt Statistics database.

To use this object, first create an instance of it. Then use the load_data method to load indicators. One or more indicators can be loaded at a time, and a starting and end year must be specified.

If the data has not been downloaded before, it will be downloaded from the World Bank API. If the data has been downloaded before, it will be loaded from the local data folder.

To get a DataFrame, use the get_data method. You can get the data for one or more, or for all indicators at once.

To update the data, use the update_data method. This will download the latest data from the World Bank API and overwrite the local data.

  • To get a list of available indicators, use the get_available_indicators method.

  • To get a list of available debt service indicators, use the debt_service_indicators method.

  • To get a list of available debt stocks indicators, use the debt_stocks_indicators method.

__post_init__()

Set the path to the data folder and create it if it doesn’t exist

_check_stored_data(indicator: str, start_year: int, end_year: int) str | bool

Check if the data is already stored locally

This also checks if the years requested are inside another file.

Parameters:
  • indicator (str) – The indicator to check

  • start_year (int) – The start year of the data

  • end_year (int) – The end year of the data

Returns:

The filename of the data if it exists bool: False if the data doesn’t exist

Return type:

str

static _indicator_parameters(indicator: str) tuple[str, int, int]

Get the indicator, start year and end year from the indicator name.

classmethod get_available_indicators() dict

Get a dictionary of all available indicators in the IDS database.

classmethod debt_service_indicators(detailed_category: bool = True) dict

Get a dictionary of Debt Service indicators in the IDS database.

classmethod debt_stocks_indicators(detailed_category: bool = True) dict

Get a dictionary of Debt Service indicators in the IDS database.

_get_indicator(indicator: str, start_year: int, end_year: int) bblocks.import_tools.common.ImportData

Get data for an indicator. This method is not meant to be accessed directly. Instead, use the .get_data() method.

Parameters:

indicator – The indicator to get. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To view all available indicators, call .get_available_indicators().

Returns:

The same object to allow chaining of methods

load_data(indicators: str | list, start_year: int, end_year: int) bblocks.import_tools.common.ImportData

Load the data for an indicator or a list of indicators.

Parameters:
  • indicators – The indicator(s) to load. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To view all available indicators, call .get_available_indicators().

  • start_year – The first year to include in the data

  • end_year – The last year to include in the data

update_data(reload_data: bool = True) bblocks.import_tools.common.ImportData

Update the data for all loaded indicators.

get_data(indicators: str | list = 'all', **kwargs) pandas.DataFrame

Get the data for an indicator or a list of indicators.

Parameters:

indicators – The indicator(s) to get. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To get all available indicators, set indicators=”all”.

Returns:

A pandas dataframe with the requested data.

bblocks.get_dsa(update=False, local_path: str = None) pandas.DataFrame

Extract DSA _data from the

Extract the most recent Debt Sustainability Assessment (DSA) _data for PRGT-Eligible Countries from the IMF website. URL = https://www.imf.org/external/Pubs/ft/dsa/DSAlist.pdf

Parameters:
  • local_path – where the downloaded PDF will be stored

  • update (bool) – if True, updates the _data from the IMF website. Otherwise it loads the _data from the local file. If a local file does not exist, the _data will be extracted from the website.

Returns:

pandas dataframe with country, latest publication date, and risk of debt distress

bblocks.add_iso_codes_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'iso_code') pandas.DataFrame

Add ISO3 column to a dataframe

Parameters:
  • df – the dataframe to which the column will be added

  • id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.

  • id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.

  • target_column – the column where the iso codes will be stored.

Returns:

the original DataFrame with a new column containing ISO3 codes.

Return type:

DataFrame

bblocks.add_income_level_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'income_level', update_data: bool = False) pandas.DataFrame

Add an income levels column to a dataframe

Parameters:
  • df – the dataframe to which the column will be added

  • id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.

  • id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.

  • target_column – the column where the income level _data will be stored.

  • update_data – whether to update the underlying _data or not.

Returns:

the original DataFrame with a new column containing the income level _data.

Return type:

DataFrame

bblocks.add_short_names_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'name_short') pandas.DataFrame

Add short names column to a dataframe

Parameters:
  • df – the dataframe to which the column will be added

  • id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.

  • id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.

  • target_column – the column where the short names will be stored.

Returns:

the original DataFrame with a new column containing short names.

Return type:

DataFrame

bblocks.clean_number(number: str | pandas.Series, to: Type = float) float | int

Clean a string and return as float or integer. When selecting to=int, the default python round behaviour is used.

Parameters:
  • number – the string to clean

  • to – the type to convert to (int or float)

bblocks.clean_numeric_series(data: pandas.Series | pandas.DataFrame, series_columns: str | list | None = None, to: Type = float) pandas.DataFrame | pandas.Series

Clean a numeric column in a Pandas DataFrame or a Pandas Series which is meant to be numeric. When selecting to=int, the default python round behaviour is used.

Parameters:
  • data – it accepts a series or a dataframe. If a dataframe is passed, the column(s) to clean must be specified

  • series_columns – optionally declared (only when _data is a dataframe). To apply to one or more columns.

  • to – the type to convert to (int or float)

bblocks.to_date_column(series: pandas.Series, date_format: str | None = None) pandas.Series

Converts a Pandas series into a date series. The series must contain integers or strings that can be converted into datetime objects

bblocks.convert_id(series: pandas.Series, from_type: str = 'regex', to_type: str = 'ISO3', not_found: str | None = None, *, additional_mapping: dict = None) pandas.Series

Takes a Pandas’ series with country IDs and converts them into the desired type.

Parameters:
  • series – the Pandas series to convert

  • from_type – the classification type according to which the series is encoded. Available types come from the country_converter package (https://github.com/konstantinstadler/country_converter#classification-schemes) For example: ISO3, ISO2, name_short, DACcode, etc.

  • to_type – the target classification type. Same options as from_type

  • not_found – what to do if the value is not found. Can pass a string or None. If None, the original value is passed through.

  • additional_mapping – Optionally, a dictionary with additional mappings can be used. The keys are the values to be converted and the values are the converted values. The keys follow the same datatype as the original values. The values must follow the same datatype as the target type.

bblocks.date_to_str(series: pandas.Series, date_format: str = '%d %B %Y') pandas.Series

Converts a Pandas’ series into a string series.

Parameters:
  • series – the Pandas series to convert to a formatted date string

  • date_format – the format to use for the date string. The default is “%d %B %Y”

bblocks.format_number(series: pandas.Series, as_units: bool = False, as_percentage: bool = False, as_millions: bool = False, as_billions: bool = False, decimals: int = 2, add_sign: bool = False, other_format: str = '{:,.2f}') pandas.Series

Formats a Pandas’ numeric series into a formatted string series.

Parameters:
  • series – the series to convert to a formatted string

  • as_units – formatted with commas to separate thousands and the specified decimals

  • as_percentage – formatted as a percentage with the specified decimals. This assumes that the series contains numbers where 1 would equal 100%.

  • as_millions – divided by 1 million, formatted with commas and the specified decimals

  • as_billions – divided by 1 billion, formatted with commas and the specified decimals

  • decimals – the number of decimals to use

  • add_sign – add a plus sign to positive numbers

  • other_format – Other formats to use. This option can only be used if all others are false. Examples are available at: https://mkaz.blog/code/python-string-format-cookbook/

bblocks.filter_by_continent(df: pandas.DataFrame, continent: str, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame

Filter a DataFrame by continent. :param df: the DataFrame to filter :param continent: the continent to filter by (e.g. “Africa”, “Europe”, “EU”) :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)

Returns:

A filtered copy of the DataFrame.

bblocks.filter_by_un_region(df: pandas.DataFrame, region: str, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame

Filter a DataFrame by UN region. This includes, for example, “Western Africa”, “Eastern Africa”, “Southern Asia”, “Northern America”, “Central America”, “Eastern Asia”.

Parameters:
  • df – the DataFrame to filter

  • region – the region to filter by (e.g. “Western Africa”, “Eastern Africa”, etc.)

  • id_column – the name of the column to use for the id (default: “iso_code”)

  • id_type – the type of id to use (default: “regex”)

Returns:

bblocks.filter_eu_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame

Filter a DataFrame to keep only European countries. The current list of members of the European Union is always used.

Parameters:
  • df – the DataFrame to filter

  • id_column – the name of the column to use for the id (default: “iso_code”)

  • id_type – the type of id to use (default: “regex”)

Returns:

A filtered copy of the DataFrame.

bblocks.filter_african_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') pandas.DataFrame

Filter a DataFrame to keep only African countries. :param df: the DataFrame to filter :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)

Returns:

A filtered copy of the DataFrame.

bblocks.filter_latest_by(data: pandas.DataFrame, date_column: str, value_columns: str | list | None = None, group_by: str | list | None = None) pandas.DataFrame

Calculate the latest value of (a) column(s) over a period of time.

Parameters:
  • data – a DataFrame with a date column (datetime or int) and one or more numeric columns

  • date_column – the name of the date (datetime or int) column

  • value_columns – one or more columns to calculate the average over

  • group_by – Optionally, specify which columns to consider for the latest operation

Returns:

A DataFrame with the latest value of the specified columns

bblocks.set_bblocks_data_path(path)