bblocks

Submodules

Attributes

__version__

Classes

`WorldBankData`	An object to help download data from the World Bank.
`GHED`	An object to extract GHED _data
`WFPData`	Class to download and read WFP inflation and insufficient food data
`WorldEconomicOutlook`	World Economic Outlook _data
`Aids`	An object to extract data from UNAIDS.
`DebtIDS`	Import data from the World Bank's International Debt Statistics database.

Functions

`get_dsa`(→ pandas.DataFrame)	Extract DSA _data from the
`add_iso_codes_column`(→ pandas.DataFrame)	Add ISO3 column to a dataframe
`add_income_level_column`(→ pandas.DataFrame)	Add an income levels column to a dataframe
`add_short_names_column`(→ pandas.DataFrame)	Add short names column to a dataframe
`clean_number`(→ float \| int)	Clean a string and return as float or integer.
`clean_numeric_series`(→ pandas.DataFrame \| pandas.Series)	Clean a numeric column in a Pandas DataFrame or a Pandas Series which is
`to_date_column`(→ pandas.Series)	Converts a Pandas series into a date series.
`convert_id`(→ pandas.Series)	Takes a Pandas' series with country IDs and converts them into the desired type.
`date_to_str`(→ pandas.Series)	Converts a Pandas' series into a string series.
`format_number`(→ pandas.Series)	Formats a Pandas' numeric series into a formatted string series.
`filter_by_continent`(→ pandas.DataFrame)	Filter a DataFrame by continent.
`filter_by_un_region`(→ pandas.DataFrame)	Filter a DataFrame by UN region. This includes, for example, "Western Africa",
`filter_eu_countries`(→ pandas.DataFrame)	Filter a DataFrame to keep only European countries. The current list of members
`filter_african_countries`(→ pandas.DataFrame)	Filter a DataFrame to keep only African countries.
`filter_latest_by`(→ pandas.DataFrame)	Calculate the latest value of (a) column(s) over a period of time.
`set_bblocks_data_path`(path)

Package Contents

bblocks.__version__ = '1.4.1'

class bblocks.WorldBankData

Bases: bblocks.import_tools.common.ImportData

An object to help download data from the World Bank. In order to use, create an instance of this class. Then, call the load_indicator method to load an indicator. This can be done multiple times. If the _data for an indicator has never been downloaded, it will be downloaded. If it has been downloaded, it will be loaded from disk. If update_data is set to True when creating the object, the _data will be updated from the World Bank for each indicator. You can force an update by calling update if you want to refresh the _data stored on disk. You can get a dataframe of the _data by calling get_data.

_indicators: dict[str, tuple[pandas.DataFrame, dict]]

load_data(indicator: str | list[str], start_year: int | None = None, end_year: int | None = None, most_recent_only: bool = False, db: int = 2, **kwargs) → WorldBankData

Get an indicator from the World Bank API

Parameters:

indicator – the code from the World Bank data portal (e.g. “SP.POP.TOTL”)
start_year – The first year to include in the data
end_year – The last year to include in the data
most_recent_only – If True, only get the most recent non-empty value for each country
db – The database to use. By default, use the WDI database (2)

Returns:

The same object to allow chaining

update_data(reload_data: bool = True) → bblocks.import_tools.common.ImportData

Update the _data saved on disk for the different indicators

When called, it will go through each indicator and update the _data saved based on the parameters passed to load_indicator.

Returns:: The same object to allow chaining

get_data(indicators: str | list = 'all', **kwargs) → pandas.DataFrame

class bblocks.GHED

Bases: bblocks.import_tools.common.ImportData

An object to extract GHED _data

To use, create an instance of the class and call the load_indicator method. If the _data is already downloaded, it will be loaded from disk. If not, it will be downloaded. If update_data is set to True, the _data will be downloaded regardless of whether it is already on disk. To force an update, call the update method. To get the _data, call the get_data method. To get the metadata, call the get_metadata method.

_metadata: pandas.DataFrame = None

load_data() → bblocks.import_tools.common.ImportData

Load GHED data

Returns:: The same object to allow chaining

update_data(reload_data: bool) → bblocks.import_tools.common.ImportData

Update GHED _data

Returns:: The same object to allow chaining

get_metadata() → pandas.DataFrame

Get GHED metadata as a pandas dataframe

Returns:: A pandas dataframe with the metadata

class bblocks.WFPData

Bases: bblocks.import_tools.common.ImportData

Class to download and read WFP inflation and insufficient food data

property available_indicators: KeysView: View the available indicators from WFP

_country_codes() → dict

load_data(indicator: str | list) → None: Load an indicator into the WFPData object

update_data(reload_data: bool = True) → None: Update the data for all the indicators currently loaded

class bblocks.WorldEconomicOutlook

Bases: bblocks.import_tools.common.ImportData

World Economic Outlook _data

year: int | None = None

release: int | None = None

__post_init__() → None

__repr__() → str

__load_data() → None

loading WEO as a clean dataframe

Parameters:

latest_y – passed only optional to override the behaviour to get the latest
WEO. (release year for the)
latest_r – passed only optionally to override the behaviour to get the latest
value (released)

_check_indicators(indicators: str | list | None = None) → None | dict

load_data(indicator: str | list) → bblocks.import_tools.common.ImportData: Loads a specific indicator from the World Economic Outlook _data

update_data(reload_data: bool = True) → None

Update the stored WEO _data, using WEO package.

Args:

available_indicators() → None: Print the available indicators in the dataset

get_data(indicators: str | list = 'all', keep_metadata: bool = False) → pandas.DataFrame

class bblocks.Aids

Bases: bblocks.import_tools.common.ImportData

An object to extract data from UNAIDS.

To use, create an instance of the class. The load indicators using the load_indicators method. This can be done multiple times. To return a dataframe of all available indicators to load, use the available_indicators class attribute. If the data for an indicator has never been downloaded, it will be downloaded. If it has been downloaded, it will be loaded from disk. If update_data is set to true, the data will be downloaded each time an indicator is loaded. You can force an update by calling ‘update’, and all indicators will be reloaded into the object. You can get a dataframe by calling ‘get_data’ and passing the indicator name(s) (or None and this will return all indicators) and passing the area grouping(s) (‘all’ by default)

property available_indicators: pandas.DataFrame: Returns a dataframe of available indicators

load_data(indicator: str, area_grouping: str = 'all') → bblocks.import_tools.common.ImportData

Load an indicator to the object

indicator (str): The name of the indicator to load. To see a DataFrame of available: indicators, use the available_indicators method.

area_grouping (str): The grouping to use. Choose from [“country”, “region”, “all”].

Returns:: The same object to allow chaining

update_data(reload_data: bool)

Update all loaded indicators saved on the disk

When called, it will go through each loaded indicator/area grouping combination and update the data saved on disk.

Returns:: The same object to allow chaining

get_data(indicators: str | list | None = None, area_grouping: str = 'all') → pandas.DataFrame

Get the data as a Pandas DataFrame

Parameters:

indicators – By default, all indicators are returned in a single DataFrame. If a list of indicators is passed, only those indicators will be returned. A single indicator can be passed as a string as well.
area_grouping (str) – The area grouping to use. Choose from [“country”, “region”, “all”]. Default is “all”.

Returns:

A Pandas DataFrame with the requested indicator data

class bblocks.DebtIDS

Bases: bblocks.import_tools.common.ImportData

Import data from the World Bank’s International Debt Statistics database.

To use this object, first create an instance of it. Then use the load_data method to load indicators. One or more indicators can be loaded at a time, and a starting and end year must be specified.

If the data has not been downloaded before, it will be downloaded from the World Bank API. If the data has been downloaded before, it will be loaded from the local data folder.

To get a DataFrame, use the get_data method. You can get the data for one or more, or for all indicators at once.

To update the data, use the update_data method. This will download the latest data from the World Bank API and overwrite the local data.

To get a list of available indicators, use the get_available_indicators method.
To get a list of available debt service indicators, use the debt_service_indicators method.
To get a list of available debt stocks indicators, use the debt_stocks_indicators method.

__post_init__(): Set the path to the data folder and create it if it doesn’t exist

_check_stored_data(indicator: str, start_year: int, end_year: int) → str | bool

Check if the data is already stored locally

This also checks if the years requested are inside another file.

Parameters:

indicator (str) – The indicator to check
start_year (int) – The start year of the data
end_year (int) – The end year of the data

Returns:

The filename of the data if it exists bool: False if the data doesn’t exist

Return type:

str

static _indicator_parameters(indicator: str) → tuple[str, int, int]: Get the indicator, start year and end year from the indicator name.

classmethod get_available_indicators() → dict: Get a dictionary of all available indicators in the IDS database.

classmethod debt_service_indicators(detailed_category: bool = True) → dict: Get a dictionary of Debt Service indicators in the IDS database.

classmethod debt_stocks_indicators(detailed_category: bool = True) → dict: Get a dictionary of Debt Service indicators in the IDS database.

_get_indicator(indicator: str, start_year: int, end_year: int) → bblocks.import_tools.common.ImportData

Get data for an indicator. This method is not meant to be accessed directly. Instead, use the .get_data() method.

Parameters:: indicator – The indicator to get. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To view all available indicators, call .get_available_indicators().
Returns:: The same object to allow chaining of methods

load_data(indicators: str | list, start_year: int, end_year: int) → bblocks.import_tools.common.ImportData

Load the data for an indicator or a list of indicators.

Parameters:

indicators – The indicator(s) to load. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To view all available indicators, call .get_available_indicators().
start_year – The first year to include in the data
end_year – The last year to include in the data

update_data(reload_data: bool = True) → bblocks.import_tools.common.ImportData: Update the data for all loaded indicators.

get_data(indicators: str | list = 'all', **kwargs) → pandas.DataFrame

Get the data for an indicator or a list of indicators.

Parameters:: indicators – The indicator(s) to get. They must be in the IDS format (e.g. DT.DOD.DECT.CD). To get all available indicators, set indicators=”all”.
Returns:: A pandas dataframe with the requested data.

bblocks.get_dsa(update=False, local_path: str = None) → pandas.DataFrame

Extract DSA _data from the

Extract the most recent Debt Sustainability Assessment (DSA) _data for PRGT-Eligible Countries from the IMF website. URL = https://www.imf.org/external/Pubs/ft/dsa/DSAlist.pdf

Parameters:

local_path – where the downloaded PDF will be stored
update (bool) – if True, updates the _data from the IMF website. Otherwise it loads the _data from the local file. If a local file does not exist, the _data will be extracted from the website.

Returns:

pandas dataframe with country, latest publication date, and risk of debt distress

bblocks.add_iso_codes_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'iso_code') → pandas.DataFrame

Add ISO3 column to a dataframe

Parameters:

df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the iso codes will be stored.

Returns:

the original DataFrame with a new column containing ISO3 codes.

Return type:

DataFrame

bblocks.add_income_level_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'income_level', update_data: bool = False) → pandas.DataFrame

Add an income levels column to a dataframe

Parameters:

df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DACcode, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DACcode” must be passed.
target_column – the column where the income level _data will be stored.
update_data – whether to update the underlying _data or not.

Returns:

the original DataFrame with a new column containing the income level _data.

Return type:

DataFrame

bblocks.add_short_names_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'name_short') → pandas.DataFrame

Add short names column to a dataframe

Parameters:

df – the dataframe to which the column will be added
id_column – the column containing the name, ISO3, ISO2, DAC code, UN code, etc.
id_type – the type of ID used in th id_column. The default ‘regex’ tries to infer using the rules from the ‘country_converter’ package. For the DAC codes, “DAC” must be passed.
target_column – the column where the short names will be stored.

Returns:

the original DataFrame with a new column containing short names.

Return type:

DataFrame

bblocks.clean_number(number: str | pandas.Series, to: Type = float) → float | int

Clean a string and return as float or integer. When selecting to=int, the default python round behaviour is used.

Parameters:

number – the string to clean
to – the type to convert to (int or float)

bblocks.clean_numeric_series(data: pandas.Series | pandas.DataFrame, series_columns: str | list | None = None, to: Type = float) → pandas.DataFrame | pandas.Series

Clean a numeric column in a Pandas DataFrame or a Pandas Series which is meant to be numeric. When selecting to=int, the default python round behaviour is used.

Parameters:

data – it accepts a series or a dataframe. If a dataframe is passed, the column(s) to clean must be specified
series_columns – optionally declared (only when _data is a dataframe). To apply to one or more columns.
to – the type to convert to (int or float)

bblocks.to_date_column(series: pandas.Series, date_format: str | None = None) → pandas.Series: Converts a Pandas series into a date series. The series must contain integers or strings that can be converted into datetime objects

bblocks.convert_id(series: pandas.Series, from_type: str = 'regex', to_type: str = 'ISO3', not_found: str | None = None, *, additional_mapping: dict = None) → pandas.Series

Takes a Pandas’ series with country IDs and converts them into the desired type.

Parameters:

series – the Pandas series to convert
from_type – the classification type according to which the series is encoded. Available types come from the country_converter package (https://github.com/konstantinstadler/country_converter#classification-schemes) For example: ISO3, ISO2, name_short, DACcode, etc.
to_type – the target classification type. Same options as from_type
not_found – what to do if the value is not found. Can pass a string or None. If None, the original value is passed through.
additional_mapping – Optionally, a dictionary with additional mappings can be used. The keys are the values to be converted and the values are the converted values. The keys follow the same datatype as the original values. The values must follow the same datatype as the target type.

bblocks.date_to_str(series: pandas.Series, date_format: str = '%d %B %Y') → pandas.Series

Converts a Pandas’ series into a string series.

Parameters:

series – the Pandas series to convert to a formatted date string
date_format – the format to use for the date string. The default is “%d %B %Y”

bblocks.format_number(series: pandas.Series, as_units: bool = False, as_percentage: bool = False, as_millions: bool = False, as_billions: bool = False, decimals: int = 2, add_sign: bool = False, other_format: str = '{:,.2f}') → pandas.Series

Formats a Pandas’ numeric series into a formatted string series.

Parameters:

series – the series to convert to a formatted string
as_units – formatted with commas to separate thousands and the specified decimals
as_percentage – formatted as a percentage with the specified decimals. This assumes that the series contains numbers where 1 would equal 100%.
as_millions – divided by 1 million, formatted with commas and the specified decimals
as_billions – divided by 1 billion, formatted with commas and the specified decimals
decimals – the number of decimals to use
add_sign – add a plus sign to positive numbers
other_format – Other formats to use. This option can only be used if all others are false. Examples are available at: https://mkaz.blog/code/python-string-format-cookbook/

bblocks.filter_by_continent(df: pandas.DataFrame, continent: str, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame by continent. :param df: the DataFrame to filter :param continent: the continent to filter by (e.g. “Africa”, “Europe”, “EU”) :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)

Returns:: A filtered copy of the DataFrame.

bblocks.filter_by_un_region(df: pandas.DataFrame, region: str, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame by UN region. This includes, for example, “Western Africa”, “Eastern Africa”, “Southern Asia”, “Northern America”, “Central America”, “Eastern Asia”.

Parameters:

df – the DataFrame to filter
region – the region to filter by (e.g. “Western Africa”, “Eastern Africa”, etc.)
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)

Returns:

bblocks.filter_eu_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame to keep only European countries. The current list of members of the European Union is always used.

Parameters:

df – the DataFrame to filter
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)

Returns:

A filtered copy of the DataFrame.

bblocks.filter_african_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame to keep only African countries. :param df: the DataFrame to filter :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)

Returns:: A filtered copy of the DataFrame.

bblocks.filter_latest_by(data: pandas.DataFrame, date_column: str, value_columns: str | list | None = None, group_by: str | list | None = None) → pandas.DataFrame

Calculate the latest value of (a) column(s) over a period of time.

Parameters:

data – a DataFrame with a date column (datetime or int) and one or more numeric columns
date_column – the name of the date (datetime or int) column
value_columns – one or more columns to calculate the average over
group_by – Optionally, specify which columns to consider for the latest operation

Returns:

A DataFrame with the latest value of the specified columns

bblocks.set_bblocks_data_path(path)