bblocks.cleaning_tools.filter

Functions

`filter_latest_by`(→ pandas.DataFrame)	Calculate the latest value of (a) column(s) over a period of time.
`_filter_by`(→ pandas.DataFrame)	Helper function to filter a DataFrame by membership to a specific grouping.
`filter_by_continent`(→ pandas.DataFrame)	Filter a DataFrame by continent.
`filter_by_un_region`(→ pandas.DataFrame)	Filter a DataFrame by UN region. This includes, for example, "Western Africa",
`filter_african_countries`(→ pandas.DataFrame)	Filter a DataFrame to keep only African countries.
`filter_eu_countries`(→ pandas.DataFrame)	Filter a DataFrame to keep only European countries. The current list of members

Module Contents

bblocks.cleaning_tools.filter.filter_latest_by(data: pandas.DataFrame, date_column: str, value_columns: str | list | None = None, group_by: str | list | None = None) → pandas.DataFrame

Calculate the latest value of (a) column(s) over a period of time.

Parameters:

data – a DataFrame with a date column (datetime or int) and one or more numeric columns
date_column – the name of the date (datetime or int) column
value_columns – one or more columns to calculate the average over
group_by – Optionally, specify which columns to consider for the latest operation

Returns:

A DataFrame with the latest value of the specified columns

bblocks.cleaning_tools.filter._filter_by(df: pandas.DataFrame, by: str, by_value: str, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Helper function to filter a DataFrame by membership to a specific grouping. The groupings come from those available through the country_converter package. More info available at: https://github.com/konstantinstadler/country_converter#classification-schemes

Parameters:

df – the DataFrame to filter
by – the type of grouping to filter by (e.g. “Continent”, “UNRegion”, “EU”)
by_value – the value of the grouping to filter by (e.g. “Africa”, “Europe”, “EU”)
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)

Returns:

A filtered copy of the DataFrame.

bblocks.cleaning_tools.filter.filter_by_continent(df: pandas.DataFrame, continent: str, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame by continent. :param df: the DataFrame to filter :param continent: the continent to filter by (e.g. “Africa”, “Europe”, “EU”) :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)

Returns:: A filtered copy of the DataFrame.

bblocks.cleaning_tools.filter.filter_by_un_region(df: pandas.DataFrame, region: str, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame by UN region. This includes, for example, “Western Africa”, “Eastern Africa”, “Southern Asia”, “Northern America”, “Central America”, “Eastern Asia”.

Parameters:

df – the DataFrame to filter
region – the region to filter by (e.g. “Western Africa”, “Eastern Africa”, etc.)
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)

Returns:

bblocks.cleaning_tools.filter.filter_african_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame to keep only African countries. :param df: the DataFrame to filter :param id_column: the name of the column to use for the id (default: “iso_code”) :param id_type: the type of id to use (default: “regex”)

Returns:: A filtered copy of the DataFrame.

bblocks.cleaning_tools.filter.filter_eu_countries(df: pandas.DataFrame, id_column: str = 'iso_code', id_type: str = 'regex') → pandas.DataFrame

Filter a DataFrame to keep only European countries. The current list of members of the European Union is always used.

Parameters:

df – the DataFrame to filter
id_column – the name of the column to use for the id (default: “iso_code”)
id_type – the type of id to use (default: “regex”)

Returns:

A filtered copy of the DataFrame.