bblocks.dataframe_tools.add =========================== .. py:module:: bblocks.dataframe_tools.add Functions --------- .. autoapisummary:: bblocks.dataframe_tools.add.__validate_add_column_params bblocks.dataframe_tools.add.add_population_column bblocks.dataframe_tools.add.add_poverty_ratio_column bblocks.dataframe_tools.add.add_population_density_column bblocks.dataframe_tools.add.add_gdp_column bblocks.dataframe_tools.add.add_gov_expenditure_column bblocks.dataframe_tools.add.add_gdp_share_column bblocks.dataframe_tools.add.add_population_share_column bblocks.dataframe_tools.add.add_gov_exp_share_column bblocks.dataframe_tools.add.add_income_level_column bblocks.dataframe_tools.add.add_short_names_column bblocks.dataframe_tools.add.add_iso_codes_column bblocks.dataframe_tools.add.add_median_observation bblocks.dataframe_tools.add.add_flourish_geometries bblocks.dataframe_tools.add.add_value_as_share Module Contents --------------- .. py:function:: __validate_add_column_params(*, df: pandas.DataFrame, id_column: str, id_type: str | None, date_column: str | None) -> tuple Validate parameters to use in an *add column* function type .. py:function:: add_population_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'population', update_data: bool = False) -> pandas.DataFrame Add population column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the population for that year will be used. If it's missing, it will be missing in the returned column as well. If the _data isn't specified, the most recent population _data from the world bank is used. :param target_column: the column where the population _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the population _data. :rtype: DataFrame .. py:function:: add_poverty_ratio_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'poverty_ratio', update_data: bool = False) -> pandas.DataFrame Add poverty headcount column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the population for that year will be used. If it's missing, it will be missing in the returned column as well. If the _data isn't specified, the most recent _data is used. :param target_column: the column where the population _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the poverty _data. :rtype: DataFrame .. py:function:: add_population_density_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'population_density', update_data: bool = False) -> pandas.DataFrame Add population density column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the population for that year will be used. If it's missing, it will be missing in the returned column as well. If the _data isn't specified, the most recent _data is used. :param target_column: the column where the population _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the population density _data. :rtype: DataFrame .. py:function:: add_gdp_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'gdp', usd: bool = True, include_estimates: bool = False, update_data: bool = False) -> pandas.DataFrame Add GDP column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the GDP for that year will be used. If it's missing, it will be missing in the returned column as well. If the date isn't specified, the most recent _data is used. :param include_estimates: Whether to include years for which the WEO _data is labelled as estimates. :param usd: Whether to add the _data as US dollars or Local Currency Units. :param target_column: the column where the gdp _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the gdp _data from the IMF World Economic Outlook. :rtype: DataFrame .. py:function:: add_gov_expenditure_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, target_column: str = 'gov_exp', usd: bool = True, include_estimates: bool = False, update_data: bool = False) -> pandas.DataFrame Add Government Expenditure column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the expenditure for that year will be used. If it's missing, it will be missing in the returned column as well. If the date isn't specified, the most recent _data is used. :param include_estimates: Whether to include years for which the WEO _data is labelled as estimates. :param usd: Whether to add the _data as US dollars or Local Currency Units. :param target_column: the column where the expenditure _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the expenditure _data from the IMF World Economic Outlook. :rtype: DataFrame .. py:function:: add_gdp_share_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, value_column: str = 'value', target_column: str = 'gdp_share', decimals: int = 2, usd: bool = False, include_estimates: bool = False, update_data: bool = False) -> pandas.DataFrame Add value as share of GDP column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the GDP for that year will be used. If it's missing, it will be missing in the returned column as well. If the date isn't specified, the most recent _data is used. :param value_column: the column containing the value to be converted to a share of GDP. :param decimals: the number of decimals to use in the returned column. :param include_estimates: Whether to include years for which the WEO _data is labelled as estimates. :param usd: Whether to add the data as US dollars or Local Currency Units. :param target_column: the column where the gdp _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the _data as a share of gdp _data, using the IMF World Economic Outlook. :rtype: DataFrame .. py:function:: add_population_share_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, value_column: str = 'value', target_column: str = 'population_share', decimals: int = 2, update_data: bool = False) -> pandas.DataFrame Add population share column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the population for that year will be used. If it's missing, it will be missing in the returned column as well. If the _data isn't specified, the most recent population _data from the world bank is used. :param value_column: the column containing the value to be used in the calculation. :param target_column: the column where the population _data will be stored. :param decimals: the number of decimals to use in the returned column. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing value as share of population. :rtype: DataFrame .. py:function:: add_gov_exp_share_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, date_column: str | None = None, value_column: str = 'value', target_column: str = 'gov_exp_share', usd: bool = False, include_estimates: bool = False, update_data: bool = False) -> pandas.DataFrame Add value as share of Government Expenditure column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param date_column: Optionally, a date column can be specified. If so, the expenditure _data for that year will be used. If it's missing, it will be missing in the returned column as well. If the date isn't specified, the most recent _data is used. :param value_column: the column containing the value to be converted to a share of expenditure. :param include_estimates: Whether to include years for which the WEO _data is labelled as estimates. :param usd: Whether to add the _data as US dollars or Local Currency Units. :param target_column: the column where the expenditure _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the _data as a share of expenditure, using the IMF World Economic Outlook. :rtype: DataFrame .. py:function:: add_income_level_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'income_level', update_data: bool = False) -> pandas.DataFrame Add an income levels column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DACcode, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DACcode" must be passed. :param target_column: the column where the income level _data will be stored. :param update_data: whether to update the underlying _data or not. :returns: the original DataFrame with a new column containing the income level _data. :rtype: DataFrame .. py:function:: add_short_names_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'name_short') -> pandas.DataFrame Add short names column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DAC code, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DAC" must be passed. :param target_column: the column where the short names will be stored. :returns: the original DataFrame with a new column containing short names. :rtype: DataFrame .. py:function:: add_iso_codes_column(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'iso_code') -> pandas.DataFrame Add ISO3 column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DAC code, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DAC" must be passed. :param target_column: the column where the iso codes will be stored. :returns: the original DataFrame with a new column containing ISO3 codes. :rtype: DataFrame .. py:function:: add_median_observation(df: pandas.DataFrame, group_by: str | list = None, value_columns: str | list[str] = 'value', append: bool = True, group_name: Optional[str] = None) -> pandas.DataFrame Add median observation column to a dataframe :param df: the dataframe to which the column will be added :param group_by: the column(s) by which to group the _data to calculate the median. :param value_columns: the column(s) containing the values to be used for the median. :param append: if True, the median observation will be appended to the dataframe. If False, the median observation will be stored in a new column. :param group_name: the name of the group to be used in the id_column or as the name of :param the column containing the median observations.: :returns: the original dataframe with added rows for the median (if append is True) or a new column containing the median observations (if append is False). :rtype: DataFrame .. py:function:: add_flourish_geometries(df: pandas.DataFrame, id_column: str, id_type: str | None = None, target_column: str = 'geometry') -> pandas.DataFrame Add flourish geometries column to a dataframe :param df: the dataframe to which the column will be added :param id_column: the column containing the name, ISO3, ISO2, DAC code, UN code, etc. :param id_type: the type of ID used in th id_column. The default 'regex' tries to infer using the rules from the 'country_converter' package. For the DAC codes, "DAC" must be passed. :param target_column: the column where the flourish geometries will be stored. :returns: the original DataFrame with a new column containing the flourish geometries. :rtype: DataFrame .. py:function:: add_value_as_share(df: pandas.DataFrame, value_col: str, share_of_value_col: str, target_col: str | None = None, decimals: int = 2) -> pandas.DataFrame