naampy package

Subpackages

Submodules

naampy.in_rolls_fn module

class naampy.in_rolls_fn.InRollsFnData[source]

Bases: object

InRollsFnData class.

classmethod in_rolls_fn_gender(df: DataFrame, namecol: str, state: str | None = None, year: int | None = None, dataset: str = 'v2_1k') DataFrame[source]

Appends additional columns from Female ratio data to the input DataFrame based on the first name.

Removes extra space. Checks if the name is the Indian electoral rolls data. If it is, outputs data from that row.

Parameters:
  • df (DataFrame) – Pandas DataFrame containing the first name column.

  • namecol (str) – Columnn name containing the first name.

  • state (str or None) – The state from which Indian electoral rolls data to be used. (default is None for all states)

  • year (int or None) – The year of Indian electoral rolls to be used. (default is None for all years)

Returns:

Pandas DataFrame with additional columns:-

‘n_female’, ‘n_male’, ‘n_third_gender’, ‘prop_female’, ‘prop_male’, ‘prop_third_gender’ by first name

Return type:

DataFrame

static list_states(dataset: str = 'v2_1k') ndarray[source]
Parameters:

dataset (str) – version of the dataset

Returns:

list of states

static load_naampy_data(dataset: str) str[source]
Parameters:

dataset (str) – version of the dataset

Returns:

path to the data

classmethod predict_fn_gender(first_names: list[str]) DataFrame[source]

Predict gender based on name :param first_names: list of first name :type first_names: list of str

Returns:

Pandas DataFrame with predicted labels and probability

Return type:

DataFrame

naampy.in_rolls_fn.in_rolls_fn_gender(df: DataFrame, namecol: str, state: str | None = None, year: int | None = None, dataset: str = 'v2_1k') DataFrame

Appends additional columns from Female ratio data to the input DataFrame based on the first name.

Removes extra space. Checks if the name is the Indian electoral rolls data. If it is, outputs data from that row.

Parameters:
  • df (DataFrame) – Pandas DataFrame containing the first name column.

  • namecol (str) – Columnn name containing the first name.

  • state (str or None) – The state from which Indian electoral rolls data to be used. (default is None for all states)

  • year (int or None) – The year of Indian electoral rolls to be used. (default is None for all years)

Returns:

Pandas DataFrame with additional columns:-

‘n_female’, ‘n_male’, ‘n_third_gender’, ‘prop_female’, ‘prop_male’, ‘prop_third_gender’ by first name

Return type:

DataFrame

naampy.in_rolls_fn.main(argv=['-M', 'html', 'source', 'build'])[source]

Main method for shell support

naampy.in_rolls_fn.predict_fn_gender(first_names: list[str]) DataFrame

Predict gender based on name :param first_names: list of first name :type first_names: list of str

Returns:

Pandas DataFrame with predicted labels and probability

Return type:

DataFrame

naampy.utils module

naampy.utils.download_file(url: str, target: str) bool[source]
naampy.utils.find_ngrams(vocab: list, text: str, n: int) list[source]

Find and return list of the index of n-grams in the vocabulary list.

Generate the n-grams of the specific text, find them in the vocabulary list and return the list of index have been found.

Parameters:
  • vocab (list) – Vocabulary list.

  • text (str) – Input text

  • n (int) – N-grams

Returns:

List of the index of n-grams in the vocabulary list.

Return type:

list

naampy.utils.get_app_file_path(app_name: str, filename: str) str[source]

Module contents

naampy.in_rolls_fn_gender(df: DataFrame, namecol: str, state: str | None = None, year: int | None = None, dataset: str = 'v2_1k') DataFrame

Appends additional columns from Female ratio data to the input DataFrame based on the first name.

Removes extra space. Checks if the name is the Indian electoral rolls data. If it is, outputs data from that row.

Parameters:
  • df (DataFrame) – Pandas DataFrame containing the first name column.

  • namecol (str) – Columnn name containing the first name.

  • state (str or None) – The state from which Indian electoral rolls data to be used. (default is None for all states)

  • year (int or None) – The year of Indian electoral rolls to be used. (default is None for all years)

Returns:

Pandas DataFrame with additional columns:-

‘n_female’, ‘n_male’, ‘n_third_gender’, ‘prop_female’, ‘prop_male’, ‘prop_third_gender’ by first name

Return type:

DataFrame

naampy.predict_fn_gender(first_names: list[str]) DataFrame

Predict gender based on name :param first_names: list of first name :type first_names: list of str

Returns:

Pandas DataFrame with predicted labels and probability

Return type:

DataFrame