Utils API

ccrvam.checkerboard.utils

Module Contents

Classes

DataProcessor

Data processing engine for contingency table analysis.

Functions

gen_contingency_to_case_form

Convert a multi-dimensional contingency table data to the case form data.

gen_case_form_to_contingency

Convert case form data to a multi-dimensional contingency table.

API

class ccrvam.checkerboard.utils.DataProcessor[source]

Data processing engine for contingency table analysis.

static load_data(data: Union[str, numpy.ndarray, pandas.DataFrame], data_form: str, dimension: tuple, var_list: Optional[List[str]] = None, category_map: Optional[Dict[str, Dict[str, int]]] = None, named: bool = False, delimiter: str = None) numpy.ndarray[source]

Load and process data for contingency table analysis.

Input Arguments

  • data : Data source - file path, raw data array, or data frame

  • data_form : Format of the data: “case_form”, “frequency_form”, or “table_form”

  • dimension : A tuple specifying the number of categories for each variable. The length of the tuple indicates the number of variables , and each element in the tuple specifies the number of categories for the corresponding variable.

  • var_list : Names of variables in order of appearance in the data (optional)

  • category_map : Mapping of categorical labels to numeric indices for each variable (optional)

  • named : Whether the first row contains variable names (for file input)

  • delimiter : Column separator character for text files (optional)

Outputs

Processed contingency table for statistical analysis

Warnings/Errors

  • ValueError : If data_form is invalid or inputs are inconsistent

  • FileNotFoundError : If the specified data file cannot be found

static _apply_category_mapping(data: numpy.ndarray, category_map: Dict[str, Dict[str, int]], var_list: List[str], data_form: str) numpy.ndarray[source]

Internal helper to convert qualitative categories to numerical categories (1, 2, …).

static _process_frequency_form(data: numpy.ndarray, shape: tuple) numpy.ndarray[source]

Internal helper to convert frequency form data to contingency table.

static _process_case_form(data: numpy.ndarray, shape: tuple) numpy.ndarray[source]

Internal helper to convert case form data to contingency table.

static _process_table_form(data: numpy.ndarray, shape: tuple) numpy.ndarray[source]

Internal helper to process table form data.

ccrvam.checkerboard.utils.gen_contingency_to_case_form(contingency_table: numpy.ndarray) numpy.ndarray[source]

Convert a multi-dimensional contingency table data to the case form data.

Input Arguments

  • contingency_table : Multi-dimensional contingency table containing frequency counts

Outputs

Array for the case form data frames containing individual observations, with one or more categorical variables

ccrvam.checkerboard.utils.gen_case_form_to_contingency(cases: numpy.ndarray, shape: tuple, axis_order: Optional[list] = None) numpy.ndarray[source]

Convert case form data to a multi-dimensional contingency table.

Input Arguments

  • cases : Array where each row represents an observation with categorical variables

  • shape : Dimensions of the output contingency table

  • axis_order : (Optional) List specifying how case columns map to contingency table dimensions. For example, if cases has columns [A,B,C] and axis_order is [2,0,1], then column A maps to dimension 2, B to 0, and C to 1 in the contingency table. If None, assumes sequential mapping [0,1,2,…].

Outputs

Multi-dimensional contingency table.