Schema
Arize class to organize and map column names containing model data within your Pandas dataframe to Arize.
Import and initialize Arize Schema from arize.utils.types
Parameter | Data Type | Expected Type In Column | Description |
---|---|---|---|
| str | Contents must be a string limited to 128 characters | (Optional) A unique string to identify a prediction event. Required to match a prediction to delayed actuals or feature importances in Arize. If the column is not provided, Arize will generate a random prediction id. |
| List[str] or TypedColumns | Feature values can be int, float, string, list of strings | (Optional) Column names for features. If TypedColumns is used, the columns will be cast to the provided types prior to logging. |
| Learn more here | (Optional) Dictionary mapping embedding display names to | |
| str | The content of this column must be int Unix Timestamps in seconds | (Optional) Column name for timestamps |
| str | The content of this column must be convertible to string | (Optional) Column name for categorical prediction values |
| str | The content of this column must be int/float. For Multi-Class models, content of this column must be a dictionary, mapping class name to int/float prediction scores. | (Optional Column name for numeric prediction values |
| str | The content of this column must be convertible to string | (Optional) Column name for categorical ground truth values |
| str | The content of this column must be int/float. For Multi-Class models, content of this column must be a dictionary, mapping class name to int/float actual scores. | (Optional) Column name for numeric ground truth |
| List[str] or TypedColumns | Tag values can be int, float, string. LImited to 1k values | (Optional) Column names for tags. If TypedColumns is used, the columns will be cast to the provided types prior to logging. |
| Dict[str,str] | The content of this column must be int/float | (Optional) dict of k-v pairs where k is the feature_colname and v is the corresponding shap_val_col_name. For example, your dataframe contains features columns |
| str | The content of this column must be string and is limited to 128 characters | (Required*) Column name for ranking groups or lists in ranking models *for ranking models only |
| str | The content of this column must be integer between 1-100 | (Required*) Column name for rank of each element on the its group or list *for ranking models only |
| str | The content of this column must be int/float | (Required*) Column name for ranking model type numeric ground truth values *for ranking models only |
| str | The content of this column must be a string | (Required*) Column name for ranking model type categorical ground truth values *for ranking models only |
| Learn more here | ObjectDetectionColumnNames object containing information defining the predicted bounding boxes' coordinates, categories, and scores. | |
| Learn more here | ObjectDetectionColumnNames object containing information defining the actula bounding boxes' coordinates, categories, and scores. | |
| Learn more here | EmbeddingColumnNames object containing the embedding vector data (required) and raw text (optional) for the input text your model acts on | |
| Learn more here | EmbeddingColumnNames object containing the embedding vector data (required) and raw text (optional) for the text your model generates | |
| Learn more here | PromptTemplateColumnNames object containing the prompt template and prompt template version, both optional | |
| LLMConfigColumnNames | Learn more here | LLMConfigColumnNames object containing the LLM model name (optional) and its hyper-parameters (optional) used at inference time |
| LLMRunMetadataColumnNames | Learn more here | LLMRunMetadata object containing metadata about the LLM inference, i.e., token counts and response latency |
| str | The contents of this column must be list of entries convertible to strings | Column name for retrieved document ids |
| str | Contents of this column must be a dictionary mapping string class names to float scores. Learn more here | (Optional) Column name used only for Multi-Label Multi-Class models and determines the minimum prediction value for a class to be considered a positive prediction. |
Code Example
prediction id | feature_1 | feature_2 | tag_1 | tag_2 | prediction_ts | prediction_label | actual_label | embedding |
---|---|---|---|---|---|---|---|---|
1fcd50f4689 | ca | [ca, ak] | female | 25 | 1637538845 | No Claims | No Claims | [1.27346, -0.2138, ...] |
Last updated