SDK Reference
Datasets
Using evaluation datasets
The evaluation dataset is used to assess the performance of models by providing them with specific contexts, instructions, and questions, and comparing the model-generated outputs with expected answers.
Keys
Nomadic can ingest evaluation datasets with the following format. Each entry in the evaluation dataset is a dictionary with the following keys:
Key | Required | Description | Example |
---|---|---|---|
Context | No | Enhances the prompt. Supplies additional background information to help the model generate accurate responses. | “You are a doctor writing a visit note from a transcript of the doctor-patient conversation.” |
Instruction | No | Enhances the prompt. Provides specific guidance to the model on what action to perform. | “Absolutely do not hallucinate. Capture only factual information.” |
Question | Yes | The user input or query that prompts the model to generate a response. This is the only required key. | “What were the main topics discussed?” |
Answer | No, required only if using a supervised evaluator such as cosine similarity | The expected output or response from the model, which serves as the benchmark for evaluation. | “Investment strategies, retirement planning, and risk management.” |
Example Entry
Here is an example of an item in the sample evaluation dataset:
Basic Usage
Custom Dataset Ingress
Coming soon!