# Advanced Usage Guide

## Metric selection strategies

Use `__all__` to compute all registered metrics:

```bash
t2s run -d ck25 -j ./datasets/ck25/eval/ -m __all__ -ee http://localhost:8886/
```

Or run focused subsets depending on your evaluation question:

- Structural fidelity: `query_exact_match`, `token_f1`, `codebleu`
- Result quality: `answerset_precision`, `answerset_recall`, `answerset_f1`
- Ranking behavior: `mrr`, `ndcg`, `hit@1`, `p@1`

## Parallel execution

Enable multiprocessing across systems/files:

```bash
t2s run -d ck25 -j ./datasets/ck25/eval/ -m query_execution answerset_f1 -ee http://localhost:8886/ -p
```

## Export controls

Useful flags:

- `-eq` to include per-query scores
- `-ep` to write output to a custom location
- `-s` to set explicit system names

Example:

```bash
t2s run \
  -d ck25 \
  -s AIFB DBPEDIA-CG \
  -j ./datasets/ck25/eval/AIFB.jsonl ./datasets/ck25/eval/DBPEDIA-CG.jsonl \
  -m query_execution answerset_f1 \
  -ee http://localhost:8886/ \
  -eq \
  -ep ./datasets/ck25/results/custom-run.json
```

## LLM-based metrics

If selected metrics require LLM support, configure the Ollama model:

```bash
t2s run -d ck25 -j ./datasets/ck25/eval/ -m llm_judge -ee http://localhost:8886/ -lo gemma3:4b
```