LLM-Assisted Generation

Use a language model to produce a richer schema from an open-ended story — useful for complex or unusual domains where the rule-based parser falls short.

pip install "misata[llm]"

Supported providers

ProviderEnv varNotes
groqGROQ_API_KEYFast, free tier available
openaiOPENAI_API_KEYGPT-4o / GPT-4-turbo
anthropicANTHROPIC_API_KEYClaude Sonnet / Opus
geminiGOOGLE_API_KEYGemini Pro via OpenAI-compat endpoint
ollamaFully local, no API key

Usage

from misata import LLMSchemaGenerator

gen = LLMSchemaGenerator(provider="groq")
# gen = LLMSchemaGenerator(provider="anthropic")
# gen = LLMSchemaGenerator(provider="ollama", model="llama3")

schema = gen.generate_from_story(
    "A fraud detection dataset — 2% positive rate, FICO scores, "
    "transaction velocity features, device fingerprints"
)

import misata
tables = misata.generate_from_schema(schema)

When to use it

  • Your domain is niche and the story parser returns a generic schema
  • You need column-level semantics that require world knowledge (e.g. realistic medical codes)
  • You want to iterate on schema design in natural language before committing to YAML

LLM → YAML → version control

Generate once with the LLM, save the schema to YAML, then commit it so future runs are deterministic and free.

```python
schema = gen.generate_from_story("A logistics company …")
misata.save_yaml_schema(schema, "logistics.yaml")
```