Structured Data
Find structure in data
From the chatlas docs:
- Article summaries: Extract key points from lengthy reports or articles to create concise summaries for decision-makers.
- Entity recognition: Identify and extract entities such as names, dates, and locations from unstructured text to create structured datasets.
- Sentiment analysis: Extract sentiment scores and associated entities from customer reviews or social media posts to gain insights into public opinion.
- Classification: Classify text into predefined categories, such as spam detection or topic classification.
- Image/PDF input: Extract data from images or PDFs, such as tables or forms, to automate data entry processes.
Components
chatlas:Chat.extract_data()methodpydantic: data model fromBaseModel, with optionalFielddescriptions
Simple example
import chatlas as ctl
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
chat = ctl.ChatOpenAI()
chat.extract_data(
"My name is Susan and I'm 13 years old",
data_model=Person,
)output:
{"name": "Susan", "age": 13}
Add descriptions
Field(): add a description to the model- Type hint with None: to allow optional
import chatlas as ctl
from pydantic import BaseModel, Field
class Person(BaseModel):
"""A person"""
name: str = Field(description="Name")
age: int = Field(description="Age, in years")
hobbies: list[str] | None = Field(
description="List of hobbies. Should be exclusive and brief."
)
chat = ctl.ChatAnthropic() # changed to Anthropic
chat.extract_data(
"My name is Susan and I'm 13 years old",
data_model=Person,
)Article summary
Demo Chatlas docs: https://posit-dev.github.io/chatlas/structured-data/article-summary.html
Entity recognition
- If you want a pandas dataframe as an output
- Tou need to create the row-wise spec of data
- Then create a list of your row data
Demo Chatlas docs: https://posit-dev.github.io/chatlas/structured-data/entity-recognition.html
Sentiment analysis
Demo Chatlas docs: https://posit-dev.github.io/chatlas/structured-data/sentiment-analysis.html
Classification
Demo Chatlas Docs: https://posit-dev.github.io/chatlas/structured-data/classification.html
Multi-modal input
- Images
- PDFs
Demo Chatlas Docs: https://posit-dev.github.io/chatlas/structured-data/multi-modal.html
Multi-modal input: Images
Demo: https://github.com/chendaniely/nydsaic2025-llm/blob/main/code/04-structured/02-image.py
Multi-modal input: PDF
Demo: https://github.com/chendaniely/nydsaic2025-llm/blob/main/code/04-structured/03-pdf.py