Writing Cleaner LLM Code with Claudette, Cosette, and Gaspard

Unpacking three handy Python libraries created by Answer.AI for LLM development.

AIPythonTutorial

AUTHOR

Nathan Horvath

PUBLISHED

2025-05-13

If you've ever worked directly with Large Language Model (LLM) APIs, you've likely encountered the same frustration I did: writing excessive boilerplate code just to handle basic conversations. It can be quite tedious copying and pasting message histories, parsing JSON responses, and writing increasingly convoluted prompt engineering hacks just to get consistent output formats for an LLM application.

Fortunately, others in the community have encountered similar issues and built tools to circumvent them. I was recently introduced to the suite of Answer.AI libraries: Claudette, Cosette, and Gaspard. These open-source Python wrappers dramatically simplify interactions with Anthropic, OpenAI, and Google Gemini APIs respectively.

The beauty of these libraries isn't that they add new functionality, but rather, that they strip away the unnecessary complexity that can make working with LLMs feel more like fighting with JSON than building something useful. They handle conversation state, structured outputs, and tool use in a more natural way. Furthermore, these libraries run entirely on your local machine. The only external data transmission occurs when interacting with the official LLM APIs themselves. This means your application code, custom tools, and conversation history aren't shared with any additional third parties.

In this post, I'll show you some simple examples of how these libraries can simplify your code when building LLM applications. I'll be focusing primarily on Claudette for Anthropic's Claude models, with a brief highlight of a special feature from both Cosette and Gaspard. The primary functionality you will see is applicable to all three libraries though, so you can choose your preference and work with that.

Traditional API Approach

Let's look at how we'd typically interact with Claude using Anthropic's standard SDK for a simple LLM call:

import anthropic
import os

# Create a client
anthropic_client = anthropic.Anthropic(
    api_key=os.environ.get("ANTHROPIC_API_KEY"),
)

# Initial message
response = anthropic_client.messages.create(
    model="claude-3-7-sonnet-20250219",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Who was the seventh Prime Minister of Canada?"}
    ]
)
print(response.content[0].text)

The seventh Prime Minister of Canada was Wilfrid Laurier. He served as Prime Minister from July 11, 1896, to October 6, 1911, making him the first francophone (French-speaking) Prime Minister of Canada. Laurier is known for promoting national unity between French and English Canadians and for his role in Canada's development during a period of significant immigration and economic growth.

For follow-up questions, you need to manually construct the conversation history:

# Follow-up requires including previous messages
follow_up_response = anthropic_client.messages.create(
    model="claude-3-7-sonnet-20250219",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Who was the seventh Prime Minister of Canada?"},
        {"role": "assistant", "content": response.content[0].text},
        {"role": "user", "content": "What was his political party?"}
    ]
)
print(follow_up_response.content[0].text)

Wilfrid Laurier was a member of the Liberal Party of Canada. He led the Liberal Party from 1887 until 1919 and served as Prime Minister under the Liberal banner during his time in office from 1896 to 1911. His leadership helped establish the Liberals as a major political force in Canada during the late 19th and early 20th centuries.

This works, but notice the unnecessary complexity. You need to manually track the conversation history, adding previous messages to each new request. For multi-turn interactions, this creates verbose code that focuses more on message management than on your actual task. If you're building something like a chatbot, this approach quickly becomes unwieldy as the conversation grows.

A Simpler Approach with Claudette

Now let's see how Claudette simplifies this:

from claudette import Chat, models, contents

# models is a list of model name strings
# We'll use claude-3-7-sonnet, but you can pick from several of Anthropic's offerings
claude37sonnet = models[1]

# Create a chat instance
chat = Chat(claude37sonnet)

# Initial message
response = chat("Who was the seventh Prime Minister of Canada?")
print(contents(response))

The seventh Prime Minister of Canada was Wilfrid Laurier. He served as Prime Minister from July 11, 1896, to October 6, 1911, making him the first francophone (French-speaking) Prime Minister of Canada. Laurier led the Liberal Party and is known for promoting national unity between French and English Canadians, as well as overseeing a period of significant economic growth and western expansion in Canada.

# Follow-up is much simpler too
follow_up = chat("What was his date of birth?")
print(contents(follow_up))

Wilfrid Laurier was born on November 20, 1841, in Saint-Lin, Canada East (now Saint-Lin-Laurentides, Quebec).

The difference is immediately clear. Claudette maintains conversation state automatically, making follow-up interactions as simple as calling the chat object again. This is particularly valuable when developing interactive applications, exploring data in notebooks, or maintaining clean and readable code. When building a prototype or data exploration tool, this simplicity allows you to focus on the insights you're trying to gather rather than API mechanics.

Just like with the traditional API, you can also define a custom system prompt with Claudette to set the tone, behavior, or instructions for your entire conversation while maintaining concise code:

chat = Chat(claude37sonnet, sp="Always respond as if you are a pirate.") 

pirate_response = chat("Who was the seventh Prime Minister of Canada?")
print(contents(pirate_response))

Arr, ye be askin' about the seventh scallywag to captain the ship o' Canada, do ye?

That be none other than Wilfrid Laurier, who took the helm from 1896 to 1911, a mighty long voyage of 15 years! A French-Canadian buccaneer, he was the first mate of French descent to command Canada's vessel.

Under his leadership, the Canadian ship sailed through prosperous waters - new provinces joined the crew, gold was discovered in the Yukon, and many new hands came aboard from distant lands.

Shiver me timbers, Laurier was known for his silver tongue and diplomatic ways, tryin' to find the middle course between English and French interests in the great Dominion. A clever navigator of political waters, he was!

Structured Output

One common task is extracting structured information from LLM responses. When building data pipelines or applications that need to process LLM outputs programmatically, getting consistent data structures is crucial. The traditional approach requires crafting precise prompts and manual parsing:

prompt = """Please provide information about Wilfred Laurier in this exact JSON format:
{
  "name": "Full Name",
  "birth_year": YYYY,
  "death_year": YYYY or "Alive" if still alive,
  "political_party": "Party name",
  "accomplishments": ["Accomplishment1", "Accomplishment2", ...]
}
Only return the JSON, nothing else."""

response = anthropic_client.messages.create(
    model=claude37sonnet,
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}]
)

# Would need parsing to convert to Python object
print(response.content[0].text)

{
  "name": "Wilfrid Laurier",
  "birth_year": 1841,
  "death_year": 1919,
  "political_party": "Liberal Party of Canada",
  "accomplishments": [
    "Canada's first francophone Prime Minister (1896-1911)",
    "Led Canada during a period of rapid growth and prosperity",
    "Created Alberta and Saskatchewan as provinces in 1905",
    "Established the Department of External Affairs in 1909",
    "Championed national unity between English and French Canadians",
    "Implemented a policy of increased immigration to populate Western Canada",
    "Negotiated a free trade agreement with the United States (though it was not implemented)",
    "Oversaw construction of a second transcontinental railway"
  ]
}

This approach has several issues:

You need to carefully craft your JSON structure in the prompt
The model might not follow your format exactly
You still need to parse the output text into a Python object
Any changes to your data structure require updating both your prompt and your parsing code

With Claudette (and friends), you can define a class and get structured data directly. I'll demonstrate this first by defining a custom PrimeMinister class that stores various details of Prime Ministers I'm interested in collecting:

from fastcore.utils import basic_repr, store_attr
class PrimeMinister:
    '''Information about a Canadian Prime Minister including biographical details and key accomplishments.'''
    def __init__(self,
                 name: str,                     # Full name
                 birth_date: str,               # Date of birth (YYYY-MM-DD)
                 death_date: str,               # Date of death (YYYY-MM-DD), "Alive" if still alive
                 political_party: str,          # Political party affiliation
                 accomplishments: list[str],    # List of accomplishments        
                ): store_attr()

    __repr__ = basic_repr('name,birth_date,death_date,political_party,accomplishments')

sample_pm = PrimeMinister("William Lyon Mackenzie King", "1874-12-17", "1950-07-22", "Liberal", ["Introduction of old age pensions", "Creation of the Canadian Broadcasting Corporation"])
print(sample_pm)

PrimeMinister(name='William Lyon Mackenzie King', birth_date='1874-12-17', death_date='1950-07-22', political_party='Liberal', accomplishments=['Introduction of old age pensions', 'Creation of the Canadian Broadcasting Corporation'])

Next, this class definition can be passed as a tool to Claudette to structure the output as a Python object.

from claudette import Client

# Using the structured output feature
anthropic_cli = Client(claude37sonnet)
pm = anthropic_cli.structured("Provide information about Canada's seventh Prime Minister", PrimeMinister)[0]
print(pm)

PrimeMinister(name='Wilfrid Laurier', birth_date='1841-11-20', death_date='1919-02-17', political_party='Liberal', accomplishments=['First francophone Prime Minister of Canada', 'Promoted national unity between English and French Canadians', 'Oversaw significant immigration and settlement of Western Canada', 'Established the Department of External Affairs', 'Created Alberta and Saskatchewan as provinces in 1905', "Implemented a policy of 'sunny ways' to resolve conflicts", 'Served as Prime Minister for 15 years (1896-1911)', 'Negotiated a reciprocity agreement with the United States (though it led to his defeat)', 'Expanded the railway system across Canada'])

Behind the scenes, Claudette (and others) are using Python's docments library to parse the class definition, generate an appropriate JSON schema, and handle the communication with Claude.

This structured output approach provides several advantages:

Your data structure is defined in code, not in a prompt
You get a proper Python object back, not a string that needs parsing
Type hinting and documentation in the class provide guidance to the model
Changes to your data structure only need to be made in one place

This is particularly valuable when building data pipelines or applications that need to consume LLM outputs programmatically. This approach ensures consistent data structures that can feed directly into your downstream processes.

Enhancing LLMs with Tools

While structured data output is powerful on its own, these libraries really shine when we combine this capability with tools, as LLMs excel at reasoning but often struggle with precise calculations or accessing specific data. This allows us to not just get data from the model, but to have the model interact with our own code, enhancing regular AI workflows.

Let's explore how this works with a silly toy example: accurately and precisely calculating a Prime Minister's age in both human and dog years. First, I'll create a helper function to retrieve structured Prime Minister data:

def _prime_minister(name: str) -> PrimeMinister:
    """Retrieve detailed information about a Canadian Prime Minister."""
    cli = Client(claude37sonnet)
    return cli.structured(f"Provide detailed information about {name}, a Canadian Prime Minister", PrimeMinister)[0]

pm1 = _prime_minister('John A. Macdonald')
print(pm1)

PrimeMinister(name='Sir John Alexander Macdonald', birth_date='1815-01-11', death_date='1891-06-06', political_party='Conservative', accomplishments=['First Prime Minister of Canada (1867-1873, 1878-1891)', 'Father of Confederation - played a key role in the creation of Canada in 1867', 'Oversaw the expansion of Canada to include Manitoba, British Columbia, and Prince Edward Island', 'Initiated the construction of the Canadian Pacific Railway', 'Created the North-West Mounted Police (later the RCMP)', 'Implemented the National Policy of tariff protection to foster Canadian industry'])

Great! Now we have accurate birth and death date information. While LLMs might know these dates from their training, they often struggle with precise calculations based on them without the use of tools.

Next, I'll create a function that calculates lifespan precisely to the day:

from datetime import datetime
from dateutil.relativedelta import relativedelta

def _pm_lifespan(pm: PrimeMinister) -> str:
    """Calculate the lifespan of a Canadian Prime Minister in years, months, and days."""
    birth_dt = datetime.strptime(pm.birth_date, '%Y-%m-%d')
    if pm.death_date == 'Alive':
        death_dt = datetime.now()
        status = "has lived so far"
    else:
        death_dt = datetime.strptime(pm.death_date, '%Y-%m-%d')
        status = "lived"
    delta = relativedelta(death_dt, birth_dt)
    return(f"{pm.name} {status} for {delta.years} years, {delta.months} months, and {delta.days} days.")

_pm_lifespan(pm1)

'Sir John Alexander Macdonald lived for 76 years, 4 months, and 26 days.'

This calculation would be difficult for an LLM to perform accurately without tools since it requires precise date arithmetic. Even still, it would likely not provide an answer to this level of precision without heavy prompt engineering.

Finally, let's add a simple function to convert human years to dog years:

def dog_years(years: int, months: int, days: int) -> float:
    '''Converts human lifespan into dog years'''
    return 7 * (years + (months / 12) + (days / 365))

dog_years(76, 4, 26)

534.8319634703196

Now let's chain these tools together with Claudette!

chat = Chat(claude37sonnet, tools=[_prime_minister, _pm_lifespan, dog_years])
result = chat.toolloop("How long did Robert Borden live in dog years?")
print(contents(result))

Robert Borden lived for approximately 580.7 dog years. He was born on June 26, 1854, and died on June 10, 1937, giving him a human lifespan of 82 years, 11 months, and 15 days, which converts to about 580.7 dog years.

Here, Claudette automatically detected that the question required multiple computational steps and:

Used the _prime_minister function to get Borden's biographical data (not a traditional tool as there's a nested LLM call)
Passed that data to the _pm_lifespan tool to calculate his exact lifespan
Used the dog_years tool to convert that precise lifespan to dog years
Synthesized a natural language response with the result

Let's compare this to asking Claude directly without tools:

chat = Chat(claude37sonnet)
response = chat("How long did Robert Borden live in dog years?")
print(contents(response))

To calculate Robert Borden's age in "dog years," I'll first find his actual lifespan.

Sir Robert Borden (Canada's 8th Prime Minister) lived from June 26, 1854 to June 10, 1937, making him 82 years old at death.

Using the common conversion of 1 human year = 7 dog years, Borden's age in dog years would be:
82 × 7 = 574 dog years

Note that this is just a fun calculation using the simplified 7:1 ratio. The actual aging relationship between dogs and humans is more complex, with dogs aging more rapidly in their early years and then more slowly later on.

The direct approach yields 574 dog years, which is off by several years from our more precise calculation. Even with Claude's extended thinking capability, the accuracy improves but isn't exact:

chat = Chat(claude37sonnet)
response = chat("How long did Robert Borden live in dog years?", maxthinktok=1024)
print(contents(response))

To calculate Robert Borden's lifespan in "dog years," I need to:

1) Find his actual lifespan: Robert Borden (Prime Minister of Canada) lived from 1854 to 1937, which means he lived for 83 human years.

2) Convert to dog years: Using the common (though simplified) conversion of 1 human year = 7 dog years

83 human years × 7 = 581 dog years

So Robert Borden lived approximately 581 "dog years."

(Note that the 7:1 ratio is just a popular approximation - actual dog aging varies by breed and isn't linear throughout their lives.)

This fun example still demonstrates some powerful capabilities that can apply to serious business contexts. In real-world applications, you might use similar techniques to process financial data with LLMs while ensuring computational accuracy, build customer service bots that can both understand requests and query databases, or develop document processing systems that extract, validate, and transform information.

Unique Features Across Libraries

The capabilities we've explored so far are available across all three Answer.AI libraries. This consistent interface makes it easy to work with your preferred LLM provider. However, each library also offers unique features that leverage the special capabilities of their respective model providers. Let's explore a distinct element from each.

Claudette: Prefill for Controlled Responses

Claudette supports Claude's "prefill" feature, which lets you specify how a response should begin:

prefill_chat = Chat(claude37sonnet)
response = prefill_chat("Write a brief sentence about John Diefenbaker's legacy in the style of Donald Trump.",
                prefill="Dief the Chief")
print(contents(response))

Dief the Chief, not a great leader, believe me, Canada was a disaster under him, total mess with the Avro Arrow, very sad, but he did some things with rights, not as good as me, but OK, many people are saying this.

This prefill capability is extremely useful when you need consistent response formats or want to guide the model's tone without complex prompt engineering. For applications like content generation or character-based interactions, it provides a subtle way to control output without constraining the model's creativity.

Cosette: Azure OpenAI Support

Cosette makes it easy to work with Microsoft Azure OpenAI in corporate environments, where data privacy is paramount:

from cosette import Chat, Client, contents
from openai import AzureOpenAI

azure_client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version="2024-06-01"
)

# Wrap Azure client with Cosette client
cosette_client = Client(model="gpt-4o", cli=azure_client)

# Near-identical structure as Claudette
chat = Chat(cli=cosette_client)
response = chat("Who is the seventh Prime Minister of Canada?")
print(contents(response))

The seventh Prime Minister of Canada was Sir Wilfrid Laurier, who served from July 11, 1896, to October 6, 1911. He was the first French-Canadian to hold the office of Prime Minister and is remembered for his efforts to maintain national unity in Canada, as well as his support for moderate social reforms and economic development.

Many organizations restrict API access to approved cloud environments, and the Azure integration allows you to use all of Cosette's features while complying with corporate privacy policies. You get the same simplified interface regardless of whether you're using OpenAI's API directly or through Azure.

Gaspard: Multimedia Support

Gaspard shines with its multimedia capabilities:

from IPython.display import Image, display
from pathlib import Path

img_fn = Path("kitty.jpg")
display(Image(filename=str(img_fn), width=500))

Stephen Harper with kitten

import google.generativeai as genai
from gaspard import Chat, contents 

genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))
gemini_chat = Chat('gemini-2.0-flash-exp')

response = gemini_chat([img_fn, "How many Prime Ministers are there in this photo?"])
print(contents(response))

Uploading media...Based on the image, there is one Prime Minister visible in the photo. It is most likely Stephen Harper, who was the Prime Minister of Canada from 2006 to 2015.

This multimedia capability opens up the possibility of processing images, audio, video, or PDFs using the same consistent interface. Gaspard provides a streamlined way for developers to leverage Gemini's advanced capabilities without the complexity of the raw API. You'll want to look to Gaspard for projects involving content analysis, document processing, or multimedia applications.

Final Thoughts

Claudette, Cosette and Gaspard address the pain points I felt when I first started working with the most popular LLM APIs. They elegantly handle conversation states, structured outputs, and tool use in ways that feel more natural to me. I'm thankful for the Answer.AI team's work on sharing these with the community.

What's particularly interesting is how these libraries appear to be influencing the broader ecosystem. OpenAI's own API has been gradually simplifying lately, suggesting that the industry is recognizing the need for more developer-friendly interfaces. While we're still in the early days of LLM application development, tools like Claudette, Cosette, and Gaspard give us a glimpse of what more mature, streamlined interactions might look like.

If you're interested in learning more about working with LLM APIs, I encourage you to try replicating and adapting these code snippets on your own. Start with a simple chat interface using your preferred model, then experiment with data you frequently work with to explore structuring data and tool calling.

These libraries don't add functionality that wasn't already possible, but they significantly reduce the cognitive overhead of working with LLM APIs. They let you focus on what you're building rather than the API mechanics. Whether you're prototyping a new idea or building production systems, this simplified approach can make your development process noticeably smoother.

I wish I had discovered these sooner, and I hope this introduction helps you avoid some of the API wrestling I experienced in my early LLM projects.