LLMs & the Model Family¶

Role of LLMs in agents¶

The LLM is the brain of the agent. It:

Interprets the user's goal
Decides which tool to use (and with what arguments)
Synthesizes observations into a final answer

Model families¶

Family	Examples	Notes
Open-weight	Llama 3, Mistral, Qwen	Run locally or on HF Inference
Proprietary	GPT-4o, Claude 3.5	API access only
Code-specialized	DeepSeek-Coder, StarCoder	Better for code tools

Special tokens¶

Chat models use special tokens to structure conversation turns. Example (Llama 3 format):

<|begin_of_text|>
<|start_header_id|>system<|end_header_id|>
You are a helpful assistant.<|eot_id|>
<|start_header_id|>user<|end_header_id|>
What is 2+2?<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

Agents inject tool schemas and observations into these structured turns.

Accessing models via HF Hub¶

from huggingface_hub import InferenceClient

client = InferenceClient(model="meta-llama/Meta-Llama-3-8B-Instruct")
response = client.chat_completion(
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=256,
)
print(response.choices[0].message.content)

Notes¶

Add your own notes and experiments here.