Welcome to My AI Engineering Blog

Hello, and welcome.

I'm Etienne Tovimafa, an AI Engineer specializing in agentic AI systems, and this is where I document my journey building intelligent agents—from autonomous task decomposition to multi-agent orchestration, from tool-using LLMs to production agent deployments.

This blog exists at the intersection of research and reality. It's where agent architectures meet production constraints, where LLM capabilities collide with real-world complexity, and where theoretical agentic frameworks transform into systems that autonomously solve complex problems.

The Agentic Revolution

We're witnessing a fundamental shift in AI engineering. We've moved beyond simple prompt-response systems to building AI agents—autonomous systems that can:

Reason and plan multi-step solutions to complex problems
Use tools and APIs to gather information and take actions
Maintain memory across conversations and sessions
Decompose tasks into manageable subtasks
Self-correct when plans fail or outputs are wrong
Collaborate with other specialized agents

This isn't just about better prompts or fine-tuned models. It's about building systems that exhibit agency—the ability to perceive, decide, and act autonomously to achieve goals.

And that's where my passion lies.

What I Actually Build

Currently, I work as an AI Engineer designing and deploying intelligent agentic systems. Here's what that looks like in practice:

1. Autonomous AI Agents

The core of my work is building AI agents that can reason, plan, and execute complex tasks without constant human intervention.

I don't just build chatbots that respond to prompts. I build agents that:

Break down complex goals into executable subtasks (task decomposition)
Plan action sequences using frameworks like ReAct (Reasoning + Acting), Plan-and-Execute, and Reflexion
Use external tools: Search engines, APIs, databases, calculators, code interpreters
Maintain context across multi-turn interactions using various memory strategies
Self-evaluate their outputs and retry when necessary
Learn from feedback to improve over time

Real-world example: I've built document processing agents that autonomously extract structured data from unstructured PDFs—they decide which extraction strategy to use (OCR, text parsing, vision models), validate outputs, retry on failures, and orchestrate multiple specialized models based on document type.

2. Multi-Agent Orchestration

Single agents are powerful, but multi-agent systems unlock even more complex capabilities.

I design architectures where multiple specialized agents collaborate to solve problems:

Coordinator agents that delegate tasks to specialist agents
Specialist agents with domain-specific expertise (data extraction, validation, generation)
Critic agents that evaluate and improve outputs from generator agents
Memory agents that manage shared context across the agent network

Frameworks I work with:

LangGraph for building stateful multi-agent workflows
CrewAI for role-based agent orchestration
AutoGen for conversational multi-agent systems
Custom orchestration patterns using LangChain

Real-world example: Built a recruitment system where one agent extracts candidate data from CVs, another agent searches databases for matching opportunities, a third agent generates personalized recommendations, and a coordinator agent orchestrates the entire pipeline—all running autonomously in production.

3. Tool-Using LLMs & Function Calling

Modern LLMs can call functions and use tools, but designing reliable tool-using agents is an art.

I build agents that:

Select the right tool from dozens of available options
Handle tool failures gracefully with retries and fallbacks
Chain tool calls to complete multi-step tasks
Validate tool outputs before using them in reasoning
Integrate external APIs: Search engines, databases, web scraping, code execution

Challenges I've solved:

Tool selection ambiguity (when should the agent use Tool A vs. Tool B?)
Error handling (what happens when an API call fails mid-task?)
Prompt injection attacks (ensuring tool use is safe and controlled)
Cost optimization (reducing unnecessary tool calls)

4. Agent Memory & Context Management

A truly intelligent agent needs memory—the ability to remember past interactions, learn from them, and use that knowledge in future decisions.

I implement various memory architectures:

Short-term memory: Conversation buffers and sliding windows
Long-term memory: Vector stores for semantic memory retrieval
Working memory: Scratch pads for intermediate reasoning steps
Entity memory: Tracking entities (users, documents, events) across sessions
Episodic memory: Remembering past task executions and outcomes

Real-world example: Built a conversational agent for language learning that remembers user mistakes across sessions, adapts exercises based on past performance, and retrieves relevant examples from previous conversations—all while maintaining conversation context within LLM token limits.

5. RAG Systems as Agent Tools

Retrieval-Augmented Generation (RAG) isn't a standalone system in my work—it's a tool that agents use to access knowledge.

I build RAG systems that agents can invoke:

Semantic search with vector databases (Pinecone, Chroma, FAISS)
Hybrid search combining keyword and vector retrieval
Reranking and relevance filtering
Citation tracking and fact-checking
Multi-hop retrieval for complex queries

But here's the key difference: The agent decides when and how to use RAG. It's not "user query → retrieve → generate." It's "agent analyzes task → decides RAG is needed → constructs retrieval query → evaluates results → decides next action."

Real-world example: Built a document Q&A agent that doesn't blindly retrieve for every question. It first assesses if the question requires external knowledge, constructs optimized retrieval queries, evaluates result quality, and decides whether to retrieve more, generate an answer, or ask the user for clarification.

6. Production Agent Deployment

Building an agent that works in a notebook is one thing. Deploying it to production is another.

I handle:

API design for agent systems (synchronous vs. asynchronous endpoints)
State management for long-running agent tasks
Error handling and retries for unreliable LLM outputs
Cost optimization (caching, smaller models for subtasks, prompt compression)
Latency optimization (parallel tool calls, streaming responses)
Safety and monitoring (guardrails, output validation, logging)
Scaling agents to handle thousands of concurrent requests

Infrastructure I use:

FastAPI for agent APIs
Celery for asynchronous agent tasks
Redis for agent state management
Docker for containerization
LangSmith and Weights & Biases for agent monitoring

Notable Agentic Projects

Autonomous Document Processing Pipeline

Built an end-to-end agent system that processes invoices, receipts, and forms:

Planner agent: Analyzes document type and selects extraction strategy
OCR agent: Handles scanned documents with Tesseract and vision models
Extraction agent: Uses specialized models for different document types
Validation agent: Verifies extracted data against business rules
Correction agent: Fixes errors or requests human intervention

Key achievement: 92% automation rate, reducing manual processing time by 80%.

ModoBot - Agentic Moderation System (2023)

Built an intelligent moderation agent for Telegram communities:

Classifier agent: Custom toxicity detection model (TensorFlow/Keras)
Decision agent: Determines appropriate sanctions based on severity and history
Action agent: Executes moderation actions (warnings, bans, message deletion)
Learning agent: Collects feedback to improve future decisions

Key learnings: Agents must balance precision (avoiding false positives) with recall (catching actual toxicity), especially when actions have real consequences.

Multi-Agent Recommendation System

Built a collaborative agent system for candidate-job matching:

Parsing agent: Extracts structured data from CVs
Search agent: Queries Neo4j graph database for matches
Ranking agent: Scores candidates based on relevance
Generation agent: Creates personalized recommendations
Coordinator agent: Orchestrates the entire pipeline

Key achievement: Reduced recommendation latency from 15 seconds to 3 seconds through agent parallelization and caching strategies.

My Technical Stack (Agentic Focus)

Agent Frameworks:

LangChain / LangGraph (multi-agent workflows)
LlamaIndex (data agents)
CrewAI (role-based agents)
AutoGen (conversational agents)
Custom agent architectures in Python

LLM Tools:

OpenAI GPT-4, GPT-4o (function calling, reasoning)
Anthropic Claude (long context, tool use)
Open-source LLMs (Llama, Mistral for cost optimization)
Hugging Face Transformers (fine-tuning specialized models)

Agent Memory & Tools:

Vector databases: Pinecone, Chroma, FAISS
Graph databases: Neo4j for knowledge graphs
Redis for agent state and caching
PostgreSQL for structured data

Infrastructure:

FastAPI for agent APIs
Docker for containerization
Celery for async agent tasks
LangSmith for agent observability

Core Python Libraries:

PyTorch / TensorFlow (custom model training)
Pandas / NumPy (data processing)
LangChain ecosystem

What You'll Find on This Blog

This blog focuses on practical agentic AI engineering—the real challenges, trade-offs, and solutions when building production agent systems.

Core Topics

1. Agentic Architectures

ReAct (Reasoning + Acting) agents
Plan-and-Execute patterns
Reflexion and self-correction
Multi-agent orchestration
Agent communication protocols

2. Tool Use & Function Calling

Designing reliable tool-using agents
Error handling and retries
Tool selection strategies
Building custom tools for agents
Safety and sandboxing

3. Agent Memory Systems

Short-term vs. long-term memory
Vector-based semantic memory
Entity tracking across sessions
Memory compression strategies
Balancing memory size vs. context window

4. Multi-Agent Systems

Coordinator-worker patterns
Specialist agent design
Agent communication and state sharing
Consensus and voting mechanisms
Debugging multi-agent systems

5. Production Agent Engineering

API design for agents
Cost optimization strategies
Latency reduction techniques
Agent monitoring and observability
Safety guardrails and validation

6. RAG for Agents

When agents should use retrieval
Hybrid search strategies
Multi-hop retrieval
Citation and fact-checking
RAG cost optimization

7. LLM Fine-Tuning for Agents

Training specialized agent models
LoRA and QLoRA for efficiency
Instruction tuning for tool use
Distillation for faster agents
Evaluating fine-tuned agents

8. Document Intelligence

OCR agents for scanned documents
Multi-modal document understanding
Layout analysis and table extraction
Agent-based PDF processing pipelines

My Writing Philosophy

Every article follows these principles:

Agent-centric thinking: How do you design systems that exhibit agency?
Production-first: Real constraints, real costs, real trade-offs
Honest about failures: What didn't work and why
Working code: GitHub repos with reproducible examples
Deep technical dives: No hand-waving, no "simply use X"

I write the articles I wish existed when I was learning to build agents. No surface-level tutorials. No toy examples that break in production. No solutions that cost $1000/day to run.

Who This Is For

This blog is for:

AI Engineers building production agent systems
LLM practitioners moving beyond simple chatbots
Backend developers integrating agentic AI into applications
Researchers translating agent papers into working systems
Anyone curious about building truly autonomous AI systems

If you're tired of "build a chatbot in 10 minutes" tutorials and want to understand what it takes to build real agentic systems at scale, you're in the right place.

What I'm Currently Exploring

Advanced agent architectures: Tree-of-Thoughts, Graph-of-Thoughts reasoning
Agent evaluation: How do you measure agent performance reliably?
Multi-modal agents: Agents that combine vision, text, and code execution
Self-improving agents: Systems that learn from their mistakes
Agent security: Preventing prompt injection and jailbreaks in production

Upcoming Articles

Here's what's coming in the next few months:

Building a ReAct Agent from Scratch: Complete implementation with tool use, reasoning traces, and error handling
Multi-Agent Orchestration Patterns: Coordinator-worker, specialist collaboration, and consensus mechanisms
Agent Memory Architectures: Implementing short-term, long-term, and episodic memory for production agents
Production Agent Deployment: API design, cost optimization, latency reduction, and monitoring
LLM Function Calling Deep Dive: How to build reliable tool-using agents, handle errors, and optimize tool selection
RAG as an Agent Tool: When agents should retrieve, how to construct queries, and multi-hop retrieval strategies

Let's Connect

I love discussing agentic AI, sharing knowledge, and collaborating on hard problems. Reach out if you:

Have questions about building agent systems
Want to discuss agentic architecture challenges
Have feedback on an article
Are working on multi-agent systems and want to share learnings
Just want to connect with another agent engineer

Find me here:

LinkedIn: linkedin.com/in/etiennetovi
GitHub: github.com/abiotov
Email: abiodouneti@gmail.com

I'm also open to:

Speaking at AI/ML meetups and conferences
Contributing to open-source agent frameworks
Collaborating on research or technical writing
Mentoring engineers entering agentic AI

Final Thoughts

We're at the beginning of the agentic AI revolution. LLMs are powerful, but they're just one component. The real magic happens when you combine reasoning, planning, tool use, memory, and multi-agent collaboration into systems that can autonomously solve complex real-world problems.

Building these systems is hard. Agents hallucinate. Tools fail. Costs spiral. Plans go wrong. But when you get it right—when an agent autonomously completes a task you thought would require human intelligence—it's incredibly rewarding.

This blog is my way of sharing what I've learned building production agent systems. The architectures that work. The patterns that scale. The mistakes I've made. The trade-offs I've navigated.

Whether you're debugging why your agent keeps calling the wrong tool, figuring out how to reduce agent latency, or trying to orchestrate multiple agents without chaos, I hope you'll find something useful here.

Thank you for reading, and welcome to the journey.

Let's build intelligent agents together.

Etienne Tovimafa AI Engineer | Building autonomous AI agents that reason, plan, and act

Published: January 17, 2025