Welcome to My AI Engineering Blog
Hello, and welcome.
I'm Etienne Tovimafa, an AI Engineer specializing in agentic AI systems, and this is where I document my journey building intelligent agents—from autonomous task decomposition to multi-agent orchestration, from tool-using LLMs to production agent deployments.
This blog exists at the intersection of research and reality. It's where agent architectures meet production constraints, where LLM capabilities collide with real-world complexity, and where theoretical agentic frameworks transform into systems that autonomously solve complex problems.
The Agentic Revolution
We're witnessing a fundamental shift in AI engineering. We've moved beyond simple prompt-response systems to building AI agents—autonomous systems that can:
- Reason and plan multi-step solutions to complex problems
- Use tools and APIs to gather information and take actions
- Maintain memory across conversations and sessions
- Decompose tasks into manageable subtasks
- Self-correct when plans fail or outputs are wrong
- Collaborate with other specialized agents
This isn't just about better prompts or fine-tuned models. It's about building systems that exhibit agency—the ability to perceive, decide, and act autonomously to achieve goals.
And that's where my passion lies.
What I Actually Build
Currently, I work as an AI Engineer designing and deploying intelligent agentic systems. Here's what that looks like in practice:
1. Autonomous AI Agents
The core of my work is building AI agents that can reason, plan, and execute complex tasks without constant human intervention.
I don't just build chatbots that respond to prompts. I build agents that:
- Break down complex goals into executable subtasks (task decomposition)
- Plan action sequences using frameworks like ReAct (Reasoning + Acting), Plan-and-Execute, and Reflexion
- Use external tools: Search engines, APIs, databases, calculators, code interpreters
- Maintain context across multi-turn interactions using various memory strategies
- Self-evaluate their outputs and retry when necessary
- Learn from feedback to improve over time
Real-world example: I've built document processing agents that autonomously extract structured data from unstructured PDFs—they decide which extraction strategy to use (OCR, text parsing, vision models), validate outputs, retry on failures, and orchestrate multiple specialized models based on document type.
2. Multi-Agent Orchestration
Single agents are powerful, but multi-agent systems unlock even more complex capabilities.
I design architectures where multiple specialized agents collaborate to solve problems:
- Coordinator agents that delegate tasks to specialist agents
- Specialist agents with domain-specific expertise (data extraction, validation, generation)
- Critic agents that evaluate and improve outputs from generator agents
- Memory agents that manage shared context across the agent network
Frameworks I work with:
- LangGraph for building stateful multi-agent workflows
- CrewAI for role-based agent orchestration
- AutoGen for conversational multi-agent systems
- Custom orchestration patterns using LangChain
Real-world example: Built a recruitment system where one agent extracts candidate data from CVs, another agent searches databases for matching opportunities, a third agent generates personalized recommendations, and a coordinator agent orchestrates the entire pipeline—all running autonomously in production.
3. Tool-Using LLMs & Function Calling
Modern LLMs can call functions and use tools, but designing reliable tool-using agents is an art.
I build agents that:
- Select the right tool from dozens of available options
- Handle tool failures gracefully with retries and fallbacks
- Chain tool calls to complete multi-step tasks
- Validate tool outputs before using them in reasoning
- Integrate external APIs: Search engines, databases, web scraping, code execution
Challenges I've solved:
- Tool selection ambiguity (when should the agent use Tool A vs. Tool B?)
- Error handling (what happens when an API call fails mid-task?)
- Prompt injection attacks (ensuring tool use is safe and controlled)
- Cost optimization (reducing unnecessary tool calls)
4. Agent Memory & Context Management
A truly intelligent agent needs memory—the ability to remember past interactions, learn from them, and use that knowledge in future decisions.
I implement various memory architectures:
- Short-term memory: Conversation buffers and sliding windows
- Long-term memory: Vector stores for semantic memory retrieval
- Working memory: Scratch pads for intermediate reasoning steps
- Entity memory: Tracking entities (users, documents, events) across sessions
- Episodic memory: Remembering past task executions and outcomes
Real-world example: Built a conversational agent for language learning that remembers user mistakes across sessions, adapts exercises based on past performance, and retrieves relevant examples from previous conversations—all while maintaining conversation context within LLM token limits.
5. RAG Systems as Agent Tools
Retrieval-Augmented Generation (RAG) isn't a standalone system in my work—it's a tool that agents use to access knowledge.
I build RAG systems that agents can invoke:
- Semantic search with vector databases (Pinecone, Chroma, FAISS)
- Hybrid search combining keyword and vector retrieval
- Reranking and relevance filtering
- Citation tracking and fact-checking
- Multi-hop retrieval for complex queries
But here's the key difference: The agent decides when and how to use RAG. It's not "user query → retrieve → generate." It's "agent analyzes task → decides RAG is needed → constructs retrieval query → evaluates results → decides next action."
Real-world example: Built a document Q&A agent that doesn't blindly retrieve for every question. It first assesses if the question requires external knowledge, constructs optimized retrieval queries, evaluates result quality, and decides whether to retrieve more, generate an answer, or ask the user for clarification.
6. Production Agent Deployment
Building an agent that works in a notebook is one thing. Deploying it to production is another.
I handle:
- API design for agent systems (synchronous vs. asynchronous endpoints)
- State management for long-running agent tasks
- Error handling and retries for unreliable LLM outputs
- Cost optimization (caching, smaller models for subtasks, prompt compression)
- Latency optimization (parallel tool calls, streaming responses)
- Safety and monitoring (guardrails, output validation, logging)
- Scaling agents to handle thousands of concurrent requests
Infrastructure I use:
- FastAPI for agent APIs
- Celery for asynchronous agent tasks
- Redis for agent state management
- Docker for containerization
- LangSmith and Weights & Biases for agent monitoring
Notable Agentic Projects
Autonomous Document Processing Pipeline
Built an end-to-end agent system that processes invoices, receipts, and forms:
- Planner agent: Analyzes document type and selects extraction strategy
- OCR agent: Handles scanned documents with Tesseract and vision models
- Extraction agent: Uses specialized models for different document types
- Validation agent: Verifies extracted data against business rules
- Correction agent: Fixes errors or requests human intervention
Key achievement: 92% automation rate, reducing manual processing time by 80%.
ModoBot - Agentic Moderation System (2023)
Built an intelligent moderation agent for Telegram communities:
- Classifier agent: Custom toxicity detection model (TensorFlow/Keras)
- Decision agent: Determines appropriate sanctions based on severity and history
- Action agent: Executes moderation actions (warnings, bans, message deletion)
- Learning agent: Collects feedback to improve future decisions
Key learnings: Agents must balance precision (avoiding false positives) with recall (catching actual toxicity), especially when actions have real consequences.
Multi-Agent Recommendation System
Built a collaborative agent system for candidate-job matching:
- Parsing agent: Extracts structured data from CVs
- Search agent: Queries Neo4j graph database for matches
- Ranking agent: Scores candidates based on relevance
- Generation agent: Creates personalized recommendations
- Coordinator agent: Orchestrates the entire pipeline
Key achievement: Reduced recommendation latency from 15 seconds to 3 seconds through agent parallelization and caching strategies.
My Technical Stack (Agentic Focus)
Agent Frameworks:
- LangChain / LangGraph (multi-agent workflows)
- LlamaIndex (data agents)
- CrewAI (role-based agents)
- AutoGen (conversational agents)
- Custom agent architectures in Python
LLM Tools:
- OpenAI GPT-4, GPT-4o (function calling, reasoning)
- Anthropic Claude (long context, tool use)
- Open-source LLMs (Llama, Mistral for cost optimization)
- Hugging Face Transformers (fine-tuning specialized models)
Agent Memory & Tools:
- Vector databases: Pinecone, Chroma, FAISS
- Graph databases: Neo4j for knowledge graphs
- Redis for agent state and caching
- PostgreSQL for structured data
Infrastructure:
- FastAPI for agent APIs
- Docker for containerization
- Celery for async agent tasks
- LangSmith for agent observability
Core Python Libraries:
- PyTorch / TensorFlow (custom model training)
- Pandas / NumPy (data processing)
- LangChain ecosystem
What You'll Find on This Blog
This blog focuses on practical agentic AI engineering—the real challenges, trade-offs, and solutions when building production agent systems.
Core Topics
1. Agentic Architectures
- ReAct (Reasoning + Acting) agents
- Plan-and-Execute patterns
- Reflexion and self-correction
- Multi-agent orchestration
- Agent communication protocols
2. Tool Use & Function Calling
- Designing reliable tool-using agents
- Error handling and retries
- Tool selection strategies
- Building custom tools for agents
- Safety and sandboxing
3. Agent Memory Systems
- Short-term vs. long-term memory
- Vector-based semantic memory
- Entity tracking across sessions
- Memory compression strategies
- Balancing memory size vs. context window
4. Multi-Agent Systems
- Coordinator-worker patterns
- Specialist agent design
- Agent communication and state sharing
- Consensus and voting mechanisms
- Debugging multi-agent systems
5. Production Agent Engineering
- API design for agents
- Cost optimization strategies
- Latency reduction techniques
- Agent monitoring and observability
- Safety guardrails and validation
6. RAG for Agents
- When agents should use retrieval
- Hybrid search strategies
- Multi-hop retrieval
- Citation and fact-checking
- RAG cost optimization
7. LLM Fine-Tuning for Agents
- Training specialized agent models
- LoRA and QLoRA for efficiency
- Instruction tuning for tool use
- Distillation for faster agents
- Evaluating fine-tuned agents
8. Document Intelligence
- OCR agents for scanned documents
- Multi-modal document understanding
- Layout analysis and table extraction
- Agent-based PDF processing pipelines
My Writing Philosophy
Every article follows these principles:
- Agent-centric thinking: How do you design systems that exhibit agency?
- Production-first: Real constraints, real costs, real trade-offs
- Honest about failures: What didn't work and why
- Working code: GitHub repos with reproducible examples
- Deep technical dives: No hand-waving, no "simply use X"
I write the articles I wish existed when I was learning to build agents. No surface-level tutorials. No toy examples that break in production. No solutions that cost $1000/day to run.
Who This Is For
This blog is for:
- AI Engineers building production agent systems
- LLM practitioners moving beyond simple chatbots
- Backend developers integrating agentic AI into applications
- Researchers translating agent papers into working systems
- Anyone curious about building truly autonomous AI systems
If you're tired of "build a chatbot in 10 minutes" tutorials and want to understand what it takes to build real agentic systems at scale, you're in the right place.
What I'm Currently Exploring
- Advanced agent architectures: Tree-of-Thoughts, Graph-of-Thoughts reasoning
- Agent evaluation: How do you measure agent performance reliably?
- Multi-modal agents: Agents that combine vision, text, and code execution
- Self-improving agents: Systems that learn from their mistakes
- Agent security: Preventing prompt injection and jailbreaks in production
Upcoming Articles
Here's what's coming in the next few months:
-
Building a ReAct Agent from Scratch: Complete implementation with tool use, reasoning traces, and error handling
-
Multi-Agent Orchestration Patterns: Coordinator-worker, specialist collaboration, and consensus mechanisms
-
Agent Memory Architectures: Implementing short-term, long-term, and episodic memory for production agents
-
Production Agent Deployment: API design, cost optimization, latency reduction, and monitoring
-
LLM Function Calling Deep Dive: How to build reliable tool-using agents, handle errors, and optimize tool selection
-
RAG as an Agent Tool: When agents should retrieve, how to construct queries, and multi-hop retrieval strategies
Let's Connect
I love discussing agentic AI, sharing knowledge, and collaborating on hard problems. Reach out if you:
- Have questions about building agent systems
- Want to discuss agentic architecture challenges
- Have feedback on an article
- Are working on multi-agent systems and want to share learnings
- Just want to connect with another agent engineer
Find me here:
- LinkedIn: linkedin.com/in/etiennetovi
- GitHub: github.com/abiotov
- Email: abiodouneti@gmail.com
I'm also open to:
- Speaking at AI/ML meetups and conferences
- Contributing to open-source agent frameworks
- Collaborating on research or technical writing
- Mentoring engineers entering agentic AI
Final Thoughts
We're at the beginning of the agentic AI revolution. LLMs are powerful, but they're just one component. The real magic happens when you combine reasoning, planning, tool use, memory, and multi-agent collaboration into systems that can autonomously solve complex real-world problems.
Building these systems is hard. Agents hallucinate. Tools fail. Costs spiral. Plans go wrong. But when you get it right—when an agent autonomously completes a task you thought would require human intelligence—it's incredibly rewarding.
This blog is my way of sharing what I've learned building production agent systems. The architectures that work. The patterns that scale. The mistakes I've made. The trade-offs I've navigated.
Whether you're debugging why your agent keeps calling the wrong tool, figuring out how to reduce agent latency, or trying to orchestrate multiple agents without chaos, I hope you'll find something useful here.
Thank you for reading, and welcome to the journey.
Let's build intelligent agents together.
Etienne Tovimafa AI Engineer | Building autonomous AI agents that reason, plan, and act
Published: January 17, 2025