Pierre KasparianAI & Data freelancer
← Back to blog

Guides

Step-by-step tutorials for implementing concrete AI solutions.

AI agentsRAG productionvector databaseElasticsearchLLM pipeline
Persistent AI Agent Memory with Elasticsearch

Architecture for multi-index AI agent memory on Elasticsearch: 3 memory types, hybrid retrieval, DLS multi-tenant isolation, and R@10 of 0.89. Production guide.

June 19, 2026 · 7 min read

LLMGDPR-compliant LLM deploymentLLM hosting EuropeAI cost optimizationLLM orchestration Python
LLM Inference Engineering: Optimize Latency and Costs

How LLM inference engineering works in production: prefill, decode, batching, quantization, and when to self-host to control costs and stay GDPR-compliant.

June 16, 2026 · 8 min read

AI agentLLM evaluationLLM orchestration Pythonfine-tuningRAG production
Evaluating an AI Agent in Production: the Semantic Judge

How to evaluate AI agent quality in production with a fine-tuned semantic judge: LangChain + Fireworks method, 100x cheaper than frontier LLMs.

June 16, 2026 · 7 min read

AI agentsOpikobservabilityLLM orchestrationproduction
AI Agent Harness: How to Make It Self-Repairing

Opik connects traces, automatic diagnosis and test loops: every production incident becomes a permanent regression test. Python examples.

June 12, 2026 · 7 min read

local LLMGDPRopen source LLM GDPR compliantLLM hosting EuropeLLM pipeline Python freelance
Local LLMs and GDPR: agentic coding without data leaks

Local LLMs or Mistral cloud for GDPR-compliant agentic coding: comparing sovereign architectures, practical guide with Ollama, LM Studio and VS Code.

June 5, 2026 · 7 min read

RAGRAG chatbot custom documentsGDPR-compliant RAG productionLLM pipeline Python freelanceretrieval augmented generation
Indexing images in a RAG pipeline: production guide

Images in a RAG pipeline: index-time captioning, junk image filtering, separate chunks. Results: 1-6% overhead instead of 27-51%.

June 5, 2026 · 7 min read

RAGevaluationproductionGDPR-compliant RAG productionLLM
7 Advanced Metrics to Evaluate Your RAG in Production

Standard metrics miss up to 40% of RAG errors in production. Discover 7 advanced evaluation techniques to detect hidden accuracy gaps in your pipeline.

May 28, 2026 · 7 min read

RAGchunkingNLPLLM pipeline Python freelanceLangChain
RAG Chunking: 4 Strategies to Maximize Retrieval Precision

Fixed-size, recursive, semantic or agentic: comparing 4 RAG chunking strategies with code examples and production recommendations.

May 28, 2026 · 8 min read

PythonPDFLiteParseRAGGDPR
LiteParse v2.0: Local PDF Extraction Without LLM or Cloud

LiteParse v2.0 parses PDFs and Office documents locally, without LLMs or cloud APIs. GDPR-compliant RAG pipelines in Python, JS, or Rust.

May 28, 2026 · 7 min read

LLMMistralLLM pipeline Python freelanceLLM cost optimization for SMBssovereign AI Europe freelance
Dynamic LLM Routing: Cheaper, Reduce Downtime

Routing across Mistral Small/Medium/Large based on token volume and server load can cut LLM costs by 10x with no quality loss. Here is the playbook.

May 28, 2026 · 9 min read

RAGrerankercross-encoderGDPR-compliant RAG productionLLM pipeline Python freelance
Boosting a RAG with a Cross-Encoder Reranker

A cross-encoder reranker improves RAG precision without changing your retriever. Cohere Rerank, local hosting options, Python examples.

May 28, 2026 · 7 min read

PythonPDFPyMuPDFRAGNLP
Parsing PDF documents with PyMuPDF in Python

A complete PyMuPDF (fitz) tutorial: text extraction, metadata, images, and structured blocks from PDFs. Perfect for building a RAG pipeline.

May 27, 2026 · 10 min read

LLMGDPRComplianceSovereign AIEU hosting
Integrating an LLM without violating GDPR: 2025 guide

Complete guide for EU companies: which GDPR articles apply to LLMs, why the Cloud Act is a problem, and which architectures keep you compliant.

January 15, 2025 · 8 min read