S 01 - Manifesto

Preamble

A working notebook kept by a developer who treats artificial intelligence as an engineering material - not a promise.

Software that
reasons for itself.

I ship AI into production web apps - for teams that need to automate, scale, and deliver smarter interfaces. End to end, from embedding to pixel.

Base
Barcelona / Paris
Practice
Full-Stack x AI
Since
2019
Engagements
Retainer / fixed
fig. 01 - Retrieval-augmented inference (illustrative schematic)
  1. 00

    Query

    fn. 00

    user - context

  2. 01

    Embedder

    fn. 01 - 3072d

    text-embedding-3

  3. 02

    Retrieve

    fn. 02 - Postgres

    pgvector - hybrid + rerank

  4. 03

    reason

    fn. 03 - reason + act

    Claude - JSON tools

  5. 04

    Verify

    fn. 04 - regressions

    Zod - evals + LLM judge

ReactNext.jsTypeScriptNodePythonPostgreSQLpgvectorDockerAWSOpenAIAnthropicLangChainRAGThree.jsTailwindObservabilityEvalsStripeReactNext.jsTypeScriptNodePythonPostgreSQLpgvectorDockerAWSOpenAIAnthropicLangChainRAGThree.jsTailwindObservabilityEvalsStripe
S 02 - Practice

Three practices,
one maker.

I work at the intersection of three disciplines that rarely talk to each other enough: full-stack engineering, model integration, and product experience. My role is to hold them together - a RAG pipeline that lands in 300 ms, an interface that makes the model's decision legible, an architecture that holds up on a Monday morning. Software that reasons is only useful when it keeps its promises in your customer's hand.

Catalogue of instruments

Frontend
React - Next.js - TypeScript - Tailwind - Three.js
Backend
Node - Express - Python - FastAPI - PostgreSQL - Mongo
AI
OpenAI - Anthropic - LangChain - pgvector - RAG - Evals
Ops
Docker - AWS - CI/CD - Observability - Stripe
S 03 - Case studies

Five systems,
put to work.

Flagship
00

AI reconciliation - freelance client work

Map Align - a 5-act pipeline

Reconcile volatile external sources (websites, maps, PDFs) with an internal database of thousands of physical spaces, without the AI ever hallucinating deletions. The deterministic engine decides; the LLM reviews and can only downgrade.

Stack - TanStack - Drizzle + Postgres - OpenAI - Zod - Turborepo

5
pipeline stages
0.72
match threshold
~3300
lines of core code
01Extract
02Unify
03Compare
04Apply
05Review
01

Legaltech - production

Verixa

Legal-document checker wired into Legifrance and Judilibre. Structured extraction, citation verification, alerts on outdated case law. Built for firms that can't afford to be wrong.

Stack - Next.js - FastAPI - pgvector - Anthropic - RAG

>= 90%
citation accuracy
< 500 ms
p50 latency
continuous
automated evaluations
eval score - 90 days
02

Ops - B2B SaaS (client engagement)

PMS - project mgmt, rewired

Project management whose reporting writes itself: weekly digest, drift detection, client brief. The team stays in their tool, the board gets a readable brief every Monday.

Stack - Next.js - Node - Postgres - OpenAI - Stripe

12+
AI features integrated
RBAC
fine-grained permissions
auto
weekly digests
digests shipped - 19 weeks
03

Mobile + Admin - client work

DEFIM - maritime learning

An exam-prep platform for maritime certifications: an iOS/Android app and a web admin, designed and shipped end to end. Timed quizzes, mock exams, rich pedagogical content, and a drag-and-drop admin workflow to orchestrate the whole thing.

Stack - Expo / React Native - Next.js - Supabase - dnd-kit - Zod

2 apps
mobile + admin
iOS/Android
native platforms
GDPR
auto account deletion
admin - cataloguev1

Methodology and references available on request - contact@pauldosser.fr.

S 04 - Working log

Seven years of writing,
by hand.

  1. 2023 - present

    Freelance - Full-Stack x AI

    AI integration and product engineering engagements for scale-ups and studios: audit, prototype, production rollout, team support.

    Barcelona / Paris
  2. 2024 - present

    Team Lead - Two.Zero

    Leading a product and engineering team: scoping, architecture, delivery, technical coaching. Focus on AI integration and production reliability.

    Barcelona
  3. 2025 (6 months)

    Full-Stack Developer - NTT DATA

    Public Procurement Data Space (PPDS) project for the European Commission: SPARQL queries, Virtuoso knowledge graphs, front-end and back-end work supporting public data accessibility and transparency.

    Barcelona - hybrid
  4. 2022 - 2024

    Full-Stack Lecturer - Epitech

    Teaching modern web architecture, supervising student projects, engineering practice.

    Barcelona
  5. 2019 - 2022

    Developer - CertiPair

    Certification platform: product, API, client-facing interfaces. First scaling experience, first real production incidents.

    Paris
S 05 - Method

Three acts. No magic.

I.Frame

Audit and hypothesis

I sit inside your product for two to three days. I leave with a testable hypothesis, a named risk, a demo that's doable in two weeks.

  • Product / data workshops
  • Risk map
  • Eval plan
II.Make

Build and evaluate

Short sprints, weekly demos, automated evaluations from the first commit. We measure before celebrating - and we keep the numbers.

  • RAG / tool pipelines
  • Guardrails and observability
  • Eval-driven iteration
III.Ship

Into production

Deployment, monitoring, handover to your team - or ongoing support. The goal: the system holds on a Monday morning without me.

  • CI/CD and infrastructure
  • Dashboards and alerts
  • Team handover
A good AI integrator spends 70% of their time on what surrounds the model - data, guardrails, interface, evaluation. The rest is a matter of taste.
- Paul Dosser, working notebook, 2026
S 05b - Frequently askedref. FAQ.01

What people ask first

Frequently asked.

The five questions that come up most often before a first call. If yours is missing, write directly - I always reply.

Do you ship production RAG systems?
Yes. Paul Dosser designs and ships full RAG pipelines - pgvector hybrid retrieval with reranking, typed tools with Zod, automated evaluations and tracing via Langfuse - for European clients since 2023. Every project lands with its eval suite, not just a demo that works on Monday.
Are you available for freelance work in Barcelona or remote?
Yes. Based between Barcelona and Paris, I work as a freelance for clients in France, Spain, and the wider EU, on-site or remote. I take full-time on a scoped engagement, fixed-price projects, or a few days per week on retainer depending on the contract.
Which LLM providers do you integrate in practice?
Mostly Claude (Anthropic) and OpenAI models for reasoning and structured generation, with text-embedding-3 for embeddings. I pick by use case: reliable function-calling, latency, cost per token, data compliance. The stack is provider-agnostic - the same pipeline can swap from one to another by reconfiguring an adapter.
How do you evaluate AI system quality in production?
I treat prompts like code: versioned eval sets, LLM judges for qualitative metrics, deterministic rules for hard constraints (citations, formats, required fields), and continuous comparison through Langfuse. Every prompt or model change runs through the suite before it reaches production, which kills silent regressions.
Which stack do you typically ship on?
TypeScript end to end: Next.js (App Router, React Server Components) on the front, Node or FastAPI on the backend, PostgreSQL with pgvector for the vector store, Zod for AI contracts, and Langfuse for LLM observability. Hosting on Vercel or the client cloud depending on data residency constraints.
S 06 - Contact

A letter

Let's build
something together.

If you have a product that could reason a little better, a process a model could lift off your team, or an AI hypothesis waiting for proof - write. I answer within 48 hours with an honest, free, and often more useful-than-expected opinion.

- Paul.