Backend AI Engineer

ezAIx, Inc · Thousand Oaks, USA (Remote)

May 2025 – Present

ezAIx is a US-based enterprise AI company trusted by Fortune 500 companies and government agencies — building a suite of production-grade platforms across voice AI (VoiceCoreIQ), Teams observability (TeamsCoreIQ), UC + network monitoring (NetmonIQ), Copilot ROI (CLA), and modular AI agents (ezAgents). I joined as the sole backend engineer and owned the entire server-side architecture of VoiceCoreIQ — the flagship product handling real-time voice, 100+ language translation, omnichannel messaging, and intelligent automation at enterprise scale.

System Architecture & Real-Time Presence

–Designed and built VoiceCoreIQ from scratch — a Django ASGI monolith split across 18 internal apps: call_manager, queue_engine, agent_portal, analytics, omnichannel, rag_service, billing, and more.
–Built a node-based visual workflow engine for dynamic call routing, IVR, and omnichannel orchestration.
–Used Django Channels for all real-time features with 5 distinct WebSocket consumers handling different event streams (agent sessions, live transcripts, supervisor feeds, queue updates, voicebot state).
–Engineered a real-time Agent Presence Dashboard powered by a custom ASGI SSE (Server-Sent Events) middleware and Redis Pub/Sub, giving supervisors live visibility into agent states, active calls, and queue metrics.

Production AI Voicebot & Multilingual Intelligence

–Architected the real-time voicebot by bridging Azure OpenAI GPT-4o Realtime API with Azure Communication Services (ACS) media streams over WebSocket.
–Implemented real-time PCM audio resampling (16↔24 kHz), tool-calling (order lookup, CSAT collection), configurable personality profiles, and session-level token/cost tracking.
–Integrated Azure OpenAI GPT-4o-mini for real-time sentiment analysis, post-call summarization, and intelligence report emails.
–Built a 3-tier TTS caching strategy (in-memory LRU → Redis → Azure Speech) to cut TTS API costs by 85% and call setup latency by 70%.

RAG Pipeline & Knowledge Integration

–Built a full document ingestion pipeline: Azure Blob Storage → Azure Document Intelligence (layout + OCR) & Firecrawl → 512-token chunk splitting → llama-text-embed-v2 embeddings → Pinecone namespaces per tenant.
–Supported PDF, DOCX, PPTX, and plain text. Document Intelligence handled complex tables, headers, and multi-column layouts that naive text extraction would butcher.
–Designed namespace-based multi-tenancy in Pinecone so each client's knowledge base is fully isolated — no cross-contamination on retrieval.
–Maintained a separate index for call transcript semantic search, allowing supervisors to query historical conversations.

Event-Driven Azure Architecture & Integrations

–Architected a highly scalable, event-driven system using Azure Event Grid to handle incoming call webhooks, call state changes, and media streams.
–Implemented robust Call Recording Webhooks to process, encrypt, and store call recordings (AES-256 at rest, TLS 1.2/1.3 in transit) seamlessly via Azure Blob Storage.
–Designed Human Agent Handoff with Redis-coordinated stop/lock flags and Celery tasks that cancel active media, stop DTMF recognition, and route calls to the next workflow node — preserving full conversation context.
–Implemented Microsoft Graph API for presence webhooks, calendar scheduling, and Teams notifications.

Infrastructure, DevOps & Scale

–Containerized the entire stack with Docker multi-stage builds (bytecode-only secure Dockerfile for IP protection) and nginx reverse proxy with WebSocket/SSE passthrough.
–Designed multi-tenant MongoDB architecture with MongoEngine ODM, lazy per-tenant connection aliases, and complex aggregation pipelines.
–Built analytics dashboards powering SLA thresholds, sentiment, multi-question CSAT, agent productivity, and smart disposition with PDF export via ReportLab.
–Owned backend architecture end-to-end and participated in client calls as Product Lead to scope features and drive iterative delivery.

Company Kickoff — Dubai

–Flew to Dubai (Dec 17–22, 2025) for the company kickoff — selected engineers and founders gathered in person for a week of strategy and planning.
–Worked on-site with the founding team to align on product vision, roadmap priorities, and 2025 delivery strategy.

Impact & Business Outcomes

–Reduced TTS API costs by 85% through tiered caching — from ~$0.018/call to under $0.003/call at scale.
–Cut call setup latency by 70% — voicebot first response dropped from ~2.8s to under 0.85s.
–Platform processes thousands of calls daily for enterprise clients.

← back to experience