About Kedara
Millions of families struggle to coordinate elder care. We're building an AI-powered Care Coordination System with a personalized, voice-AI Assistant that becomes a trusted partner for caregivers—assisting them with managing the care coordination burden and reducing burnout.
Our founders scaled Speech AI and NLP at MindMeld and BabbleLabs (both acquired by Cisco). We're a small team of product, AI/ML, and senior care experts. You'll be our founding systems engineering lead.
The Role
Design and build the core backend infrastructure for the application and real-time AI inference—powering our app including voice-AI conversational assistant. You will own the overall system architecture that powers the frontend and the AI orchestration layer optimizing for latency, cost, security, reliability and human-like conversational fluidity for the conversational assistant.
Core Infrastructure
- Partner with frontend, product, and AI/ML leads to translate application requirements into end-to-end system architecture
- Own end-to-end delivery from design through production for reliability and scalability
- Design scalable APIs serving user onboarding, data layer, identity and role based access controls
- Build security and compliance: HIPAA-compliant data handling, encryption, access controls, PHI protection
AI Inference Infrastructure
- Architecting real-time media transport layers (e.g., WebRTC or optimized WebSockets) to stream raw audio, video & image data between the mobile client and inference engine, for ultra-low latency in-app voice and video interactions
- Integrate and orchestrate different models (ASR, TTS, LLMs/SLMs, OCR etc), a multi-modal pipeline (audio/voice, video, images) and orchestration layer for “stateful agents” to coordinate complex care tasks
- Route inference between on-device models (Speech-to-Text, quick responses) and cloud LLMs (reasoning, RAG) based on latency and cost trade-offs
- Establish observability for inference quality: latency, token usage, cost per request, hallucination detection
Requirements
- 8+ years building distributed backend systems at scale, especially for consumer apps preferably from 0-1 in high ambiguity and high velocity environments
- Strong proficiency in a systems language (Go, Rust, or C++, Python) for rapid AI experimentation and lead implementation for high-performance and conversational fluidity.
- Hands-on experience with Kubernetes, Docker, CI/CD pipelines, GCP, AWS tech stacks.
- Relational (PostgreSQL, MySQL) and NoSQL databases. Experience with event-driven architectures (e.g., Kafka, Redis Pub/Sub, or AWS Kinesis)