⚑ Next-Generation AI Platform

K8-Powered LLM Platform
for Enterprise Solutions

TraceMyPods delivers powerful AI capabilities through a secure, scalable Kubernetes-based platform with multiple LLM models and image generation capabilities.

🧠 10+ AI Models
⚑ 99.9% Uptime
πŸ›‘οΈ 24/7 Support
🎬 Live Demo

See TraceMyPods in Action

Watch how our AI platform transforms your workflow with intelligent automation, seamless integrations, and enterprise-grade performance.

πŸ€–
⚑
🧠
πŸš€
πŸ—οΈ Platform Features

Enterprise-Grade Infrastructure

TraceMyPods combines powerful AI capabilities with enterprise-grade infrastructure

πŸ“¦

Modular Microservices

Fine-grained APIs like admin, order, token, ask, and deliver services make the platform highly modular and maintainable.

🧩

Embeddings & Vector Search

Advanced search powered by Qdrant and custom embeddings from embedding-api for real-time semantic search and AI memory.

πŸ•΅οΈ

Observability Stack

Integrated with Prometheus, Grafana, and Loki for deep visibility, tracing, and real-time alerts across all Kubernetes services.

πŸ“¨

Email Sending with Resend

Built-in SMTP support for OTP verification & Invoice

πŸ“‘

Event Streaming with Kafka

Reliable real-time messaging and data pipelines between microservices using Apache Kafka integration.

πŸ”

Secure Token Authentication

Generate secure tokens for API access with Redis-backed authentication and 1-hour expiry for enhanced security.

🧠

Multiple AI Models

Access a variety of LLM models from TinyLlama to powerful Mistral and CodeLlama for different use cases and requirements.

πŸ–ΌοΈ

Image Generation

Create AI-generated images from text descriptions with our public API feature, currently in beta.

πŸ”§

Customizable

Easily extendable and customizable to fit your specific needs with a modular architecture.

⚑

High Performance HPA and GPU

Optimized infrastructure with GPU acceleration for AI models and efficient request routing.

πŸ“Š

Analytics Dashboard

Comprehensive analytics dashboard for monitoring usage, performance, and model interactions.

☁️

Kubernetes-Based

Built on EKS with Istio service mesh for enterprise-grade reliability, scalability, and security.

πŸ€–
TraceMyPods AI Assistant

Token: Not Generated
Waiting for your query...

πŸ’‘ Premium models available for enhanced capabilities:

#mistral #codellama #llama2 #phi

🧠 Available Models

Choose Your AI Model

Choose from our selection of powerful AI models to suit your specific needs

πŸ’‘
Smoll
~500 MB 1 GB RAM

Free Lightweight model perfect for chat bot with minimal resource requirements.

Free
πŸ› οΈ
Your Custom Model
Custom Scalable

Yes, We Support Hosting your Custom Model

Contact Sales
🌐
Gemma 2B
2.6 GB 6–8 GB RAM

Google's open-weight chat-optimized model suitable for small to medium workloads.

Open Source
πŸ¦…
Falcon-RW 1B
~1.3 GB 4 GB RAM

Small version of the Falcon family, ideal for offline summarization and QA tasks.

Apache 2.0
πŸ§‘β€πŸ’»
Replit Code 3B
3.3 GB 8–10 GB RAM

Fine-tuned for code generation and completions. Great for coding copilots.

Open Source
🀏
TinyLlama
~1.1 GB 4 GB RAM

Lightweight model perfect for simple Q&A and chat applications with minimal resource requirements.

$15/month
🧠
Mistral-7B
~4.2 GB 8-16 GB RAM

Powerful general-purpose model with excellent reasoning capabilities and broad knowledge.

$15/month
πŸ’»
CodeLlama
4.5-10 GB 16-24 GB RAM

Specialized for code generation and understanding across multiple programming languages.

$10/month
πŸ¦™
LLaMA 2
4.5-40 GB 16-80 GB RAM

Versatile but resource-heavy model with state-of-the-art performance across various tasks.

$20/month
Ο†
Phi-2
~1.7 GB 6-8 GB RAM

Efficient and compact model with excellent reasoning capabilities for its size.

$12/month
πŸ—οΈ System Architecture

Robust Infrastructure

TraceMyPods is built on a robust, scalable infrastructure designed for enterprise use

Application Components (Beta)

End User (Browser)
Web Interface
Istio Ingress
Traffic Management
Frontend Pod
UI & Token Generation
Ask API
Handles User Prompts
Token API
Manages Session Tokens
Order API
Handles Orders
Admin API
Admin Dashboard
Vector API
Semantic Search Access
Embedding API
Text Vectorization
Deliver API
Handles Deliveries
Redis Pod
Token Storage
MongoDB Pod
Main Database
Qdrant Pod
Vector DB
AI Pod (Ollama)
LLM Model Hosting
LLM Models Pod
Model Deployment
Mailpit Pod
Email Testing
Prometheus
Metrics Collection
Grafana
Metrics Visualization
Loki
Log Aggregation
Kafka
Async Messaging
↑