TraceMyPods

🏗️ Platform Features

Enterprise-Grade Infrastructure

TraceMyPods combines powerful AI capabilities with enterprise-grade infrastructure

📦

Modular Microservices

Fine-grained APIs like admin, order, token, ask, and deliver services make the platform highly modular and maintainable.

🧩

Embeddings & Vector Search

Advanced search powered by Qdrant and custom embeddings from embedding-api for real-time semantic search and AI memory.

🕵️

Observability Stack

Integrated with Prometheus, Grafana, and Loki for deep visibility, tracing, and real-time alerts across all Kubernetes services.

📨

Email Sending with Resend

Built-in SMTP support for OTP verification & Invoice

📡

Event Streaming with Kafka

Reliable real-time messaging and data pipelines between microservices using Apache Kafka integration.

🔐

Secure Token Authentication

Generate secure tokens for API access with Redis-backed authentication and 1-hour expiry for enhanced security.

🧠

Multiple AI Models

Access a variety of LLM models from TinyLlama to powerful Mistral and CodeLlama for different use cases and requirements.

🖼️

Image Generation

Create AI-generated images from text descriptions with our public API feature, currently in beta.

🔧

Customizable

Easily extendable and customizable to fit your specific needs with a modular architecture.

⚡

High Performance HPA and GPU

Optimized infrastructure with GPU acceleration for AI models and efficient request routing.

📊

Analytics Dashboard

Comprehensive analytics dashboard for monitoring usage, performance, and model interactions.

☁️

Kubernetes-Based

Built on EKS with Istio service mesh for enterprise-grade reliability, scalability, and security.

🤖
TraceMyPods AI Assistant

Token: Not Generated

Waiting for your query...

💡 Premium models available for enhanced capabilities:

#mistral #codellama #llama2 #phi

🧠 Available Models

Choose Your AI Model

Choose from our selection of powerful AI models to suit your specific needs

💡

Smoll

~500 MB 1 GB RAM

Free Lightweight model perfect for chat bot with minimal resource requirements.

Free

🛠️

Your Custom Model

Custom Scalable

Yes, We Support Hosting your Custom Model

Contact Sales

🌐

Gemma 2B

2.6 GB 6–8 GB RAM

Google's open-weight chat-optimized model suitable for small to medium workloads.

Open Source

🦅

Falcon-RW 1B

~1.3 GB 4 GB RAM

Small version of the Falcon family, ideal for offline summarization and QA tasks.

Apache 2.0

🧑‍💻

Replit Code 3B

3.3 GB 8–10 GB RAM

Fine-tuned for code generation and completions. Great for coding copilots.

Open Source

🤏

TinyLlama

~1.1 GB 4 GB RAM

Lightweight model perfect for simple Q&A and chat applications with minimal resource requirements.

$15/month

🧠

Mistral-7B

~4.2 GB 8-16 GB RAM

Powerful general-purpose model with excellent reasoning capabilities and broad knowledge.

$15/month

💻

CodeLlama

4.5-10 GB 16-24 GB RAM

Specialized for code generation and understanding across multiple programming languages.

$10/month

🦙

LLaMA 2

4.5-40 GB 16-80 GB RAM

Versatile but resource-heavy model with state-of-the-art performance across various tasks.

$20/month

Phi-2

~1.7 GB 6-8 GB RAM

Efficient and compact model with excellent reasoning capabilities for its size.

$12/month

🏗️ System Architecture

Robust Infrastructure

TraceMyPods is built on a robust, scalable infrastructure designed for enterprise use

Application Components (Beta)

End User (Browser)

Web Interface

Istio Ingress

Traffic Management

Frontend Pod

UI & Token Generation

Ask API

Handles User Prompts

Token API

Manages Session Tokens

Order API

Handles Orders

Admin API

Admin Dashboard

Vector API

Semantic Search Access

Embedding API

Text Vectorization

Deliver API

Handles Deliveries

Redis Pod

Token Storage

MongoDB Pod

Main Database

Qdrant Pod

Vector DB

AI Pod (Ollama)

LLM Model Hosting

LLM Models Pod

Model Deployment

Mailpit Pod

Email Testing

Prometheus

Metrics Collection

Grafana

Metrics Visualization

Loki

Log Aggregation

Kafka

Async Messaging

K8-Powered LLM Platform
for Enterprise Solutions

See TraceMyPods in Action

Enterprise-Grade Infrastructure

Modular Microservices

Embeddings & Vector Search

Observability Stack

Email Sending with Resend

Event Streaming with Kafka

Secure Token Authentication

Multiple AI Models

Image Generation

Customizable

High Performance HPA and GPU

Analytics Dashboard

Kubernetes-Based

🤖
TraceMyPods AI Assistant

Choose Your AI Model

Robust Infrastructure

Application Components (Beta)

Introducing AWS Bedrock Integration

K8-Powered LLM Platform for Enterprise Solutions

See TraceMyPods in Action

Enterprise-Grade Infrastructure

Modular Microservices

Embeddings & Vector Search

Observability Stack

Email Sending with Resend

Event Streaming with Kafka

Secure Token Authentication

Multiple AI Models

Image Generation

Customizable

High Performance HPA and GPU

Analytics Dashboard

Kubernetes-Based

🤖 TraceMyPods AI Assistant

Choose Your AI Model

Robust Infrastructure

Application Components (Beta)

📘 Premium AI Access

K8-Powered LLM Platform
for Enterprise Solutions

🤖
TraceMyPods AI Assistant