Skip to content

SKfaizan-786/sentinel-devops-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

499 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Sentinel Logo

Typing SVG

Your infrastructure never sleeps. Neither does Sentinel.

License: MIT Node.js 18+ Next.js 16 Kestra Groq AI

Quick Start β€’ Features β€’ Architecture β€’ Hackathon β€’ Contributing


πŸ† Hackathon Achievement

πŸ₯‡ FEATURED PROJECT at WeMakeDevs AI Agents Assemble

Metric Value
Total Teams Competing 6,000+ worldwide
Countries Represented 20+
Prize Pool $15,000 USD
Sentinel's Status ✨ Featured in Top Projects ✨

Submitted to: Apertre 3.0 Open Source Programme


🎯 What is Sentinel?

Sentinel is an autonomous AI-powered DevOps agent that transforms infrastructure management from reactive firefighting to proactive, self-healing operations.

Unlike traditional monitoring tools that tell you what broke, Sentinel tells you why it broke and fixes it automaticallyβ€”without human intervention.

The Problem We Solve

  • ❌ Engineers woken at 3 AM to restart services
  • ❌ Alert fatigue from constant notifications
  • ❌ MTTR (Mean Time To Recovery) measured in hours
  • ❌ Post-mortem blame cycles instead of prevention

Sentinel's Solution

  • βœ… 24/7 autonomous monitoring with 5-second polling
  • βœ… AI-powered root cause analysis (Groq LLaMA 3.3-70B)
  • βœ… Automatic self-healing within 30 seconds
  • βœ… Transparent reasoning for every decision
  • βœ… Cost-optimized: AI only runs when services actually fail
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     πŸ›‘οΈ SENTINEL ARCHITECTURE   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                 β”‚
β”‚   Your Infrastructure           β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚ 🟒  β”‚  β”‚ 🟒  β”‚  β”‚ πŸ”΄  β”‚   β”‚
β”‚   β”‚Auth β”‚  β”‚ Pay β”‚  β”‚Notifβ”‚   β”‚
β”‚   β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜   β”‚
β”‚      β”‚       β”‚        β”‚       β”‚
β”‚      β””β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚              β–Ό                β”‚
β”‚        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚        β”‚ πŸ€– AI Engine β”‚       β”‚
β”‚        β”‚ (Kestra +    β”‚       β”‚
β”‚        β”‚  Groq)       β”‚       β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚               β–Ό               β”‚
β”‚        ✨ AUTO-HEAL ✨        β”‚
β”‚               β–Ό               β”‚
β”‚       All Services Healthy    β”‚
β”‚                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

✨ Key Features

🧠 Intelligence Layer

  • Real-time Root Cause Analysis using LLaMA 3.3-70B
  • Predictive Failure Detection via pattern recognition
  • Cost-Optimized AI: Only invokes LLM when services fail
  • Human-Readable Reports with actionable insights

⚑ Automation Layer

  • 30-Second Kestra Orchestration for workflow automation
  • Autonomous Self-Healing without human approval
  • Multi-Service Monitoring with parallel health checks
  • Intelligent Recovery Workflows tailored to failure types

πŸŽ›οΈ Visibility Layer

  • Real-Time Dashboard with live metrics & service status
  • AI Reasoning Panel showing agent's decision-making
  • Incident Timeline with recovery analytics
  • CLI Tool for power users and developers

🐳 Infrastructure Layer

  • Docker Containerization for easy deployment
  • PostgreSQL State Management via Kestra
  • Webhook-Based Communication between components
  • Scalable Microservices Architecture for production use

πŸ–ΌοΈ System in Action

πŸ“Š Real-Time Dashboard

All systems healthy with live metrics and AI reasoning

Dashboard



🚨 Detecting Failures

Service down detected, AI analysis triggered

Failure Detection



πŸ€– CLI Power Tool

Developer interface for manual operations

CLI Tool


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        πŸ›‘οΈ SENTINEL STACK                                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚   πŸ“± FRONTEND  β”‚    β”‚  πŸ”§ BACKEND    β”‚    β”‚  πŸ€– KESTRA     β”‚        β”‚
β”‚  β”‚  Next.js 16    │◄──►│  Express.js    │◄──►│  Orchestrator  β”‚        β”‚
β”‚  β”‚  Port: 3000    β”‚    β”‚  Port: 4000    β”‚    β”‚  Port: 9090    β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚                                 β”‚                     β”‚                 β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                    β–Ό                                                    β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚   β”‚         🐳 DOCKER NETWORK (Services + State)                    β”‚  β”‚
β”‚   β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”‚
β”‚   β”‚                                                                 β”‚  β”‚
β”‚   β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚  β”‚
β”‚   β”‚  β”‚ πŸ” Auth  β”‚   β”‚ πŸ’³ Pay   β”‚   β”‚ πŸ“§ Notif β”‚   β”‚ πŸ—„οΈ Postgres  β”‚ β”‚  β”‚
β”‚   β”‚  β”‚  :3001   β”‚   β”‚  :3002   β”‚   β”‚  :3003   β”‚   β”‚   :5432     β”‚ β”‚  β”‚
β”‚   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚  β”‚
β”‚   β”‚                                                                 β”‚  β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                          β”‚
β”‚                            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                 β”‚
β”‚                            β”‚ 🧠 GROQ  β”‚                                 β”‚
β”‚                            β”‚ LLaMA AI β”‚                                 β”‚
β”‚                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                 β”‚
β”‚                                                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How It Works

sequenceDiagram
    participant B as Backend<br/>(5s polling)
    participant K as Kestra<br/>(30s cron)
    participant S as Services
    participant AI as Groq<br/>LLaMA 3.3
    participant F as Dashboard

    loop Every 5 Seconds
        B->>S: Health check all services
        S-->>B: Status responses
        B->>F: Broadcast status
    end

    loop Every 30 Seconds
        K->>S: Parallel health checks
        S-->>K: Responses
        
        alt Any Service Down?
            K->>AI: Analyze failure + metrics
            AI-->>K: Root cause + recommendations
            K->>S: Execute healing action
            K->>B: POST webhook with report
            B->>F: Real-time update
        else All Healthy
            K->>B: Send healthy status
        end
    end
Loading

Recovery Timeline

Time Event
0s Service crashes
5s Backend detects (5-second polling)
30s Kestra scheduled check runs
32s AI analyzes root cause
35s Healing action executes
40s Service restored βœ…

Worst-case recovery: ~65 seconds (when service fails just after Kestra check)


πŸ› οΈ Tech Stack

Layer Tech Purpose
Frontend Next.js 16, TypeScript, Tailwind CSS, Recharts Real-time dashboard with glassmorphism UI
Backend Node.js, Express, Axios Health aggregation, webhook handler, REST API
Orchestration Kestra, YAML workflows, PostgreSQL Automation, state management, scheduling
AI/Intelligence Groq API (LLaMA 3.3-70B) Root cause analysis, recommendations
Infrastructure Docker, Docker Compose, 3 Mock Services Containerization, networking, simulation
CLI Commander.js, Chalk, cli-table3 Developer interface, chaos testing

πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose (v20+)
  • Node.js 18+ (for local development)
  • PostgreSQL 12+ (required for RBAC system - included in docker-compose)
  • Groq API Key (Free at console.groq.com)

Note: The RBAC system requires PostgreSQL. When using docker-compose up, PostgreSQL is automatically started. For local development, install PostgreSQL separately or use the containerized version.

⚑ One-Command Setup

# Clone the repository
git clone https://github.com/SKfaizan-786/sentinel-devops-agent.git
cd sentinel-devops-agent

# Set up environment variables
cp backend/.env.example backend/.env
# ⚠️  Edit backend/.env and set a strong JWT_SECRET before starting!

# Start the entire stack (includes PostgreSQL for RBAC)
docker-compose up -d

# Initialize RBAC system (first time only)
cd backend
npm install
npm run quick-setup

# That's it! Access at:
# 🌐 Dashboard: http://localhost:3000
# πŸ€– Kestra UI: http://localhost:9090
# πŸ“Š Backend API: http://localhost:4000

⚠️ Security Warning: The quick-setup creates a default admin account (admin@example.com / password123) for development. Change this password immediately in production environments!

πŸ”§ Development Setup

Expand for full development guide
# 1. Start infrastructure (includes PostgreSQL for RBAC)
docker-compose up -d kestra postgres auth-service payment-service notification-service

# 2. Set up RBAC system
cd backend
cp .env.example .env
# ⚠️  Edit .env and set JWT_SECRET to a strong random value:
# node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
npm install
npm run quick-setup  # Creates database schema and default admin

# 3. Start backend (in same terminal)
npm start

# 4. Start frontend (in new terminal)
cd ../sentinel-frontend
npm install
npm run dev

# 5. Optional: Install CLI
cd ../cli
npm install
npm link

# Now accessible at:
# Dashboard: http://localhost:3000
# Backend: http://localhost:4000
# Kestra: http://localhost:9090
# PostgreSQL: localhost:5432 (for RBAC)
# CLI: sentinel status

# ⚠️  Default admin credentials (DEVELOPMENT ONLY):
# Email: admin@example.com
# Password: password123
# Change immediately in production!

πŸ–₯️ CLI Usage

Sentinel includes a powerful CLI for DevOps engineers:

# View system health
sentinel status

# Simulate failures (chaos testing)
sentinel simulate auth down
sentinel simulate payment degraded
sentinel simulate notification slow

# Trigger manual healing
sentinel heal auth

# Generate AI incident report
sentinel report

Example Output:

$ sentinel status

πŸ›‘οΈ  SENTINEL STATUS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  Service              Status    Latency   
  ─────────────────────────────────────────
  auth-service         🟒 UP     45ms      
  payment-service      🟒 UP     52ms      
  notification-service 🟒 UP     38ms      

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Last Check: 2026-02-01T10:30:00Z
AI Status: Active & Monitoring

πŸ§ͺ Test the Auto-Healing

Live Demo Scenario:

# Terminal 1: Watch the dashboard
open http://localhost:3000

# Terminal 2: Crash a service
sentinel simulate auth down

# Watch what happens:
# 1. Dashboard status β†’ RED (within 5s)
# 2. AI panel β†’ "Analyzing..."
# 3. After 30s β†’ Kestra runs + triggers healing
# 4. Within 65s total β†’ Service restored, status β†’ GREEN

πŸ† Hackathon Qualifications

Track Achievement
πŸ€– Kestra Autonomous Kestra orchestration with parallel health checks, conditional AI invocation, and self-healing workflows
πŸ’» Cline Built with Cline's assistance. Production-ready CLI embodies autonomous developer workflows
🐰 CodeRabbit AI-powered code reviews on every PR ensure enterprise-grade quality
🌐 Vercel Real-time dashboard deployed on Vercel with optimized Next.js

πŸ“š Documentation

Document Content
DOCUMENTATION.md Complete docs index
ARCHITECTURE.md System design deep-dive
DEVELOPMENT.md Setup & development guide
CONTRIBUTING.md How to contribute
SECURITY.md Security policy & disclosure
API.md REST API reference
FAQ.md 50+ Q&A
ROADMAP.md Future features

🀝 Contributing

We welcome contributions! Sentinel is open source and beginner-friendly.

# Fork β†’ Clone β†’ Branch β†’ Code β†’ Push β†’ PR
git checkout -b feature/amazing-feature
git commit -m "feat: add amazing feature"
git push origin feature/amazing-feature

See CONTRIBUTING.md and CODE_OF_CONDUCT.md.


πŸ“„ License

MIT License - see LICENSE for details.


πŸ‘₯ Team

SKfaizan-786
@SKfaizan-786

Backend & Orchestration
mdhaarishussain
@mdhaarishussain

Frontend & Dashboard

Built with ❀️ for the WeMakeDevs AI Agents Assemble Hackathon
Featured in the Top Projects (6000+ teams worldwide)


⭐ Show Your Support

If Sentinel helped you, give us a star! ⭐

GitHub stars

Share with your network:

Twitter LinkedIn


Sentinel Logo
Sentinel
"Monitoring that never sleeps"

Made with πŸ›‘οΈ by the Sentinel Team

About

Sentinel is an autonomous DevOps agent that predicts failures before they happen, auto-heals services, and keeps infrastructure healthy 24/7 using Cline CLI, Kestra AI workflows, and a real-time Vercel dashboard.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors