AI-SIEM: Real-Time Threat & Anomaly Detection System

Open-source Security Information and Event Management (SIEM) system for small to medium businesses. Provides real-time threat detection, anomaly analysis, and centralized security monitoring using ELK Stack, Grafana, and machine learning.

🎯 Features

Real-time Log Ingestion: Centralized log collection from multiple sources via Logstash
Elasticsearch Search: Fast indexing and searching of security events
Kibana Visualization: Comprehensive log analysis and visualization
Grafana Dashboards: Advanced security metrics and threat indicators
Anomaly Detection: ML-based detection using Isolation Forest algorithm
Threat Intelligence: Pattern-based threat detection and correlation
Alerting System: Real-time alerts with deduplication and rules engine
🔥 Automated Response: Automatic IP/port blocking for detected threats
Containerized Deployment: Docker Compose for easy deployment
100% Open Source: No licensing costs, self-hosted

📋 Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Log Sources                               │
│  (Web Servers, Firewalls, OS Logs, Applications)            │
└──────────────────┬──────────────────────────────────────────┘
                   │
        ┌──────────▼──────────┐
        │    Logstash         │
        │ (Log Processing)    │
        └──────────┬──────────┘
                   │
        ┌──────────▼──────────────────────────┐
        │    Elasticsearch (Data Lake)        │
        │  - siem-logs-* (Raw Events)        │
        │  - siem-threats-* (Detected)        │
        │  - siem-anomalies-* (Anomalies)    │
        │  - siem-alerts-* (Alerts)           │
        └──────────┬──────────────────────────┘
                   │
        ┌──────────┴──────────┬──────────┬──────────┐
        │                     │          │          │
    ┌───▼────┐          ┌─────▼──┐ ┌────▼─────┐ ┌──▼─────────┐
    │ Kibana │          │ Grafana│ │ Anomaly  │ │  Threat    │
    │ (Viz)  │          │(Dash)  │ │Detection │ │ Detection  │
    └────────┘          └────────┘ └──────────┘ └────────────┘
                                         │            │
                        ┌────────────────┼────────────┘
                        │                │
                    ┌───▼────────────────▼────┐
                    │   Redis (Queue/Cache)    │
                    └───┬────────────────┬─────┘
                        │                │
                    ┌───▼────────────────▼────┐
                    │  Alerting Engine         │
                    │ (Rules, Notifications)   │
                    └──────────────────────────┘

📊 Components

Component	Purpose	Technology	Port
Elasticsearch	Data storage & indexing	Elasticsearch 8.11	9200
Kibana	Log visualization	Kibana 8.11	5601
Logstash	Log processing pipeline	Logstash 8.11	5000
Grafana	Security dashboards	Grafana 10.2	3000
Redis	Caching & message queue	Redis 7	6379
Anomaly Detector	ML-based anomaly detection	Python 3.11	Internal
Threat Detector	Pattern-based threat detection	Python 3.11	Internal
Alerting Engine	Alert processing & routing	Python 3.11	Internal
Automated Response	⚡ NEW Auto-blocks malicious IPs/ports	Python 3.11	Internal

🚀 Quick Start

Prerequisites

Docker & Docker Compose
4+ GB RAM
20+ GB disk space
Linux/macOS/Windows with Docker Desktop

Installation

Clone and navigate to project:

cd /path/to/AI-SIEM

Update credentials (Important for production!):

# Edit docker-compose.yml and update these environment variables:
# - ELASTIC_PASSWORD
# - GF_SECURITY_ADMIN_PASSWORD
# - REDIS_PASSWORD

Create Redis data directory:

mkdir -p redis/data
chmod 755 redis/data

Start all services:

docker-compose up -d

Verify services:

docker-compose ps
# All services should show "Up"

Access Interfaces

Service	URL	Default Credentials
Kibana	http://localhost:5601	elastic / change_me_elastic_password
Grafana	http://localhost:3000	admin / change_me_grafana_password
Elasticsearch	http://localhost:9200	elastic / change_me_elastic_password
Logstash	http://localhost:9600	(No auth)

📝 Configuration

1. Logstash Log Ingestion

Location: logstash/pipeline/main.conf

Configure inputs for your log sources:

input {
  # TCP input
  tcp {
    port => 5000
    codec => json
  }
  
  # File input
  file {
    path => "/var/log/siem/*.log"
  }
}

Supported formats: JSON, Syslog, Plain text, CSV

2. Grafana Dashboards

Location: grafana/dashboards/

Pre-configured dashboards:

Security Threat Overview: Real-time threat statistics
(Add more dashboards in grafana/dashboards/ directory)

3. Alert Rules

Location: alerting/config/alert_rules.yml

Customize alert severity, thresholds, and notification channels.

4. Anomaly Detection

Location: anomaly-detection/main.py

Configuration options:

Contamination rate: Percentage of anomalies (default: 10%)
Retraining interval: Model retraining frequency (default: 6 hours)
Feature selection: Change extracted features from logs

🔌 Sending Logs

TCP/JSON Format

echo '{"event_type": "connection_attempt", "src_ip": "192.168.1.100", "dst_ip": "8.8.8.8", "src_port": 54321, "dst_port": 443, "packet_count": 10, "byte_count": 1024, "severity": "low"}' | nc localhost 5000

Via Filebeat (Agent-based)

Install Filebeat on source systems:

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/auth.log
    - /var/log/syslog

output.logstash:
  hosts: ["<SIEM-SERVER-IP>:5000"]

Via Syslog

Configure your devices/applications to send syslog to port 5000:

facility: local0
server: <SIEM-SERVER-IP>
port: 5000
protocol: udp

🔍 Detection Engines

Anomaly Detection Engine

Algorithm: Isolation Forest (scikit-learn)

What it detects:

Unusual network traffic patterns
Abnormal packet rates
Unexpected data transfer volumes
Statistical outliers in event characteristics

Process:

Fetches historical logs (24 hours)
Trains model on baseline behavior
Monitors incoming events for deviations
Retrains every 6 hours with fresh data

Configuration: anomaly-detection/main.py

Threat Detection Engine

Method: Pattern matching and rule-based detection

What it detects:

Brute Force Attacks: 5+ failed logins in 10 minutes
Port Scanning: 20+ unique ports in 5 minutes
DDoS Attacks: 1000+ requests from single IP in 1 minute
Privilege Escalation: Unauthorized elevated access attempts
Malware Signatures: Known malicious file hashes
Data Exfiltration: Suspicious outbound data transfers

Configuration: threat-detection/main.py

Alerting Engine

Features:

Real-time alert generation
Alert deduplication (prevent spam)
Rule-based filtering
Redis-based queueing
Extensible notification channels

Configuration: alerting/config/alert_rules.yml

🛡️ Automated Response System

Method: Firewall-based IP and port blocking using iptables

What it blocks:

Critical/High Severity Threats: Automatically blocks source IPs
- Critical threats: 120 minutes block duration
- High threats: 60 minutes block duration
Port Scanning Attacks: Blocks targeted ports temporarily (30 min)
DDoS Attacks: Blocks attack source and protects ports

Features:

✅ Automatic IP blocking with iptables
✅ Port-based blocking for service protection
✅ Auto-expiration (blocks automatically unblock after duration)
✅ All actions logged to Elasticsearch
✅ Dry-run mode for testing (no actual blocking)
✅ Real-time monitoring via Redis pub/sub

Configuration:

# In docker-compose.yml:
- AUTO_BLOCK_ENABLED=true    # Enable/disable auto-blocking
- BLOCK_DURATION_MINUTES=60  # Default block duration
- DRY_RUN_MODE=false         # true = log only, false = actually block

Testing:

# Run test suite to simulate attacks
python3 test_blocking.py

# Check blocked IPs
docker logs ai-siem-response --tail 50

# View blocking actions in Elasticsearch
curl "localhost:9200/siem-response-actions-*/_search?pretty"

See: automated-response/README.md for detailed documentation

📊 Elasticsearch Index Lifecycle

Index Naming Convention

siem-logs-YYYY.MM.DD: Raw security events
siem-threats-*: Detected threats
siem-anomalies-*: Detected anomalies
siem-alerts-*: Generated alerts

Index Management

Delete old indices to manage disk space:

# Delete indices older than 30 days
curl -X DELETE http://localhost:9200/siem-logs-$(date -d '30 days ago' +%Y.%m.%d)

🐛 Troubleshooting

Elasticsearch Health Check

curl http://elastic:change_me_elastic_password@localhost:9200/_cluster/health

View Service Logs

# All services
docker-compose logs -f

# Specific service
docker-compose logs -f anomaly-detector
docker-compose logs -f threat-detector
docker-compose logs -f alerting-engine

Common Issues

Issue	Solution
Port already in use	Change port mapping in `docker-compose.yml`
Out of memory	Increase Docker memory limit, reduce `ES_JAVA_OPTS`
No logs appearing	Check Logstash configuration, verify log format
Slow queries	Index size too large - configure log rotation
Detection not working	Ensure at least 1 hour of data in Elasticsearch

📈 Performance Tuning

Elasticsearch

For SMB environments (1M events/day):

ES_JAVA_OPTS: -Xms512m -Xmx512m  # Adjust heap size
indices.memory.index_buffer_size: 30%

Logstash

pipeline.batch.size: 125        # Increase for throughput
pipeline.batch.delay: 50        # Milliseconds
queue.type: persisted          # Enable persisted queue

🔐 Security Best Practices

Change default credentials immediately:
- Elasticsearch admin password
- Grafana admin password
- Redis password
Network isolation:
- Restrict access to ports 5601, 3000, 9200
- Use firewall rules
- Deploy in internal network only
Enable HTTPS:
- Configure Elasticsearch SSL
- Set up Kibana proxy with HTTPS
Regular backups:
- Backup Elasticsearch data
- Export Grafana dashboards
- Version control alert rules

📦 File Structure

AI-SIEM/
├── docker-compose.yml          # Main configuration
├── elasticsearch/
│   ├── config/elasticsearch.yml
│   └── data/                   # Persistent data
├── kibana/
│   └── config/kibana.yml
├── logstash/
│   ├── config/logstash.yml
│   └── pipeline/main.conf      # Log processing rules
├── grafana/
│   ├── config/grafana.ini
│   ├── dashboards/             # Dashboard definitions
│   └── provisioning/           # Automated setup
├── anomaly-detection/
│   ├── Dockerfile
│   ├── main.py                 # Detection engine
│   └── requirements.txt
├── threat-detection/
│   ├── Dockerfile
│   ├── main.py                 # Threat patterns
│   └── requirements.txt
├── alerting/
│   ├── Dockerfile
│   ├── main.py                 # Alert processor
│   ├── config/alert_rules.yml  # Alert configuration
│   └── requirements.txt
├── logs/                        # Log input directory
├── sample-data/                 # Test data
├── redis/                       # Redis data
└── README.md

🔄 Data Flow Examples

Example 1: Failed Login Detection

1. Server logs failed auth → Logstash ingests
2. Logstash parses & enriches → Elasticsearch indexes
3. Threat detector queries: failed_auth > 5 in 10min
4. Match found → Threat alert generated
5. Alerting engine → Redis pub/sub → Elasticsearch
6. Dashboard updated → Security team notified

Example 2: Anomaly Detection

1. Network traffic logged → Logstash processes
2. Stored in Elasticsearch siem-logs-*
3. Anomaly detector fetches last 24hrs → trains model
4. New logs arrive → scored against model
5. Unusual packet rate detected → anomaly_score = -0.75
6. Alert generated & published → stored in siem-alerts-*
7. Grafana dashboard reflects new alert

🛠 Maintenance

Regular Tasks

Daily:

Monitor dashboard for alerts
Check service health: docker-compose ps

Weekly:

Review logs: docker-compose logs --tail 100
Verify model accuracy (anomaly detection)
Test alerting system

Monthly:

Clean old indices (>90 days)
Review and update threat signatures
Analyze detection accuracy

Scaling for Growth

Single-node to multi-node Elasticsearch:

Increase node.name in elasticsearch.yml
Update docker-compose for multiple ES instances
Configure cluster settings

Increase log retention:

Adjust ILM policies
Add more storage volumes
Upgrade disk space

📚 Additional Resources

📄 License

This project uses open-source components:

Elasticsearch & Kibana: SSPL License
Logstash: SSPL License
Grafana: AGPL License
Python libraries: Various open-source licenses

🤝 Contributing

Contributions welcome! Areas for enhancement:

Additional detection algorithms
Integration with more log sources
Advanced visualization dashboards
Notification channel integrations (Slack, Email)
Custom ML models

📞 Support

For issues or questions:

Check troubleshooting section
Review service logs
Check Elasticsearch health
Verify connectivity between services

Last Updated: December 2025 Version: 1.0.0

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
alerting		alerting
anomaly-detection		anomaly-detection
automated-response		automated-response
elasticsearch/config		elasticsearch/config
grafana		grafana
kibana/config		kibana/config
logstash		logstash
sample-data		sample-data
threat-detection		threat-detection
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
DEPLOYMENT.md		DEPLOYMENT.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
INDEX.md		INDEX.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
QUICKREF.md		QUICKREF.md
QUICK_START_BLOCKING.md		QUICK_START_BLOCKING.md
README.md		README.md
START_HERE.md		START_HERE.md
TESTING.md		TESTING.md
TESTING_AUTO_RESPONSE.md		TESTING_AUTO_RESPONSE.md
docker-compose.yml		docker-compose.yml
health-check.sh		health-check.sh
quick-start.sh		quick-start.sh
setup-permissions.sh		setup-permissions.sh
siem-client.py		siem-client.py
test_blocking.py		test_blocking.py

Folders and files

Latest commit

History

Repository files navigation

AI-SIEM: Real-Time Threat & Anomaly Detection System

🎯 Features

📋 Architecture

📊 Components

🚀 Quick Start

Prerequisites

Installation

Access Interfaces

📝 Configuration

1. Logstash Log Ingestion

2. Grafana Dashboards

3. Alert Rules

4. Anomaly Detection

🔌 Sending Logs

TCP/JSON Format

Via Filebeat (Agent-based)

Via Syslog

🔍 Detection Engines

Anomaly Detection Engine

Threat Detection Engine

Alerting Engine

🛡️ Automated Response System

📊 Elasticsearch Index Lifecycle

Index Naming Convention

Index Management

🐛 Troubleshooting

Elasticsearch Health Check

View Service Logs

Common Issues

📈 Performance Tuning

Elasticsearch

Logstash

🔐 Security Best Practices

📦 File Structure

🔄 Data Flow Examples

Example 1: Failed Login Detection

Example 2: Anomaly Detection

🛠 Maintenance

Regular Tasks

Scaling for Growth

📚 Additional Resources

📄 License

🤝 Contributing

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages