Performance Issues

This guide covers identifying and resolving performance issues in License Monitor and License Server Detail.

Performance Metrics

Key Metrics to Monitor

Metric	Healthy Range	Warning	Critical
API Response Time	< 100ms	100-500ms	> 500ms
CPU Usage	< 50%	50-80%	> 80%
Memory Usage	< 70%	70-90%	> 90%
Error Rate	< 1%	1-5%	> 5%
Active Connections	< limit/2	limit/2-80%	> 80% limit

Collecting Metrics

# API response time
curl -w "Time: %{time_total}s\n" -o /dev/null -s \
    http://localhost:8080/api/health

# CPU and memory usage
ps aux | grep license_monitor | awk '{print "CPU:", $3"%", "MEM:", $4"%"}'

# Connection count
ss -s | grep TCP

# Detailed metrics endpoint
curl http://localhost:8080/api/metrics

CPU Performance

High CPU Usage

Symptoms: License Monitor using excessive CPU, system slowdown.

Identify the cause

# Check CPU usage over time
top -p $(pgrep license_monitor) -b -n 5

# Profile with perf (Linux)
perf top -p $(pgrep license_monitor)

Check polling interval

# Increase if too frequent
[command_mode]
interval_seconds = 600  # 10 minutes

Check regex complexity

# Simplify regex patterns
[tail_mode]
# Instead of complex pattern
# regex_pattern = "^(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+).*"
# Use simpler pattern
regex_pattern = "^\\d{4}"

Check parse script efficiency

# Time the parse script
time (lmstat -a | python3 parse.py)

CPU Optimization

# config.toml optimizations
[command_mode]
interval_seconds = 600        # Reduce polling frequency

[api]
max_connections = 50          # Limit concurrent connections
rate_limit_requests = 60      # Limit request rate

[tail_mode]
batch_size = 50               # Process logs in batches

Memory Performance

High Memory Usage

Symptoms: Memory grows over time, eventual OOM.

Check current memory usage

# Process memory
ps aux | grep license_monitor

# Detailed memory map
pmap -x $(pgrep license_monitor)

# Memory over time
watch -n 5 'ps aux | grep license_monitor | awk "{print \$6/1024\"MB\"}"'

Check for memory leaks

# Monitor RSS growth
while true; do
    ps -o rss= -p $(pgrep license_monitor)
    sleep 60
done | tee memory.log

Identify large allocations
- WebSocket connection buffers
- Log message queues
- License data cache

Memory Optimization

# config.toml memory optimizations
[api]
max_connections = 50          # Limit connections
message_buffer_size = 100     # Limit buffered messages

[command_mode]
cache_ttl_seconds = 300       # Expire cached data
max_cache_entries = 100       # Limit cache size

# System-level limits
[Service]
MemoryMax=512M
MemoryHigh=384M

Network Performance

Slow API Responses

Measure response times

# Test endpoint latency
for i in {1..10}; do
    curl -w "%{time_total}\n" -o /dev/null -s \
        http://localhost:8080/api/health
done | awk '{sum+=$1} END {print "Avg:", sum/NR, "s"}'

Check for network issues

# Network latency
ping -c 10 license-server

# TCP connection stats
ss -s

# Check for dropped packets
netstat -s | grep -i drop

Check upstream latency

# Time license server query
time lmstat -a -c 27000@license-server

Network Optimization

# config.toml network optimizations
[api]
read_timeout_seconds = 30      # Request read timeout
write_timeout_seconds = 30     # Response write timeout
idle_timeout_seconds = 60      # Idle connection timeout

[command_mode]
timeout_seconds = 60           # Command execution timeout

# nginx proxy optimizations
upstream license_monitor {
    server 127.0.0.1:8080;
    keepalive 32;              # Connection pooling
}

server {
    proxy_connect_timeout 5s;
    proxy_read_timeout 30s;
    proxy_send_timeout 30s;

    # Enable compression
    gzip on;
    gzip_types application/json;
}

Database Performance

Convex Performance

# Check Convex dashboard for:
# - Slow queries
# - High function invocation time
# - Database size

npx convex dashboard

Query Optimization

// Use indexes for common queries
servers: defineTable({...})
    .index("by_active", ["isActive"])
    .index("by_vlan", ["vlan"])

// Query with index
const activeServers = await ctx.db
    .query("servers")
    .withIndex("by_active", q => q.eq("isActive", true))
    .collect();

Load Testing

Basic Load Test

# Using Apache Bench
ab -n 1000 -c 10 -H "X-API-Key: KEY" \
    http://localhost:8080/api/health

# Using wrk
wrk -t4 -c100 -d30s -H "X-API-Key: KEY" \
    http://localhost:8080/api/health

# Using hey
hey -n 1000 -c 10 -H "X-API-Key: KEY" \
    http://localhost:8080/api/health

Interpreting Results

Metric	Good	Investigate
Requests/sec	> 1000	< 100
Avg latency	< 50ms	> 200ms
p99 latency	< 200ms	> 1s
Error rate	0%	> 1%