Health Check API

This document describes the Health Check endpoint for monitoring Link Core service status.

Endpoint

Health Check - GET /health

Health Check

Returns the health status of the Link Core service.

Endpoint: GET /health

Response

Status: 200 OK

{
  "status": "ok",
  "timestamp": "2025-11-20T10:30:00.000Z",
  "service": "link-core",
  "version": "1.0.0"
}

Response Fields

Field	Type	Description
`status`	string	Health status: `ok` or `error`
`timestamp`	string	Current server timestamp (ISO 8601)
`service`	string	Service name identifier
`version`	string	Service version

Example

curl -X GET "https://api.ledgerlink.ai/v1/health"

Response:

{
  "status": "ok",
  "timestamp": "2025-11-20T10:30:00.000Z",
  "service": "link-core",
  "version": "1.0.0"
}

Status Codes

Status Code	Meaning	Description
`200 OK`	Healthy	Service is operational and accepting requests
`503 Service Unavailable`	Unhealthy	Service is down or not ready

Health Check Response Examples

Service Healthy

curl -X GET "https://api.ledgerlink.ai/v1/health"

Response (200 OK):

{
  "status": "ok",
  "timestamp": "2025-11-20T10:30:00.000Z",
  "service": "link-core",
  "version": "1.0.0"
}

Service Unhealthy

Response (503 Service Unavailable):

{
  "status": "error",
  "timestamp": "2025-11-20T10:30:00.000Z",
  "service": "link-core",
  "version": "1.0.0",
  "error": "Database connection failed"
}

Use Cases

1. Service Monitoring

Use health checks for continuous monitoring:

# Poll health endpoint every 30 seconds
watch -n 30 curl https://api.ledgerlink.ai/v1/health

2. Load Balancer Health Checks

Configure load balancers to check service health:

# Example: AWS ALB Target Group
HealthCheckEnabled: true
HealthCheckPath: /health
HealthCheckIntervalSeconds: 30
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
UnhealthyThresholdCount: 3

3. Container Orchestration

Use for Kubernetes liveness and readiness probes:

# Kubernetes Deployment
livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5

4. CI/CD Health Verification

Verify deployment health in CI/CD pipelines:

#!/bin/bash
# Wait for service to be healthy after deployment

MAX_ATTEMPTS=30
ATTEMPT=0

while [ $ATTEMPT -lt $MAX_ATTEMPTS ]; do
  RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" https://api.ledgerlink.ai/v1/health)
  
  if [ $RESPONSE -eq 200 ]; then
    echo "Service is healthy!"
    exit 0
  fi
  
  echo "Attempt $((ATTEMPT+1))/$MAX_ATTEMPTS: Service not ready (HTTP $RESPONSE)"
  sleep 10
  ATTEMPT=$((ATTEMPT+1))
done

echo "Service failed to become healthy"
exit 1

5. Service Discovery

{
  "service": "link-core",
  "address": "https://api.ledgerlink.ai",
  "health_check": {
    "url": "https://api.ledgerlink.ai/v1/health",
    "interval": "30s",
    "timeout": "5s"
  }
}

Integration with Monitoring Systems

Prometheus

Scrape health endpoint and convert to metrics:

# prometheus.yml
scrape_configs:
  - job_name: 'link-core'
    metrics_path: '/health'
    scrape_interval: 30s
    static_configs:
      - targets: ['api.ledgerlink.ai']

Datadog

Monitor health check endpoint:

# datadog.yaml
init_config:

instances:
  - name: link-core
    url: https://api.ledgerlink.ai/v1/health
    timeout: 5
    method: GET
    headers:
      Content-Type: application/json

Uptime Robot / Pingdom

Configure HTTP(S) monitoring:

URL: https://api.ledgerlink.ai/v1/health
Method: GET
Expected Status: 200
Check Interval: 60 seconds
Alert: When status ≠ 200

Health Check Best Practices

1. Include Dependency Checks (Future Enhancement)

Extend health check to verify dependencies:

{
  "status": "ok",
  "timestamp": "2025-11-20T10:30:00.000Z",
  "service": "link-core",
  "version": "1.0.0",
  "dependencies": {
    "database": "ok",
    "redis": "ok",
    "activemq": "ok",
    "wallet-manager": "ok",
    "link-quote": "ok"
  }
}

2. Lightweight Checks

Health checks should be fast (< 1 second):

Avoid expensive database queries
Don't wait for external service responses
Cache dependency status

3. Separate Liveness and Readiness

Consider separate endpoints:

Liveness (/health): Service is running (restart if fails)
Readiness (/ready): Service is ready to accept traffic (remove from load balancer if fails)

4. Version Information

Include version for deployment tracking:

{
  "status": "ok",
  "version": "1.2.3",
  "commit": "abc123def",
  "build": "2025-11-20T08:00:00Z"
}

5. Consistent Format

Use consistent format across all services:

{
  "status": "ok" | "degraded" | "error",
  "timestamp": "ISO 8601",
  "service": "service-name",
  "version": "semver"
}

Troubleshooting

Health Check Returns 503

Possible Causes:

Service is starting up
Database connection failed
Critical dependency unavailable
Service crashed or terminated

Actions:

Check service logs
Verify database connectivity
Check resource availability (CPU, memory)
Restart service if necessary

Health Check Times Out

Possible Causes:

Service overloaded
Network issues
Service deadlocked

Actions:

Check CPU/memory usage
Review service logs for errors
Check network connectivity
Consider scaling or restarting

Intermittent Failures

Possible Causes:

Resource exhaustion (memory leaks, file descriptors)
Transient network issues
Dependency instability

Actions:

Monitor resource usage trends
Check dependency health
Review service logs for patterns
Implement retry logic in monitoring

Security Considerations

Public vs Private Health Checks

Public Health Check:

Minimal information exposure
No authentication required
Basic status only

{
  "status": "ok"
}

Internal Health Check:

Detailed diagnostics
Requires authentication
Full dependency status

{
  "status": "ok",
  "dependencies": { ... },
  "metrics": { ... }
}

Rate Limiting

Consider rate limiting health check endpoint:

Max 60 requests per minute from single IP

Prevents abuse while allowing legitimate monitoring.

Example Monitoring Script

#!/bin/bash
# monitor-link-core.sh
# Monitors Link Core health and sends alerts

HEALTH_URL="https://api.ledgerlink.ai/v1/health"
ALERT_EMAIL="ops@ledgerlink.ai"
LOG_FILE="/var/log/link-core-health.log"

check_health() {
  RESPONSE=$(curl -s -w "\n%{http_code}" "$HEALTH_URL")
  HTTP_CODE=$(echo "$RESPONSE" | tail -n1)
  BODY=$(echo "$RESPONSE" | head -n-1)
  
  echo "$(date -u +"%Y-%m-%dT%H:%M:%SZ") - HTTP $HTTP_CODE - $BODY" >> "$LOG_FILE"
  
  if [ "$HTTP_CODE" != "200" ]; then
    echo "ALERT: Link Core health check failed (HTTP $HTTP_CODE)" | \
      mail -s "Link Core Health Alert" "$ALERT_EMAIL"
    return 1
  fi
  
  STATUS=$(echo "$BODY" | jq -r '.status')
  if [ "$STATUS" != "ok" ]; then
    echo "ALERT: Link Core status is $STATUS" | \
      mail -s "Link Core Health Alert" "$ALERT_EMAIL"
    return 1
  fi
  
  return 0
}

# Run check
check_health
exit $?

Usage:

# Run manually
./monitor-link-core.sh

# Schedule with cron (every 5 minutes)
*/5 * * * * /path/to/monitor-link-core.sh

While Link Core currently has a basic health check, consider implementing:

Detailed Health: GET /health/detailed - Full dependency status
Readiness: GET /ready - Ready to accept traffic
Liveness: GET /alive - Service is running
Metrics: GET /metrics - Prometheus metrics

Need Help? Contact helpdesk@ledgerlink.ai for assistance.

Endpoint​

Health Check​

Response​

Response Fields​

Example​

Status Codes​

Health Check Response Examples​

Service Healthy​

Service Unhealthy​

Use Cases​

1. Service Monitoring​

2. Load Balancer Health Checks​

3. Container Orchestration​

4. CI/CD Health Verification​

5. Service Discovery​

Integration with Monitoring Systems​

Prometheus​

Datadog​

Uptime Robot / Pingdom​

Health Check Best Practices​

1. Include Dependency Checks (Future Enhancement)​

2. Lightweight Checks​

3. Separate Liveness and Readiness​

4. Version Information​

5. Consistent Format​

Troubleshooting​

Health Check Returns 503​

Health Check Times Out​

Intermittent Failures​

Security Considerations​

Public vs Private Health Checks​

Rate Limiting​

Example Monitoring Script​

Related Endpoints​

Endpoint

Health Check

Response

Response Fields

Example

Status Codes

Health Check Response Examples

Service Healthy

Service Unhealthy

Use Cases

1. Service Monitoring

2. Load Balancer Health Checks

3. Container Orchestration

4. CI/CD Health Verification

5. Service Discovery

Integration with Monitoring Systems

Prometheus

Datadog

Uptime Robot / Pingdom

Health Check Best Practices

1. Include Dependency Checks (Future Enhancement)

2. Lightweight Checks

3. Separate Liveness and Readiness

4. Version Information

5. Consistent Format

Troubleshooting

Health Check Returns 503

Health Check Times Out

Intermittent Failures

Security Considerations

Public vs Private Health Checks

Rate Limiting

Example Monitoring Script

Related Endpoints