feat(cmdb-assistant): add data quality validation v1.1.0

Add validation hooks, best practices skill, and new commands to enforce
NetBox data quality standards:

Hooks:
- SessionStart: Test NetBox connectivity, report data quality issues
- PreToolUse: Validate VM/device parameters before create/update

New Commands:
- /cmdb-audit: Data quality analysis (vms, devices, naming, roles)
- /cmdb-register: Register current machine with running applications
- /cmdb-sync: Sync machine state with NetBox, detect drift

Best Practices Skill:
- Dependency order (regions -> sites -> devices -> VMs)
- Site/tenant/platform assignment requirements
- Naming conventions enforcement
- Role consolidation guidance

Updated agent with validation requirements, dependency order checks,
naming convention warnings, and duplicate prevention.

Marketplace: 5.0.0 -> 5.1.0
Plugin: 1.0.0 -> 1.1.0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-27 12:27:23 -05:00
parent 7492cfad66
commit a74a048898
14 changed files with 1483 additions and 11 deletions

View File

@@ -0,0 +1,195 @@
---
description: Audit NetBox data quality and identify consistency issues
---
# CMDB Data Quality Audit
Analyze NetBox data for quality issues and best practice violations.
## Usage
```
/cmdb-audit [scope]
```
**Scopes:**
- `all` (default) - Full audit across all categories
- `vms` - Virtual machines only
- `devices` - Physical devices only
- `naming` - Naming convention analysis
- `roles` - Role fragmentation analysis
## Instructions
You are a data quality auditor for NetBox. Your job is to identify consistency issues and best practice violations.
**IMPORTANT:** Load the `netbox-patterns` skill for best practice reference.
### Phase 1: Data Collection
Run these MCP tool calls to gather data for analysis:
```
1. virt_list_vms (no filters - get all)
2. dcim_list_devices (no filters - get all)
3. virt_list_clusters (no filters)
4. dcim_list_sites
5. tenancy_list_tenants
6. dcim_list_device_roles
7. dcim_list_platforms
```
Store the results for analysis.
### Phase 2: Quality Checks
Analyze collected data for these issues by severity:
#### CRITICAL Issues (must fix immediately)
| Check | Detection |
|-------|-----------|
| VMs without cluster | `cluster` field is null AND `site` field is null |
| Devices without site | `site` field is null |
| Active devices without primary IP | `status=active` AND `primary_ip4` is null AND `primary_ip6` is null |
#### HIGH Issues (should fix soon)
| Check | Detection |
|-------|-----------|
| VMs without site | VM has no site (neither direct nor via cluster.site) |
| VMs without tenant | `tenant` field is null |
| Devices without platform | `platform` field is null |
| Clusters not scoped to site | `site` field is null on cluster |
| VMs without role | `role` field is null |
#### MEDIUM Issues (plan to address)
| Check | Detection |
|-------|-----------|
| Inconsistent naming | Names don't match patterns: devices=`{role}-{site}-{num}`, VMs=`{env}-{app}-{num}` |
| Role fragmentation | More than 10 device roles with <3 assignments each |
| Missing tags on production | Active resources without any tags |
| Mixed naming separators | Some names use `_`, others use `-` |
#### LOW Issues (informational)
| Check | Detection |
|-------|-----------|
| Docker containers as VMs | Cluster type is "Docker Compose" - document this modeling choice |
| VMs without description | `description` field is empty |
| Sites without physical address | `physical_address` is empty |
| Devices without serial | `serial` field is empty |
### Phase 3: Naming Convention Analysis
For naming scope, analyze patterns:
1. **Extract naming patterns** from existing objects
2. **Identify dominant patterns** (most common conventions)
3. **Flag outliers** that don't match dominant patterns
4. **Suggest standardization** based on best practices
**Expected Patterns:**
- Devices: `{role}-{location}-{number}` (e.g., `web-dc1-01`)
- VMs: `{prefix}_{service}` or `{env}-{app}-{number}` (e.g., `prod-api-01`)
- Clusters: `{site}-{type}` (e.g., `home-docker`)
### Phase 4: Role Analysis
For roles scope, analyze fragmentation:
1. **List all device roles** with assignment counts
2. **Identify single-use roles** (only 1 device/VM)
3. **Identify similar roles** that could be consolidated
4. **Suggest consolidation** based on patterns
**Red Flags:**
- More than 15 highly specific roles
- Roles with technology in name (use platform instead)
- Roles that duplicate functionality
### Phase 5: Report Generation
Present findings in this structure:
```markdown
## CMDB Data Quality Audit Report
**Generated:** [timestamp]
**Scope:** [scope parameter]
### Summary
| Metric | Count |
|--------|-------|
| Total VMs | X |
| Total Devices | Y |
| Total Clusters | Z |
| **Total Issues** | **N** |
| Severity | Count |
|----------|-------|
| Critical | A |
| High | B |
| Medium | C |
| Low | D |
### Critical Issues
[List each with specific object names and IDs]
**Example:**
- VM `HotServ` (ID: 1) - No cluster or site assignment
- Device `server-01` (ID: 5) - No site assignment
### High Issues
[List each with specific object names]
### Medium Issues
[Grouped by category with counts]
### Recommendations
1. **[Most impactful fix]** - affects N objects
2. **[Second priority]** - affects M objects
...
### Quick Fixes
Commands to fix common issues:
```
# Assign site to VM
virt_update_vm id=X site=Y
# Assign platform to device
dcim_update_device id=X platform=Y
```
### Next Steps
- Run `/cmdb-register` to properly register new machines
- Use `/cmdb-sync` to update existing registrations
- Consider bulk updates via NetBox web UI for >10 items
```
## Scope-Specific Instructions
### For `vms` scope:
Focus only on Virtual Machine checks. Skip device and role analysis.
### For `devices` scope:
Focus only on Device checks. Skip VM and cluster analysis.
### For `naming` scope:
Focus on naming convention analysis across all objects. Generate detailed pattern report.
### For `roles` scope:
Focus on role fragmentation analysis. Generate consolidation recommendations.
## User Request
$ARGUMENTS

View File

@@ -0,0 +1,322 @@
---
description: Register the current machine into NetBox with all running applications
---
# CMDB Machine Registration
Register the current machine into NetBox, including hardware info, network interfaces, and running applications (Docker containers, services).
## Usage
```
/cmdb-register [--site <site-name>] [--tenant <tenant-name>] [--role <role-name>]
```
**Options:**
- `--site <name>`: Site to assign (will prompt if not provided)
- `--tenant <name>`: Tenant for resource isolation (optional)
- `--role <name>`: Device role (default: auto-detect based on services)
## Instructions
You are registering the current machine into NetBox. This is a multi-phase process that discovers local system information and creates corresponding NetBox objects.
**IMPORTANT:** Load the `netbox-patterns` skill for best practice reference.
### Phase 1: System Discovery (via Bash)
Gather system information using these commands:
#### 1.1 Basic Device Info
```bash
# Hostname
hostname
# OS/Platform info
cat /etc/os-release 2>/dev/null || uname -a
# Hardware model (varies by system)
# Raspberry Pi:
cat /proc/device-tree/model 2>/dev/null || echo "Unknown"
# x86 systems:
cat /sys/class/dmi/id/product_name 2>/dev/null || echo "Unknown"
# Serial number
# Raspberry Pi:
cat /proc/device-tree/serial-number 2>/dev/null || cat /proc/cpuinfo | grep Serial | cut -d: -f2 | tr -d ' ' 2>/dev/null
# x86 systems:
cat /sys/class/dmi/id/product_serial 2>/dev/null || echo "Unknown"
# CPU info
nproc
# Memory (MB)
free -m | awk '/Mem:/ {print $2}'
# Disk (GB, root filesystem)
df -BG / | awk 'NR==2 {print $2}' | tr -d 'G'
```
#### 1.2 Network Interfaces
```bash
# Get interfaces with IPs (JSON format)
ip -j addr show 2>/dev/null || ip addr show
# Get default gateway interface
ip route | grep default | awk '{print $5}' | head -1
# Get MAC addresses
ip -j link show 2>/dev/null || ip link show
```
#### 1.3 Running Applications
```bash
# Docker containers (if docker available)
docker ps --format '{"name":"{{.Names}}","image":"{{.Image}}","status":"{{.Status}}","ports":"{{.Ports}}"}' 2>/dev/null || echo "Docker not available"
# Docker Compose projects (check common locations)
find ~/apps /home/*/apps -name "docker-compose.yml" -o -name "docker-compose.yaml" 2>/dev/null | head -20
# Systemd services (running)
systemctl list-units --type=service --state=running --no-pager --plain 2>/dev/null | grep -v "^UNIT" | head -30
```
### Phase 2: Pre-Registration Checks (via MCP)
Before creating objects, verify prerequisites:
#### 2.1 Check if Device Already Exists
```
dcim_list_devices name=<hostname>
```
**If device exists:**
- Inform user and suggest `/cmdb-sync` instead
- Ask if they want to proceed with re-registration (will update existing)
#### 2.2 Verify/Create Site
If `--site` provided:
```
dcim_list_sites name=<site-name>
```
If site doesn't exist, ask user if they want to create it.
If no site provided, list available sites and ask user to choose:
```
dcim_list_sites
```
#### 2.3 Verify/Create Platform
Based on OS detected, check if platform exists:
```
dcim_list_platforms name=<platform-name>
```
**Platform naming:**
- `Raspberry Pi OS (Bookworm)` for Raspberry Pi
- `Ubuntu 24.04 LTS` for Ubuntu
- `Debian 12` for Debian
- Use format: `{OS Name} {Version}`
If platform doesn't exist, create it:
```
dcim_create_platform name=<platform-name> slug=<slug>
```
#### 2.4 Verify/Create Device Role
Based on detected services:
- If Docker containers found → `Docker Host`
- If only basic services → `Server`
- If specific role specified → Use that
```
dcim_list_device_roles name=<role-name>
```
### Phase 3: Device Registration (via MCP)
#### 3.1 Get/Create Manufacturer and Device Type
For Raspberry Pi:
```
dcim_list_manufacturers name="Raspberry Pi Foundation"
dcim_list_device_types manufacturer_id=X model="Raspberry Pi 4 Model B"
```
Create if not exists.
For generic x86:
```
dcim_list_manufacturers name=<detected-manufacturer>
```
#### 3.2 Create Device
```
dcim_create_device
name=<hostname>
device_type=<device_type_id>
role=<role_id>
site=<site_id>
platform=<platform_id>
tenant=<tenant_id> # if provided
serial=<serial>
description="Registered via cmdb-assistant"
```
#### 3.3 Create Interfaces
For each network interface discovered:
```
dcim_create_interface
device=<device_id>
name=<interface_name> # eth0, wlan0, tailscale0, etc.
type=<type> # 1000base-t, virtual, other
mac_address=<mac>
enabled=true
```
**Interface type mapping:**
- `eth*`, `enp*``1000base-t`
- `wlan*``ieee802.11ax` (or appropriate wifi type)
- `tailscale*`, `docker*`, `br-*``virtual`
- `lo` → skip (loopback)
#### 3.4 Create IP Addresses
For each IP on each interface:
```
ipam_create_ip_address
address=<ip/prefix> # e.g., "192.168.1.100/24"
assigned_object_type="dcim.interface"
assigned_object_id=<interface_id>
status="active"
description="Discovered via cmdb-register"
```
#### 3.5 Set Primary IP
Identify primary IP (interface with default route):
```
dcim_update_device
id=<device_id>
primary_ip4=<primary_ip_id>
```
### Phase 4: Container Registration (via MCP)
If Docker containers were discovered:
#### 4.1 Create/Get Cluster Type
```
virt_list_cluster_types name="Docker Compose"
```
Create if not exists:
```
virt_create_cluster_type name="Docker Compose" slug="docker-compose"
```
#### 4.2 Create Cluster
For each Docker Compose project directory found:
```
virt_create_cluster
name=<project-name> # e.g., "apps-hotport"
type=<cluster_type_id>
site=<site_id>
description="Docker Compose stack on <hostname>"
```
#### 4.3 Create VMs for Containers
For each running container:
```
virt_create_vm
name=<container_name>
cluster=<cluster_id>
site=<site_id>
role=<role_id> # Map container function to role
status="active"
vcpus=<cpu_shares> # Default 1.0 if unknown
memory=<memory_mb> # Default 256 if unknown
disk=<disk_gb> # Default 5 if unknown
description=<container purpose>
comments=<image, ports, volumes info>
```
**Container role mapping:**
- `*caddy*`, `*nginx*`, `*traefik*` → "Reverse Proxy"
- `*db*`, `*postgres*`, `*mysql*`, `*redis*` → "Database"
- `*webui*`, `*frontend*` → "Web Application"
- Others → Infer from image name or use generic "Container"
### Phase 5: Documentation
#### 5.1 Add Journal Entry
```
extras_create_journal_entry
assigned_object_type="dcim.device"
assigned_object_id=<device_id>
comments="Device registered via /cmdb-register command\n\nDiscovered:\n- X network interfaces\n- Y IP addresses\n- Z Docker containers"
```
### Phase 6: Summary Report
Present registration summary:
```markdown
## Machine Registration Complete
### Device Created
- **Name:** <hostname>
- **Site:** <site>
- **Platform:** <platform>
- **Role:** <role>
- **ID:** <device_id>
- **URL:** https://netbox.example.com/dcim/devices/<id>/
### Network Interfaces
| Interface | Type | MAC | IP Address |
|-----------|------|-----|------------|
| eth0 | 1000base-t | aa:bb:cc:dd:ee:ff | 192.168.1.100/24 |
| tailscale0 | virtual | - | 100.x.x.x/32 |
### Primary IP: 192.168.1.100
### Docker Containers Registered (if applicable)
**Cluster:** <cluster_name> (ID: <cluster_id>)
| Container | Role | vCPUs | Memory | Status |
|-----------|------|-------|--------|--------|
| media_jellyfin | Media Server | 2.0 | 2048MB | Active |
| media_sonarr | Media Management | 1.0 | 512MB | Active |
### Next Steps
- Run `/cmdb-sync` periodically to keep data current
- Run `/cmdb-audit` to check data quality
- Add tags for classification (env:*, team:*, etc.)
```
## Error Handling
- **Device already exists:** Suggest `/cmdb-sync` or ask to proceed
- **Site not found:** List available sites, offer to create new
- **Docker not available:** Skip container registration, note in summary
- **Permission denied:** Note which operations failed, suggest fixes
## User Request
$ARGUMENTS

View File

@@ -0,0 +1,336 @@
---
description: Synchronize current machine state with existing NetBox record
---
# CMDB Machine Sync
Update an existing NetBox device record with the current machine state. Compares local system information with NetBox and applies changes.
## Usage
```
/cmdb-sync [--full] [--dry-run]
```
**Options:**
- `--full`: Force refresh all fields, even unchanged ones
- `--dry-run`: Show what would change without applying updates
## Instructions
You are synchronizing the current machine's state with its NetBox record. This involves comparing current system state with stored data and updating differences.
**IMPORTANT:** Load the `netbox-patterns` skill for best practice reference.
### Phase 1: Device Lookup (via MCP)
First, find the existing device record:
```bash
# Get current hostname
hostname
```
```
dcim_list_devices name=<hostname>
```
**If device not found:**
- Inform user: "Device '<hostname>' not found in NetBox"
- Suggest: "Run `/cmdb-register` to register this machine first"
- Exit sync
**If device found:**
- Store device ID and all current field values
- Fetch interfaces: `dcim_list_interfaces device_id=<device_id>`
- Fetch IPs: `ipam_list_ip_addresses device_id=<device_id>`
Also check for associated clusters/VMs:
```
virt_list_clusters # Look for cluster associated with this device
virt_list_vms cluster=<cluster_id> # If cluster found
```
### Phase 2: Current State Discovery (via Bash)
Gather current system information (same as `/cmdb-register`):
```bash
# Device info
hostname
cat /etc/os-release 2>/dev/null || uname -a
nproc
free -m | awk '/Mem:/ {print $2}'
df -BG / | awk 'NR==2 {print $2}' | tr -d 'G'
# Network interfaces with IPs
ip -j addr show 2>/dev/null || ip addr show
# Docker containers
docker ps --format '{"name":"{{.Names}}","image":"{{.Image}}","status":"{{.Status}}"}' 2>/dev/null || echo "[]"
```
### Phase 3: Comparison
Compare discovered state with NetBox record:
#### 3.1 Device Attributes
| Field | Compare |
|-------|---------|
| Platform | OS version changed? |
| Status | Still active? |
| Serial | Match? |
| Description | Keep existing |
#### 3.2 Network Interfaces
| Change Type | Detection |
|-------------|-----------|
| New interface | Interface exists locally but not in NetBox |
| Removed interface | Interface in NetBox but not locally |
| Changed MAC | MAC address different |
| Interface type | Type mismatch |
#### 3.3 IP Addresses
| Change Type | Detection |
|-------------|-----------|
| New IP | IP exists locally but not in NetBox |
| Removed IP | IP in NetBox but not locally (on this device) |
| Primary IP changed | Default route interface changed |
#### 3.4 Docker Containers
| Change Type | Detection |
|-------------|-----------|
| New container | Container running locally but no VM in cluster |
| Stopped container | VM exists but container not running |
| Resource change | vCPUs/memory different (if trackable) |
### Phase 4: Diff Report
Present changes to user:
```markdown
## Sync Diff Report
**Device:** <hostname> (ID: <device_id>)
**NetBox URL:** https://netbox.example.com/dcim/devices/<id>/
### Device Attributes
| Field | NetBox Value | Current Value | Action |
|-------|--------------|---------------|--------|
| Platform | Ubuntu 22.04 | Ubuntu 24.04 | UPDATE |
| Status | active | active | - |
### Network Interfaces
#### New Interfaces (will create)
| Interface | Type | MAC | IPs |
|-----------|------|-----|-----|
| tailscale0 | virtual | - | 100.x.x.x/32 |
#### Removed Interfaces (will mark offline)
| Interface | Type | Reason |
|-----------|------|--------|
| eth1 | 1000base-t | Not found locally |
#### Changed Interfaces
| Interface | Field | Old | New |
|-----------|-------|-----|-----|
| eth0 | mac_address | aa:bb:cc:00:00:00 | aa:bb:cc:11:11:11 |
### IP Addresses
#### New IPs (will create)
- 192.168.1.150/24 on eth0
#### Removed IPs (will unassign)
- 192.168.1.100/24 from eth0
### Docker Containers
#### New Containers (will create VMs)
| Container | Image | Role |
|-----------|-------|------|
| media_lidarr | linuxserver/lidarr | Media Management |
#### Stopped Containers (will mark offline)
| Container | Last Status |
|-----------|-------------|
| media_bazarr | Exited |
### Summary
- **Updates:** X
- **Creates:** Y
- **Removals/Offline:** Z
```
### Phase 5: User Confirmation
If not `--dry-run`:
```
The following changes will be applied:
- Update device platform to "Ubuntu 24.04"
- Create interface "tailscale0"
- Create IP "100.x.x.x/32" on tailscale0
- Create VM "media_lidarr" in cluster
- Mark VM "media_bazarr" as offline
Proceed with sync? [Y/n]
```
**Use AskUserQuestion** to get confirmation.
### Phase 6: Apply Updates (via MCP)
Only if user confirms (or `--full` specified):
#### 6.1 Device Updates
```
dcim_update_device
id=<device_id>
platform=<new_platform_id>
# ... other changed fields
```
#### 6.2 Interface Updates
**For new interfaces:**
```
dcim_create_interface
device=<device_id>
name=<interface_name>
type=<type>
mac_address=<mac>
enabled=true
```
**For removed interfaces:**
```
dcim_update_interface
id=<interface_id>
enabled=false
description="Marked offline by cmdb-sync - interface no longer present"
```
**For changed interfaces:**
```
dcim_update_interface
id=<interface_id>
mac_address=<new_mac>
```
#### 6.3 IP Address Updates
**For new IPs:**
```
ipam_create_ip_address
address=<ip/prefix>
assigned_object_type="dcim.interface"
assigned_object_id=<interface_id>
status="active"
```
**For removed IPs:**
```
ipam_update_ip_address
id=<ip_id>
assigned_object_type=null
assigned_object_id=null
description="Unassigned by cmdb-sync"
```
#### 6.4 Primary IP Update
If primary IP changed:
```
dcim_update_device
id=<device_id>
primary_ip4=<new_primary_ip_id>
```
#### 6.5 Container/VM Updates
**For new containers:**
```
virt_create_vm
name=<container_name>
cluster=<cluster_id>
status="active"
# ... other fields
```
**For stopped containers:**
```
virt_update_vm
id=<vm_id>
status="offline"
description="Container stopped - detected by cmdb-sync"
```
### Phase 7: Journal Entry
Document the sync:
```
extras_create_journal_entry
assigned_object_type="dcim.device"
assigned_object_id=<device_id>
comments="Device synced via /cmdb-sync command\n\nChanges applied:\n- <list of changes>"
```
### Phase 8: Summary Report
```markdown
## Sync Complete
**Device:** <hostname>
**Sync Time:** <timestamp>
### Changes Applied
- Updated platform: Ubuntu 22.04 → Ubuntu 24.04
- Created interface: tailscale0 (ID: X)
- Created IP: 100.x.x.x/32 (ID: Y)
- Created VM: media_lidarr (ID: Z)
- Marked VM offline: media_bazarr (ID: W)
### Current State
- **Interfaces:** 4 (3 active, 1 offline)
- **IP Addresses:** 5
- **Containers/VMs:** 8 (7 active, 1 offline)
### Next Sync
Run `/cmdb-sync` again after:
- Adding/removing Docker containers
- Changing network configuration
- OS upgrades
```
## Dry Run Mode
If `--dry-run` specified:
- Complete Phase 1-4 (lookup, discovery, compare, diff report)
- Skip Phase 5-8 (no confirmation, no updates, no journal)
- End with: "Dry run complete. No changes applied. Run without --dry-run to apply."
## Full Sync Mode
If `--full` specified:
- Skip user confirmation
- Update all fields even if unchanged (force refresh)
- Useful for ensuring NetBox matches current state exactly
## Error Handling
- **Device not found:** Suggest `/cmdb-register`
- **Permission denied on updates:** Note which failed, continue with others
- **Cluster not found:** Offer to create or skip container sync
- **API errors:** Log error, continue with remaining updates
## User Request
$ARGUMENTS