Skip to main content

Signal Configuration

Standardize Signal Naming

Use consistent signal names across all systems. When every system uses “battery_voltage” for the same measurement, you can create one trigger that applies uniformly. Inconsistent naming requires duplicate triggers and complicates management. Good Examples:
  • battery_voltage (not batt_v, voltage_battery, bat_volt)
  • motor_temperature (not temp_motor, motor_temp, temperature_motor)
  • obstacle_distance (not dist_obstacle, obstacle_dist)
Convention Recommendations:
  • Use lowercase with underscores
  • Start with the component, then the metric (e.g., battery_voltage, motor_current)
  • Use full words, not abbreviations
  • Document your naming convention and enforce it across all systems

Choose Appropriate Signal Types

Match signal types to your data:
  • NUMBER: Continuous values (voltage, temperature, distance)
  • STRING: Text states (error messages, statuses)
  • BOOLEAN: Binary conditions (door_open, emergency_stop)
Using the correct type ensures proper threshold comparisons and data visualization.

Trigger Configuration

Start Conservative, Then Tune

Begin with conservative trigger thresholds that catch genuine issues without generating excessive false positives. Initial Configuration Process:
  1. Set thresholds based on manufacturer specifications or safety limits
  2. Deploy to production and monitor for 1-2 weeks
  3. Review event frequency and patterns
  4. Identify false positives (events that didn’t require action)
  5. Identify false negatives (missed conditions discovered through other means)
  6. Adjust thresholds based on actual system behavior
Example - Battery Monitoring:
  • Week 0: Set low battery threshold at 12.0V (manufacturer spec: 11.0V minimum)
  • Week 2 Review: 50 events generated, but only 5 required intervention
  • Adjustment: Lower threshold to 11.5V to reduce false positives while maintaining safety margin
  • Week 4 Review: 8 events generated, all requiring intervention ✓

Account for Environmental Variations

System behavior changes with environmental conditions. Adjust your monitoring configuration accordingly: Seasonal Adjustments:
  • Winter: Battery voltage drops faster in cold temperatures. Increase low battery thresholds slightly (e.g., from 11.5V to 11.8V) to provide more warning time.
  • Summer: Higher ambient temperatures mean motors and batteries run hotter. Adjust overheating thresholds to account for seasonal baselines while still catching genuine issues.
Operational Context:
  • Peak Hours: Systems under heavy load may exhibit different normal ranges
  • Off-Peak Hours: Idle time thresholds may need different values
  • Maintenance Windows: Temporarily disable non-critical triggers during scheduled maintenance

Use Multiple Conditions for Precision

Combine multiple signal conditions to create more precise triggers that reduce false positives. Example - Battery Aging Detection: Instead of a single condition:
  • charge_cycles > 800 (too many false positives on healthy batteries)
Use combined conditions:
  • charge_cycles > 800 AND battery_voltage < 12.0V (indicates genuine degradation)
Example - Performance Issues: Instead of:
  • task_duration > 300 seconds (may be normal for complex tasks)
Use:
  • task_duration > 300 seconds AND error_count > 0 (indicates actual problem)

Deployment Strategies

Validate Before Wide Rollout

Test new triggers on a small subset of systems before deploying to all. Gradual Rollout Process:
  1. Phase 1 - Pilot (5-10% of systems):
    • Apply new trigger to one zone or system type
    • Monitor for 24-48 hours
    • Verify trigger behavior matches expectations
  2. Phase 2 - Review:
    • Check for false positives (unnecessary events)
    • Check for false negatives (missed conditions)
    • Adjust thresholds if needed
  3. Phase 3 - Full Deployment:
    • Roll out to remaining systems once validated
    • Monitor closely for the first week
  4. Phase 4 - Optimize:
    • Fine-tune based on fleet-wide patterns
This approach prevents disruption from misconfigured triggers while allowing real-world validation.

Use Labels for Targeted Deployment

Organize your systems using labels to enable targeted trigger deployment and analysis. Organizational Labels (for filtering and analysis):
  • zone=picking / zone=packing / zone=shipping
  • model=amr_v2 / model=amr_v3
  • shift=day / shift=night
  • environment=indoor / environment=outdoor
Operational Labels (for maintenance tracking):
  • battery_replaced=2024-06-15
  • last_maintenance=2024-11-28
  • firmware_version=2.1.3
Trigger Labels (for notification management):
  • category=battery_health / category=navigation / category=performance
  • severity=critical / severity=high / severity=medium
  • action_required=immediate / action_required=schedule_maintenance
Label Best Practices:
  • Use consistent key names across all systems
  • Keep values simple and searchable
  • Document your labeling convention
  • Update operational labels as systems change

Predictive Maintenance Strategies

Battery Degradation Monitoring

Instead of waiting for batteries to fail unexpectedly, configure triggers that detect degradation patterns early. Early Warning Trigger: Monitor battery health indicators (voltage combined with charge cycle count). When a battery has accumulated significant charge cycles AND voltage is dropping, generate a maintenance notification weeks before actual failure. Example Configuration:
  • Condition 1: charge_cycles > 800
  • Condition 2: battery_voltage < 12.0V
  • Priority: Medium (not urgent, but needs scheduling)
  • Recording: Capture 0 seconds before, 10 seconds after (minimal data needed)
Maintenance Workflow:
  1. Week 0: Trigger matches degradation pattern → Event generated with medium priority
  2. Week 1: Operations reviews event → Schedules maintenance during next planned downtime
  3. Week 4: Battery replaced during scheduled window, system immediately returns to service
  4. Result: Zero unplanned downtime, optimized battery lifecycle, predictable maintenance costs

Performance Anomaly Detection

Configure triggers to identify systems performing differently from normal baselines. Outlier Detection Approach:
  1. Establish baseline performance (e.g., average task completion time is 4.2 minutes)
  2. Configure triggers that fire when individual systems exceed acceptable variance (e.g., consistently taking 5.0+ minutes per task - 20% slower)
  3. Set threshold to trigger only after pattern confirms (e.g., 3 consecutive slow tasks, not just one)
Example Configuration:
  • Condition: task_completion_time > (fleet_average * 1.2)
  • Requires: 3 consecutive matches to avoid false positives from occasional complex tasks
  • Priority: Medium (requires investigation, not immediate action)
Investigation Workflow: When performance outlier events occur, review associated recordings to understand what’s different. Common root causes include:
  • Mechanical wear (wheels, motors, sensors)
  • Software configuration drift
  • Environmental factors (floor condition, lighting)
  • Workload imbalance (assigned more difficult routes)
This targeted investigation approach focuses attention on systems that actually need it, rather than requiring constant manual monitoring.

Component Lifecycle Tracking

Track usage patterns to schedule proactive replacements before end-of-life. Examples:
  • Motor Bearings: Track operating hours and vibration levels
  • Sensors: Monitor calibration drift over time
  • Belts/Chains: Track tension variations and operating hours
Configuration Pattern: Combine usage metrics with performance indicators to predict component failure:
  • operating_hours > 5000 AND vibration_level > baseline * 1.5
  • calibration_error > 5% AND days_since_calibration > 180

Event Management

Establish Review Routines

Create a structured process for reviewing and acting on events to ensure they drive action rather than accumulating as noise. Daily Reviews:
  • Review all critical and high-priority events
  • Respond to immediate issues
  • Verify resolution of previous day’s critical events
Weekly Reviews:
  • Analyze event patterns across systems
  • Identify recurring problems
  • Adjust trigger thresholds if needed
  • Review false positive/negative rates
Monthly Reviews:
  • Review predictive maintenance queue
  • Schedule upcoming maintenance interventions
  • Assess overall fleet health trends
  • Update trigger configurations based on seasonal changes

Configure Appropriate Notification Channels

Match notification methods to event priority: Critical Events (requires immediate action):
  • Slack with @mentions
  • Email to multiple recipients
  • In-app notifications
High Priority Events (requires attention within hours):
  • Slack notifications
  • Email to operations team
  • In-app alerts
Medium/Low Priority Events (review during regular monitoring):
  • Email notifications
  • In-app only

Use Recording Windows Strategically

Configure recording windows to capture relevant context without excessive data storage: Short Events (battery low, sensor spike):
  • Record 60 seconds before, 30 seconds after
  • Captures immediate context without excessive data
Long Events (overheating, stuck detection):
  • Record 300 seconds (5 minutes) before, 600 seconds (10 minutes) after
  • Captures progression and aftermath for root cause analysis
Predictive Maintenance (early warnings):
  • Record 0 seconds before, 10 seconds after
  • Minimal recording needed since event marks a pattern, not an incident

Continuous Improvement

Monitor Trigger Effectiveness

Track metrics to assess how well your triggers are working: Key Metrics:
  • Event Frequency: How often does each trigger fire?
  • Action Rate: What percentage of events require intervention?
  • False Positive Rate: Events that didn’t require action
  • False Negative Rate: Issues discovered outside trigger system
  • Response Time: Time from event generation to resolution
Optimization Signals:
  • High event frequency + low action rate = Threshold too sensitive
  • Low event frequency + high false negatives = Threshold too conservative
  • Inconsistent response times = Priority levels may need adjustment

Document Configuration Decisions

Maintain documentation explaining your trigger configurations: What to Document:
  • Why specific thresholds were chosen
  • Environmental or operational context
  • Historical adjustments and their rationale
  • Known edge cases or limitations
This documentation helps:
  • New team members understand existing configuration
  • Troubleshoot unexpected behavior
  • Make informed adjustments over time
  • Replicate successful patterns across systems

Learn from Incidents

When issues occur, review whether your monitoring could have caught them earlier: Post-Incident Review Questions:
  • Did an existing trigger fire? If so, was it acted upon promptly?
  • If no trigger fired, could one have been configured to detect this?
  • Were there warning signs in the data that weren’t monitored?
  • Would different thresholds or conditions have provided earlier warning?
Use these insights to continuously refine your monitoring strategy.

Common Pitfalls to Avoid

Alert Fatigue

Problem: Too many low-value alerts cause teams to ignore notifications. Solutions:
  • Start with fewer, high-confidence triggers
  • Tune thresholds to reduce false positives
  • Use appropriate priority levels
  • Consolidate similar alerts
  • Regularly review and disable ineffective triggers

Over-Monitoring

Problem: Creating triggers for every possible metric creates noise without value. Solutions:
  • Focus on conditions that require action
  • Ask “If this triggers, what would we do?” before creating a trigger
  • Monitor outcomes that matter, not just metrics
  • Consolidate related conditions into single triggers

Under-Monitoring

Problem: Missing critical conditions because they weren’t configured. Solutions:
  • Review incidents to identify missed monitoring opportunities
  • Implement comprehensive coverage for safety-critical systems
  • Use gradual rollout to test new trigger ideas
  • Balance with alert fatigue concerns

Static Configuration

Problem: Triggers configured once and never adjusted despite changing conditions. Solutions:
  • Schedule regular trigger reviews
  • Adjust for seasonal variations
  • Update as systems age or workloads change
  • Respond to operational feedback

Next Steps

Apply to Your Use Case: Learn Related Concepts: Reference Documentation: