It is the ability to understand and analyze what is happening inside a system based on the information available externally

  • Enables system operators to gain insights into how a system is functioning
  • Observability is not monitoring
  • Uses data from; logs, metrics, traces, and events.
  • Logs – application logs and system logs.
  • Logs – provide insights into system behavior to identify errors and issues
  • Metrics – tracks specifics aspects of the system’s performance or behavior over time
  • Metrics – CPU, memory, response time, error rates, health status, etc.
  • Traces – captures the end-to-end flow of requests through a distributed system
  • Traces – uses timestamps at different points
  • Traces – helps to identify bottlenecks and performance issues
  • Observability should be in every architectural design

