Skip to main content

Release Management

Root causes that are triggered by a code change or a new release.

Root Causes


CPU Congested Caused By Code Changes

After a version upgrade, application containers experience high CPU usage, leading to performance degradation or unresponsiveness. This issue impacts the system's ability to handle requests effectively, potentially causing downtime or delays for end users.
High CPU usage post-upgrade typically stems from changes in the application code, dependencies, or configurations. Common causes include:

  • Inefficient code introduced in the new version (for example, infinite loops, unoptimized algorithms).
  • Increased resource demands from new features or changes in workload patterns.
  • Memory leaks causing excessive garbage collection or other inefficiencies in runtime environments like Java or Node.js.
  • Suboptimal container resource limits that throttle performance.

Memory Failure Caused By Code Changes

Memory failures after a code change can cause containers to crash or degrade performance, resulting in errors for end users or failed service requests. These issues occur when newly introduced code leads to unexpected increases in memory usage, triggering out-of-memory (OOM) errors and destabilizing the system.
The root cause is often linked to recent code modifications that introduce memory leaks, inefficient algorithms, or increased resource demands. These changes may cause the application running inside the container to consume more memory than its allocated limit. When the container exceeds this limit, the system triggers an OOM event, terminating the process and causing service disruptions. This is particularly likely if memory usage grows gradually, such as with leaks, or spikes during certain operations introduced by the new code.


Frequent Crash Failure Caused By Code Changes

One or multiple containers of a workload are frequently crashing with a non-zero exit code after a version upgrade. This disrupts the application's functionality, leading to downtime or degraded performance depending on the workload design. The issue likely stems from changes introduced in the new version, exacerbating existing problems or introducing new incompatibilities.
The non-zero exit code indicates abnormal termination, and the version upgrade suggests additional factors such as:

  • New bugs introduced in the updated code, including unhandled exceptions, invalid logic, or runtime errors.
  • Incompatible configurations that no longer match the updated application’s requirements (for example, new required environment variables).
  • Changes in dependencies, such as a library update causing compatibility issues or stricter API validation.
  • External dependencies (for example, databases or third-party APIs) whose behavior no longer aligns with the updated application.
  • Resource constraints aggravated by the updated version's increased resource demands or changes in workload patterns.
  • Health check behavior changes leading to premature or unnecessary container restarts.

Frequent Memory Failure Caused By Code Changes

The application is running out of memory after a version upgrade, leading to crashes, degraded performance, or instability. This impacts availability and user experience, often requiring container restarts or manual intervention to restore functionality. The issue is likely tied to changes in the updated version that increase memory usage or introduce inefficiencies.
Post-upgrade, out-of-memory (OOM) errors are typically caused by:

  • New memory leaks in the updated code, where objects are not properly released.
  • Increased memory consumption from new features, changes in algorithms, or handling of larger data sets.
  • Updated dependencies introducing higher memory overhead.
  • Improper memory configuration, such as memory limits that are too restrictive for the updated workload.
  • Workload changes, like higher traffic or larger input data, which were not anticipated during the upgrade.

DB Connections Congested Caused By Code Changes

After a version upgrade, the client-side database connection pool is exhausted when all available connections are in use, preventing new database queries from being executed. This can cause application requests to hang or fail, impacting user experience and potentially leading to downtime for database-dependent features.
The exhaustion occurs because the application exceeds the configured maximum number of connections in the pool. Common contributing factors include:

  • Long-running queries that hold connections for extended periods.
  • Connection leaks where connections are not properly closed or returned to the pool after use.
  • High traffic or spikes in concurrent requests exceeding the pool capacity.
  • Improper pool size configuration for the workload or database limits.

Slow Database Query Caused By Code Changes

After a version upgrade, the application is experiencing slow database queries that lead to downstream slow consumer behavior and potential resource starvation. This condition affects instance performance, particularly when query execution times become excessively long.
Slow database queries indicate that interactions with the database are taking longer than expected, which can degrade overall system responsiveness. This problem propagates into a conditional state affecting individual instances, where prolonged query durations are likely to trigger further performance degradation. When query execution times exceed acceptable thresholds, the resulting slowdown becomes nearly certain to lead to resource starvation. Even under less severe conditions, the delay in database interactions can contribute to slow consumer behavior, compounding the overall impact on system performance.


Inefficient Garbage Collection Caused By Code Changes

After a version upgrade, the garbage collector is frequently running, leading to performance degradation or crashes. This issue is likely caused by changes in the application code or dependencies that increase memory usage or introduce inefficiencies.


Inefficient Locking Caused By Code Changes

After a version upgrade, the application is experiencing frequent locking contention, leading to performance degradation or crashes. This issue is likely caused by changes in the application code or dependencies that increase locking or introduce inefficiencies.


Java Heap Congested Caused By Code Changes

After a version upgrade, the Java heap is frequently congested, leading to performance degradation or crashes. This issue is likely caused by changes in the application code or dependencies that increase memory usage or introduce inefficiencies.


Redis Connections Congested Caused By Code Changes

After a version upgrade, the Redis connection pool is frequently congested, leading to performance degradation or crashes. This issue is likely caused by changes in the application code or dependencies that increase Redis usage or introduce inefficiencies.