Skip to main content

Production hot fixes Strategies

When fixing a production issue, there are several strategies to consider. The choice of strategy depends on the specific circumstances, including the urgency of the issue, the complexity of the fix, and the potential risks involved.

I suggest choosing the strategy in the following priority:

  1. Roll Forward
  2. Roll Back
  3. Revert

Roll Forward

Roll forward

Roll forward is a strategy where instead of reverting to a previous known-good state when encountering a production issue, you fix the problem and deploy a new version that includes the fix. This approach aligns well with Continuous Integration and Continuous Delivery practices as it:

  • Maintains the forward momentum of development
  • Keeps the codebase moving in a single direction
  • Avoids complex merge conflicts that can occur with reverts
  • Reinforces the team's ability to deploy quickly and confidently

When to use Roll Forward

Roll forward is most effective when:

  • You have a robust CI/CD pipeline that allows quick deployments
  • The fix is well understood and can be implemented quickly
  • The current production issue is not causing critical data loss or security vulnerabilities
  • You have good monitoring and can verify the fix works
  • Your team has strong testing practices to validate the fix

When to avoid Roll Forward

Roll forward may not be the best choice when:

  • The production issue is causing active data corruption or security breaches
  • The root cause is not well understood
  • The fix requires extensive changes or testing
  • You don't have confidence in your deployment process
  • Time to resolution is critical and a known good state exists

In these cases, rolling back or reverting might be more appropriate strategies to quickly restore service stability.

Roll Back

Roll back

Roll back is a strategy where you return to a previous known-good version of your application when encountering production issues. This approach involves deploying an earlier version that was working correctly. Rolling back is:

  • Quick to execute since the version is already built and tested
  • Low risk as you're returning to a known stable state
  • Immediate relief from the current production issue
  • A good temporary solution while a proper fix is developed

When to use Roll Back

Roll back is most effective when:

  • There's an urgent need to restore service stability
  • The previous version is known to work correctly
  • The issue is causing significant user impact
  • You need time to properly investigate and fix the root cause
  • You have a reliable way to roll back without data loss
  • Your deployment process supports quick version switching

When to avoid Roll Back

Roll back may not be appropriate when:

  • Database schema changes prevent backward compatibility
  • The previous version has known critical issues
  • Important user data would be lost in the process
  • The issue exists in multiple previous versions
  • New features that users depend on would be lost
  • Integration with third-party services requires the current version

In these scenarios, rolling forward with a fix or finding alternative mitigation strategies might be more suitable.

Revert

Revert

Revert is a strategy where you undo specific changes by creating a new commit that reverses the problematic code changes. This approach involves using version control (like Git) to negate the effects of previous commits. Reverting is:

  • Precise since it targets specific changes
  • Traceable as it maintains a clear history
  • Safe as it doesn't disrupt the commit timeline
  • Collaborative as it works well with team workflows
  • Reversible if the revert itself needs to be undone

When to use Revert

Revert is most effective when:

  • You can identify the exact changes causing the issue
  • Multiple releases have happened since the problematic change
  • You need to maintain a clear audit trail
  • The changes are isolated and don't affect other features
  • You want to preserve the ability to re-apply the changes later
  • Rolling back would undo too many good changes

When to avoid Revert

Revert may not be appropriate when:

  • The issue spans multiple interconnected changes
  • The original changes have complex dependencies
  • Reverting would create merge conflicts
  • The problem requires new code rather than removing code
  • Time is extremely critical (rolling back might be faster)
  • The changes involve complex database migrations

In these cases, creating a new fix or using other mitigation strategies might be more appropriate.