Abstract digital transaction visualization with glowing network nodes
Engineering Featured

Avoiding Duplicate Operations: How We Solved a Race Condition in Our Payment System

How we used optimistic locking to prevent duplicate operations and maintain data accuracy under high concurrency.

Protize Team β€’
#engineering #concurrency #optimistic-locking #distributed-systems #backend

When two processes β€” the webhook response and the status-check job β€” handled failed tasks simultaneously, our users sometimes had their account credits restored twice. This post explains how our job processing system encountered a concurrency issue and how we fixed it using optimistic locking.


πŸ’₯ The Problem

At first glance, our task processing flow looked simple:

  1. A user submits a task request.
  2. We forward the request to our external processing service.
  3. The service sends a webhook with the final status (success or failed).
  4. Separately, we run a scheduled status-check job to catch missed webhooks.
  5. If a task fails β†’ we restore the deducted credits back to the user’s account.

But occasionally, both the webhook and the job would detect a failure at almost the same moment β€” and both triggered credit restorations. The result? The user’s account balance increased twice πŸ’ΈπŸ’Έ.


This concurrency race condition wasn’t frequent β€” but in systems that track account balances, even one duplicate restoration is unacceptable.


βš™οΈ The Solution: Optimistic Locking

Instead of introducing complex distributed locks, we implemented a lightweight optimistic locking mechanism at the account record level.

How It Works

Optimistic locking assumes that conflicts are rare but possible. It works like this:

  • Each account row has a version number (an integer column).
  • When a process wants to update the account balance, it checks the version number it last read.
  • The update query succeeds only if the version hasn’t changed.
  • If another process already modified the account, the current update fails gracefully β€” triggering a retry or log instead of a duplicate restoration.

Example schema:

ALTER TABLE accounts ADD COLUMN version INT DEFAULT 0 NOT NULL;

Example update query:

UPDATE accounts
SET balance = balance + 100, version = version + 1
WHERE id = 123 AND version = 5;

If no row is affected (because the version has changed), the application knows someone else already updated it β€” avoiding duplicate operations.


Abstract automation concept with blue and purple tones


πŸ”‘ Benefits

Implementing optimistic locking brought immediate benefits:

  • 🧩 Prevents duplicate operations or incorrect balances even under high concurrency.
  • ⚑ No heavy database locks β€” safe for distributed systems and background jobs.
  • πŸ” Easy to implement for any entity where state accuracy is critical.
  • 🀝 Works perfectly with redundancy, like webhook + scheduled job architecture.

πŸ§ͺ Testing the System

Before deploying, we stress-tested the new logic to ensure it handled real-world concurrency.

Test Scenarios:

  • Simulate concurrent task failures from both webhook and job processes using load-testing tools like JMeter or Artillery.
  • Only one restoration update should succeed; others should detect a version conflict.
  • Check account balances after each test β€” they must remain consistent.

Expected Outcome:

  • βœ… One restoration succeeds.
  • ⚠️ The second detects a version conflict and exits cleanly.
  • πŸ’° Account balance remains correct.

🧭 Lessons Learned

  1. Race conditions don’t always appear in development β€” but they always exist in distributed systems.
  2. Optimistic locking provides a clean, scalable safeguard without slowing down operations.
  3. Monitoring and observability are just as important β€” logs must clearly show conflicts and retries.
  4. Small design improvements like this can prevent significant data integrity issues and improve user trust.

Final Thoughts

In systems that manage account state, concurrency control is just as critical as correctness. By adding a version-based optimistic lock to our account updates, we eliminated duplicate operations β€” without adding latency or operational complexity.

β€” Protize Engineering Team

Enjoyed this article?

Subscribe to get more insights like this delivered to your inbox.