It was a Tuesday. ₹47 lakh worth of orders had been placed that morning during a flash sale. Payments were going through cleanly — Razorpay returning 200s, the orders landing in the database, confirmation emails firing. Everything looked perfect.
Then, at 2:17 PM, the first support ticket came in.
"I was charged but my order shows pending."
By 2:45 PM there were eleven of them. The team pulled up the logs. Webhooks had been firing fine. But the handler — the FastAPI endpoint that listened for payment.captured events — had crashed silently on eleven requests because of a Pydantic validation error on an edge case in the card metadata. The gateway had retried. The handler crashed again. After three attempts, Razorpay gave up. Eleven payments: captured, charged, invisible.
The reconciliation job? They'd meant to build it. It was on the backlog.
That outage cost four hours of engineering time, two angry customer service calls, and — this is the part nobody talks about — the permanent distrust of eleven customers who'd been charged money for nothing. Some of them never came back.
The entire thing was testable. Nobody had tested it.
The Lie Your Current Tests Tell You
Here's what most Python payment test suites look like:
def test_payment_success(client):
response = client.post("/checkout", json={"amount": 9900})
assert response.status_code == 200
assert response.json()["status"] == "success"
That test is lying to you. It's mocking the gateway, bypassing the webhook handler, ignoring the database write, and skipping every failure mode your production system will actually encounter.
Payment flows are not request-response. They are three-step asynchronous systems: your server initiates a charge, the gateway processes it, then the gateway calls you back via webhook to tell you what happened. The webhook is where fulfilment happens. The webhook is where bugs live. And the webhook is exactly what most test suites never touch.
What you need is a way to trigger deterministic payment outcomes — including the ones that only happen in production — and receive real signed webhooks against real handler code. That's what MockCard does.
Let's build a proper test suite from the ground up.
Setting Up
pip install httpx pytest pytest-asyncio fastapi uvicorn
# conftest.py
import pytest
import httpx
MOCKCARD_URL = "https://mockcard.io/api/v1/generate"
MOCKCARD_API_KEY = "your_api_key_here"
@pytest.fixture(scope="session")
def mc():
"""MockCard client — shared across all tests."""
with httpx.Client(
base_url="https://mockcard.io",
headers={
"Content-Type": "application/json",
"X-Api-Key": MOCKCARD_API_KEY,
},
timeout=30.0,
) as client:
yield client
One fixture, one real HTTP client. No mocks, no patches. The whole point is that your tests exercise real code paths.
The Basics — But Done Right
Before you test failure, verify your success path actually stores the right transaction reference.
# tests/test_payment_flows.py
import pytest
def test_successful_payment_stores_transaction_id(mc, db):
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": "success",
"amount": 9900,
"currency": "inr",
})
assert res.status_code == 201
data = res.json()
# payment_intent_id is what you store in your DB
# It matches webhook_event.data.object.id when the webhook fires
payment_id = data["payment_intent_id"]
assert payment_id.startswith("pi_mock_")
# Simulate your checkout creating an order
order_id = create_order(amount=9900, payment_id=payment_id)
order = db.orders.get(order_id)
assert order.payment_id == payment_id
assert order.status == "pending_webhook"
# NOT "fulfilled" — fulfilment happens in the webhook handler
Notice the assertion at the end: status is pending_webhook, not fulfilled. If your checkout route marks an order as complete before the webhook arrives, you have a bug. A lot of teams have this bug.
Every Decline, Parametrized
@pytest.mark.parametrize("scenario,expected_code,expected_decline", [
("insufficient_funds", "card_declined", "insufficient_funds"),
("do_not_honor", "card_declined", "do_not_honor"),
("expired_card", "expired_card", "expired_card"),
("incorrect_cvv", "incorrect_cvc", "incorrect_cvc"),
])
def test_decline_error_surface(mc, scenario, expected_code, expected_decline):
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": scenario,
"amount": 9900,
"currency": "inr",
})
assert res.status_code == 402
error = res.json()["error"]
assert error["code"] == expected_code
assert error["decline_code"] == expected_decline
# Your UI layer reads this and shows the right message to the user
assert len(error["message"]) > 10
Run this and you have a contract: for each decline, your code receives an unambiguous error code, no parsing required. If your gateway changes its error format, these tests will catch it before your users do.
Webhook Handling — The Part That Actually Matters
This is the code most people get wrong. The webhook handler is not just an endpoint that receives JSON. It needs to:
- Verify the HMAC signature before doing anything
- Check for duplicate event IDs before processing
- Handle the event atomically — or retry-safe if not
# app/webhooks.py
import hmac
import hashlib
import json
from fastapi import FastAPI, Request, HTTPException
from app.db import db
app = FastAPI()
WEBHOOK_SECRET = "your_webhook_secret"
@app.post("/webhook/stripe")
async def handle_stripe_webhook(request: Request):
raw_body = await request.body()
signature = request.headers.get("x-mockcard-signature", "")
# Step 1: verify signature before touching the payload
expected = "sha256=" + hmac.new(
WEBHOOK_SECRET.encode(), raw_body, hashlib.sha256
).hexdigest()
if not hmac.compare_digest(expected, signature):
raise HTTPException(status_code=400, detail="Invalid signature")
event = json.loads(raw_body)
# Step 2: idempotency check — gateways retry, you must not double-process
if await db.events.exists(event["id"]):
return {"received": True}
# Step 3: handle the event
if event["type"] == "payment_intent.succeeded":
payment_id = event["data"]["object"]["id"]
await db.orders.fulfill(payment_id)
elif event["type"] == "payment_intent.canceled":
payment_id = event["data"]["object"]["id"]
await db.orders.cancel(payment_id)
# Step 4: mark as processed AFTER successful handling
await db.events.mark_processed(event["id"])
return {"received": True}
Now test it:
# tests/test_webhook_handler.py
import pytest
from httpx import AsyncClient
from app.webhooks import app
import hmac, hashlib, json
WEBHOOK_SECRET = "your_webhook_secret"
def sign(body: bytes) -> str:
return "sha256=" + hmac.new(
WEBHOOK_SECRET.encode(), body, hashlib.sha256
).hexdigest()
@pytest.mark.asyncio
async def test_webhook_signature_rejected_if_tampered():
async with AsyncClient(app=app, base_url="http://test") as client:
payload = json.dumps({"id": "evt_test", "type": "payment_intent.succeeded"}).encode()
res = await client.post(
"/webhook/stripe",
content=payload,
headers={"X-MockCard-Signature": "sha256=fakesignature"},
)
assert res.status_code == 400
@pytest.mark.asyncio
async def test_webhook_idempotent_on_duplicate(db):
async with AsyncClient(app=app, base_url="http://test") as client:
payload = json.dumps({
"id": "evt_already_processed",
"type": "payment_intent.succeeded",
"data": {"object": {"id": "pi_mock_abc123"}},
}).encode()
# Simulate: event was already processed
await db.events.mark_processed("evt_already_processed")
res = await client.post(
"/webhook/stripe",
content=payload,
headers={"X-MockCard-Signature": sign(payload)},
)
# Should return 200, not 500, and NOT fulfill the order twice
assert res.status_code == 200
orders_fulfilled = await db.orders.fulfillment_count("pi_mock_abc123")
assert orders_fulfilled == 0
If your handler doesn't have an idempotency check, the second test will show you an order fulfilled twice. Fix it before your gateway does it to you.
The Chaos Suite — Where Real Bugs Live
This is where MockCard Pro earns its keep. The following four scenarios are not theoretical. They happen in production at scale, they are nearly impossible to reproduce manually, and they are completely invisible to standard sandbox testing.
1. The Duplicate Webhook (simulate_race)
Real gateways fire the same webhook twice when your server doesn't respond fast enough. Both deliveries have the same event.id. Without deduplication, you charge the user once and fulfil twice.
import asyncio, threading
from collections import Counter
@pytest.mark.asyncio
async def test_duplicate_webhook_never_fulfils_twice(mc, webhook_server, db):
fulfil_count = Counter()
async def handler(event: dict):
event_id = event["id"]
if await db.events.exists(event_id):
return # already processed — idempotent
payment_id = event["data"]["object"]["id"]
fulfil_count[payment_id] += 1
await db.orders.fulfill(payment_id)
await db.events.mark_processed(event_id)
webhook_server.on_event = handler
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": "success",
"amount": 9900,
"currency": "inr",
"simulate_race": True, # Pro: fires webhook twice, same event id
"webhook_url": webhook_server.url,
})
assert res.status_code == 201
payment_id = res.json()["payment_intent_id"]
await asyncio.sleep(4) # wait for both deliveries
# This assertion is the whole test
assert fulfil_count[payment_id] == 1, (
f"Order fulfilled {fulfil_count[payment_id]} times. "
"Your handler is not idempotent."
)
If this fails, you have a real bug. A bug that will double-ship orders, double-add loyalty points, or double-send gift cards during your next high-traffic event.
2. 3DS Abandonment (3ds_abandoned)
User starts 3DS, sees the OTP screen, closes the tab. Most systems leave the order in pending indefinitely. payment_intent.canceled fires immediately — your handler must act on it.
@pytest.mark.asyncio
async def test_3ds_abandoned_cancels_order(mc, db):
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": "3ds_abandoned", # Pro: fires payment_intent.canceled
"amount": 9900,
"currency": "inr",
})
assert res.status_code == 402
webhook_event = res.json()["webhook_event"]
assert webhook_event["type"] == "payment_intent.canceled"
payment_id = webhook_event["data"]["object"]["id"]
order_id = await db.orders.create(payment_id=payment_id, status="pending_3ds")
# Simulate your webhook handler receiving the cancel
await handle_payment_cancelled(payment_id)
order = await db.orders.get(order_id)
assert order.status == "cancelled", (
f"Order status is '{order.status}' — it should be 'cancelled'. "
"Left in 'pending', this order will block inventory forever."
)
Most teams discover this bug when a product goes out of stock because 200 phantom orders from abandoned 3DS flows are holding the inventory.
3. The Limbo State (limbo)
This one is silent. Payment is captured. Your server returns 201. The user is charged. But the webhook never arrives — your endpoint was down, behind a firewall, or returned 5xx on all three delivery attempts. The order sits in pending forever.
@pytest.mark.asyncio
async def test_limbo_state_caught_by_reconciliation(mc, webhook_server, db):
"""
Simulates a captured payment with no webhook delivery.
Your reconciliation job must find it.
"""
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": "limbo", # Pro: 201 returned, webhook silenced
"amount": 9900,
"currency": "inr",
"webhook_url": webhook_server.url,
})
assert res.status_code == 201
payment_id = res.json()["payment_intent_id"]
order_id = await db.orders.create(
payment_id=payment_id,
status="pending_webhook",
)
# Wait — the webhook will never come
await asyncio.sleep(5)
assert webhook_server.received_count == 0 # confirmed: silent
# Your reconciliation job runs
await reconcile_captured_payments()
order = await db.orders.get(order_id)
assert order.status in ("fulfilled", "flagged_for_review"), (
f"Order is still '{order.status}' after reconciliation. "
"Without a reconciliation job, this is lost revenue — "
"customer charged, order never fulfilled."
)
The limbo scenario is the one that sends you a support ticket six hours after the customer was charged. It's the refund you give because you can't prove the order was or wasn't fulfilled. It's the chargeback you lose because you have no audit trail.
4. The Latency Storm (latency)
Webhook delivery is delayed 9 seconds. Your server's default timeout is 10 seconds. The connection closes. MockCard retries — and this time it gets through. Now your handler processes the same event twice with two separate deliveries, each with a different request timestamp. Your idempotency check on event ID still works — but only if you actually have one.
@pytest.mark.asyncio
async def test_late_webhook_processed_exactly_once(mc, webhook_server, db):
processed_events = []
async def handler(event: dict):
event_id = event["id"]
if event_id in processed_events:
return # duplicate — discard
processed_events.append(event_id)
await db.orders.fulfill(event["data"]["object"]["id"])
webhook_server.on_event = handler
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": "latency", # Pro: webhook delayed 9 s
"amount": 9900,
"currency": "inr",
"webhook_url": webhook_server.url,
})
assert res.status_code == 201
payment_id = res.json()["payment_intent_id"]
await asyncio.sleep(20) # wait for delayed delivery + retry window
fulfillment_count = await db.orders.fulfillment_count(payment_id)
assert fulfillment_count == 1, (
"Order was fulfilled more than once due to webhook retry. "
"Increase your webhook endpoint timeout or add idempotency."
)
Testing for Razorpay
If you're integrating with Razorpay, your error handling code looks completely different. The error envelope, the event names, the signature header — all of it is Razorpay's wire format. Pass gateway: razorpay and MockCard switches the entire response format.
import hmac, hashlib
def verify_razorpay_signature(raw_body: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(secret.encode(), raw_body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature)
@pytest.mark.parametrize("scenario,expected_reason", [
("insufficient_funds", "insufficient_balance"),
("do_not_honor", "card_declined"),
("expired_card", "card_expired"),
("incorrect_cvv", "incorrect_cvc"),
])
def test_razorpay_decline_format(mc, scenario, expected_reason):
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": scenario,
"gateway": "razorpay", # full Razorpay wire format
"amount": 9900,
"currency": "inr",
})
assert res.status_code == 402
error = res.json()["error"]
# Razorpay error format — completely different from Stripe
assert error["code"] in ("BAD_REQUEST_ERROR", "GATEWAY_ERROR")
assert error["reason"] == expected_reason
assert error["source"] in ("customer", "internal")
assert "payment_id" in error["metadata"]
assert error["metadata"]["payment_id"].startswith("pay_mock_")
def test_razorpay_webhook_format_and_signature(mc):
res = mc.post("/api/v1/generate", json={
"brand": "visa",
"scenario": "success",
"gateway": "razorpay",
})
assert res.status_code == 201
data = res.json()
# payment_intent_id uses Razorpay's prefix
assert data["payment_intent_id"].startswith("pay_mock_")
# Webhook event uses Razorpay's structure
event = data["webhook_event"]
assert event["event"] == "payment.captured" # not "payment_intent.succeeded"
assert "payload" in event # not "data.object"
payment = event["payload"]["payment"]["entity"]
assert payment["status"] == "captured"
assert payment["card"]["network"] == "Visa" # not "brand"
When you test both gateways with the same scenarios, you'll often find bugs in one path that weren't in the other — because the parsing code was written for only one format and silently fails on the other.
CI Integration
# conftest.py additions
import os
import pytest
def pytest_configure(config):
if not os.getenv("MOCKCARD_API_KEY"):
pytest.skip("MOCKCARD_API_KEY not set — skipping payment integration tests")
# pytest.ini
[pytest]
markers =
payment: marks tests as payment integration tests (require MockCard API key)
chaos: marks tests as chaos scenario tests (require Pro plan)
asyncio_mode = auto
# .github/workflows/payment-tests.yml
- name: Run payment integration tests
env:
MOCKCARD_API_KEY: ${{ secrets.MOCKCARD_API_KEY }}
WEBHOOK_SECRET: ${{ secrets.WEBHOOK_SECRET }}
run: pytest tests/test_payment_flows.py -v -m "payment or chaos"
These tests run in CI on every push. The chaos suite catches regression before it ships. No sandbox account. No manual webhook replay. No staging environment that doesn't match production.
The Uncomfortable Part
Every scenario in this post maps to a real production incident somewhere. The duplicate webhook double-fulfils orders during high-traffic sales. The 3DS abandonment leaves phantom pending orders. The limbo state means customers get charged with nothing to show for it. The latency storm turns a brief gateway slowdown into duplicate fulfilments across your entire order queue.
The teams that caught these bugs in testing weren't smarter than the teams that didn't. They just had tools that could actually simulate the failure.
Your payment code isn't done when the success case works. It's done when the failure cases are tested too.
Run the tests. Fix what breaks. Sleep better.
Run Your First Chaos Test in 5 Minutes
MockCard is free to start. Sign up, get an API key, and the entire standard suite — success, all declines, 3DS challenge, network timeout — works immediately with no configuration.
The chaos scenarios (`simulate_race`, `3ds_abandoned`, `limbo`, `latency`) are Pro. They are also the ones that will take down your payment flow in production. If you ship payment code without testing them, you are betting your users' money that those bugs don't exist.
They exist.
Get your free API key → — No credit card. No sandbox. No waiting.
Upgrade to Pro → — Run the chaos suite. Sleep through your next sale.