Getting Started with Keploy
Aspiring DevOps Engineer | CI/CD | Docker | Kubernetes | AWS | Terraform | Cloud Enthusiast | Automating Workflows & Scaling Infrastructure
Everyone knows testing is important but only few enjoy working on it. Let’s be honest writing and maintaining tests take time, discipline and constant updates as the codebase changes. often the hardest part is not writing the code itself but figuring out what to test and how to keep tests in sync with real application behavior.
While exploring Keploy, I wanted to understand whether testing could be driven by how an application is actually used in real time.
This blog is walkthrough of what I built, what Keploy captured and what I learned in the process.
What Keploy is and What it Actually Does?
So Keploy is a testing tool that generates test cases by recording real API traffic and then replaying it later to detect application behavior changes.
When an application receives HTTP requests from a browser or frontend, Keploy observes the request and its response. These real interactions are stored as test cases. Later, during replay, Keploy send the same requests again and then compare the new responses with the recorded ones.
In simple terms:
Record mode captures how the application behaves in reality.
Replay mode verifies if the application still behaves the same way.
The Project I Integrated Keploy With
To understand the working of Keploy, I chose a minimal backend service instead of a complex application.
The goal was to clearly observe how Keploy captures traffic and turns it into test cases.
I built a small Flask-based API with three endpoints:
GET /health - a basic health check to verify the service is running
POST /items - to add an item
GET /items - to retrieve all items

The API works with JSON data and stores items in memory. This was intentional. Using in-memory state helped reveal interesting behavior during test replay.
Running Keploy in Record Mode
Once the service was ready, the next step was to run it through Keploy in record mode.
Keploy runs alongside the application rather than inside it. So, I started the Flask application using Keploy’s CLI in record mode.

This is what happened during recording:

Keploy started the application process
It positioned itself between the request-response path
Every incoming request and outgoing response was observed and stored
From my(user’s) POV, the application behaved exactly the same. Noting felt different. Requests were sent the usual way through browser or PowerShell and the application responded as expected. But Keploy was silently watching in the background.

This separation between the application and the testing is important. It means Keploy can be added to an existing application without modifying the application code or introducing test-specific logic.
Capturing Real Traffic and Generated Test Cases
Once Keploy started running in record mode, I sent POST requests to create new items and GET requests to fetch them back.
For every request, Keploy captured:
the HTTP method and endpoint
request headers and body
response status code
response headers and body
Once recording was stopped, Keploy generated few test case files in YAML format. Each file represented a real interaction with the API, including both the request and the expected response.

These test cases were direct representation of how the application had actually been used. That made them easy to understand.

and because the demo application had no external systems such as databases or third-party APIs, Keploy generated only test cases and no mocks. This clarified that mocks are created only for external dependencies, not for internal application logic.
Replay Mode and a Failing Test
After recording, I ran Keploy in replay mode.
During replay:
Keploy restarted the application
Sent the same recorded requests again
Compared the new responses with the previously captured ones.
Most of the test cases passed successfully. One test case, failed.

The failure was not caused by Keploy but because of the application design itself.
Since the API stored data in memory, restarting the application changed the internal state. As a result, the replayed response didn’t exactly match the original recorded response.

This was an important learning moment. Keploy exposed non-deterministic behavior instead of hiding it. In real world systems, externalized state and mocks are important for reliable test replays.
What I leaned About How Keploy Works
Working though this setup helped me build a clear mental model of Keploy.
Keploy doesn’t change application code or inject test logic into the code. Instead, it works as an external observer. Which captures reality first and verifies it later. The separation between the record and replay modes makes it easier to detect and understand failures because differences are surfaced explicitly rather than buried inside test abstractions.
Also, using an in-memory, stateful application highlighted why deterministic behavior matters. Keploy doesn’t attempt to fix such behavior instead it reveals it. Which is exactly what a testing tool shoud do.
Final Thoughts
This exercise helped me see Keploy not just as a testing tool, but as a way to observe and understand real application behaviour.
By generating tests from actual traffic, Keploy brings testing closer to how software is being used in real time. Instead of guessing edge cases, developers can validate real interactions and detect changes early.
Using a minimal API kept the focus on behavior rather than complexity. It made it easier to see where traffic based testing works well and how application design choices affect test reliability.
Good developer tools don’t just automate work they improve understanding. Keploy does that by turning real usage into concrete test cases which ultimately helps teams ship changes with more confidence.

