GEARS, API Testing - Waiting, Receiving, & Assertions
Waiting, Receiving, and Assertions work closely together, and there isn’t that much to say about each outside of a CHAIN & REPEAT. Because of this, it became clear that it would be better to write a single longer article about them, instead of two separate shorter ones.
Waiting
The waiting part is mostly handled by the tool, framework, or programming language, unless we add an explicit delay. The rule of thumb, by which all of them go, is ‘once you have the response from the Server, give it back’.
Responses can have a lot of information that tools and frameworks hide, or don’t present on the main windows. However, what is needed most of the time is the status code and the returned data in JSON form.
In theory, these two should be very straight forward and present no major variations. In the wild, though, things are never as they seem because of how the implementation was done, or whatever oddities the FE framework imposed on the team.
A detail worth remembering, even if the wait is for the time until the Server responds, this includes whatever it spends processing, fetching, or transforming the data (or results) we have asked for.
Most implementations take the above fact into account, but there are a few cases for which we have to keep our eyes open. Things like sorted/unsorted lists, searches, data fetches that involve a lot of tables or entities. And, the most wicked of all: (sub)structures that can grow without constraints.
Structures (or substructures) that keep adding rows rarely occur during normal system operation. However, with automation, they can start appearing here and there, since we are repeating a set of the same actions frequently. These aren’t a problem in any other sense than that the server will take longer and longer to respond.
To avoid the above, simply keep an eye on the executions and watch out for abnormal increases in response time. The usual culprit tends to be a log that is saved every time an action is carried out. For example, fail or success, or a historic record of something done by someone.
TIP: When automating anything that will fetch lists, use the paging option if available and set it low, unless the test requires large numbers. If this is not available, maybe it can be suggested as an enhancement, or spend time finding the right data that will not bring hundreds of records.
Receiving
Once the waiting is over we will get a response. Most resolve with the same concept behind the async/await of JavaScript. That is, success/failure or error. The first case only means that things went well with our request and we get the data, or not, and we get empty results, or ‘not found' messages.
Error usually means something else happened when processing the endpoint. With this, things get tricky with some implementations. We can have errors where the processing failed and this is not being informed properly. The infrastructure or server failed, and sometimes this can’t be dealt with elegantly. Or, a BE dependency failed and it wasn’t or couldn’t be handled correctly.
The difference between a 500 and a 400 shouldn’t be considered trivial, as it can mean a world of saved time when debugging. Or when seeing hundreds of alerts being triggered out of nowhere.
Most implementations know when to use either to externalize where the problem could be. Like, not finding data is within the scope of the 400s. The database not responding, should be a 500. But, the reality of development can complicate things, as we all know.
Error messages and their source can be hard to propagate in code. More so as we nest down the road to additional asynchronous calls against other endpoints, GRPC, database connections, external services, and so on. Sometimes there is a hiccup as the code does its chain of callbacks, and a database problem turns into a generic lost connection.
Of course, in testing or automation we don’t have to worry too much about any of this. Except when every oddity is a 500, and all failures in BE processing or data fetching are a plain 400. It might be a good time to ask how the team handles these, or suggest we start improving our use of status codes.
When everything works. There are no errors, we get the proper status code, and a non-empty JSON as part of the response. We have reached the part where we have to understand what the data means, what it brings, and what we can do with it. And that is the realm of ASSERTIONS and CHAIN & REPEAT.
As a final note for this part. Implementations are not free of their own idiosyncrasies. The inevitable ‘we have always done it that way’. Or the fact that the next developer(s) in a rotation will simply follow whatever it made sense to the previous group, since trying to refactor or adopt new best practices and guidelines is not always easy, or quick.
In that regard, two common pain points are the inappropriate use of status codes (or incomplete use), and the inconsistency of response structures. These two are worth looking into a bit more, since there is a high chance a particular system has them in one place or the other.
Pain Point: Status Code
A status code that does not match the actual state in which the endpoint finished. For example, we get a 200 and the response has the infamous { “error”: “server failure” }, or something in those lines.
Other examples include situations in which, for some reason, a proper code wasn’t used and instead we have to rely on the Response for hopes of getting more details of what happened, such as: 200/{ “unauthorized” }, 200/{ “error”: “item not found”} and so on.
In some cases, this is due to having to deliver something quickly. Dealing with exceptions or specific codes for them adds logic and development time. Other reason can be because the BE left the validation of things to the FE, either by agreement, by need, or because it is easier/faster.
When to rise a flag or a bug, depends a lot on the team or project agreements. It is better to ask, coordinate, and understand what is going on before attempting to change (or get a fix for) an ‘it works’ type of dependency that is used across all the system.
Pain Point: Inconsistent Response Structures
This is one we see often and it can slowdown learning about the system. There can be many reasons why this occurs so frequently, but still, it should be avoided. A common example is where endpoints in the same service return each their own structure. For example: { <transaction fields> }, { transaction: {} }, { transactions: [{}] }, and so on.
Sometimes, the reason behind why it happens can be understood. Developers want to highlight the type of structure they are sending back. But it can add unnecessary processing when having to deal with direct objects, arrays, arrays within fields and so on. In Automation Testing in particular, this can break generalization or require adding conditions to handle the exceptions.
However, what works best, once we know the different patterns of the structures we use, is to preprocess aiming toward homogenization. This can be very easy to do in JS as the language doesn’t care for the structure itself, allowing us to add or remove elements as needed. We can transform something into an array or nested structure with a single line of code.
Once the structure is standardized we can proceed with the Assertions or additional processing without covering extra cases. Only detail to be careful with, is to remember the transformation when it persists and the structure needs to be handed over. Also, keep in mind this works only in the automation code.
Postman: The use of scripts can create very interesting opportunities for preprocessing the Response. This way the tests themselves can work on a generalized structure rather than add unnecessary logic to handle (simple) variations. It can also be a great way to pass data and extra fields to the Visualization.
For example, if we are working with two endpoints, one of which returns a plain transaction {} and the other a named field with an array { transactions: [{}] }. It is very easy to transform the first into the second. That way our tests deal with the transactions array all the time, rather than handle exceptions for one or the other.
Added bonus is that whatever we find useful during exploration and experimentation can be carried over to any framework that uses JavaScript. This applies to the data that is ‘the same structure(s) presented in different ways’, there are times where even if the structure looks similar it represents other thing. This could be a bug, or simply a not ideal solution.
Assertions
Assertions are found in tools and frameworks, programming languages by themselves don’t have them. Most things JS-based will use common libraries such as Chai and Mocha for the tests and a BDD-esque looking approach (those expect().to.have.something() patterns).
The biggest problem with what we see in this part is that most testers go atomic; to the data type, and some additional attributes like length, ranges, list of allowed terms, and so on. Not completely our fault as the offer and examples in tools/frameworks is exactly over those.
One step further is to check the JSON Schema, but that is essentially doing atomic in bulk-mode. It can save a lot of repetition and code, though. By the way, when doing this, aim for the same-level fields or by substructures, that is, don’t verify the whole JSON. Doing it at block level is much simpler, and more effective when going over repeating substructures. Follow the KISS principle.
I haven’t found a better approach to this, yet. However, the rule is to get thinking and be creative. Analyze each structure and find the fields or groups of fields that hold business rules. Those that keep information. That have something that can give us insights on the state of the system and the data it holds.
Other fields to keep in mind are those that the FE assumes to have restrictions (string length, number ranges, number of items). Or those that the FE restricts and then passes to the BE. In the latter case, we want to confirm the endpoint doesn’t allow strange stuff to happen when working without the controls imposed by the UI.
There is a big ‘it depends’ here since not all Software is the same, not all tech stacks work the same way. Each solution has been implemented differently, and each business domain has its own way to look at their data.
Knowing the business helps a lot, and so does experience. However, there are common examples such as order details, line items, transactions, day schedules, logs, historic records, tracking inventory, and so on. How to test those, in a way that we are indeed verifying something, depends a lot on how an implementation presents them in JSON form.
The pattern in this case is to use a function to test the data, or a subset, and return a boolean value. This is what becomes the actual test in the tool or framework. Basically, we check that the data, or a subset, is good (as per our logic), or fail it.
Alas, this can get complex once we learn of all the exceptions that the data might be in. There is nothing to do here other than code iteratively as we learn, or keep it simple by aiming for the ‘most common case(s)’. And stay aware of when it is an ‘expected failure’ or a true issue.
Comparing two data sources that should return the same (sub)structure, is obvious, but still worth doing. Specially if we are getting that data as we go along with CHAIN. An exception is when we are sure the origin for both is the same table. And, this simple check can be doubly important when using NoSQL.
Anywhere we can use JavaScript has the advantage that JSON can be manipulated in many ways and as needed. This can allow us to isolate nested data, split it, join it, sanitize it, and everything in between before actually passing it to the check-the-logic portion of the code.
As a quick recap. Instead of atomic data checks, go with JSON Schema validations and include as many restrictions as possible (ranges, lower and upper limits, valid attributes, formats). Do it by block, cycle on the (sub)structures when needed.
With everything else, think, brainstorm with the team, with the devs, the PO (or validate your findings with them). What is worth testing of the data? What can it tell us of the state of the system? How and where can it help us discover anomalies with user input, UI data entry, FE calls, logic, transformations, and so on?
Most things can be automated as long as they are not too time consuming, get overly complex, or involve too many endpoints. Bonus, if what we code helps with getting a good check of the heartbeat of the system. Only warning here is ‘don’t over do it’. Once enough tests are in place, more will start to overlap without giving any additional value. Yep, ‘it depends’.
Postman: during exploration and experimentation, or if we use it as our main testing tool, the Visualization feature can come very handy to display important data in convenient ways, so that anomalies stand out. By using standard HTML and CSS, with the added extension of Handlebars templating, there is no limit on how to present a JSON.
The template is essentially a string that gets parsed and special fields get replaced with real data. This means that we can make it even more dynamic by aggregating things as needed. Include or exclude parts of it, process headers or columns, add defaults or descriptions depending the data that we will pass.
As much as the HTML design can be anything, the principles of KISS and YAGNI apply here. Display data that tells something. Display rows that give some form of information. Add labels and tags that make data clearer. And, don’t be shy to preprocess the JSON before passing it to the Postman visualization method.
TIP: Unless you really know that your data behaves in the same way as someone else’s, do not assume they are equivalent, or even mappable. Every solution has its own idiosyncrasies. This is one of those cases in which copy/pasting someone else’s similar code might prove more time consuming than helpful. But, yeah, 'it depends’.
***
Senior Software Engineer at BairesDev
1moEach article and part should stand on its own, but, sometimes it is better to know the whole story. So far, that narrative has already covered the following articles. Where it all began: API Testing: Overview of an Approach https://www.linkedin.com/pulse/api-testing-overview-approach-leonardo-antezana-dzokc/ The GEARS series: GEARS, API Testing: Introduction https://www.linkedin.com/pulse/api-testing-approach-introducing-gears-leonardo-antezana-t3jce/ GEARS, API Testing: Authentication & Authorization https://www.linkedin.com/pulse/gears-api-testing-authentication-authorization-leonardo-antezana-jqhmc/ GEARS, API Testing: Payload & Sending https://www.linkedin.com/pulse/gears-api-testing-payload-sending-leonardo-antezana-wn2yc/ GEARS, API Testing - Waiting, Receiving, & Assertions https://www.linkedin.com/pulse/gears-api-testing-waiting-receiving-assertions-leonardo-antezana-e4izc/ GEARS, API Testing - Chain & Repeat https://www.linkedin.com/pulse/gears-api-testing-chain-repeat-leonardo-antezana-jbjrc/