Streamlining requirements engineering & requirements documentation with NLP & AI
Natural language processing (NLP), a specialty field of artificial intelligence (AI), is being applied on a growing palette of use-cases, from making legal or medical research more efficient, automatically extracting & retrieving information and insights, to simplifying the daily life of people behind the curtain.
NLP is a subfield of AI that can be applied anywhere natural language text occurs. In recent years, NLP has undergone a decisive transformation through a family of new algorithms called transformer models, among them the famous algorithm called BERT (Bidirectional Encoder Representations for Transformers), which combine some of the best features of previous neural network architectures into a composition of several neural networks working together, bringing out their best characteristics and making up for their shortcomings, rendering some of the most unprecedented results in language technology.
The engineering focus area of requirements engineering, which involves the management of engineering requirements, is mostly a manual task that can be tedious, endless, and prone to errors. This impedes project velocity by forcing people to invest too much their project time into demanding quantitative tasks, instead of focusing their efforts into the less time-consuming qualitative tasks that serve as a much better measure to mark a project’s progress.
In this article we will discover how NLP can be effectively applied during the design phase of requirements engineering and how it can save businesses many hours that can be allocated to other tasks, increasing the velocity of projects, and mitigating the risk of failure.
Transformer models in NLP, such as mentioned BERT, Google’s GPT3 or Nvidia’s T5 are the state of the art of machine learning for NLP, kicking off what is described by some as “the golden age of NLP'', being of similar importance to how ImageNet elevated Computer Vision to the next level. Natural languages are characterized by multiple types of ambiguity, teaching a machine how to parse them is therefore a challenging task.
Tonality, several distinct kinds of syntactic or terminological ambiguity, unclear semantics on the sentence level, lack of structure, scientific collocations, are all factors that a computer cannot comprehend. Around 80% of the available data on the web is unstructured data (in other words natural language text), a goldmine of information and untapped potential. To utilize this information, NLP algorithms are needed to extract the data, structure it, and identify semantics to gather information & insights from text data. NLP is the programmatic component of a larger field called computational linguistics. Computational linguistics is also derived from traditional linguistics (the study of language), and computer science.
Requirements engineering (RE) is a field concerned with defining, documenting & managing engineering requirements during the design steps of engineering projects, commonly used for systems- or software engineering endeavours. This includes requirements databases that can be managed by tools such as IBM Rational DOORS or text files such as specification documents, the latter being usually written in unstructured, natural language text. There is a combination of unstructured, semi-structured and fully structured documentation. Requirements can be contextually related to one another, as in one requirement may further specify, block, or enclose another requirement.
There are also different requirement types: They could be specified as software requirements, system requirements, stakeholder requirements or test requirements. There are often links between requirements of varying types, as in a software requirement can be linked to a respective test requirement, for testing the software requirements for instance. Establishing meaningful connections between these requirements is essential for the system or software being designed, but also for the project’s velocity and productivity.
Connecting requirements is usually a human endeavour, albeit a time consuming & frustrating one. There is also much room for errors, even for domain experts linking requirements, especially when labelling the relations between them. Assigning relations (such as ‘blocks,’ ‘encloses’ or ‘specifies’) to pairs of linked requirements is not always straightforward and is known to be a cognitively challenging task. The lines between relation types can be blurry and it takes the combined effort of several domain experts to guarantee a high labelling standard in managing and linking requirements semantically.
In supervised learning this is considered a labelling task, which is not necessarily known for straightforwardness, as the labelling choice of how to label a specific entry is not straightforward. Even experienced domain experts may run into decision-making issues during the labelling process. Assigning the task to only one domain expert would potentially reduce the shape of the ground truth data to the subjective judgment of a single person, due to our human cognitive biases we may not be aware of . And that is not what we want when conducting knowledge & information management of sensitive engineering data. When managing information and knowledge like this, teams may run an elevated risk of distorting information that needs to be precise, clean, and reliable enough to be used for training a machine learning model for instance.
For engineers it is essential to have a precise knowledge of all the requirements for implementing a system or software. But this can be very cumbersome. Requirements documents are extensive, complex sources of information. Wouldn’t it be much better to have some helper system which can parse all those documents and create a semantic representation that is both machine-readable and at the same time more digestible for project members? The result will certainly still have to be checked by domain experts, but far less time will have to be invested into managing requirements altogether. This can save a lot of people a lot of trouble. But then again, creating such a solution is a challenge of its own.
In NLP, named entities in text (like people, organizations, products, etc. or other domain-specific variants of entity types) are found by training a model on a task called NER (named entity recognition). These named entities are then connected by training another NLP model on a task called Relation Extraction, one of the core challenges of NLP. The two connected entities alongside their relation can then be stored as an information triplet, by storing the entities as nodes and their relations as edges/vertices. This technique can be used to bootstrap a graph representation of information found within text and transform information represented in textual form into a graphical representation, an end-to-end approach with minimal human intervention.
NLP usually applies the approach on the term level, for example creating the triplet Tim Kook (PERSON) – CEO_of – Apple (ORG) from an example sentence ‘Tim Kook is the CEO of Apple.’ The tags ‘PERSON’ and ‘ORG’ are automatically assigned to the nouns within the sentence by an NLP model that has been trained to perform named entity recognition with given labels.
In requirements engineering however, our problem set looks different: We often deal with entire sentences being the smallest possible information entity, usually a brief textual description of a software- or system requirement. For many use-cases thereof, we do not want to decompose the sentence.
Although it makes sense to tag every word in the sentence with its linguistic features with an NLP system, for requirements engineering it is about setting up connections between these sentences, or requirements descriptions, which is a fundamentally different task from a machine learning point of view. Therefore, a lot of the of the standard approaches in NLP are not going to help us get the curve.
By using transformer models, we can use a pretrained model that can quickly get a quite comprehensive understanding of the input text, which can be also used for inference on domain-specific text as found in requirements engineering or legal documents, to predict a numerical representation of the incoming text. These numerical representations, called embeddings, another word for vectors, can now be used for more advanced AI tasks.
My findings at Itemis AG have shown that even for many domain-specific scenarios such as requirements engineering, the models seem to perform quite well in representing the semantics of the input data.
A defining characteristic of transformer models though is that they can be further fine-tuned for more specific downstream tasks in NLP, such as question answering used in chat-bots and search, text summarization/generation or next-word prediction or machine translation. This approach of adding a more task-specific supervised stack on top of the output of a more task-agnostic unsupervised stack is known as transfer learning.
In this case however, the more specific downstream tasks known to be used with transformer models cannot help us directly with doing what we intend to do. We also cannot create a preliminary model for named entity recognition, as we are dealing with entire sentences treated as single entities (as each of these sentences are descriptions of a specific engineering requirement), a quite unusual way of representing data or information within NLP as we know it.
What could be done is to develop a new type of standalone classifier for instance, which can utilize the data representation inferred by the pretrained transformer model on our own custom-defined downstream task handled by a different, custom model instead of a fine-tuned transformer model. This will give us more decision-making power regarding the nature of the model. All we need for doing this is the output of the transformer model we are using on our input data, as an input to our own model.
As mentioned, transformers create word embeddings (or word vectors) to represent words in the input text with a state-of-the-art dynamic approach that can be mapped into vector space. But we want to represent entire sentences as vectors instead of words. It turns out that we can use an approach to create a summary sentence vector representation from all word vectors within a sentence to accurately represent the sentence, in this case an engineering requirement, as a vector representation that can be repurposed for many future deep learning endeavours.
One of the key innovations with models like BERT is that instead of randomly generating initial vectors for every input instance and using these as input to train a classifier, we create a much more accurate preliminary, trained representation to be used for training a new classifier, improving model performance at inference time. Transformer models are also more dynamic and sensitive to context compared to the preceding deep learning models such as RNNs, and the input embeddings have a more elaborate architecture that store more information about the text they represent, compared to word2vec’s static embeddings, which is why these vectors are commonly referred to as contextual word embeddings.
Once the system has computed the embeddings for our requirements texts, we calculated semantic similarity scores between each vector pair, by combining every vector with every other vector except with itself.
Assume we have a database with ten thousand requirements. Without AI, engineers would have to manually connect or group requirements together by hand. Now we have a system that computes the embeddings for each requirement and by means of efficient matrix multiplication, stacks the embeddings, and computes a tensor representation, which holds the semantic similarity scores for all combinations of embedding pairs.
Just think about combining ten thousand unique engineering requirements together pairwise, this entails arduous, time-consuming manual labour. It is impossible for a human to connect all requirements together and then calculate similarity scores for each pair together retrospectively, to decide which requirements may or may not belong together contextually, within a matter of minutes, and then rank them based on the scores!
As mentioned, we have learned that we could represent the output similarity scores as tensors, by using pytorch’s tensor data structure and then extracting the related values to each pair and store them as unified information pieces. The individual scores for each requirements pair are kept as single-value tensors to enforce numerical stability. When type-converting the scores to normal numerical types such as a float, high-precision values may otherwise experience numerical underflow.
By using robust Pytorch tensors, we can keep our calculations robust. In the case of exceptionally large requirements databases, calculations may take a while depending on the data, so we store our vectors to only calculate them once, on a cuda-enabled GPU if necessary. The NLP system we are building, which so far makes use of a conglomerate of AI algorithms across its system components, also seems to perform well on a CPU, but that depends on how much client data it needs to process.
It’s an exciting journey, to build a system that can read all your documents and represent the key information as a digestible data structure of interconnected information pieces, like an auto-generated, interactive mind-map that could also be visualized, where things are logically connected to one another, instead of scouring through endless requirements documents just to do tons of work so that your team can start with the actual project tasks in first place.
If this article has piqued your interest, feel free to reach out to us and we will evaluate how NLP can add value to your business!
Also be prepared for our future posts, where we will talk more about NLP, including posts that are more sensible to business professionals, to draw a big picture of NLP for anyone unfamiliar with any of the mentioned engineering concepts.
If you enjoyed this article or want to share your opinion, leave us a holler in the comments section. Also make sure to smash that like button!
We will keep you posted on our current NLP endeavours.
It would be really helpful if there's a paid web service that we can call that generates a set of requirements from a list of unstructured documents.
Great article! Thanks for that contribution, Ario!