Digital Marketplaces Unleashed
Page 85
An important challenge to be further explored are the issues of analysis, scalability, transparency and quality assurance of systems that continuously drive their decisions based on statistical estimates and runtime data. In the following two Sections, we will discuss our approaches and ideas in this direction.
55.3 Get to Know Your Software: Non‐Functional Requirements for Adaptive Systems
In the previous Section, we discussed how to shift some activities traditionally placed in the design time of the software development process towards runtime. We showed how planning can be used to make a software product adhere to specified functional requirements during runtime. However, software development is also concerned with non‐functional requirements (NFRs), i. e., properties like performance, scalability or robustness, the fulfillment of which can be just as crucial to many software applications.
55.3.1 NFR Assessment
In the traditional software development process, non‐functional requirements are assessed on a per‐domain basis by respective experts. For example, for a web shop application, performance can be defined as the average time‐to‐respond to requests on the web shop’s site or the maximum load of users the web shop can serve at the same time. For factory automation software, performance might be defined as the throughput of parts that can be achieved or as the maximum time it takes to employ a desired change of production in the factory setup. Even in these quickly sketched examples it becomes clear that there often is an intricate trade‐off between multiple non‐functional requirements: A factory may be able to improve its reaction time to changes by keeping a relatively large amount of equipment in a state of stand‐by; but this decision may in turn diminish the throughput the factory is able to provide when no change is required at all.
Thus, even though non‐functional properties are often sufficiently general in their definition so that they apply to a wide array of systems built for various different functional purposes, the exact measurements of interest need to be suitably engineered to match the problem at hand [8]. To this aim, the product owner (or whoever is responsible for the requirement specification in the employed development process) typically defines a series of use cases related to non‐functional properties. These could read like this: “When 50 or less users are putting an order into the web shop at the same time, the maximum response time for the server to any of these users must be at most 500 milliseconds.” (Performance)
“When orders to the smart factory have not changed for more than one hour, the throughput should be at least 100 items per hour.” (Performance)
“When a machine in the smart factory crashes, it should take at most 15 min for the factory to reconfigure so that the faulty machine is compensated for.” (Robustness)
As the number of specific use cases increases, it becomes increasingly probable that there exist situations where conflicts between different requirements may arise. A prioritization of the defined use cases is thus an intrinsic part of a specification of non‐functional requirements.
In today’s software, these kinds of engineering decisions are made by experienced developers with detailed knowledge of the working domain of the software and the likeliness of various environmental events and thereby resulting adaptations. In future software applications, like industry 4.0 setups, e. g., decisions of similar magnitude may be imposed onto some online adaptation mechanism: As intelligent software may deem it necessary to re‐organize a smart factory’s configuration to respect a change in functional requirements, the importance and meaning of several non‐functional requirements defined for the previous system configuration may change significantly. As a consequence, the process of NFR engineering, i. e., ensuring the fulfillment of non‐functional requirements, needs to be performed at runtime or–more specifically–any time autonomous adaptation may occur.
55.3.2 Building Models for NFRs
This argument presupposes a strong form of adaptation which may among other things introduce structural changes to the system and may be based on goal changes completely unpredicted at design time. For simpler systems, which can only make very limited decisions, i. e., adapt in few dimensions statically specified at design time, it is often possible to prove NFR properties statically at design time by simply iterating over all possible system configurations that may occur. In this case, NFR engineering can also be performed quite similarly to how it is executed for classical, non‐adaptive systems. However, for modern applications like industry 4.0 settings, checking against all possible configurations a system may adopt during its runtime is not feasible.
Nonetheless, any small change in a system’s configuration may have a detrimental impact on non‐functional properties. Thus, for any configuration in the decision space of the autonomous software and for every NFR of interest for the system under test, we can get a different point of measurement when assessing to what degree a specific NFR is fulfilled for a given configuration of the system. When performing autonomous adaptation in the system under test, we thus need to consider that every small adaptive change can also cause our NFR measurements to yield different values. The resulting relation between system changes and consequential degree of NFR fulfillment is also called the NFR landscape of a given system.
Most non‐functional properties are hard to predict analytically without actually running the software. The availability of a powerful simulation tool is thus of central importance to be able to test and measure non‐functional properties without threatening the effectiveness of the real system at work. Fortunately, the need for high‐accuracy simulations of industrial hardware has been recognized in various areas of engineering and adequately comprehensive simulations are (while still being subject of current research) being made available to system builders [9, 10]. Still, which configurations to simulate is yet another decision to be made automatically at runtime if the whole process of NFR engineering is to be executed online.
The main focus point of automated NFR engineering can thus be described as discovering the most relevant use cases to test the system’s current behavior against. The term “most relevant” in this scenario is a bit tricky to define adequately, though. Simply speaking, we are mostly interested in all extreme cases, i. e., use cases where non‐functional properties are fulfilled to the least extent and those use cases which yield the best result with respect to non‐functional properties; but we also want to achieve a relatively good coverage of all possible cases, i. e., we also want to look for exemplary cases which encompass a lot of different scenarios. For the first part, assessing NFRs is then not unlike any standard optimization problem and can be tackled using the same techniques such as meta‐heuristic search [11] employing, e. g., evolutionary algorithms [12]. For the second part, we need to adequately estimate which regions of the search space of test cases might be of more interest than others but then search them in balanced manner so that the search process does not converge towards just a few regions (or even only one single region) of interest. It is thus important to decide the value of various test cases not only based on how bad the system under test handles them but also with respect to how many other test cases with similar outcomes there are, and how different their setup is from other test cases that may have already been found when producing a test suite. Certainly, coming up with a proper metric that respects all of these aspects is one of the key steps towards more automated NFR engineering that is suitable for autonomous systems.
In any case, more resources spent on NFR evaluation are likely to improve the results of automated NFR engineering as the process of automated NFR engineering as described above can be likened to a probabilistic search algorithm on the NFR landscape. However, computation time is in most cases a very limited resource and especially so when we ne
ed to (re‑)evaluate NFRs online while continuing the system’s normal operations. It is thus beneficial to guide the search through the NFR landscape towards regions that are more likely to become relevant with respect to the system’s actions and their planned outcomes, respectively. To do so requires a tight integration of online planning and NFR tests and checks with NFR engineering into a system’s online adaptation cycle [10, 13, 14]. However, it allows the developers to continuously monitor and update the degree of fulfillment for NFRs in an ever‐changing system.
55.4 Quality Assurance in Smart Factories
The vision of a smart factory is to have a production plant which organizes its production processes on its own. In doing so, the production processes are automatically adapted to current production orders, resource availability or other changing requirements. For example, in case of a breakdown of a machine the system re‐organizes itself automatically in such a way that the work‐pieces are redirected to another machine, and the production does not have to be stopped. Another use case is, that the system adapts its production dynamically to the current production orders. This means, that after recognizing that the demand for washing machines, e. g., has increased significantly, the system stops producing dish washers and re‐organizes itself so that washing machines can be produced. In contrast to classic manufacturing factories, the system makes this decision autonomously without any human interaction.
This “smartness” is made possible by the fact that the process flow for the production of a work‐piece is not explicitly fixed in advance. Instead, a kind of “recipe” is given to the system, which just lists the necessary steps, but not the explicit machines or stations where the single actions have to take place at. The system then decides at runtime how to put these “recipes” into practice. The advantage of this approach is that—for these decisions—the system can take the current situation into account (e. g., current production orders or current resource availability), and can thus optimize the process flow with respect to these matters.
In other words, the way of producing a workpiece is not anymore limited to explicitly predefined process flows. Actually, there are not even limitations regarding the production of certain workpieces as long as the corresponding “recipe” can be put to practice using the system’s components.
One example where the “smartness” of a factory really can increase efficiency is the so‐called lot size one production. In lot size one productions every workpiece is produced just once, i. e., the desired properties of no two workpieces are exactly the same. Using the aforementioned “recipe”‐based approach, this means that the system just receives a list of the necessary steps for each workpiece. Based on these, the system defines autonomously—without human interaction—the explicit work flow. Since this planning phase takes place at runtime, the resulting process can be optimized with respect to the current requirements.
However, regarding the task of quality assurance in such systems, this new paradigm also poses new challenges. Especially the fact that, in theory, every possible way through the production plant using any combination of machines or components can be put to practice, is challenging. Depending on the complexity of the system and the number of the system’s components, the number of possible work flows increases exponentially. Obviously, beyond a certain number, not all of them can be tested in advance. Thus, the classical approach to test at design time to assure the quality at runtime does not work anymore. To assure the quality anyway, new methods have to be applied at runtime, which on one hand detect misbehavior immediately and on the other hand provide information about possible root causes. Based on this information, the system can be analyzed, potential weak points can be detected, and so the quality be assured.
In the next section, we list concrete challenges and corresponding requirements which have to be taken into account when developing such new quality assurance methods for smart factories.
55.4.1 Challenges
Basically, there are three main challenges when developing quality assurance methods for smart factories: Volume of data, distribution of data, and volume of possible process flows.
Volume of Data
Essential for all the “smart” decisions of the system is that the system knows as much as possible about the current state of the system, its nearby environment, the current tasks and optimization requirements. Thus, the first key challenge is to handle all this information. Although the problem is not to transmit and save the mass of data, the real challenge is to extract the relevant information. In addition, since most of the data is provided just‐in‐time by sensors which monitor the relevant units of the system, there is also the requirement of real‐time processing of the incoming data.
As a consequence, quality assurance methods in smart factories have to be able to process incoming data streams in real‐time. Otherwise, important information cannot be extracted and is not available for the evaluation of the system’s behavior.
Distribution of Data
As already mentioned, most of the data which is used in smart factories is provided by sensors which are applied to relevant units in the system. They are, depending on the spatial dimensions of the production plant, more or less widely spread. To communicate their measurements anyway, they are connected (via WLAN, e. g.) and form a network where messages can be exchanged. However, in order to avoid unnecessary communication overhead in the network, as many data processing tasks as possible should be worked on in a distributed manner. This means that not all relevant information is first sent to one central unit, which then processes all data. Instead, at least partial results are pre‐calculated by singular sensors or small sensor groups, and then put together to the end result.
Regarding quality assurance methods for smart factories, this ability to work in a distributed manner is the second requirement which can be derived by the fact that smart factories are sensor networks.
Volume of Possible Process Flows
Since the production in a smart factory is not anymore limited to explicitly predefined process flows, the third key challenge is to handle the volume of possible process flows. Especially with regard to quality assurance, this is the most critical aspect. In contrast to classic quality assurance procedures in smart factories, it is no longer possible to test all work flows in advance. To assure the quality anyway suitable methods have to evaluate the behavior of the system at runtime. In addition, these methods should provide further information which can help to detect the root cause for misbehavior of the system. Based on this analysis, countermeasures can be taken to avoid future faults, and thus assure the quality.
To evaluate the behavior of the system at runtime, it is essential to know of all allowed process flows. Since these can be very numerous, suitable compressing methods are needed to handle the volume of possible process flows. The idea of such compressing methods is that all possible work‐flows are covered while not all have to be stored explicitly. Thus, when developing quality assurance methods for smart factories, managing to construct such compressed representations of possible work flows is the third challenge.
55.4.2 A Summary on Quality Assurance in Smart Factories
Smart factories are the next generation of manufacturing. Key feature of this new paradigm is that the process flows are not predefined in advance. Instead, the system decides autonomously depending on the current state and requirements at runtime how to put the given “recipe” into practice. However, this approach poses new challenges for quality assurance. As a consequence the quality assurance procedure has to be shifted into runtime. Accordingly, in this Section we listed three main challenges and resulting requirements which have to be considered when developing such quality assurance methods which can be applied at runtime: first, the methods have to process
the incoming data streams in real‐time; second, it should be possible that the processing tasks can be worked on in a distributed manner; third, to handle the volume of possible process flows, suitable compressing techniques are required, which build an adequate representation of all possible process flows.
A summary of the aforementioned challenges and resulting requirements for quality assurance methods in smart factories can be seen in Table 55.1. Table 55.1Challenges and resulting requirements for quality assurance methods in Smart Factories
Challenges
Resulting Requirements
Volume of Data
Real‐Time
Network Structure
Distributed
Volume of Possible Process Flows
Compressing
55.5 Conclusion & Outlook
In this Chapter, we discussed potential and challenges of systems with the ability to autonomously decide about their configuration and behavior. To this end, efficient transformation of available runtime data into evaluation of alternatives is of key importance. However, this new behavioral freedom yields new challenges for system analysis, scalability, transparency and quality assurance. We propose to move classic design time quality assurance activities such as non‐functional requirement engineering and system testing into runtime in order to enable coping with highly volatile system environments and requirements.
Autonomous systems are built in order to react to changes in the requirements on a much smaller time scale than is typically necessary for humans to be involved. This feature alone enables new possibilities for products and services like, e. g., “lot size one” production or coordinating production processes with the current market value of required goods. This is enabled by human experts defining relatively abstract goals for autonomous systems to strive after instead of engineering the complex process in‐detail. When the embraced observation, control and safety techniques prove to be working for these kinds of scenarios, one can think of entrusting autonomous systems with increasingly more broadly defined tasks and thus transferring growing amounts of business logic into the system’s requirements specification. Eventually, system autonomy typical for industry 4.0 may then be a feature that does not only affect production lines but also the way management decisions are made or what is even considered a management decision in the first place.