|
|||||||||
|
|||||||||
According to The Mythical Man-Month, the essence of software is a construct of interlocking abstract concepts.1 Complexity, conformity, changeability and invisibility are inherent properties of the essence of software. Given the radical differences in software development in comparison to other engineering disciplines, complexity and changeability assume greater importance. Even the process of creating software makes it prone to complexity. Two main sources of the complexity are functional and structural changes that surface as the system is developed.
The Theory of Inventive Problem Solving (TRIZ) is a large collection of empirical methods discovered and invented through comprehensive studies of millions of patents and other inventions for problem formulation and possible solution directions.2 One of the pillars of TRIZ is the quest for ideality. TRIZ forces problem solvers to define the ideal system, which is defined as function achieved without resources or harm. It is extremely useful when it is known what function the system being designed needs to perform. This is fairly easy for a system whose functionality, once defined, will not change – a hardware system is an example.
By contrast a software system is developed in an evolutionary way – no one knows the final fine-grained functionality of the system upfront. A company starts with one idea and the system typically ends up looking like something else. In this scenario, what is the ideal system? Should the structure of the system be looked at rather than just the function?
Many different metrics exist for measuring software complexity. Researchers have tried to measure software complexity from the code complexity perspective as well as coupling and cohesion point of view, or even the disorder in the code defined as software entropy.3 There is work studying relationship between software complexity and software reliability in literature.5,7 The relationship between software complexity as an interacting process of coupling and cohesion has been studied explicitly.4 The system complexity estimator (SCE) and system change impact model (SCIM) quantify the structural complexity of the software system.16
The author proposes that the ideality of system structure is just as important a constituent in the design of a system as achievement of function, perhaps reflecting the wisdom of ancients, who proclaimed, "It is not only the end but the means as well that matters." By "means," the author is referring to the structure of the system. The author further proposes that ideality should be congruent with simplicity or the least complex software system. This article describes experiments with this line of thinking in a specific scenario – development an identity security software system.
In a software system, complexity emanates from the unstructured nature of the software, the gap between actual requirements and those specified, the gap between requirements specification and design, and the gap between design and implementation (the actual code written).6
The structure of the software system is composed of multiple elements joined together to provide system-level functionality. The elements can be functions, independent modules, procedures, classes and objects. Their interaction is based on the content that they transfer to each other while the software system is executed. Content may be simple data, data structure, control information, functions or programs. Standard design practice dictates that such coupling should be minimized. Further, each element of the software system should be as cohesive as possible. Complexity emanates from a lack of cohesion in each module and the strength of coupling between various modules.9,10
Modularity is central to the design and development of software. Modular systems incorporate collections of abstractions in which each functional abstraction, each data abstraction and each control abstraction handles a local aspect of the problem being solved.9,10 The coupling-cohesion criteria mean that the system is structured to maximize the cohesion of elements in each module and to minimize coupling between modules.
Coupling refers to the degree of interdependence among the components of a software system. Good software should obey the principle of low coupling. The cohesion of a module is defined as a quality attribute that seeks to measure the singleness of purpose of a module. Cohesion seeks to maximize the connections within a module. Composite module cohesion has been quantitatively defined.11
A measure of complexity called software entropy takes into account the disorder in the code.3 The disorder in the system depends upon the lack of cohesion in the modules, level of coupling between modules and complexity of the modules. The Software Engineering Institute (SEI) offers the maintainability index (MI), which states that a program's maintainability is calculated using a combination of widely-used and commonly-available measures.8,12 Taking a cue from social network analysis, one source defines the system complexity estimator as a measure of complexity of the system.16 This is an integrated metric that combines coupling and cohesion of various components of a system.
Software system complexity is defined as a measure of the non-cohesion of a system's constituent modules and the interdependencies of modules.16 This is closer to the design guideline of minimum coupling and maximum cohesion.9,10 The SCE computes overall complexity of the system using the centrality measures typically used in social network analysis to identify the relative importance of different actors based on their connectivity with the rest of the network.6,13
The SCE starts with the definition of an ideal software system: "A system with completely independent elements (modules) where each module performs a single function is the least complex architecture – this is the ideal architecture for a system. In such an ideal architecture/design the system complexity is minimized." Ideally a module should perform only a single function.
Further, the SCE identifies two levels of complexities – one at the element/module level and another at the level of interdependencies among elements. Non-cohesion is measured by the cardinality of functions performed by each module. The more functions a particular module performs, the less cohesive it is. The second level of complexity is the interdependencies among system elements. There are two kinds of interdependencies: 1) how much the module depends on the system for its functioning and 2) how much the system depends on the module for its functioning. As an example, the dependency matrix in Table 1 describes a four-module software system.
Table 1: System Dependency Matrix | ||||
| A1 | A2 | A3 | A4 |
A1 | 1.0 | 0.5 | 0.0 | 0.0 |
A2 | 1.0 | 1.0 | 0.8 | 0.0 |
A3 | 0.0 | 0.5 | 1.0 | 0.0 |
A4 | 0.2 | 0.0 | 1.0 | 1.0 |
Once the dependency matrix (D) and the number of functions performed by each module are obtained, a system complexity matrix (X) can be constructed.2 Each element of X, i.e., xij is computed as:
xij = dij x Hj
where dij is the ith row and jth column element of matrix D; Hj is the non-cohesion of module j, which equals the number of functions performed by j; xij is the ith row and jth column element of matrix X. The system complexity matrix (SCM) for the example above is shown in Table 2.
Table 2: System Complexity Matrix | ||||
Non-cohesion | 3 | 2 | 1 | 6 |
| A1 | A2 | A3 | A4 |
A1 | 3.0 | 1.0 | 0.0 | 0.0 |
A2 | 3.0 | 2.0 | 0.8 | 0.0 |
A3 | 0.0 | 1.0 | 1.0 | 0.0 |
A4 | 0.6 | 0.0 | 1.0 | 6.0 |
The overall system complexity (W) is the sum of all elements of matrix. Table 2 totals 19.4. To find out the relative contribution of each module to the overall complexity, take into account the dependencies in more detail.
There are two kinds of dependency mapping to compute the two corresponding indices. These indices are called module dependency on the system index (MDSI) and system dependency on the module index (SDMI). The corresponding element of the normalized eigenvector corresponding to the principal eigenvalue of the SCM gives the MDSI for the respective module. Similarly, the element of the vector obtained for the transposition of the SCM gives the SDMI. The MDSI for the X matrix is given in Table 3. Similar computations for the transposing of the matrix will lead to SDMI – these values are shown in Table 4.
Table 3: Normalized Eigenvector Corresponding to Principal Eigenvalue | |||||
| A1 | A2 | A3 | A4 | MDSI |
A1 | 0.45 | 0.25 | 0.00 | 0.00 | 0.18 |
A2 | 0.45 | 0.50 | 0.29 | 0.00 | 0.31 |
A3 | 0.00 | 0.25 | 0.36 | 0.00 | 0.15 |
A4 | 0.09 | 0.00 | 0.36 | 1.00 | 0.36 |
Table 4: Relative Contribution of Modules to System Complexity | ||||
| MDSI | SDMI | Average (r) | Module Complexity (r x W) |
A1 | 0.18 | 0.34 | 0.26 | 4.97 |
A2 | 0.31 | 0.27 | 0.29 | 5.66 |
A3 | 0.15 | 0.19 | 0.17 | 3.34 |
A4 | 0.36 | 0.20 | 0.28 | 5.43 |
Table 4 shows that the average of MDSI and SDMI gives the relative contribution of a given module to overall system complexity in percentages. Multiplying these percentages (r) by the overall system complexity (C) results in the module complexity. Modules A1, A2, A3 and A4 contribute 4.97, 5.66, 3.34 and 5.43 to the overall complexity, respectively.
A radar plot or Kiviat chart showing the complexity contribution of each module provides a system complexity map of the software design, which is useful in finding complexity imbalances in the software system. The SCE-SCIM framework has been used to describe an approach for a robust inventive software design, which combines the analytic hierarchy process (AHP) TRIZ and DSM as an integrated framework.14,15
Akiva is a data scrambling tool to mask enterprise database applications. It is designed to "de-identify" personal and sensitive data required for use in a variety of situations such as software development, implementation and testing and outsourcing. It allows the creation of disguised copies of production databases and provides realistic and fully functional databases without compromising privacy, and offers an additional level of data protection beyond firewalls and encryption. Its main features include:
The algorithms that Akiva implements include:
The web-based application has been developed using the J2EE framework. The masking algorithms have been implemented using PL/SQL procedures in Oracle.
The existing Akiva design has an estimated 8,000 lines of code, with 18 modules performing a total of 54 unique functions; the average number of functions per module is three. The SCE established that the system complexity was 88.7. (The system complexity map is shown in Figure 1.)
|
The blue area highlights the complexity imbalance created by the module "Masking," which contributes the most to the overall complexity. In the ideal system, the complexity should be equivalent to the number of functions being performed, which is 54.
The team brainstormed to look at alternative designs for minimizing the system complexity. Three alternative designs evolved as shown in Figures 2, 3 and 4. The first has 22 modules, the second has 39 modules and the third has 42 modules.
|
|
|
As the number of modules increased, the functions per module decreased – 3, 1.3 and 1.2, respectively. Compared to the existing design, this was definitely an improvement. However, overall complexity of the product in all three design options actually increased because of the increased number of couplings among various modules – to 102, 174 and 155, respectively.
The team brainstormed further to look at ways and means of reducing the coupling. It hit upon the idea of a router, which was suggested earlier during the discussions but not pursued. In networking, a router is a device (usually hardware but sometimes software) that determines where a given packet should head next, based on its knowledge of the state of the network at any given moment. By applying this concept, the team derived a design (shown in Figure 5) with 36 modules performing 45 functions, which equals 1.3 functions per module. The overall complexity decreased to 81 from 89 in the original design. This is a much cleaner design and easier to maintain.
|
The most useful result, however, is a reduction in the lines of code from 7,964 to 3,866 – more than 50 percent. Table 5 shows the lines of code in the modules of the original design and those in the final evolved design.
Table 5: A Comparison of Lines of Code | ||
Algorithm | Lines of Code | |
| Before | After |
Blank out | 562 | 109 |
Replacement | 575 | 122 |
Generic shuffle | 1,435 | 211 |
SSN generator | 1,812 | 542 |
Scrambling | 2,453 | 974 |
LUHN | 1,127 | 168 |
Log | 0 | 150 |
Single field router | 0 | 1,590 |
TOTAL | 7,964 | 3,866 |
DIFFERENCE | - | -4,098 |
The table below summarizes the system complexity analysis of existing, alternatives design options and final evolved design. The final evolved design is not only closer to design guideline of highly cohesive modules but also coupled to the optimal need.
Table 6: A Comparison of Size Design Evaluation | ||||
| Number of Functions | Number of Modules | Average Functions/Modules | Complexity |
Existing Design | 54 | 18 | 3 | 88.7 |
Design Option 1 | 66 | 22 | 3 | 102.3 |
Design Option 2 | 51 | 39 | 1.3 | 174 |
Design Option 3 | 51 | 42 | 1.2 | 154.6 |
Final Evolved Design | 45 | 36 | 1.3 | 81.2 |
Figure 6 plots system complexity for all of the designs. The evolved design of Akiva has least complexity. The bubble size of each design option indicates the average cohesion as defined by the number of functions per module.
|
Using the ideality concept objective from TRIZ, the author proposes that it makes more sense to look at structural ideality than just achievement of function alone for software systems. The system complexity estimator was used for evaluating various design alternatives to evolve to a final software system that was closer to ideality. This approach not only produced a more robust and maintainable software product, it reduced the code size by more than half. This is a highly desirable result since the demands on software development productivity are intensifying. Further, the SCE framework can be used to minimize the complexity of other non-software products.
Note: This paper was originally presented at The Altshuller Institute's TRIZCON2008.
Navneet Bhushan is the founder / director of an innovation co-creating firm, Crafitti Consulting Pvt Ltd. He has worked close to two decades in managing and developing IT, innovation and productivity solutions and has worked in large commercial and government organizations. He is the principal author of Strategic Decision Making - Applying the Analytic Hierarchy Process published by Springer, UK, 2004. His current research interests include complexity, open innovation and globalization. He is a visiting faculty member at Welingkar School of Business Management. Contact Navneet Bhushan at navneet.bhushan (at) crafitti.com or visit http://www.crafitti.com.