Tuesday, April 23, 2013

Component Interaction Diagrams


If you are going to work with other people to build something, you need a way to communicate clearly about what you are building.  From ages past, the clearest way to do that was by drawing pictures.  With software, we do the same thing.  Most tools, and methodologies have different techniques, diagrams, and types of illustrations that are central to their documentation approach.  I'm familiar with most of them.  Over the years, I've refined the best parts of each to construct a set of diagram that follows the Key Principles and allows me to Move Quickly In the Dark.

The first is called a Component Interaction Diagram, and the second is a Process Flow Diagram.  In reality, all of the documentation formats across the various methodologies and approaches have their strengths and weaknesses and are proposed by smart people for varying reasons.  With a Component Interaction Diagram (CID) and corresponding Process Flow Diagram (PFD) we attempt to focus the diagrams so that it provides the maximum value with the least effort and stays relevant for the longest amount of time to the widest possible audience.  Rather than try and justify the format up front, I'm going to explain the how you create them.  As we discuss each aspect of the diagrams and the guidelines for them, the reasons for each will become apparent.  If they don't, perhaps you'll find clear reasons why you prefer whatever documentation you have chosen.

Component Interaction Diagram
A Component Interaction Diagram (CID) has a singular purpose.  It is depicts the components utilized in a particular solution scope and the points of interaction between them.  How it does that, the information that can be layered on top of it or derived from it, and the variety of ways that it can be utilized are all secondary considerations.  The way in which we meet the primary purpose is what will allow us to use it to maximum advantage for the widest audiences.  So as we go through the guidelines and process of creation, consider all the downstream impacts and you'll understand why it requires such rigid precision.  You'll also uncover areas where you can forgo precision or formality, and the consequences of choosing to deviate.  In many cases, you may be perfectly happy with only reaping some of the benefits, a decision which would merit following an abbreviated process.

Okay, with the disclaimers and background out of the way, let us begin with the guidelines.

  •  A given CID should have a clear, identified and immutable scope that is independent of time.
  •  Use symbols to represent the components in scope for a given diagram.
  •  Use lines to represent interactions between the components in a given diagram.
  •  The consuming or external components are placed in the left most region in a given diagram.
  •  The persistent storage or most granular processes are placed in the right most region in a given diagram.
  •  Do not depict containment.
  •  Do not depict state, sequence, or directionality.
  •  Do not depict the flow of data or process logic.
  •  Environmental or grouping boundaries are an accepted practice.
  • Use different symbols to represent components of different types.
  • The set of symbols in use should be consistent across a given set of diagrams.
  • Symbols should not contain other symbols.

  • Every component should only exist once in a given diagram.
  • Every component should have a unique identifying label in a given diagram.
  • Classes, tables, and other structures containing state are represented as separate storage components.
  • Methods, functions, and other processing constructs are to represented as separate process components.
  • The should a minimal number of formats for lines in a given diagram.
  • There should only be a single line between any two components in a given diagram.
  • Every interaction line should connect exactly two components in a given diagram.
  • Interaction lines should not have labels, but line format may indicate classification.
  • Utilize call-outs to describe details in common language about a specific component or specific interaction.
  • Utilize note boxes to provide context for a set of components or to describe the interaction semantics.
  • Utilize note boxes to provide rationale for the approach, usage recommendations, or exception semantics.
  • Always provide a legend for symbols and line formats if there are multiple.
Since the guidelines are fairly rigid, let's discuss the intent and reasoning for the common areas of concern.

A diagram should have a particular scope independent of any particular processing state.  Ensuring that every component only shows up once on the diagram allows the diagram to serve as an inventory. By ensuring there is no state or sequencing this allows us to track completion against the inventory independent of the orthogonal or cross-cutting nature of the components.  Simply put, a component is complete for a given diagram when it satisfies the interactions present on given diagram. Enforcing that each component only shows up once allows for an accurate depiction of multiple dependency and ensures that polymorphic or iteratively developed components and functionality libraries are properly decomposed.

Adding state, sequence or flow to a diagram requires the introduction of time which modifies the scope. The nature of state, sequence and flows means the diagram would  not have a clearly delineated scope. Introducing time information to a CID requires that the user understand the particular pre- and post- conditions to validate the scope of the diagram. This will inevitably complicate the diagram, often requiring multiple diagrams and that the audience makes assumptions about the nature of the components or interactions.  All of these side effects will allow a single diagram to have multiple interpretations which can all appear accurate.  For these reasons and others this information should be represented separately in a Process Flow Diagram using the symbols and components from this diagram.

For a particular set of diagrams to be consumed easily by a variety of audiences, the information needs to be presented consistently.  Therefore positioning components consistently within a diagram provides the ability to perform comparisons between diagrams and to follow interactions through symbols across different diagrams.

Components on a diagram need to stand independently so that the their attributes can be granularly managed for inventory, tracking, and validation.  Containment makes calculation and attribute management very difficult and can introduce artificial assumptions about scope.  Decomposition becomes unwieldy and harder to validate with the introduction of containment.  The use of bounding boxes or shaded backgrounds for grouping is an accepted alternative but should be used sparingly to reduce complexity and ease consumption.

The two major classifications of components that are appropriate on a CID are components for storage and for processing.  Decomposing processing components are usually self-explanatory with the only challenge to find the appropriate level at which to stop.  Reasons to decompose below the assembly or interface boundary include tracking the contributing teams, the use of different skill sets, or when implementation is iterative.  Consider that every decomposition increases the barriers to construction and consumption.

Deciding which storage components to decompose can be challenging. As a general rule of thumb, only persistent or shared data structures need to be present on the diagram as storage components.  For example, a class that is used to transmit data across an interface isn't appropriate because it is transitive (not persisted) and is only accessed by the components independently.  Alternatively an in-memory class that manages thread state and is monitored by a controlling process is shared but not persistent and therefore still meets the criteria for placement on the diagram.  A file which receives log updates or a table which is updated as the outcome of a process are both persistent and therefore meet the criteria for use on the diagram.  In any case, all data structures for which you desire tracking can be included. Transitive structures are strongly discouraged because of the complications involved with fitting them into the diagram.  Again, the trick is to balance the desire to track at the most granular level against the ease to construct and consume these diagrams.

Since my examples are generally scrubbed for particular purposes you'll just have to contact me if you'd like one.