How can we entice the developers of applications involved in a workflow process to implement them in a flexible fashion?
In particular, how can we design the data that flows between the workflow applications in a way that the developers must implement the applications in a flexible fashion?
An airline schedules future flights by assigning the following data:
The airline has three corresponding applications for computing the data:
In the past, business requirements dictated that flight scheduling occur in a certain order: the route was scheduled first, then the plane was assigned, and lastly the plane was staffed with crew:
An XML document would flow from one application to the next. Each application was responsible for "filling in" its part of the flight data. For example, the crew scheduler would examine the XML document that it received from the plane scheduler to see what plane is being used, and given that data it would look inside its database to determine what crew has experience with that airplane.
To support the business requirements, the data is structured in a nested fashion. Here's how the XML document looks after all three applications have inserted their data:
<?xml version="1.0" encoding="UTF-8"?> <Flight_Schedule> <Route> <!-- data about the route that will be flown --> <Plane> <!-- data about the plane that will fly the route --> <Crew> <!-- data about the crew that will staff the plane --> </Crew> </Plane> </Route> </Flight_Schedule>
The crew data is nested inside the plane data, which is nested inside the route data.
When the plane scheduler receives the XML document it fills in its data inside the <Route> element. Likewise, when the crew scheduler receives the XML document it fills in its data inside the <Plane> element.
Suppose that as the XML document is flowing through the applications we intercept it at some arbitrary point. Given the way the data is designed, we can be certain that:
Thus, the nested data design imposes dependencies on the data, and the order in which the data is filled in: the crew data is filled in after the plane data, which is fill in after the route data.
The data design is imposing an order on application processing!
Data dependencies can result in application dependencies!
If I am the application developer for the crew scheduler, then I may assume that the input document will always contain route data and plane data. So, I hardcode my application to assume that all data (route and plane) is available to make my decision on crew staffing. Making a decision on crew staffing based on partial information — such as knowing the route but not the plane — is ignored.
Nested data designs can lead to inflexible workflow applications.
New business requirements have led the airline to require its applications be more flexible. For example:
Applications must be able to make decisions based on imperfect or incomplete information.
Rather than a rigid sequence of applications, the airline desires that application processing occur in any order:
To avoid data dependencies, and thus application dependencies, keep the data flat:
<?xml version="1.0" encoding="UTF-8"?> <Flight_Schedule> <Route> <!-- data about the route that will be flown --> </Route> <Plane> <!-- data about the plane that will fly the route --> </Plane> <Crew> <!-- data about the crew that will staff the plane --> </Crew> </Flight_Schedule>
Make each element — <Route>, <Plane>, and <Crew> — optional. In doing so, each application will not be able to assume what will be the in its input, and must necessarily be designed in a flexible fashion.
Remove the dependencies in the data, and it will increase the flexibility of the applications!
How you design your data can have a big impact on the flexibility of the workflow applications that process the data.
Data with dependencies will likely produce inflexible workflow applications.
Hierchically nested data designs impose dependencies on the data. Avoid nested data designs for workflow processing.
Data without dependencies will likely produce flexible applications.
Flat data designs impose no dependencies on the data. Use flat data designs for workflow processing.
Last Updated: February 11, 2008