The IT Manager’s Quest for Sleep
This is not a story of vagaries and euphemisms. Unfortunately, this story is also often-told. This is a story starting with anguish and suffering. It may end with a happy ending, it may not. But it’s all about you – The IT Manager.
You know the story. For all we know, you may be in the midst of this story. As a development manager or IT manager, you get the late night call. Job “XYZ” is running too long, and impacting multiple jobs. In a bank the dreaded words are “we cannot post.” Other enterprises have different phrases or different events (e.g., “we cannot close the month”) but in each case these words elicit a sense of dread. You’re not going to get any sleep, you’re going to have to rouse a legion of people from a sound sleep, and you’re looking forward to a litany of “root cause” meetings and incident reports in the coming days. What happened? You just potentially experienced:
- Unexpected downtime, possibly caused by human error (i.e., one of your staff)
- Your operations staff complaining they have no control over production processing, making recovery difficult
- You just got a Monday morning “surprise” due to weekend errors
- No one was notified about delays and errors in a mission-critical job stream
For the rest of the story … Full text