The new Corrective Maintenance Report database

Background

We understand by corrective maintenance (CM) reporting a system that allows to track event or change-oriented interventions on systems belonging to the site's installations, primarily the instruments or data systems, by the NSA team's operators or other members of operations and operations support, and occasionally instrument mentors and other associated ARM-wide teams. While a number of tools exist to track and plan interventions relating to computer, data and communications systems as well as engineering changes and projects overall, day-to-day CM reporting, including for unforeseen outage intervention and checks regarding the ARM sites has historically been managed by each site's operational team and local tools have been developed that aim to match the particular needs and capacities of each team.

From a systems management perspective, CM reporting, being change-oriented, is no substitute for ARM-wide tools such as OSS (primarily a configuration status management & tracking system). Its main functions are two-fold:

  • Track the Who?, What?, When? and Why? for each intervention on any instrument to support troubleshooting and leave a clear and easily consultable record
  • Serve as input material for ARM-wide tracking such as OSS (a configuration management system) and DQPR (a limited issue tracking system) in order to make it easier to keep these systems up to date.

Not all OSS events will originate from the CM reporting system (some may be linked to any other project or issue management system in use within ARM), and conversely a large number of CM reports will never give rise to OSS events. Indeed, a typical CM report that is linked to an issue will arise before the issue is sufficiently well characterized to be tracked in more formal and detailed systems. In other cases, no impact on data quality or availability arose from the issue, or the question of whether such impact arose can only be answered by monitoring the status of the system or instrument for several days after the intervention.

In order to fulfill these objectives, the CM reporting system must be lightweight and easy to use, in order to encourage the local operators to provide as much useful information as is at their disposal. Usability improvements should directly translate into a larger number and better quality of available CM reports.

The existing solution

For the NSA site, a database front-end based on the commercial database FileMaker Pro was developed. This platform has been found to present a number of shortcomings, the most salient of which are:

  • Somewhat disorganized and cluttered user interface
  • Slowness, especially when accessed via the internet (in particular from Barrow)
  • High need for manual intervention for download and synchronization of local (Barrow) instance with central (Fairbanks) master instance
  • Quality of data is low - duplicate IDs, overwritten fields, and little control over available field values
  • Based on proprietary technology that is not easy to interface with

The new solution

Given the above situation, the reporting system is being re-developed based on widely used open-source web development tools and a MySQL database backend.

Ownership and management

The new CM report database is owned and developed by the UAF team in close collaboration with the operational team(s) as well as overall ARM software infrastructure teams. Hosting on central infrastructure (DMF) should be considered.

Specs and requirements

TODO: Upload specs document

High-level functional requirements

The following is a list of high-level requirements that may or may not be implemented in any given version. The features provided on individual version pages are meant as a more detailed, version-specific and partial list.

  • Create, edit, retrieve [operators] and delete [admins] records for corrective maintenance interventions carried out by the operations team that runs the NSA facilities in Barrow, AK and elsewhere
  • Interrogate database to build simple reports
  • Search for individual reports by (old) report ID as well as operator, instrument, location and date range.
  • Authenticate users, and automatically pre-populate data entry fields from user profile
  • Extend functionality of existing DB regarding:
    • Automation of tasks (such as sending email to a distribution list)
    • Image/file upload attached to a CM report
    • Indicate follow-up actions such as “request enabling collection/ingest” or “create OSS entry”
    • Track edits to a report including time and author of edits
  • Enable multiple instances. This includes:
    • Automatic synchronization (for example, daily) in both directions with sane handling of version conflict and unique identifiers
    • In a later version, an entirely stand-alone client application that can function on a non-network connected laptop
  • Enable interoperability with ARM-wide tools. In a first approximation, this may mean:
    • Provide stable URLs for each CM record which can be derived from historical CM report IDs in order to make it possible to reference each report from other systems (such as OSS, but also weekly reports, send them in email…)
    • Mark a report as requiring an OSS event, and link up with OSS event ID, name of OSS reviewer etc.
    • Design CM reporting system's instruments/components in a way that keeps them compatible with OSS instruments/components
    • Reference other ARM-wide tracking system identifyers such as belonging to DQPR, ECR, BCR, EWO…
  • In a second phase, build interoperable services. This requires collaboration between the owners of the CM system, DQPR, OSS and possibly other system owners, such as SGP

High-level non-functional requirements

  • Improvements in speed for local operators - must work even with limited internet connectivity (solution: locally installed instance, which requires
  • Improvements in usability - readability, low number of clicks to achieve tasks
  • Web-deployed with a widely used database backend (preference: MySQL)
  • Use widely used web framework (preference: Django on top of Python)
  • Semi-manual: Progressive data clean-up of existing entries – remove corruptions, align instrument and operator names etc.

Infrastructure

  • Master database, application and web server should, most likely, be ultimately hosted within ARM central facilities (DMF). This was suggested by Brad Perkins (OSS developer) and Mark Ivey (NSA operations manager) independently.
  • In order for this to work, we believe that a certain level of experience with working code and day-to-day needs to be acquired. During the initial pre-alpha phase, rapid and flexible intervention may be necessary to catch and correct major issues as they arise.
  • For this reason, we have elected to initially use an existing Unix server (nanuna.gi.alaska.edu), which is already part of the NSA operations computing infrastructure, as the master database/app/web server.
  • The software allows for local “satellite” instances which are intended to be synchronized bidirectionally with the master instance at appropriate intervals (for example nightly).
  • A VM running CentOS 5.7 has been brought up by SDS to run a local instance in Barrow
  • Software and network requirements for either the master or a secondary server are documented on the software requirements page.
  • Within UAF, a the machine cloudberry is being used as a staging server.
  • Development takes place on the individual assigned work computer. Chris Waigl's environment: Mac OS X 10.6.8 / Python 2.7.1 / Django 1.2.5 / MySQL 5.1.54.

Timeline

Pages are being maintained regarding challenges, features and notes for each milestone.

Milestone ETA - tentative Status
Interface demo 2011-04 Done
Release zero2011-07 In progress
Feature release 1 2011-08 In progress
Feature release 2 2011-09 Scheduled
Feature release 3 2011-10 Scheduled
Feature release 4 2011-11 Scheduled

Feedback

Please feel free to leave feedback on the Feedback page, or by sending email to Chris Waigl.

projects/new_cm_report_database.txt · Last modified: 2011-10-24 15:36 by Chris Waigl (admin)
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki