The “How I” Files #2 – Auditing a Business Intelligence System

I finally got time to write another “How I” file – time has just been in short supply lately with SQLSaturday Wausau coming up fast!  In this file, I talk about how I would audit an existing business intelligence system to determine how well the system is performing.  Another caveat – this is more of a “How Will I” than a “How I” file meaning everything here is theory and thought experiments. Now that those warnings are out of the way, let’s get to it!

High Level Approach

Any business intelligence system can be broken down into components.  For example, in a very basic system we might have one Analysis Services instances pulling data directly from source systems with a nightly job to process the model.  However, in most cases we would be looking at a system consisting of an ETL job, data mart or data warehouse, and output in the form of data enabled spreadsheets, paginated reports, and dashboards.

I would break my audit down into a component specific approach.  This way I can focus on each individual piece of the puzzle and at the end look at the big picture.

ETL Audits

In this section, I would look to answer the following questions:

  1. Can the ETL process be restarted without destroying data in the data warehouse?
  2. Is data lineage preserved and recorded through the ETL process?
  3. Is anyone notified in case of an error condition in the ETL process?
  4. Are performance statistics such as rows extracted, rows loaded, and execution start and end times captured?
  5. Are rows flagged when they fail to pass business quality rules where appropriate?
  6. Are rows modified time stamped and indicated why there were modified?
  7. When was the last successful run of the ETL process?
  8. Are error messages captured and recorded during the ETL process?
  9. Does the ETL process store unmodified source data in an operational data store?

Data Mart / Data Warehouse Audits

In this section, I would look to answer the following questions:

  1. Are fact and dimension tables clearly identified?
  2. Is the staging area logically separated from the data warehouse area?
  3. Are security controls in place to prevent data warehouse users from accessing the staging area?
  4. Are tables and attributes named with business friendly names?
  5. Are measures in fact tables consistently defined?  For example, are all profit margin measures calculated the same way?
  6. Are dimension attributes consistently formatted?  For example, do all State attributes follow the same convention?
  7. Are all elements in the data warehouse mapped back to business requirements?
  8. Have all calculations been verified and validated with business users?
  9. Are all data warehouse queries properly tuned?
  10. Are views used to retrieve commonly accessed data sets?
  11. Do dimension tables have surrogate keys defined and indexed?
  12. Do dimension tables have business keys defined?

Report and Dashboard Audits

  1. Do all paginated reports follow a consistent layout with regards to page headers and footers?
  2. Do all reports retrieve data from views instead of ad-hoc queries?
  3. Do all graphs on a report follow consistent color schemes?
  4. Are all headers formatted consistently?
  5. Are explanations of the data included where needed?
  6. Do all reports map back to defined business requirements?
  7. Are reports accessed on a regular basis?
  8. Are reports filling a true business need?

Phew… that’s a lot of stuff to consider.  Especially the report and dashboard section – those questions apply to each and every output.  As you can see full BI systems can become quite complex very quickly and it’s important to address issues at design time rather than after the fact.  In fact, I had to re-architect an entire BI system because I overlooked some of the points above.

Do you feel I missed anything in this list?  Add it below in the comments!

2 thoughts on “The “How I” Files #2 – Auditing a Business Intelligence System

  1. Hi Jesse,

    These two articles are excellent and well written, I look forward to more of these. One of my views on the Reports and Dashboards, data retrieved by views. I personally prefer to retrieve them via Stored-procedures than Views, and use views only if the reporting tool doesn’t work well with Stored Procedures.

    The structure I use is Table->View-Stored Procedure->Consumer. This gives me a bigger toolset for performance tuning, a single contract with the report that allows me if needed to change underlying structure, auditing on execution and usage, and security options.

    • Post Author Jesse Seymour

      Thanks for the tip! I am basically learning all of this as I go, and taking the fake it until I make it approach 🙂 The single contract aspect is why I use views with all my reports. It gives me a single place for changes and abstracts the physical data structure from the logical. Although I can see using stored procedures to exactly that as well, plus giving me the ability to use multiple queries to massage the data before spitting it out.

Leave a Reply