The analogy between oil and data is credited to Clive Humby in 2006. Clive is a British mathematician who established Tesco’s Clubcard loyalty program. Humby highlighted the fact that, although inherently valuable, data needs processing, just as oil needs refining before its true value can be unlocked.

Since then, the analogy has been widely used in marketing materials to bring attention to the value of data and the potential economic impacts the control and use of data can have.

An important aspect of the analogy that is often overlooked, is that data (like oil) needs refinement before it can be usefully interpreted. And that refinement (or processing) must occur quickly enough for the insights to be useful when acted upon.

It is much more important to know that your project is not achieving the target cost or production rates daily than monthly because there is time to react and change work practices. This is especially true if you are running a large construction project that is spending more than $1M per day – you want fast feedback.

Construction data – much more than drawings and documents

Construction data is far more than simply drawings and documents. While these are important and communicate intent, there are many other types of data.

Consider these examples which might occur on a daily or even hourly basis.

  • Actuals (start and finish dates)
  • Site attendance records
  • Site supervisor diary
  • Weather records
  • Equipment engine run hours
  • Weigh-bridge records
  • Material delivery dockets
  • Material test results
  • Truck count sheets
  • GPS and Telemetry from fixed and mobile plant
  • Safety inspections
  • Progress records (quantity, rule of credit, …)
  • Unplanned project events (scope, site conditions, material defects, equipment breakdowns, …)

Simply converting paper data or documents into a scanned digital rendition will only solve some problems. What we need to do is capture the contained data in a structured form, that can be easily transmitted and interpreted.

Because of the wide variety of data formats, significant human effort is required to gather, enter and process the above data. Furthermore, the above data can represent conflicting information that requires human effort to resolve.

Unfortunately, the effort required to generate the information can be so large that we lose sight of the need to use the information to make better project decisions.

Using this data we routinely need to generate construction information such as:

  • Daily cost and production rates
  • Accrued costs
  • Percentage complete
  • Actuals (start and finish dates)
  • Earned value and other metrics like CPI and SPI
  • Progress claims
  • Commercial notices and claims …

And furthermore, we combine the above data and information together to generate internal and external reports.

At E7 we are seeing new construction projects every week and are helping these project teams grapple with data and information challenges. Overwhelmingly we see paper dockets, paper timesheets, paper diaries and reporting spreadsheets as the standard tools that many teams are working with.

Paper has obvious limitations and in that the data must be transcribed into an electronic system. While this takes time, it also limits the speed in which this data can be used, reviewed and enriched for other purposes.

The humble site docket

The paper docket typically used on sites involving subcontract workforce might seem best captured as paper, then passed to a supervisor for approval, then passed to an admin person to enter into the cost control system and then passed to site engineers for entry into a progress spreadsheet.

However, a docket is actually the source of many other pieces of information and supports a wide range of downstream processes.

  • Person/Equipment/Material
  • Start time, End Time, Breaks
  • Quantity
  • Prestart checks
  • Signature
  • Company, Date, Unique identifier, …
  • But the humble docket supports a wide range of downstream processes including:
  • Record of attendance on site, accrued cost, approval of work, material placement, …
  • Safety exposure hours, completion of prestart checks, …
  • Physical progress (load counts, amount of material moved, …)

So when a docket is captured electronically, it enables these processes to occur in almost real-time instead of waiting to be passed between teams. When accumulated, these inherent delays mean that the information being gathered is less valuable because it is less current.


Spreadsheets have been long understood as a huge risk for large businesses, but they have become ubiquitous for many industries, including construction.

Research shows that over 90% of spreadsheets contained errors and 23% contained serious errors. This is because it is very difficult to find and fix errors. Hence calculation errors are commonplace, but still, Engineers rely upon them for everyday work – even when alternatives exist.

The European Spreadsheet Risks Interest Group documents these risks and details significant errors that have resulted. A couple of interesting examples include:

  1. Vote counting in a Malaysian election
  2. Financial reporting to London stock exchange
  3. Misinterpretation of human genome data

Information decay and decision delay

The combined effect of slow and unreliable information is that the time taken to make a decision is delayed. In isolation, a delay of 7 days may not be significant, but when this becomes the norm and is happening for the majority of data being captured from the site, it becomes a constant drag on decision making.

Cognitive bias and data-informed decisions

As a result, decisions are made based on anecdotes or gut feel, instead of being informed by data. When we make decisions based on instinct and gut feel, we become more susceptible to cognitive bias. Some common examples that affect our decisions include:

  1. Confirmation bias – our tendency to search for and favour information that confirms our beliefs while simultaneously ignoring or devaluing information that contradicts our beliefs.
  2. Availability heuristic – a common mistake that our brains make by assuming that the examples which come to mind easily are also the most important or prevalent things.
  3. Anchoring – our tendency to stubbornly cling to a number once we hear it and evaluate all other offers based on that previous number, even if that isn’t the most relevant bit of information.
  4. Sunk cost fallacy – once we’ve invested time and/or money in something, we become vastly less likely to abandon it, even once it should be clear that the project will ultimately fail.
  5. Survivorship Bias – our tendency to focus on the winners and try to learn from them while completely forgetting about the losers who are employing the same strategy.

Data-driven decision making

Being informed by data help be objective about the decisions we make and minimises the effect of confirmation bias.

A great example of using data to inform decision making in demonstrated in the film “Moneyball” in which the process for selecting baseball players is challenged using a data-first approach instead of the traditional gut feel approach.

If you’re not familiar with Moneyball, Michael Lewis details the surprising success of the small-market baseball team, the Oakland Athletics, which competes against large-market teams with much deeper pockets such as the New York Yankees or Boston Red Sox.

In order to maximize his player budget (a fifth of the size of larger teams’ budgets), Oakland A’s General Manager, Billy Beane, broke with tradition and applied an analytical approach to baseball’s flawed and subjective scouting system. His staff drafted young, inexpensive players and obtained unwanted, affordable veterans with high on-base percentages as well as unorthodox pitchers who generated a lot of ground outs. Using statistical analysis known as sabermetrics, the Oakland A’s were able to level the playing field and proceed to outsmart and outperform much richer teams. All of the MLB teams had access to the same data; however, the Oakland A’s identified inefficiencies in how the data was being used and capitalized on them.

Another fantastic example of a work practice that uses data to inform decisions is from the father of Lean Thinking – W. Edwards Deming.

Deming was an American Mathematician who pioneered the use of statistics in manufacturing process control and continuous improvement helped revolutionize the Japanese manufacturing industry post World War 2 using the Plan-Do-Check-Act cycle. Born from this work is the Toyota Production System and Lean.

When we capture data either paper, electronic or otherwise, the value of the data depends on how easily when can extract, link, transform, interpret and reuse.

The health care industry has tackled these issues over recent years and anyone who has visited the hospital may have noticed the transformation to digital medical records. The construction industry may be able to leverage some of these learnings.

Principles for improving data capture

Here are some guiding principles that can help improve your data maturity:

  • Capture electronically close in person, place and time
  • Plan the master/reference data to align with how the project performance is to be measured and align with the financial system.
  • Use reference codes/identifiers/geolocation for resources (people, equipment, materials, companies) to allow easy cross-referencing
  • Express data in self-describing formats where possible
  • Exchange data in neutral file formats – avoid proprietary file formats
  • Agree a “source of truth” for core data like budgets, people, equipment, …
  • Organise and store for ease of accessibility
  • Favour less high-quality data over more unreliable data
  • Balance detail (granularity) against the cost of capture
  • Convert data into information and drive decision making

Tackling the above challenges is not easy, especially in an environment where systems are fragmented and each party in the supply chain is not always incentivised to share information freely.

Projects that have converted the above process into a digital process by applying the above principles realise significant benefits. Not only in the administration effort saved, but also in having transparency of accrued costs available for performance analysis usually by the next day.

A project team that adopts a data-driven mindset and starts using that information to inform decisions will naturally improve project performance.

The data we collect (and the way we collected it) is an asset and can be used to improve the productivity of projects. We already have the capability and existing technology to collect and refine the data into useful information. What we need to is to challenge the old habits for data collection and processing and to get smarter about what we do.

Refining our construction data relies on being able to process and link with other information so that we can make informed decisions.