In my day-to-day job, as a Digital Collaboration Coordinator, I spend many hours collecting, checking and managing various kinds of BIM data. Far too many times after receiving a set of data I threw up my hands helplessly – the quality was far below the point I could use it. Then I would start a mundane process of data cleaning. I cursed this situation countless times before it struck me – most of the people on the construction project don’t even know how to handle data correctly! We have never been taught what makes for a good data set. Neither how to fill Excel spreadsheets to allow easier data management. Nor how to populate our models with good quality data.
On our blog, there is a series of articles about Data-Driven Design and using a database during the design process. This is a useful process, though I realized that numerous projects are not yet ready to implement! The reason is bad data quality or no routines for data entry and management!
I decided to take a step back and start from scratch with an education process about data. Therefore, I want to kick off from the very basic level and dwell on data management on construction projects. In this entry, I will cover some definitions and theory (with practical examples as we use to do!) and then I will write about data entry, quality, management and more.
Table of Contents
What is data?
Let us begin with something absolutely basic and consequently, we will proceed with the difficulty level. So – what is data?
According to Wikipedia definition:
Data are individual facts, statistics, or items of information, often numeric. In a more technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects.
Data gives us information about objects or persons and it can be transmitted or processed. A single value (datum) is often called a data point. Data is everything we create on our projects – from meeting recordings to complicated models. Some examples of data in our projects:
- a pdf file (e.g. Product Data Sheet),
- an e-mail sent to a coworker
- a photograph of a construction site.
- a fire-rating of a wall in our model.
Let’s move to something less obvious now.
Structured and Unstructured data
Data we produce can be structured or unstructured. It depends on how it looks and how we create it. I touched on this subject here. Let’s start with defining the difference between them.
Structured data (or data model) organises data points and defines the relations between one another. As the name suggests, it has to have a structure before placing it in data storage (such as relational databases).
BIM objects are a great example: a data model representing a wall is composed of other elements that define a wall: thickness, length, fire-rating, material, etc. To create a wall you have to put data into a predefined schema (each data point into a corresponding data field). Hence, structured data is also called schema-on-write. The most important feature of structured data is its query simplicity. Nonetheless, it requires an effort to create a set of data in a database.
Unstructured data is frankly speaking anything else. Unstructured information does not have a pre-defined data model. It is stored in a native file format. Accordingly, unstructured data are e-mails, pictures, pdf documents, meetings notes, etc. The biggest advantage of unstructured data is its simplicity to create and store. Yet to query it, a user has to understand how the format translates into pure information. Thus it is also called schema-on-read. A picture of a construction site has no data model and only a person accustomed to a subject matter can translate it into data, such as the number of floors, used materials, size of the construction, type of the bearing elements, etc.
This table shows examples of data we encounter on construction projects:
|Structured Data||Unstructured Data|
|Bills of quantities||Photographies|
|BIM objects||Meeting recordings|
|BIM models||Meeting notes|
|Excel spreadsheets (depending on quality)||Excel spreadsheets (depending on quality)|
What are properties?
We already know what is the data that we have in our projects. From now on, I will focus only on one type of project data: BIM models and what we have within. In fact, BIM is about objects’ properties. What are those then? This is a bit philosophical and surprisingly deep subject, but let us keep it stupid simple.
Property is a physical or abstract characteristic of an object. Physical properties indicate what this object is in the physical world: its colour, thickness, length or material it is made of. Abstract properties could be for instance: cost, object’s code (according to Uniclass for instance) or Control Area code.
What kind of properties do we have in our projects? To answer that question, let us leap back. As far as our first entry on this blog. As I described there, we divide BIM data into graphical and non-graphical.
Graphical data is all we see on the screen. Those are unstructured data and are used mostly to distinguish what different lines or surfaces mean. This type of graphical data are:
- Line thickness
- Line type (solid, dashed, dotted)
- Shape of the model
- Characters and symbols
Non-graphical data is all the information that sits within the drawings or models. This can be different schedules, areas of the room or volume. Those are physical properties that derive directly from the graphical design. If we think of 3D design as being a blunt combination of surfaces, it’s precisely where its properties are.
BIM models offer more of such data variety. As a result of separating building elements from plain 3D shapes and dividing them into classes (or categories), we can assign different properties to different elements. Each category has numerous properties, both physical and abstract.
Each BIM object has hundreds of properties. To have them all listed one after another would be extremely cluttered. Therefore we use property sets – a grouping of properties. IFC schema has its grouping, each BIM authoring tool has its own. You could think of them as chapters in a book or sheets in an Excel spreadsheet.
Predefined property sets are grouped logically. In IFC viewer you will commonly see property sets: Identification, Location, Relations or Quantities.
User-defined property sets group together user-defined properties. Such grouping should be described in BEP (BIM Execution Plan) and I recommend sticking to some rules for grouping them. Otherwise, users might always misclick or misspell this set (been there).
What is a BIM model?
To conclude this theory introduction, let’s now join all these parts together and describe what is a BIM model. Datawise.
First of all, a BIM model is what we see on the screen: a 3D model, which presents unstructured graphical data (schema-on-read to understand what is shown on the screen). But in the background, a BIM model is nothing else but a database: non-graphical, structured data with both physical and abstract properties of the objects. Each object is a table and each property is a column. Objects relate to one another in the same way a relational database creates connections.
Take a look at the infographics below to get an understanding of how the data sets relate to one another.
Why do we need properties?
You should know by now what is data, what are properties, what kind of properties we have in the projects and how we group them. Before we wrap up, I want to address the last question: what are we using those properties for? There are several reasons why we need properties on projects.
To have all information available in one place.
Scheduling and Quantity surveying.
Here comes the real power of properties. In chapter 2 I stated that structured data’s biggest advantage is the simplicity of querying and filtering. BIM models give the possibility of easy filtering and sorting of received information. With filtering, queries like “Show me all objects that I am responsible for designing. Show me all objects that are still in the early design phase.” can be easily addressed.
Exporting data to Business Intelligence software.
Since BIM is a database and properties are columns, it means that we can easily reuse our design in various types of software. We are not forced to cooperate only through a model. We can for instance focus only on data and send our BIM database to software such as Power BI (if you are keen on learning how to do it, let me know in the comments and I can create a tutorial!). And this opens a whole new world for the data analysis of our projects.