The post-BIM world. Transition to data and processes and whether the construction industry needs semantics, formats and interoperability

artem boiko
52 min readDec 20, 2024

--

With the advent of digital data in the 1990s, the construction industry began to transform actively. Computer technology was introduced into the design, management and construction processes, which led to the emergence of concepts such as CAD (Computer-Aided Design systems), PLM (Product Lifecycle Management) and, later, BIM (Building Information Modeling).

However, like any innovation, they are not the end point of development. Concepts like BIM have become an important milestone in the history of the construction industry, but sooner or later they will give way to better tools and approaches that will better meet the challenges of the future.

Overwhelmed by the influence of CAD vendors and confused by the complexities of its own implementation, the BIM concept that emerged in 2002 may well not live to see its thirtieth birthday, like a rock star that flashed brightly but quickly faded away. The reason is simple: the demands of data specialist are changing faster than CAD vendors can adapt to them.

Faced with a lack of quality data, today’s construction industry professionals demand cross-platform interoperability and access to open data for easier analysis and processing. The lack of data and its complexity has a negative impact on everyone involved in the construction process: designers, project managers, on-site construction workers and ultimately the client.

Instead of a complete data set for operation today, the customer and investor receive containers in formats that require complex geometric kernels, an understanding of data schemas, annually updated API documentation and specialized CAD-BIM software to work with the data. At the same time, much of the design data remains unutilized.

This approach is outdated and no longer meets the demands of today’s digital environment. The future will divide companies into two types: those who use data effectively and those who leave the market.

In this article we will look at both the existing BIM framework, including the use of CAD (BIM), IFC and USD formats, and alternative and future approaches to working with data and processes, and answer key questions that are of interest to data professionals in the construction industry today:

  • What is BIM — marketing or real innovation for the construction industry
  • Why CAD formats are killing interoperability and the concept of BIM
  • USD vs. IFC: why do vendors impose new formats?
  • Why new formats from CAD vendors IFC and USD do not solve key interoperability problems
  • Why are CAD vendors abandoning files and moving to granular data starting in 2023?
  • Why parametric CAD formats and BREP geometry are not essential for the construction industry
  • How Strabag and Züblin challenged CAD vendors and what came out of it
  • Why the use of semantics and ontologies in data management falls short.
  • Why construction businesses will oppose the use of open data
  • And what are the tools of the future for dealing with construction project data

Thank you very much for the valuable discussions and debate on the issues raised by Rasso Steinmann, Thomas Liebich, Ulf-Günter Krause, Bernd Müller-Jürries, Simon Dihlas, Michael Maass, dear members of BuildingSMART, the OSArch community and the DataDrivenConstruction Chat group.

🔗 Original Article: The post-BIM world. Transition to data and processes or whether the construction industry needs geometric kernels, semantics, formats and interoperability

Content:

  1. BIM is about data and processes, but then why the acronym?
  2. Each CAD (BIM) data user serves ten data consumers
  3. Any CAD (BIM) program is a data compiler that visualizes geometry through a geometry kernel
  4. CAD, IFC and geometric kernels: who’s in charge?
  5. IFC is CAD within CAD with a dependency on the geometry kernel and the SDK
  6. Why do builders need geometry? When lines turn into money
  7. Basic calculations in construction or from lines to volumes: How area and volume become numbers
  8. Why do we need triangles? The whole truth about tessellation in construction
  9. Zublin-Strabag’s attempt to subordinate CAD (BIM) vendors to the interests of the construction industry
  10. The emergence of semantics and ontology in the construction industry
  11. Semantics and ontology: how to make data talk?
  12. From graphs to tables: labor costs in grouping and filtering
  13. In the shadow of ISO and buildingSMART: the war for control of the data format
  14. Why do builders and customers need to control data?
  15. Uberization and open data is a threat to the construction business
  16. Do BIM, openBIM, BIM Level 3, and noBIM actually exist, or are they marketing gimmicks?
  17. What’s next? Simple formats and user-friendly tools
  18. Emergence of LLM and ChatGPT in project data processes In lieuof conclusion

1. BIM is about data and processes, but then why the acronym?

The concept of BIM (Building Information Modeling), resurrected in the construction industry with the publication of Autodesk’s BIM Whitepaper in 2002 and complemented by the mechanical engineering concept of BOM (Bills of Materials), originated from the parametric approach to creating and processing project data. The parametric approach to creating and processing design data was first implemented in the Pro/Engineer system for mechanical engineering design (MCAD). This system became the prototype for many modern CAD solutions, including those used in the construction industry today.

Fig. 1 History of Revit, Solidworks, Autocad and Whitepaper BIM product emergence

Quote by Samuel Heisenberg, founder of PTC, developer of the MCAD product Pro/ENGINEER, and mentor to Leonid Reitz, creator of Revit:

The goal is to create a system that is flexible enough to encourage the engineer to easily consider different designs. And the cost of making changes to the design should be as close to zero as possible. Traditional CAD/CAM software of the time unrealistically limits making inexpensive changes to only the very initial stage of the design process

Already in the late 1980s, the goal was to eliminate the limitations of the then existing CAD programs. The main objective was to reduce the labor required to make changes to the parameters of design elements and to make it possible to update the model on the basis of data outside the CAD programs. In this case, the key role was to be played by the parameterization of the task and automatic input of parameters from the database to update the model in the CAD system.

After Autodesk finalized its purchase of Revit (with the legacy BOM concept from Pro/ENGINEER), the two vice presidents of Autodesk prepared a Whitepaper that marked the arrival of the BIM concept to the construction industry in 2002.

Figure 2. Appearance of the BIM Whitepaper in 2002

On a website with a published WhitePaper on BIM, Autodesk actually reproduced the marketing materials of the BOM (Bills of Materials) concept from the Pro/ENGINEER products used back in the early 1990s.

BIM is described as building information management, where all updates and all changes take place in a database. So whether you are dealing with schematics, sections or sheet drawings, everything is always coordinated, consistent and up-to-date.

Almost all modern concepts for adding and processing parameters in construction, such as IFC, BIM, openBIM and buildingSMART, have been supported in creation and are promoted by CAD solution providers. However, many of these ideas are borrowed from other industries or acquired from startups. For example, Autodesk did not create Revit, the BIM concept, AutoCAD Architecture, Navisworks, Civil 3D and Advanced Steel on its own, but acquired these solutions from startups.

Similarly, buildingSMART did not create the IFC format or the acronym openBIM®. IFC was adapted by the Technical University of Munich from the mechanical engineering format STEP, and later rebranded by HOK to create the IAI Alliance, while openBIM was only rebranded by buildingSMART as a trademark in the early 2020s after its initial registration by several CAD vendors in 2012. The IFC-STEP format, in turn, is based on IGES, created in 1979 by a group of CAD users and vendors with support from NIST and the U.S. Department of Defense. The links between the developers and the ideas of modern concepts are presented in the “History of BIM” map.

Figure 3. Map of connections of the main teams of CAD (BIM) solutions developers

Mechanical engineering, where the core tools and concepts came from, is gradually moving away from working with CAD, CAM and PLM terms to digital twins and end-to-end data management processes. Design, manufacturing and operations processes are not viewed through the lens of specific tools (from companies such as PTC, Siemens and Dassault Systemes), but through unified approaches to data and processes. Instead of specialized terms such as BOM, PLM or PDM, the concept of data management, process management and data analytics is increasingly being used. It is not only in mechanical engineering and other industries that this transition from narrow abbreviations of software vendors to universal concepts led by “data” and “processes” is observed.

Users and developers in the construction industry, like their counterparts in other industries, will inevitably move away from the vague software vendor terminology that has dominated the last 20 years, focusing on the key aspects of digitalization — “data” and “processes.”

Figure 4. Construction processes and design are inextricably linked to data and processes

Without free access to high-quality structured data, it will be impossible to build efficient processes and automate them in the future. Therefore, the first priority will be to discover, organize, unify and organize data, which will create the basis for automation and optimization of business processes.

To understand the complexity and confusion of the current CAD-BIM concept of data and process management, we will look at the basics of creating the geometry and meta-information that populate design data. This will answer the question of why the use, automation and standardization of design data has remained a challenge for the past 30 years.

2. Each CAD (BIM) data user serves ten data consumers

If before the early 2000s the share of data stored in digital form was extremely small and the issues of using this data in other systems were practically not raised, then thanks to the emergence of CAD (BIM), PLM, ERP, Excel and SQL-based applications, the situation has changed dramatically today. The number of systems consuming data has grown tenfold, causing a real crisis for managers and specialists who have to receive, process and transmit data.

Figure 5. Data and process integration scheme in the construction company ecosystem

Since the early 2000s, the number of office employees involved in maintaining various user interface systems and databases has grown exponentially, resulting in a dramatic increase in the importance of data access and sharing. As Autodesk CEO noted back in 2002, for every CAD engineer already in the early 2000s there were at least a dozen other specialists who needed to work with data created in CAD systems (Quote):

You need to be able to manage all that data (CAD — author’s note), store it digitally and sell lifecycle and process management software, because for every engineer who creates something, there are ten people who have to work with the data.

By the early 2020s, the number of professionals working with data from CAD and BIM programs has grown exponentially, reaching hundreds of professionals in large companies — from GIS, ERP and CAFM system managers to construction site foremen.

Figure 6. Dozens of specialists per engineer creating data in CAD (BIM) programs

All these users and their data managers in the construction industry are striving for full compatibility between different programs and platforms, and more specifically databases and data formats. And while almost all systems and databases in the construction industry are open to engineers and IT professionals, only CAD (BIM) remains closed databases with proprietary formats. These closed outposts of project information have impacted dozens of other departments and hundreds of professionals over the past 30 years, creating a dependency on limited access to information.

Issues of true cross-platform and interoperability face the fundamental problem of the closed nature of CAD programs and the complex proprietary geometry kernels, third-party SDK tools, and various data schemas they use in proprietary formats.

Figure 7. Of all the databases, only CAD (BIM) access remains unavailable

Why do we need CAD (BIM) systems in construction? Their main task is to help the designer to create new data based on the initial parameters of the tasks in the project, which includes both the geometry of design elements and related meta-information.According to the definition from Wikipedia:

Building Information Modeling (BIM) is a method of creating and managing digital representations of the physical and functional characteristics of buildings and other objects.

Most CAD (BIM) systems use closed databases and proprietary storage formats to create and store these features and parameters. In addition, they use sophisticated proprietary geometry kernels that provide visualization and interactive interaction with the project geometry.

3. Any CAD (BIM) program is a data compiler that visualizes geometry through a geometry kernel

Each CAD (BIM) program either uses its own geometry kernel or relies on a third-party proprietary solution, which greatly complicates data exchange between different platforms.

Any CAD (BIM) program is a compiler of geometry data that is displayed using a geometry kernel.

The market is dominated by geometry kernels such as Siemens Parasolid, Dassault Systèmes CGM, PTC ProEngineer, ADSK ShapeManager and ASCON C3D. The only free open source geometry kernels are OpenCascade and the CGAL library under the GPL license.

Figure 8. Architecture of CAD/BIM systems workflow: from code bases to visualization

There are no problems with drawing geometry by parameters when it comes to simple geometric elements such as lines or planes. However, when working with complex or compound elements, the situation changes. Even with the same input parameters, different geometric kernels can produce different results due to the peculiarities of their operation and processing algorithms. As a result, the geometric entities of the project saved as parametric geometry will be displayed differently in different CAD (BIM) products.

Figure 9. Different input parametric models give different results when using different geometric kernels

Full cross-platform compatibility at the level of parametric geometry remains unattainable for most CAD vendors. This is due to the lack of standardization of algorithms used in geometry kernels of systems, which are often not owned by CAD vendors (and more MCAD vendors), as well as the need to create open exchange specifications and develop universal converters. Currently, only a handful of initiatives, including OpenCascade, Open Design Alliance, and CADExchanger, are addressing such challenges. Historically, initiatives to convert and unify multi-format data have come under intense pressure from CAD vendors. However, many of these conflicts have ended in litigation with decisions not in favor of the vendors.

Let’s understand why we need a geometric kernel in data handling, which is so complex that even CAD (BIM) system developers have to turn to third-party vendors and developers for solutions and development.

4. CAD, IFC and geometric kernels: who is in charge?

Geometry kernel is necessary for visualization and processing of geometry. One of the most common methods of geometry representation in CAD (BIM) systems is Boundary Representation (BREP or B-rep).

BREP (Boundary Representation) is a way of describing the geometry of an object through boundary parameters: surfaces, edges and vertices.

Other forms of geometry representation, such as CSG and Swept Solids, are used in IFC format and internal containers of CAD programs, which find their application in certain tasks. However, due to its versatility, BREP is the leading form of geometry representation and has become the standard for most engineering and architectural solutions in the CAD (BIM) environment.

In any CAD program, user actions such as mouse clicks in the program interface to select points or lines are converted to BREP using a geometric (mathematical) kernel and displayed as a finished shape in the program window.

However, since each CAD vendor uses its own geometry kernel with a unique code base, often consisting of tens of millions of lines, transferring BREP parameters between programs does not guarantee their identical representation in another CAD system.

Figure 10. Emergence from BREP parametric using geometric kernel of element shape

BREP (Boundary Representation) within the IFC format is not a fundamental format as it is described parametrically without using a specific or existing geometric kernel for it. This creates a situation where the same parametric IFC model can be interpreted differently in different software products that use different geometry kernels, such as OpenCascade, Shape manager, Siemens Polarsolid which are used in CAD (BIM) programs.

A key element in the process of creating real cross-platform interoperability could potentially be the utopian idea of creating a universal geometric core to which the parametrics of elements can be bound. This would ensure correct interpretation of the same geometry in any CAD (BIM) system.

Therefore, either buildingSMART should have its own free and open geometric core or the same BREP element from IFC format will continue to be displayed differently in different tools and programs.

IFC itself can be seen as a kind of CAD within CAD or a skeleton for a computer-aided design system, which in 30 years has not officially appeared for IFC.

5. IFC is CAD within CAD with dependency on geometry kernel and SDKs

The IFC format uses different ways of representing geometry, such as CSG and Swept Solids, but BREP has also become the leading standard for transferring feature geometry in IFC format, as this format is supported when exporting from CAD (BIM) programs and allows for potential (only implemented in experimental products) editing of features when importing IFC back into CAD programs.

In most cases, when geometry in IFC is defined parametrically (BREP), it becomes impossible to get properties such as volume or area of project entities by having only an IFC file, because to work with geometry and its visualization in this case you will need a geometry kernel, which is initially absent.

To support IFC in any program it is actually necessary to create another ideal IFC CAD inside the existing solution with its own geometry core and its own logic of working with geometry.

Figure 11. Is it possible to standardize geometry kernels and SDKs used by CAD vendors?

And while there may be no problems with primitive elements in IFC-BREP format, in addition to problems with different geometric kernel engines, there are enough elements that have their own peculiarities for correct mapping. This problem is discussed in detail in the international “Reference study of IFC software support” published 2019. Citation:

The same standardized datasets produce inconsistent results, few detectable common patterns, and serious problems are found in the standard’s support (IFC — author’s note), probably due to the very high complexity of the standard data model. The standards themselves are partly to blame here, as they often leave some details undefined, with high degrees of freedom and various possible interpretations. They allow for high complexity in the organization and storage of objects, which is not conducive to effective universal understanding, unique implementations, and consistent data modeling

The correct understanding of “certain provisions” is available to paid members of buildingSMART and often in behind-the-scenes discussions. As a consequence, whoever wants to access important knowledge about certain features of IFC will try to cooperate with large companies, or reach it by their own research. From an interview with the developer of the CAD program Renga:

You come across a question about importing and exporting data via the IFC format and ask your fellow vendors: “Why is the information about parametric transfer of premises transferred in the IFC file in this way? The open specification from buildingSMART does not say anything about it”. Answer from “more knowledgeable” European vendors: “Yes, it is not said, but it is allowed”.

Fig. 10: The same IFC project in different CAD (BIM) programs gives different results

All features of IFC parameter mapping and generation in the geometry kernel can only be realized by large development teams. Therefore, the current practice of features and complexity of the IFC format is beneficial primarily to CAD vendors and has much in common with Microsoft’s “adopt, extend, destroy” strategy, where the growing complexity of the standard actually creates barriers for smaller market players. Microsoft’s strategy was to adapt open standards, add its own extensions and features to create user dependence on its products and then drive out competitors. Microsoft has long imposed its own standards (e.g., Internet Explorer), which slowed the adoption of more advanced and universal technologies such as CSS, HTML5, or independent browsers.

As a result, today only large CAD vendors, who can invest significant resources to support all entities and their mapping to their internal geometry core, can fully implement the IFC ontology. Large vendors also have the ability to harmonize technical details of features among themselves that may not be available to even the most active participant in buildingSMART.

Learn more about the challenges faced by development teams working with IFC formats in the Reference study of IFC software support

Fig. 12: International study of IFC format cross-platformability

For small independent teams and open source projects that want to support the development of interoperable formats, the lack of their own geometry kernel becomes a serious problem. Without it, it is virtually impossible to take into account all the many subtleties and nuances associated with cross-platform data exchange.

Currently, the only widely used open-source core underlying popular openBIM tools — such as IfcOpenShell, BlenderBIM-Bonsai, IFC.js and FreeCAD — is OpenCascade. However, it too has its limitations. The BuildingSMART organization has no direct tools to influence OpenCascade (OCC) development and licensing. Historically, the OCC development team was centered in Nizhny Novgorod, but in recent years some OCC specialists have moved to Portugal, and the larger and remaining part of the team has focused on developing a branch of the OpenCascade (OCC) project under the brand of the new Chinese open-source geometry kernel OGG.

And why do we need hard-won geometry in construction at all, and do we need it in parametric form?

6. Why do builders need geometry? When lines turn into money

Geometry, in addition to visualization, supplements existing element parameter lists with key volumetric characteristics, such as area and volume, which are automatically calculated based on the shape of the project entity object. These parameters play a crucial role as they serve as the basis for subsequent calculations, computations and analysis.

The automatic calculation of geometry becomes the link between abstract data in the form of problem parameters and their physical realization.

Geometry has historically been the foundation of engineering communication, providing the ability to calculate lengths, areas and volumes. From the earliest papyrus drawings to modern digital formats, drawings have always served as a key tool for communicating information about quantities of materials and work between engineers, foremen and estimators. For millennia, until the 1980s, estimators manually collected quantity and volume data based solely on visual representations, using rulers and protractors as the primary measurement tools.

Figure 13: The main purposes of geometric data in construction business processes

With the advent of computers, the manual and time-consuming task of calculating volumetric characteristics is now solved by full automation thanks to the advent of volumetric modeling in modern CAD (BIM) tools, which allows you to automatically obtain volumetric attributes of any element, without the need to calculate these values manually with a calculator.

Working in CAD programs the creation of geometric elements for calculations is done through the user interface of CAD-BIM programs. To transform points and lines into volumetric bodies, geometric kernel is used. The geometric kernel performs the key task — transformation of geometry into volumetric models, from which, after approximation, the volumes of the element are automatically calculated.

Fig. 14: Path of geometric shape visualization outside CAD (BIM) systems

As thousands of years ago during the construction of pyramids, where elbow and cubit were used for measurements, so today in CAD programs the accuracy of geometry interpretation plays a key role: the accuracy of project budget calculations, correct determination of cost and timing of works, which make up any construction project, depend on it.

Accurate calculations are a key factor for survival in the construction industry. In a highly competitive environment, access to quality volume data is crucial for successful project implementation and maintaining competitiveness.

If accurate calculations play a key role in determining the resources, materials and time parameters of a project, it is important to understand exactly how these calculations are made.

7. Basics of calculations in construction or from lines to volumes: How area and volume become numbers

In practice, triangulation is often used to compute areas and volumes of geometric surfaces defined analytically or via NURBS in BREP, which converts complex surfaces into a grid of triangles.

NURBS (Non-Uniform Rational B-Splines) is a mathematical way of describing curves and surfaces, whereas BREP is a structure for describing the complete three-dimensional geometry of an object, including its boundaries, which can be defined using NURBS.

Even if the surface is given analytically or via NURBS, it is most often approximated by tessellation, since exact computations via integrals or complex analytical methods are rarely realized in practice due to their complexity and high computational cost.

The essence of tessellation is to break down complex surfaces into simpler elements — triangles or polygons. This approach is used for surface and volume calculations, on-screen visualization, export to formats like MESH (“mesh”), and collision and collision analysis systems. In games, tessellation is used to create realistic landscapes, and in CAD/CAE systems for computation and visualization. Bee honeycombs are an example of tessellation in nature.

Figure 15: BREP and polygonal representation of the sphere

BREP (NURBS) used in CAD (including BIM systems) is not a fundamental model of geometry. This method was created as a convenient tool for representing circles and rational splines. However, it has limitations — for example, the inability to accurately describe the sinusoid that underlies helical lines and surfaces. As a result, BREP (NURBS) remains only a method of approximation, but not a fundamental means of describing geometry.

In contrast, triangle meshes and parametric tessellation are characterized by their simplicity, efficient memory usage and ability to process large amounts of data. These advantages make it possible to do without complex and expensive geometry kernels and hundreds of millions of lines of code when calculating geometric shapes.

Figure 16. Examples of cylinder and cube tessellation

In the construction industry, it does not matter how the volumetric characteristics are determined — by parametric models in internal CAD or IFC formats, or by using simplified geometric representations in USD, glTF, DAE or OBJ formats.

Geometry defined as polygons or BREPs (NURBS) are ways of approximating a continuous shape. Just as Fresnel integrals have no exact analytic expression, discretizing geometry through polygons or NURBS is always an approximation, just like triangular MESH.

Both polygonal MESH and parametric BREP have their advantages and limitations, but the goal is the same — to describe geometry efficiently and conveniently, taking into account the user’s tasks. In the end, the accuracy of the geometric model depends not only on the method of its representation, but also on the requirements of a particular task.

Fig. 17: Difference of volumetric characteristics of figures with different number of polygons

Parametric geometry in BREP format is necessary mainly where minimal data size is important and it is possible to use resource-intensive and expensive geometry kernels for its processing and display. Most often it is characteristic for CAD programs that use MCAD vendor geometry kernels for this purpose.

In most construction applications, the need for parametric geometry and complex geometry kernels is rare, unless its importance is promoted by the CAD vendors themselves or by the geometry kernel vendors who develop these tools.

As a result, inside CAD (BIM) programs, parametric geometry (BREP, RVT, IFC, PLN) is still converted into MESH triangles and polygons for calculations through the tessellation process. Outside the CAD (BIM) environment, in most cases, triangulated MESH geometry (USD, SVF, glTF, CPIXML, DAE, NWC) is also used for visualization, calculation and collision search, which makes the need for parametric geometry even less obvious.

So which format should be chosen as the standard for data exchange and is IFC the appropriate format for exchange — triangular IFC-MESH (gLTF, OBJ, DAE, SVF) or parametric IFC-BREP (RVT, PLN, DGN)?

8. Why do we need triangles? Using tessellation in construction

A specific CAD (BIM) program should not be the basis of the exchange format to be used both in costing departments and on the construction site. Geometric information should be presented in the format directly, without reference to a geometric core or CAD architecture.

In the construction industry, in systems and databases that utilize design data, the dependency on the CAD editor and geometry kernel should be minimized.

Geometry parametrics from CAD programs can be part of the process, but only as input data, not as the basis for an exchange format. This is the only way to ensure universality and independence of geometry descriptions. Most parametric geometries, including BREP and NURBS, are converted to a triangular mesh (tessellation) for computation and visualization. After tessellation, you still get a triangle MESH. If you can’t see the difference in the end, why complicate the process?

Parametric formats are not fundamental and conversely formats such as OBJ, STL, gLTF, SVF, CPIXML, USD and DAE remain fundamental because they use the simple and universal Triangle MESH structure. It is understandable and efficient in computer graphics architecture, and does not require additional geometry kernels with tens of millions of lines of code for visualization or element calculations.

Fig. 18: Transition of geometry from parametric format to polygonal representation

Due to problems with IFC interpretation and geometric kernel differences, all CAD vendors, without exception, use reverse engineering SDKs to transfer data between solutions of different vendors and no one uses complex IFC or USD format for interoperability purposes.

The MESH USD format, offered from 2023 by CAD vendors in the new AOUSD alliance discussed in last week’s article (An Era of Change: IFC is a thing of the past or why Autodesk and other CAD vendors are willing to give up IFC for USD in 14 key facts), has the potential to become the new standard for replacing parametric geometry in proprietary formats. It can also serve as a means to describe BREP geometry in IFC, which CAD vendors themselves have failed to do so far.

But instead of using concepts promoted by alliances of CAD vendors that they themselves do not use, it is more productive to focus on understanding the benefits of each approach in a specific context and choose one or another type of geometry depending on the use case.

The choice between different geometric representations is a trade-off between accuracy, computational efficiency, and the practical needs of a particular problem.

Figure 19: Tessellated BREP sphere with different number of polygons

The complexity and use of geometric kernels that CAD vendors impose on the construction industry in the processing of design data may not be necessary at all. The USD format with MESH geometry may be a Pandora’s box for the industry, opening up alternative approaches to IFC and BREP data exchange for designers. It turned out that there are other, simpler and more open formats that can provide quality interaction between CAD (BIM) engineers and dozens of other specialists.

One of the most popular construction ERP systems — ITWO/MTWO — is an existing example of such an effective application of MESH geometry in the business processes of construction companies. This product, owned by the Franco-German association of Schneider Electric and RIB Software, demonstrated the use of the MESH format with a simplified meta-information storage scheme back in the mid-2000s. Instead of the IFC and USD formats, ITWO/MTWO uses the proprietary but readable CPIXML format.

Behind the development of the iTWO/MTWO ERP system is the construction corporation Strabag, and more specifically its subsidiary Züblin from Stuttgart. It was Züblin that initiated the development of iTWO, which at the time was positioned not just as an ERP system, but as an international platform for 4D-5D BIM — the integration of design data with schedules and estimates.

9. Zublin-Strabag’s attempt to subordinate CAD (BIM) vendors to the interests of the construction industry

Züblin-Strabag is one of Europe’s largest construction companies with a deep understanding of all phases of the construction process. STRABAG SE’s turnover reached 17.67 billion euros in 2023. By comparison, the total global market size of CAD software companies, including Autodesk, was estimated at around $18.54 billion in 2021,.

The developers of Zublin (Strabag) and RIB Software in Stuttgart were able to create plug-in converters for basic CAD programs that take geometric data and tessellate it in the OBJ format CPIXML (similar to USD). As a result, such triangulated geometries made it possible to calculate more than 150 different volumetric characteristics based on primitive MESH OBJ geometry beyond the CAD (BIM) programs in a tabular ERP, in addition to those already obtained within the CAD program itself. After the sale of ITWO to France’s Schneider Electric for €1.5 billion, Züblin (Strabag) focused on creating a new product for interoperability in the construction industry.

Figure 20: History of Zublin-Strabag swallows and the emergence of the OGG geometric kernel

Since the mid-2010s, Züblin (Strabag) has been developing a platform called SCOPE, which acquires geometry from various CAD programs via API connections and the reverse engineering SDK and converts it into a neutral format based on OpenCascade. This allows geometry data to be used in various business cases without being tied to specific CAD (BIM) programs. The main idea of the project is to separate project data management from CAD applications and to ensure users’ independence from vendor-specific tools. The SCOPE project description echoes the idea of Samuel Geisberg (creator of PTC and mentor of the Revit project), who sought to reduce the influence of CAD programs on the processes of changing and adding data:

During the first eight months of the project, the consortium succeeded in producing the first complete component descriptions in World Wide Web RDF and OWL formats. To this end, the functionality of the openCASCADE geometry kernel was converted into web addressable structures. The stated goal of SCOPE is to create web structures to overcome the barriers of digitalization. For this purpose, all components of a digital twin of a building are displayed independently of the software and made accessible via web interfaces. If successful, this means that all digitization activities required to create a building twin by all involved companies, specialized disciplines and industries can be performed independently of each other and still be networked. Provided that the content is provided on the same technical basis, namely a single server structure and a single data schema.

Figure 21: SCOPE semantic project by the Züblin-Strabag team

The idea of the SCOPE project is certainly promising and necessary for the industry, but the project is implemented within the closed Zublin-Strabag ecosystem, with teams of developers and managers changing. This potentially leads to the creation of a cumbersome and inefficient system with extended project realization terms. The OpenCascade-based Züblin project also faces the technical limitations of the OpenCascade geometry kernel itself and the ever-changing API of CAD programs. Whether it is possible for one company of undoubtedly talented specialists to cope with these issues remains a question.

In summary, given the limitations faced by developers, SCOPE cannot be called a complete solution at the moment. A development created by a private organization for internal use is not a universal tool.

On the other hand, CAD vendors themselves are making a step towards simplification and want to give the construction industry a new USD exchange format, which is similar to the CPIXML format that has already covered all 4D-7D processes in construction companies in Central Europe.

As a result, companies in the construction industry face a choice in the future: follow the approaches proposed by Züblin and the Fraunhofer Institute, or adopt solutions promoted by HOK and CAD vendors through the buildingSMART or AOUSD alliance.

However, both approaches, whether they originate from the construction industry or the CAD world, come to the same conclusion: for efficient data exchange, it makes sense to use simple flat metainformation storage formats and triangulated formats such as OBJ, CPIXML, DAE, SVF, gLTF or USD that store the same element data.

Fig. 22: Different data formats contain identical geometry and meta-information of project elements

After reviewing geometry transfer and related complexities, let’s move on to discuss the second integral part of CAD (BIM) formats — metainformation and semantic ontology of elements, which is mentioned in the official press releases of SCOPE and buildingSMART team.

Similar to the buildingSMART approach, Züblin has applied the concept of data semantics to its SCOPE platform. The basis of this approach is the idea of the semantic web proposed by Tim Berners-Lee. His goal is to create an intelligent semantics where data is structured in such a way that it can be understood not only by humans but also by machines.

10. Emergence of semantics and ontology in construction

Standardization and unification in construction borrowed the introduction of semantics and ontology from the concept of the semantic web in the late 1990s. This concept was adapted in the context of buildingSMART for the IFC standard. The basic idea behind semantics is that data should make sense not only to humans but also to machines, allowing them to “understand” information rather than just transmit it. Ontology, in turn, creates clear definitions of terms and their relationships, which provides a unified framework for all systems.

buildingSMART has attempted to scale this approach to the entire construction industry. In one of the key documents on the future of the IFC5 format, entitled “Future of the Industry Foundation Classes: Towards IFC 5”, the semantic approach is mentioned 32 times, emphasizing its importance for the further development of the standard:

Especially in the Semantic Web domain, much effort has been invested in transforming, modularizing and simplifying the IFC schema (or ontology; according to the typical idiom in this domain). It started with a straightforward transformation of the IFC EXPRESS schema and modifications that led to a more idiomatic ontology (Beetz et al. 2009), as well as analyses aimed at introducing modularity (Terkaj and Pauwels 2017).

The goal of buildingSMART is to create a single universal standard for describing objects and their relationships. This approach should be applicable throughout the construction world, ensuring data unification and improving its interpretation by different systems. Buying membership in buildingSMART gives member companies not only the opportunity to join the future, but also to actively influence its formation today.

Figure 23: 1994 quote from the chairman of the international board of directors of BuildingSMART

However, the implementation of semantics and ontologies does not always succeed. The reality has proven to be much more complex. In the gaming industry, attempts to describe game objects and interactions through ontologies have encountered problems due to the high dynamics of change and the creative nature of the industry. As a result, standard data formats (XML, JSON) and algorithms proved to be more efficient. A similar situation was observed in the real estate market, where the variety of local terms and rapid changes made ontologies overly complex. Simple databases and standards such as RETS performed better in data exchange and processing.

New entities should not be brought in unless absolutely necessary

Occam’s Razor

Technical difficulties, such as the complexity of markup and high labor intensity of support, as well as low motivation of developers, hampered the development of this idea in other industries. RDF (Resource Description Framework) did not become a mass standard, and ontologies proved to be overly complex and economically unjustifiable. As a result, the ambition to create a global semantic web failed to materialize. Some ideas have been adapted in corporate solutions, but the original goal of creating a single comprehensive graph has not been achieved.

Figure 24. Comparison of relational and ontology databases

Ontologies and semantic technologies promised to create meaning from data, but in practice they work more as a unification and standardization mechanism. Moving from tables to data graphs improves search and unifies the data model, but does not make the data more meaningful to machines. The question is not whether semantic technologies should be used, but where they really make a difference.

The interest in semantic technologies and ontologies in the construction industry is supported by the buildingSMART initiatives. However, the need for a complete formal logic and the creation of a unified ontology for the whole industry remains a controversial point. The experience of the CYC (“encyclopedia”) and semantic web projects shows that abandoning the idea of a single universal ontology in favor of local micro-theories applicable only within a specific task, project or company may be a more productive approach.

11. Semantics and ontology: how to make data talk?

Thanks to the efforts of buildingSMART, semantics and ontologies have become not only the key idea behind CAD vendor-driven standardization, but also the foundation for projects such as SCOPE, promoted by Züblin (Strabag), and aimed at freeing the construction industry from the dependency of CAD systems.

Semantic technologies are unification, standardization and modification of large arrays of heterogeneous data, as well as implementation of complex search. However, semantics is in no way related to the creation of new meaning or knowledge — in this respect they are not superior to other data storage and processing technologies.

Representing data from a relational database as a set of triplets does not add meaningfulness to the data itself. Replacing tables with a graph can be useful for unifying the data model, implementing complex searches, and safely modifying business models. However, it does not make the data more “intelligent” — the computer does not begin to understand its meaning better.

When it comes to storing “OWL data”, this data is stored using RDF triplets (RDF — Resource Description Framework and OWL — Web Ontology Language).

Figure 25: Graph data model: Nodes, Edges, and Triples illustrating relationships between building blocks

Theoretically, the logical inference of risoners (programs for automatic logical inference) allows to derive new statements based on ontologies. For example, if a building ontology records that “a foundation is a support for a wall” and “a wall is a support for a roof”, the risoner is able to automatically infer that “a foundation is a support for a roof”. This mechanism is indeed useful for optimizing data analysis, as it eliminates the need to explicitly spell out each dependency. However, it is not creating new knowledge, but merely automatically comparing already known facts.

Logical links in the ontology, if they are needed, can be organized without complex semantic technologies, for example, using relational databases (SQL) or CSV and XLSX tables. In columnar databases and formats, it is possible to add a column “roof support” and programmatically ensure that the fact that the roof is linked to the foundation is added when the wall is created. This is accomplished without the use of RDF, OWL, graphs or resolvers.

Fig. 26: Comparison of graphical and tabular models of representation of the same logical relations

The decision of buildingSMART and Strabag (Zublin) to follow the semantic web concept, which seemed promising and popular in the late 1990s, has influenced the entire construction industry. However, the paradox is that the semantic web concept, which was originally proposed for the Internet, has not been widely used even in its native environment. On the Internet, for which RDF and OWL were developed, these concepts are hardly used today. A full-fledged semantic web in the original architecture has never appeared, and its creation is probably not foreseen.

The idea of creating an Internet where computers would understand the meaning of content proved too complex and uneconomical. That’s why the corporations that originally supported the semantic web have abandoned only the useful elements of the technology — such as ontologies and SPARQL — and applied them for corporate purposes rather than for the Internet as a whole. If you look at Google Trends, for example, for the last 20 years, you can see that there may not be any prospects there anymore.

Fig. 27: Interest in the topic of semantic web according to Google query statistics

A logical question arises here: why use triplets, rizoners and SPARQL in construction at all, if you can process data using popular structured queries (SQL, Pandas, Apache)? In enterprise applications, SQL is the standard for working with databases. SPARQL, on the contrary, requires complex graph structures and specialized software and according to trends in Google does not attract developers’ interest.

Graph databases and classification trees may be useful in some cases, but their application is not always justified for most everyday tasks. As a result, the creation of knowledge graphs and the use of semantic web technologies makes sense only when it is necessary to unify data from different sources or to realize complex logical conclusions. For everyday tasks, such as construction data management, relational databases, CSV, SQL and Excel remain simpler, more accessible and efficient tools.

12. from graphs to tables: labor costs in grouping and filtering

Any ontology and relation fundamentally describes the parameters of project elements and entities using key-value pairs. The only question is how to communicate these key-value dictionary parameters and in what form. The difference is not in the storage mechanism or structure, but in the depth of semantic understanding and the ability to make meaningful connections between concepts. Whether express markup and the IFC ontology is the ideal tool for this is a big question.

Fig. 28: Information about element entities is stored in key-value form as different forms of representation

Other industries have the same elements with parameters, geometries, and similar ontology transfer problems. However, specialists in these industries transfer metainformation using popular formats such as XML, DB, JSON, CSV, HD5 and XLSX. A logical question arises: why in the construction industry we decided to transfer metainformation using technologies developed back in the 70’s for the IGES-STEP format in line-by-line EXPRESS markup, which was invented back in the days of punch cards? Yes, there are converters for converting data from IFC to JSON, XML, CSV or XLSX. However, the question arises — why use an intermediate step with IFC at all?

It is much more logical to unload data from CAD programs directly into JSON, XML or structured CSV/XLSX formats using reverse engineering SDKs, which are used by all CAD vendors without exception. In this case the use of IFC as an intermediate step loses its meaning. And if there is no difference in information completeness between graphs and tables, the choice will be reduced only to the choice of data schema and record format.

The form and schema of the data should fit the use case for particular tasks.

Figure 29: Graph and table structure contain identical information about project elements

The semantic, graph format only simplifies the creation of new relationships, that is, it allows new data types to be added to the graph without any changes to the storage structure.

Compared to relational tables, there is no special, additional data connectivity in a graph — translating two-dimensional database data into a graph does not increase the number of relationships or allow for new information.

To get data into business processes, we should strive to use those tools that help us get results as quickly and easily as possible.

The main task when processing data from CAD (BIM) models and databases remains the same — it is the fast grouping and filtering from the common project database of a certain group in order to extract key information. The results of this work are presented in the form of tables, graphs or documents, allowing for informed data-driven business decisions.

Figure 29: One of the main use cases for design data in construction is QTO grouping and filtering

Despite identical input and output data — the approaches and time required to perform these operations can vary significantly depending on the CAD (BIM) systems used, and more specifically the formats, data schemas and conceptual approaches used in them.

For example, the task of obtaining a table of volumes for all types of elements of the same category (let’s take the example of the walls category — OST_Walls, IfcWalls) from the project looks different depending on the tool. In the Revit GUI, it takes 17 mouse clicks to group and retrieve the table. Using Dynamo for Revit allows you to automate the process, but it requires importing and linking 13 blocks of code in a special IronPython IDE. Writing a Python script for Revit would require 40 lines of code and knowledge of APU, which provides more flexibility but requires more effort on the part of the engineer or programmer.

Working with IFC files through the Solibri program interface is similar to Revit in terms of labor intensity — here, to get a simple table with volumes from the project, you will also need to perform 17 mouse clicks. When using the Python library IfcOpenShell or JavaScript IFC.js, the processing becomes automated, but this again requires writing about 40 and more than 100 lines of code, respectively.

Additionally, the problem with proprietary formats and formats of CAD (BIM) tools is that, on the one hand, they provide a convenient environment for developing your own algorithms for automating production processes. However, on the other hand, these algorithms become rigidly tied to a specific data format or the way they are interpreted. As a result, automation loses its universality and starts to depend on the specifics of the format, instead of working with entities and being independent of technical limitations.

Figure 30: The same result is achieved with different labor inputs

By normalizing and structuring data from CAD project formats without opening CAD (BIM) programs themselves, but using for example reverse engineering tools, we get access to the use of data analytics tools for grouping, filtering and analysis. With data analytics tools, grouping, filtering and processing operations are performed with literally one line of code, whereas in traditional CAD systems and BIM concepts, where data is represented as graphs, you have to go through hierarchies and classes to get to the elements. As a result, the same actions that require dozens of clicks in the user interface, hundreds of lines of script, can be performed faster and easier by processing data in structured formats.

It is structured and normalized data that gives engineers the ability to move quickly and efficiently between different types of data, eliminating the need to learn unique format schemas, interface features, API connections, and geometry kernels.

Figure 31. In a graph data form, users get to the data they need using an API or user interface

Normalizing and structuring data gives flexibility in processing without requiring significant effort to understand the data schema for each individual use case. The same idea is being taken up by Autodesk in 2023, announcing a new era of granular data, where there will no longer be a file system — and projects will be broken down to minimal elements, as is the case with data analytics and structured formats.

If CAD vendors are developing new formats and ways of working with data for their customers, who is responsible for the data formats and processes for the entire construction industry and who is involved in creating standards for their use?

13. In the shadow of ISO and buildingSMART: the war for control of the data format

The problems of cross-platform CAD-MCAD data were first encountered by specialists in mechanical engineering, who often have to work with STEP and IGES formats (the ideological predecessor of IFC) to transfer information between MCAD products, where, as in the construction industry, specialists struggle with cross-platform geometric kernels. Quoted from the article “Choosing the Best CAD File Format”:

All CAD systems also have a geometric modeling kernel that underlies the native format and allows you to create and manipulate geometry. CATIA uses Convergence Geometric Modeler (CGM), Creo uses Granite (g), and Siemens NX and SOLIDWORKS use the Parasolid (x_t) kernel.

The geometric modeling kernel is exactly the same as the native format in terms of pure geometry, since the native format for any CAD system is based on its geometric modeling kernel. You can’t get better geometric accuracy than the kernel format as long as you’re using the kernel for the source or final application. For example, if you have a part from CATIA and you want to open it in MasterCam, save it in Parasolid kernel format, because that is what the MasterCam software application is based on.

Figure 32. Data quality in Closed (Closed BIM) and Open (Open BIM) formats

Programs such as Revit (based on Pro/Engineer) and AutoCAD (based on CADDS 3), as well as most modern geometry kernels, including open-source OpenCascade, came to construction from mechanical engineering. It was in this industry that data exchange standards such as STEP (analogous to IFC for mechanical engineering) were created.

In general, although the problems of mechanical engineers when working with formats are similar and lead us to the issues of using geometric kernels, the approaches to standardizing and promoting these formats in the mechanical engineering and construction industries are markedly different.

In contrast to construction, where the IFC standard is developed and promoted by buildingSMART, standards in mechanical engineering are formed at the level of the International Organization for Standardization (ISO). In ISO, standards are developed with equal participation of all member countries, which makes the process global and more independent. ISO can be seen as the “United Nations for standardization” with its central office in Geneva.

The STEP standard is being developed by TC 184, a committee affiliated with the American National Standards Institute (ANSI). Its history dates back to the IGES project, launched in 1979 by a group of CAD users and suppliers, including Boeing, General Electric, Xerox, Computervision and Applicon. The project received support from the U.S. National Institute of Standards and Technology (NIST) and the U.S. Department of Defense, with development funded by the military-industrial complex and controlled by the military. Later, the STEP standard was created on the basis of IGES, and its offshoot in the 90s became IFC. Unlike open initiatives, all key decisions on STEP standard development were made behind the scenes, without excessive publicity and noise.

STEP is a frankly North American standard: its use is not imposed on mechanical engineers; if you want to use it, use it, if you don’t want to use it, no problem. In contrast, buildingSMART, originally created in 1994 to promote the interests of HOK and Autodesk, with building STEP-IFC positions itself as a global initiative advocating interoperability and universal solutions for the whole world.

Unlike buildingSMART, the developers of STEP are not in the business of selling subscriptions or creating divisions like chapters and rooms through which to collect contributions to promote a parametric format without its own geometric core and semantic ontology that is supposed to do what the semantic web has failed to do.

But even the attempt to bring STEP-IFC to the construction industry and the creation of alliances does not allow to overcome the chaos created by the exchange formats. The situation has reached the point where even CAD vendors themselves are no longer able to support IFC within their products without the use of special SDKs, which we discussed in detail in the article Struggle for open data in the construction industry. History of AUTOLISP, intelliCAD, openDWG, ODA and openCASCADE

14. Why do builders and clients need to control data?

Data creation in construction is a continuous process of generating parameters and converting them into readable formats. Each project entity — a wall, window or foundation — is an object with a set of attributes such as material, type, cost, volume and area. This data needs to be stored somewhere, processed and made available to end users.

Developers of CAD (BIM) programs strive to keep users in their ecosystem and every year will move more and more data processing to cloud storage partners Amazon, Microsoft Azure and Huawei. The main marketing advantage of closed systems is the availability of high-quality geometry transfer from the geometric core of CAD (BIM) solutions to the cloud MESH format and the ability to import/export third-party formats from other vendors using expensive SDKs for reverse engineering.

But why do ordinary engineers, logisticians, builders, foremen and estimators need geometric kernels and cloud technologies? The accuracy of tessellated formats is sufficient for most construction tasks, and the use of complex geometry kernels and SDKs only complicates the process.

Figure 34. The construction industry faces an inevitable transition to open structured data and open tools

Most modern graphics engines work specifically with triangle meshes, not with BREP geometry. The irony is that all CAD vendors continue to persistently promote complex geometry kernels, sometimes not their own, while visualization, simulation and animation are gradually moving to popular and free tools for non-commercial use — Blender, Unity, Unreal Engine and Omniverse.

By abandoning parametric geometry for data exchange and processing in favor of MESH geometry, the dependence on CAD programs and geometry kernels will be significantly reduced. This will make data handling more transparent and accelerate the development of already popular formats such as OBJ, gLTF, DAE and USD. Recognizing this trend, CAD vendors are striving to keep users within their ecosystems. To do so, they promote “user-friendly” interoperability formats that are formally positioned as open. However, in practice, exporting them requires a subscription to CAD (BIM) programs, and processing requires API knowledge and skills in mapping features of complex geometric kernels.

In today’s design and construction world, the complexity of data access has led to over-engineering of project management. Medium to large construction and design companies are forced to either maintain close relationships with CAD (BIM) solution providers to access data via APIs and products such as Forge and ACC, or bypass the limitations of CAD vendors by using expensive SDK converters for reverse engineering.

Access to your own project data shouldn’t require a special key, a subscription fee to a cloud-based solution, or a magic spell in the form of an API request.

Access to information is a right, not a privilege.

Tim Berners-Lee, inventor of the World Wide Web

Access to open data will open the next pandora’s box for construction businesses, which will inevitably lead to the transformation of the entire construction industry.

15. Uberization and open data is a threat to the construction business

Investors and construction finance clients will inevitably realize the value of open data and historical data analytics in the future. This will open up opportunities to automate the calculation of project schedules and costs, allowing for better cost control and faster identification of excess costs. For example, if concrete volumes on site can be automatically checked against simple flat MESH project data with structured meta information without the use of CAD (BIM) and its complex geometric kernels, it will be immediately obvious how overestimates are being overestimated.

This openness and transparency of data poses a threat to construction companies that are used to making money from opaque processes and confusing reports, where speculation and hidden costs can be hidden behind complex and closed data formats and platforms. Therefore, construction companies are unlikely to be interested in fully implementing open data into their business processes. If the data is available and easy to process for the customer, it can be checked automatically, which will eliminate the possibility of overestimating volumes and manipulating estimates.

The loss of control over volume and cost calculations has already transformed other industries, allowing customers to directly achieve their goals. Digitalization and data transparency have transformed many traditional business models, such as cab drivers with Uber, hoteliers with Airbnb and retailers with Amazon, where direct access to information and automated payments have significantly reduced the role of intermediaries.

Figure 35. The construction business will soon face the same processes that cab drivers, hoteliers salesmen had to face 10 years ago

Investors, customers and banks have already started to demand transparency in the construction industry as well. The process of opening up and unrestricted access to data is inevitable, and in time open data will become the new standard. Therefore, the issues of open formats implementation will be most demanded by investors, customers, banks and private equity funds — those who are ultimately the end users of the constructed objects.

The movement of the investor, the customer from idea to finished building, in the future will be akin to traveling on autopilot — without a driver in the form of a construction company, promises to become independent of speculation and uncertainty.

Figure 36. Transition from decision-making based on the opinions of important professionals (HIPPO) to data analysis in the construction industry: today and tomorrow

Data and processes in all human economic activities are no different from what professionals in the construction industry have to deal with. The era of open data and automation will inevitably change the construction business, just as it has already done in banking, commerce, agriculture and logistics. In these industries, the role of intermediaries and traditional ways of doing business are giving way to automation and robotization, leaving no room for unjustified mark-ups and speculation.

In the long term, construction companies, which today dominate the market by setting price and service quality standards, may lose their role as the key intermediary between the customer and their construction project.

Open, structured data and processes will provide clients and investors with the basis for accurate estimates of project cost and schedule, eliminating the opportunity for construction companies to speculate on opaque data and complex formats. This is both a challenge and an opportunity for the industry to rethink its role and adapt to a new environment where transparency and efficiency will become key success factors. But where does that leave BIM in this story?

16. Do BIM, openBIM, BIM Level 3, and noBIM actually exist, or are they marketing gimmicks?

When we talk about BIM, images of 3D models, collision detection tools, and model viewers in ACC come to mind. But if you dig deeper, the question arises: what is BIM really? Is it a set of data, parameters and processes or just a marketing slogan? To answer this question, we need to go beyond the acronyms and concepts promoted by CAD vendors and look at the essence of working with design information — data and processes.

Any business process in construction does not start with working in CAD (or BIM) tools. In any business process we first form the parameters of the task and define the requirements for future elements: we specify a list of entities, their initial characteristics and boundary values. This is usually done in the form of several columns of a table, database or lists of key-value pairs (1–2).

And only on the basis of these initial parameters, objects are automatically or manually created in CAD/BIM programs using API (3–4), after which they are again checked for compliance with the initial requirements (5–6). This cycle — definition, creation, verification and correction (2–6) — is repeated until the data quality reaches the desired level for the target system — documents, tables or dashboards (7).

Figure 37. Process automation cycle for business processes in the construction industry

If we consider CAD (BIM) as a way of transferring parameters, which are sets of keys and values originally generated outside the CAD environment (1–2), then it becomes obvious that we work in fact with a database of parameters (2–3, 5–6), which is supplemented with various tools and at some point goes from simple requirements to a set of elements with parameters, which in a CAD program are usually handled as a closed database. Approaching BIM through the prism of this definition, we find principles similar to those used in data analytics as well as in ETL processes (data extraction, transformation and loading). The question arises: what is the uniqueness of BIM if there are no similar approaches in other industries?

For the last 20 years, CAD vendors have positioned BIM as something more than just a database. Marketing-wise, BIM is sold as a parametric tool capable of automating the design, modeling and lifecycle management processes of construction projects. In reality, however, BIM has become more of a tool to keep users on the vendors’ platform than a convenient method of managing data and processes.

Figure 37. Comparison of data processing workflows: manual CAD (BIM) approach without processing and automated workflow with data preprocessing

Vendors have effectively isolated quality CAD (BIM) data within their platforms, hiding it behind proprietary APIs, SDKs and geometry kernels. This has deprived users of the ability to independently extract, analyze, and communicate data to bypass these ecosystems.

Today, most data analytics processes in the construction industry are similar to those from other industries. These are typical ETL cycles (Extract, Transform, Load) and data analytics. In banking, for example, data is extracted, transformed into understandable formats and loaded into BI platforms for visualization and analysis. In construction, the same actions have been and will continue to be performed — data extraction from CAD (BIM) databases (Extract), normalization and structuring, subsequent analysis, transformation (Transform) and uploading to other systems and databases (Load).With full access to the CAD database and using reverse engineering tools, we can get a flat set of entities with attributes and export them to any convenient open format containing both geometry and attributes of design elements, similar to what is implemented in the SCOPE project.

Figure 38. Comparison of data processing workflows: manual CAD (BIM) approach without processing and automated workflow with data preprocessing

When BIM data is transformed into easy-to-analyze formats — structured representations of tables, databases, DWH, DL — developers stop depending on specific data schemas and closed ecosystems. This is what the future of the construction industry looks like — collecting data, analyzing it, validating it, and automating processes with data analytics tools.

Information is the oil of the 21st century, and analytics is the internal combustion engine.

Perhaps BIM (CAD) is not the end goal, but only a stage of evolution. When construction professionals realize that they can work directly with data, bypassing traditional CAD tools, “BIM” as a term will dissolve into a flood of information. We will start talking about granular or structured construction project data, but it will no longer be the BIM we know today.

17. What comes next. Simple formats and user-friendly tools

The key direction of development may be the transition to open and independent solutions free from proprietary SDKs and geometry engines. Instead of trying to “tweak” the IFC format, which remains difficult to work with without special knowledge of the format’s peculiarities and a specialized geometry engine, the industry should pay attention to existing universal formats that have proven their effectiveness: MESH, OBJ, gLTF, DAE, FBX or USD for geometry, as well as CSV, JSON, XML, XLSX, SQL, YAML for metadata and parameters.

The future of the construction industry is linked to the creation of simple and affordable tools similar to those that emerged in 2D graphics in the late 2010s after two decades of Photoshop dominance. The emergence of flat JPEG, PNG and GIF formats, free from the redundant logic of the internal engines of 2D editors, has led to the development of thousands of compatible image processing solutions.

Similarly, the standardization and simplification of 3D formats and meta-information will stimulate the emergence of many convenient and independent tools for working with construction projects. Abandoning the complex logic of “vendor cores” and moving to universal, open formats will create conditions for more flexible and efficient work, as well as open access to data for all participants in the construction process.

Figure 39. CAD/BIM maturity level: from unstructured data to fully structured data and repositories.

There are promising projects in the Open Source community that can serve as examples of development of lightweight geometric solvers: SolveSpace with the potential to work in the browser, Web-CAD.org on JavaScript, various free CSG-editors. Special attention should be paid to the Unity platform, which provides a ready base for creating complex tools with the ability to customize rendering and add the necessary functionality. In my 2021 example, the engine of Unity’s browser-based CAD program NoteCAD was reworked with its own geometry solver to match the functionality of the free online SketchUp as much as possible. The result is a lightweight, free and open source product that runs quickly on any server.

Figure 40. BimCADOnline example of an open source online solution with its own geometric solver

The main principle of industry development should not be the creation of new formats, but the effective use and refinement of existing solutions. It is important to rely on international communities such as OsArch and FreeCAD, and to support teams developing open geometry kernels.

For those who need to work with BREP-geometry, Open Source geometry kernels such as OCC and OGG can play a key role. And for tasks where MESH geometry is sufficient, CGAL, OpenMesh and MeshLib will be a promising solution. Use reverse engineering SDKs from companies such as OpenDesignAliance, CADExchanger or HOOPS to retrieve and write data to various CAD and MCAD formats. These SDKs have been created to give developers equal access to data and to ensure that everyone can develop CAD and MCAD solutions at the level of leading CAD (BIM) developers.

Figure 41 OpenDWG. The Alliance’s goal is to level the playing field for all CAD vendors, regardless of their relationship with Autodesk

Combined with LLM models, such tools will greatly improve the efficiency of the processes of creating, transforming and exporting data to target systems.

18. Emergence of LLM and ChatGPT in project data processes

In addition to the development of open formats and universal tools, Large Language Models (LLMs) such as LLaMA, ChatGPT, Gemini and Claude are revolutionizing the data processing industry. These technologies are fundamentally changing the way project data is handled and analyzed, making processing and automation accessible to everyone involved in the construction process.

If earlier the information was accessed exclusively through complex APIs and SDK-interfaces requiring special programming skills, now it is possible to interact with data in natural language.

Within a few years there will inevitably be a democratization of data access. Simple engineers, managers and planners will be able to get the necessary information from project data in structured formats by simply formulating queries in ordinary language. For example, it will be enough to ask in chat: “Show in a table with grouping by type all walls with a volume of more than 10 cubic meters”, — and LLM will independently convert this query into SQL- or Pandas-query, transforming structured project data through grouping into the desired format of a table, graph or finished document.

With each passing day, the construction industry will hear more and more about granular structured data, DataFrames and columnar databases. Unified two-dimensional DataFrames formed from various databases and CAD (BIM) formats will be the perfect fuel for modern analytical tools. And tools like Pandas and similar libraries for working with two-dimensional tables and columnar databases are ideal for use in LLM due to their powerful data processing capabilities and efficient indexing. Such data does not require an understanding of schema formats and becomes an ideal source for RAG, LLM and ChatGPT.

The automation process itself will be significantly simplified — instead of studying APIs of closed products and writing complex scripts in Python, C# or JavaScript to analyze or transform parameters, now it will be enough to formulate the task in the form of a set of separate text commands that will be folded into the right Pipeline or Workflow process for the right programming language.

Figure 42. Moving from programming to query creation in the LLM to create a Pipeline process

No more waiting for new products, formats, plug-ins or updates from CAD (BIM) tool vendors. Engineers and builders will be able to work independently with data using simple, free and easy-to-understand tools with LLM chatbots.

Modern data analytics tools, combined with open data formats, will create a new paradigm of work in the construction industry, where the main thing will not be the possession of a certain software and understanding of its API, but the ability to effectively formulate-parameterize tasks and quickly analyze the data obtained.

In lieu of a conclusion

The situation with interoperability and cross-platform in the construction industry clearly demonstrates the underlying problems associated with the transfer and use of data between different CAD (BIM) systems. A prime example of this is Autodesk, which actively promotes the open IFC format, but is itself unable to ensure its correct export from its own products and has to turn to the reverse-engineering SDK from OpenDesignAlliance, an organization that was originally created in 1998 to counter Autodesk’s monopoly over data. Paradoxically, this situation forces all industry players to use similar SDKs to extract and adapt competing CAD solutions’ data for their platforms, effectively undermining the idea of cross-platform interoperability for which the IFC interoperability format was created.

The IFC format, intended to be a universal bridge between different CAD (BIM) systems, in reality serves as an indicator of compatibility problems between geometric kernels of different CAD platforms, similar to the STEP format from which it originally emerged. Despite the active promotion of the format by the buildingSMART organization, the main efforts are focused on the standardization of geometry (which requires a unified geometry kernel, which still does not exist), and on the unification of semantics and ontologies for the construction industry. However, these efforts echo the unrealized ambitions of the Semantic Internet concept, where expectations far exceeded reality.

Figure 43. Geometry and information in construction processes: from complex CAD and BIM systems to simplified data for analytics

Industry specifics exacerbate this problem. The construction industry, which traditionally lags 10–20 years behind other industries in mastering new technologies, risks repeating this path. If in IT the failures of semantic web were compensated by the emergence of new technologies (big data, IoT, machine learning, AR/VR), the construction industry has no such reasons.

However, there are also isolated examples of alternative approaches. Züblin with their SCOPE project demonstrates how it is possible to go beyond the classical logic of CAD (BIM) systems. Instead of trying to subordinate IFC or relying on proprietary geometry kernels, they use reverse engineering APIs and SDKs to extract data from various CAD programs, convert it into neutral formats such as OBJ or CPIXML based on the open source OpenCascade kernel, and then apply it to hundreds of business processes of construction and design companies. However, despite the progressiveness of the idea, such projects remain part of closed ecosystems that reproduce the logic of monovendor solutions. As a result, the construction industry is once again trapped in a situation where cross-platform standards like IFC fail to fulfill their mission and local initiatives only partially mitigate the problem.

And it is highly likely that CAD vendors will once again succeed in shifting the discussion about access to open data towards “new” concepts, formats and alliances that, like BIM and openBIM, will serve primarily as a tool to keep users in proprietary ecosystems, which will once again stall productivity growth in an already unproductive industry, as resources will be directed not to simplifying and optimizing processes, but to maintaining control of ecosystems and attracting users to the new concept.

Figure 44. For the last 30 years CAD (BIM) technologies have failed to increase productivity in the construction industry

The purpose of this analysis is not to criticize existing approaches, but to initiate a discussion on the main question — how to increase productivity in the construction industry and whether it is possible in principle. I have deep respect for the developers of CAD (BIM) solutions and Autodesk, as well as for the participants and members of buildingSMART. However, perhaps it is time to give up waiting for new concepts from vendor software and focus on independent development. By freeing itself from data access issues, the industry will be able to move to modern and user-friendly tools for working with and analyzing data. This will allow the industry to move in the direction that Samuel Geisberg pointed out back in the late 1980s.

Traditional CAD/CAM software unrealistically restricts making inexpensive changes at the very beginning of the design process. The goal is to create a system that is flexible enough to encourage the engineer to easily consider different designs. And the cost of making changes to the design should be as close to zero as possible.

This is the beginning of a discussion about industry transformation. Only by solving data access issues can we move on to exploring business processes and automating them.

This document discusses general topics in the field of construction and data processes, referencing historical and current industry practices. All trademarks and registered trademarks are the property of their respective owners, and this text is not affiliated with or endorsed by any trademark holders. The content herein is provided for informational purposes and does not constitute legal or technical advice.

  1. Trademark Disclaimer:

All product names, logos, and brands mentioned in this document are the property of their respective owners. Use of these names, logos, or brands does not imply endorsement.

2. Non-Infringement Statement:

This document is intended for informational purposes only. The content presented does not aim to infringe upon or claim ownership of any proprietary trademarks, patents, or intellectual property.

3. Fair Use Notice:

This material is provided under the principles of fair use for educational, research, and informational purposes. The content is based on publicly available information and does not claim to represent or replicate proprietary knowledge.

4. Copyright Acknowledgment:

Certain sections of this document reference third-party materials and are duly credited. If there are any copyright concerns, please contact us to address them appropriately.

5. Attribution Statement:

Specific frameworks, formats, or tools mentioned are attributed to their respective authors or organizations. Their inclusion is intended for analysis and commentary.

📈 Open data and formats will inevitably become a standard in the construction industry — it’s just a matter of time. This transition will be accelerated if we all spread the word about open formats, database access tools and SDKs for reverse engineering. Each and every one of you can help in this process. If you find the information you read useful, please share it with your colleagues.

If you’d like to keep up with new updates and articles, sign up for the newsletters on the DataDrivenConstruction website or subscribe on LinkedIn and Medium.

🔗 Original Article: The post-BIM world. Transition to data and processes or whether the construction industry needs geometric kernels, semantics, formats and interoperability

🔗 LinkedIn: The post-BIM world. Transition to data and processes or whether the construction industry needs geometric kernels, semantics, formats and interoperability

👋 I would appreciate your comments and opinions! If any facts and sources raise questions for you or you want to share your own views — please write your thoughts in comments or private messages. Your point of view, observations and comments are important to the discussion. I will be glad to continue the dialog.

Other articles on these topics:

📰 The Age of Change: IFC is a thing of the past or why Autodesk is willing to give up IFC for USD in 14 key facts

📰 The struggle for open data in the construction industry. History of AUTOLISP, SDK, intelliCAD, openDWG, ODA, openCASCADE

📰 Lobbyist Wars and BIM Development. All Parts

📰 The book “DataDrivenConstruction. Navigating the Data Age in the Construction Industry”

--

--

artem boiko
artem boiko

Written by artem boiko

For the last ten years I have been working in construction industry and implementing Python scripts and processes automation in construction industry.

No responses yet