The use of robotics on construction sites has failed to reach its potential, claims a University of Toronto researcher, but with more research, Daeho Kim says, a higher-order fully autonomous mobile robot could be a step closer to realization.
Kim, an assistant professor in the U of T’s Faculty of Applied Science and Engineering, says most current so-called robots that patrol construction sites should be more accurately referred to as tools that repeat some pre-programmed tasks.
A few success stories aside, what is missing is full robotic automation and digitization that uses human-level visual artificial intelligence (AI) to fully understand the construction sites where they are deployed.
To attain the high level of visual AI that will power robotics on sites, millions of images are required but for a variety of reasons, obtaining that number is impractical. What Kim and his team are proposing are two novel techniques: synthesizing virtual construction images, and generating miniature-scale construction images.
“As we are developing new forms of construction robots, the hardware part has made a large stride, for example, Spot from Boston Dynamics, but the software development, that artificial intelligence part, still has a long way to go,” said Kim.
“The problem is that we are lacking training data for construction scenes. DNN, deep neural network, the core engine of visual AI, is a supervised model, which naturally becomes data-greedy. To develop a well-trained artificial intelligence…we need a giant number of well-diversified training images for the construction scenes.”
Kim’s research program was one of 251 university initiatives that were announced as recipients of a total of $64 million in funding from the Canada Foundation for Innovation’s John R. Evans Leaders Fund in September.
His project summary, submitted to the innovation centre, stated, “Robotic solutions with enhanced AIs will collaborate with field workers safely, improving productivity and profitability while offsetting the growing labour shortage. The proposed research project is essential to realizing this vision, delivering optimized field-applicable DNN models — a critical next step in the development of autonomous construction robots.”
The robotics will collect, analyze and document site information, allowing the creation of live digital twin models of ongoing construction sites.
Synthesizing images that will be developed into visual AI is required, Kim explained, first because it is hard to collect the data in person.
Surveillance cameras and drones have occlusions, are very costly — Kim mentioned $2 to $10 per image — and present other problems.
Gathering one million images would be time-consuming and there are various problematic regulations and confidentiality issues.
Commercializing and sharing the data in a competitive construction setting are other problems.
Work is proceeding swiftly at Kim’s U of T lab, with the team using five Tensor Processing Units and Google Cloud software. More computational resources are required.
“We fully focused on developing a simulation software that can automatically synthetize non-real but real-looking construction images, and a few weeks ago, we kick-started generating one million construction training images actively. This is thrilling news to me as, to my best of knowledge, we never had a chance before to use one million training images in construction DNN training,” he said.
Steps in the synthesis include creating a 3D human model, followed by the input of motion capture data of workers; creating a 3D construction worker avatar by mapping the 2D or 3D clothing map onto the 3D human model; setting the imaging conditions randomly, including camera distance and lighting conditions; and synthesizing and generating construction images or videos by superimposing the virtual construction worker avatar onto 3D construction backgrounds.
Later comes prototyping of a fully autonomous mobile robot for construction digital twinning that deploys the higher-order DNN models.
The construction robots will need to be able to monitor and analyze location, moving speed and direction, pose, proximity and other factors that capture the presence of construction workers.
“It’s still not clear how much synthetic images are effective in training visual AI models for a construction scene, which is highly dynamic and unstructured.” said Kim. “We may or may not need our own unique solution.”
For the final step, Kim will need partners in the private sector — he is seeking an innovative construction firm that would support the research financially.
Follow the author on Twitter @DonWall_DCN