5 CONCLUSIONS AND FUTURE
WORK
We have proposed an approach to extract equipment
and technical attributes from P&IDs and retrieve tech-
nical documentation from technical sheets, and de-
scribed an architecture to support this approach. We
have performed experiments on a manually labelled
dataset of P&IDs, containing 607 equipment, and
the performance results for the equipment extraction
stage achieve about 97,2% precision and 71,2% re-
call.
In the Oil & Gas Upstream industry, EPC projects
take in average 90k hours of technical engineers to
create the asset register, to do criticality studies (to
define the criticality of each equipment), the defini-
tion of the spare parts to be procured and put in stock
for later use, and the definition of maintenance plans,
procedures, and manuals. The proposed architecture,
Gutenbrain, allows the saving of 16k per project, ei-
ther by extraction automatically the equipment infor-
mation or by allowing to search in the technical in-
formation using semantic search in content otherwise
unsearchable. The experimental validation presents
an average reduction of approximately the 60% of
engineers’ effort in cumbersome tasks of extracting
equipment information allowing the saved hours to be
spent by engineers on tasks with higher value. This
approach still requires users to validate the extracted
information and extract the undetected information,
but we provide a user interface for engineers to have
the autonomy to do so.
In the future we would like to fine-tune the
question-answering models with closed-context data
from past projects to improve the results. We would
also like to complement the equipment extraction
with the computer vision approach of detecting sym-
bols within diagrams. The hypothesis being that an
ensemble method using text and symbols might out-
perform our current approach.
REFERENCES
Abinaya Govindan, G. R. and Verma, A. Intelligent ques-
tion answering module for product manuals.
Chen, W., Chang, M.-W., Schlinger, E., Wang, W., and Co-
hen, W. W. (2020). Open question answering over
tables and text. arXiv preprint arXiv:2010.10439.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2018). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. arXiv preprint
arXiv:1810.04805.
Elyan, E., Garcia, C. M., and Jayne, C. (2018). Sym-
bols classification in engineering drawings. In 2018
International Joint Conference on Neural Networks
(IJCNN), pages 1–8. IEEE.
Fu, L. and Kara, L. B. (2011). From engineering diagrams
to engineering models: Visual recognition and appli-
cations. Computer-Aided Design, 43(3):278–292.
Gao, W., Zhao, Y., and Smidts, C. (2020). Component de-
tection in piping and instrumentation diagrams of nu-
clear power plants based on neural networks. Progress
in Nuclear Energy, 128:103491.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep
residual learning for image recognition. CoRR,
abs/1512.03385.
Herzig, J., Nowak, P. K., M
¨
uller, T., Piccinno, F., and Eisen-
schlos, J. M. (2020). Tapas: Weakly supervised table
parsing via pre-training.
Kang, S.-O., Lee, E.-B., and Baek, H.-K. (2019). A digiti-
zation and conversion tool for imaged drawings to in-
telligent piping and instrumentation diagrams (p&id).
Energies, 12(13):2593.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A robustly optimized bert pre-
training approach. arXiv preprint arXiv:1907.11692.
Moreno-Garc
´
ıa, C. F., Elyan, E., and Jayne, C. (2019).
New trends on digitisation of complex engineer-
ing drawings. Neural computing and applications,
31(6):1695–1712.
Nandy, A., Sharma, S., Maddhashiya, S., Sachdeva, K.,
Goyal, P., and Ganguly, N. (2021). Question answer-
ing over electronic devices: A new benchmark dataset
and a multi-task learning based qa framework. arXiv
preprint arXiv:2109.05897.
Rahul, R., Paliwal, S., Sharma, M., and Vig, L.
(2019). Automatic information extraction from pip-
ing and instrumentation diagrams. arXiv preprint
arXiv:1901.11383.
Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. (2016).
Squad: 100,000+ questions for machine comprehen-
sion of text. arXiv preprint arXiv:1606.05250.
Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019).
Distilbert, a distilled version of bert: smaller, faster,
cheaper and lighter. ArXiv, abs/1910.01108.
Wenyin, L., Zhang, W., and Yan, L. (2007). An interactive
example-driven approach to graphics recognition in
engineering drawings. International Journal of Docu-
ment Analysis and Recognition (IJDAR), 9(1):13–29.
Yu, Y., Samal, A., and Seth, S. C. (1997). A system for rec-
ognizing a large class of engineering drawings. IEEE
Transactions on Pattern Analysis and Machine Intel-
ligence, 19(8):868–890.
Gutenbrain: An Architecture for Equipment Technical Attributes Extraction from Piping Instrumentation Diagrams
211