• PostGIS becomes slow once single-point granu-
larity (using PC Intersection) is desired. In al-
most all cases, working with unindexed files is
faster, and this is before including the substantial
time for importing the point cloud into PostGIS.
5 CONCLUSION
In this paper, we demonstrated how fast search al-
gorithms enable working with raw, unindexed geo-
data. We implemented several experimental applica-
tions that perform queries that are answered ad-hoc
based on raw geodata, without using any index struc-
ture. We released these applications under an open-
source license on GitHub (Fraunhofer IGD, 2021a).
The experiments show that for both building and point
cloud data, ad-hoc queries perform very well in gen-
eral. They achieve times reasonable for practical use
cases and comparable to those of the indexed-based
solutions. In particular, they allowed us to directly
work with the data without having to wait several
hours for it to be imported into a database. This even
works for data that does not fit into main memory.
For point cloud data, we also discovered potential
for optimization by changing the data layout within
the LAS file format so a search algorithm has to fetch
less data from disk. Compression is a limiting fac-
tor that makes querying raw point cloud data slow.
We were unable to achieve interactive query response
times for all tested compressed formats due to the
large computational overhead.
We believe that our results form the basis for de-
veloping further applications based on ad-hoc queries,
harnessing the power of modern computers. Espe-
cially in the scientific community, we often encounter
single-user scenarios where data is analyzed in a way
that is similar to what ad-hoc queries offer. Integrat-
ing fast search algorithms in the fashion of what is
presented in this paper into existing data analysis li-
braries could speed up these scenarios, enabling users
to work more efficiently with geospatial data.
REFERENCES
Alagiannis, I., Borovica, R., Branco, M., Idreos, S., and
Ailamaki, A. (2012). NoDB: Efficient query exe-
cution on raw data files. Proceedings of the ACM
SIGMOD International Conference on Management
of Data, pages 241–252.
American Society for Photogrammetry and Remote Sens-
ing (ASPRS) (2013). LAS specification, version
1.4 - R13. https://www.asprs.org/wp-content/uploads/
2010/12/LAS 1 4 r13.pdf. Accessed: 2022-04-06.
Boyer, R. S. and Moore, J. S. (1977). A fast string searching
algorithm. Communications of the ACM, 20(10):762–
772.
Cesium Team (2018). CesiumGS/3d-tiles: Specification
for streaming massive heterogeneous 3D geospatial
datasets. https://github.com/AnalyticalGraphicsInc/
3d-tiles. Accessed: 2022-04-06.
Cura, R., Perret, J., and Paparoditis, N. (2017). A scalable
and multi-purpose point cloud server (pcs) for easier
and faster point cloud data management and process-
ing. ISPRS Journal of Photogrammetry and Remote
Sensing, 127:39–56.
Department of City Planning (DCP) of the City of
New York (2021). Primary land use tax lot
output (pluto). https://www1.nyc.gov/site/planning/
data-maps/open-data/dwn-pluto-mappluto.page. Ac-
cessed: 2021-07-10.
DoITT (2021). Department of Information Technology
& Telecommunications (DoITT) of the City of New
York – NYC 3-D building model. https://www1.nyc.
gov/site/doitt/initiatives/3d-building.page. Accessed:
2021-07-01.
El-Mahgary, S., Virtanen, J. P., and Hyypp
¨
a, H. (2020). A
simple semantic-based data storage layout for query-
ing point clouds. ISPRS International Journal of Geo-
Information, 9(2).
Fraunhofer IGD (2021a). Ad-hoc queries on 3d
building models and point clouds - bench-
mark implementations. https://github.com/
igd-geo/adhoc-queries-building-models, https:
//github.com/igd-geo/adhoc-queries-pointclouds.
Accessed: 2022-04-06.
Fraunhofer IGD (2021b). Enhanced NYC 3-D building
model. version 20v5. https://github.com/georocket/
new-york-city-model-enhanced/. Accessed: 2022-
04-06.
Gr
¨
oger, G., Kolbe, T. H., Nagel, C., and H
¨
afele, K.-
H. (2012). OGC city geography markup language
(CityGML) encoding standard. Open Geospatial Con-
sortium, 2.0.0 edition.
Holanda, P., Raasveldt, M., Manegold, S., and M
¨
uhleisen,
H. (2020). Progressive indexes: Indexing for interac-
tive data analysis. Proceedings of the VLDB Endow-
ment, 12(13):2366–2378.
Horspool, R. N. (1980). Practical fast searching in strings.
Software: Practice and Experience, 10(6):501–506.
Isenburg, M. (2013). LASzip: Lossless compression of li-
dar data. Photogrammetric Engineering and Remote
Sensing, 79(2):209–217.
Isenburg, M., Liu, Y., Shewchuk, J., Snoeyink, J., and
Thirion, T. (2006). Generating raster dem from mass
points via TIN streaming. In Raubal, M., Miller, H. J.,
Frank, A. U., and Goodchild, M. F., editors, Interna-
tional conference on geographic information science,
pages 186–198. Springer.
Kr
¨
amer, M. (2020). Georocket: A scalable and cloud-based
data store for big geospatial files. SoftwareX, 11.
Liu, H., Van Oosterom, P., Meijers, M., and Verbree, E.
(2020). An optimized SFC approach for nd window
DATA 2022 - 11th International Conference on Data Science, Technology and Applications
444