INFORMATION INTELLIGENT TECHNOLOGY OF THE AUTOMATIC GEOREFERENCING OF THE ECOLOGICAL TEXT NATURAL-LANGUAGE INFORMATION
Keywords:named entity recognition (NER) technology, natural language processing (NLP) technology, NLP, data georeferencing, spatial relations, GIS, machine learning, artificial intelligence, ecological text information
Research is devoted to the development of the information intelligent technology of the automated georeferencing of text information-language information by means of the NER (named entity recognition) technology and NLP (natural language processing) technology with the reference to the geographical objects of the vector maps. The study kit is formed by means of partition of the labeled entities-locations and entities-organizations into separate samples, which contain combined in a certain way entities, that characterize the planar objects of larger surface and, separately, those that characterize smaller planar objects, linear and point objects. Such division of the data enables to organize multistage revision of the identification results and models used, this allows to provide simultaneously the increase of the completeness, accuracy and speed of georeferencing of the set ecological text information.
Recommendations, regarding the application of this technology for Ukrainian, English and other languages as well as for the algorithm for the preparation of the input cartographic data, using GIS-package of ArcGIS programs are developed. The examples of the application of the separate elements of the suggested technology to real text data about the state of water arrays of the South Bug basin are given.