The EO Open Science 2017 conference was organized in Frascati, a picturesque town near to Rome. It is the home of multiple science organizations including European Space Agency Earth Observation headquarters, Italian National Agency for New Technologies and several international scientific laboratories.
Commercial impact and opportunities
As well as purely scientific research opportunities, there are multiple industrial and commercial applications for the earth observation analytics results: real estate, agriculture, financial trading to name a few. For example, in the commercial activity monitoring, an analysis can predict economic results by the rate of parking lots occupancy near supermarkets and level of activity in trading ports.
One of the presented cases, StarLab – a company founded by neuroscience and image recognition scientists, worked with local Barcelona council on the Street Health project identifying dynamics in the city’s forestation. They now successfully use the same methodology to create services for construction, real estate, and insurance organizations.
On the receiving side, earth observation, like other complex industry ecosystems, has a strong demand for tools that automate data exploration, storage, and integration. This is particularly true for smaller companies and less specialized in the topic researchers. Larger players have more flexibility with creating own tools, but also are looking for better infrastructure and integrations across software and data processing platforms. Organizations of all levels would benefit from the ability to integrate geospatial data into their business processes, with no or little own technical infrastructure and data analytics specialists.
A separate sector was represented by software companies building their solutions on top of the data from the Sentinel satellites and other open sources. Synergise, a Lubliana-based software development team, explained that they used to build “pyramids”, complex bespoke solutions for each client, but at some point had an aha moment and made a deal with Amazon where they would unpack the data from ESA and upload it into the AWS Open Data storage at EC2 in exchange for free processing time. Later on, they transformed it into a self-service platform product which allows them to transition into offering more scalable product-based offerings.
Open data for collaboration and innovation
There is a vast volume of high-resolution imagery available from proprietary satellite constellations such as Planet and Digital Globe. The data collected from the Sentinel satellite constellations is obtainable for free from the Sentinel Hub. A representative of Planet, a San Francisco-based data provider that operates its own satellite constellation, presented how her company approaches data processing and sharing. Planet offers a free demo access to their lower-resolution images. Another example is Digital Globe, one of the leaders in commercial satellite surveillance and other services, Google Maps and Earth is using the satellite imagery provided by this company.
Data science toolbox
For our data engineering team, earth observation is one of the areas to apply the methodologies and algorithms we use. It was interesting to discuss in details how the methods Skein already exploits in other areas are beneficial for gaining insights and designing solutions for the satellite applications. Such methods and capabilities include computational infrastructure, image recognition techniques, data storage and ETL (Extract, Transform, Load), technology for capitalizing on the mass human power for algorithm training and analytics and user-friendly UI and output of the analytics results.
Earth observation data comes in the form of images, often in the form of time series and can also be in very-high-resolution (VHR), geospatial GIS data and other forms. Correspondingly, a number of machine learning approaches such as unsupervised learning, K-means, machine vision neural net processing and pattern recognition translate well between earth observation research and other industries. In the same way, Python, which is used by our team at Skein, is the most popular data science framework and earth observation scientists.
Use of crowd science
Crowdsourcing of solutions from citizen scientists and anyone who is interested in helping the greater scientific good, is a particularly important topic. It is used for tasks like training machine learning algorithms and helping the technology where automation is not fully achievable with the current technology. Similarly, Skein worked with a leading library to automate human-powered library records processing.
One of the major objectives of the ESA’s earth observations programmes is to promote citizen science and collaboration, helping the wide audience of data scientists, engineers or lay people to help with things such as training of a machine learning algorithm to differentiate and classify clouds. For example, clouds are a big deal among image recognition scientists. They obscure the photographs of the earth surface. Human help is invaluable in training cloud-recognition and clean-up algorithms. A part of Sentinel Hub Synergise integrates a manual cloud classification platform.
Quantitative data is collected at the massive scale, and it is often presented in time-series. From April 2014- big data archives from the Sentinel satellites are freely available. The scope is at a staggering level of over 30PB collected daily. Computational infrastructure is a particularly important topic since processing of such volumes of information requires large capacities. For many calculations it is most cost-efficient to run them on a local server, however for tasks that require cloud-computing, the costs remain the challenge, especially taken into account the fact that most of the large players are US companies, such as Amazon.
In summary, ESA’s activities and dynamic development of the ecosystem of technology and collaborators at all levels create an abundant platform not only for the research and innovation in environmental management, security, and economic development. They also create abundant commercial opportunities, benefiting from and fostering the development of machine learning, data science, and global computational infrastructure.