Subscribe to this thread
Home - General / All posts - ArcGIS Pro Deep Learning Blog Post
antoniocarlos

580 post(s)
#24-Feb-21 15:25

I was looking at this article on the ESRI Blog (https://www.esri.com/arcgis-blog/products/arcgis-pro/imagery/deep-learning-with-arcgis-pro-tips-tricks/) related to Deep Leaning and feature extraction from images. It is the first of 3. It seems to me that Manifold 9 already has the infrastructure to do this kind of thing and has had it for some time now. So why not? I'll send a request this weekend. :-)


How soon?

Dimitri


6,512 post(s)
#25-Feb-21 08:22

I agree "deep learning" (also called "AI", "neural networks," and other phrases that all mean the same things these days...) is a sexy thing. You're right that Manifold's mastery of GPU plus the ability to handle big images means a lot of the necessary infrastructure is in place.

But the article does not go into how the practical reality of what needs to be done to use deep learning means that very few Manifold people would actually use it.

The key takeaway from that article is this:

At the highest level, deep learning, which is a type of machine learning, is a process where the user creates training samples, for example by drawing polygons over rooftops, and the computer model learns from these training samples and scans the rest of the image to identify similar features.

But the above phrase doesn't tell you how many training samples the user has to provide for the capability to be useful. Typical models (which is what you create by providing training samples) require tens of thousands of training samples, and many models require hundreds of thousands of training samples.

Say you want to create a model that finds swimming pools in aerial photos. First, you need many thousands of aerial photos of swimming pools and then you need somebody who will sit there marking off in each of those photos what's a swimming pool. It's not like you do only five or ten photos and then the machine takes over. Training is a very costly investment of human labor.

The reality is that in most cases it's quicker and cheaper to simply digitize manually. If you're some very big organization that needs to find swimming pools for tens of thousands of different cities and counties around the world, well, then maybe it would be cost effective for you to train up a model that could then use deep learning to automate some of the work.

As you can see from the above, deep learning is one of those things that works great as a marketing pitch so long as the practical limitations on using it aren't considered. ESRI's pitching it hard because it's an easy way for them to be able to say "Hey, we're not idiots about GPU, we have deep learning!" - even though only about one in a thousand of their customers can make use of it.

From an implementation perspective, deep learning is much easier than general purpose parallelism on GPU. Why? Because deep learning is basically the same thing for all applications. It's putting together a neural net, the very same neural net software in most cases, and training it. The difference between looking for swimming pools and looking for roadways or rooftops is just the training material you feed it. Even easier is that Nvidia hands out free libraries that do the neural net for you on the GPUs. There is basically no GPU code to write, which is why an organization with such poor GPU skills as ESRI can do it.

In contrast, every use of GPU for parallel processing is different: it's a different parallel algorithm with different math to process, say filters as compared to say, cosines, and you have to write that code in each case. That is much harder to do, which is why ESRI to this day still has only three simple raster functions that are GPU parallel.

It's true Manifold has most of the infrastructure required to support neural networks applied to raster processing. But Manifold doesn't have all of it. The missing parts are a variety of user interfaces to enable training and to manage saving, loading and use of models that have been trained. There is also some very straightforward work to manage installation and use of machine learning modules that NVIDIA provides for use with NVIDIA GPUs, and to do that in a way that doesn't turn Manifold and Viewer installation packages into utter bloatware. All of that stuff is easy - no rocket science involved - but there's a fair amount of it.

But when you do that what do you have? A collection of features that allow somebody who has the time and resources to acquire tens or hundreds of thousands of images and then to hand-process each of those images to teach the model what's desired. That's not something most people in the Manifold user community have the resources to do.

And, even for those who have the resources, it takes a really huge investment in training labor to get results that are worthwhile. A good example is in this topic, which uses a buildings footprints data set created by Microsoft using a neural network (that is, a deep learning process). Despite effectively infinite resources, the result of Microsoft's deep learning process is really awful and inaccurate, as examples in that topic show.

There are cases where deep learning can help despite inaccuracies. Finding swimming pools is one of those cases, because it allows tax assessors to hunt for undeclared pools using the output of a well-trained model as a first cut filter of possible pools to check. But getting all the training done is a big deal even for organizations that are as large as, say, Los Angeles County tax office.

One shortcut to that is to provide a capability to load pre-trained models that other organizations have created and then sell. I think that's where ESRI is going with this, since it's clear that while few people have the resources to train models, many more people would like to take advantage of somebody else's training investment. But that, too, is an expensive, niche interest, not a broad interest like many other things within the Manifold user community.

I don't doubt this is something Manifold will do eventually, as it's a cool thing and, you're right, given Manifold's ability to handle GPU well and to handle big rasters well, it's an intellectually appealing fit. But first Manifold has to focus on a variety of pending goals of broader interest. :-)

antoniocarlos

580 post(s)
#25-Feb-21 16:23

Thanks so much for your response. I guess I don't have to send the request yet. :-).


How soon?

joebocop
457 post(s)
#25-Feb-21 19:46

It's a cool idea! Even the ability to consume the pre-trained model format they are advertising (dlpk in the form of zip files) would be great though.

"Pretrained deep learning packages (dlpks) are becoming more readily available as the trend of deploying deep learning workflows shifts from complex Python scripts to out-of-the-box tools."

A DPLK file seems to be "... composed of the Esri model definition JSON file (.emd), the deep learning binary model file, and optionally, the Python raster function to be used." (https://enterprise.arcgis.com/en/portal/latest/use/deep-learning-in-raster-analysis.htm)

The dlpks obviously wouldn't need to be included in Release 9 installers, but could be consumed off disk after being downloaded from their source. Of course, a full environment for training, saving, modifying models "would be great", but represents a huge endeavour, and probably is bad ROI for Manifold engineering.

ESRI has done the labour of creating the models and packaging them for use; Manifold would just need to map their facilities onto the equivalent Release 9 functions ("geoprocessing tools"), if they exist.

Would be incredible to demo this functionality with the free Viewer application, especially for forest fire fighting applications, I think.

dchall8
848 post(s)
#25-Feb-21 23:24

When I was with the appraisal district we purchased aerial imagery from EagleView, formerly Pictometry. For the low low price of roughly $235,000, they flew the county and sent the images to India for processing. They identified every building and provided tools to do some rudimentary and advanced measurements without us having to step foot on the property. In Texas land owners do not have to let us on to their property, so this service paid for itself quickly. If you subscribe to annual or semi annual updates, the updates will highlight buildings which are new, missing, or have changed in size significantly. They can also identify swimming pools.

Eagleview can also provide LiDAR imagery and 4-band infrared.

Manifold User Community Use Agreement Copyright (C) 2007-2019 Manifold Software Limited. All rights reserved.