Images can be Inefficient

Most of the visual impressions we see published on the web and in print media are images. See the Images topic to understand the structure of images.


Images consist of a sea of pixels that are distinguished from each other only by the color of the pixel. Unlike drawings, there are no "objects" in images. Any objects we see in images are really illusions in the human mind. When we see a cloud of pixels that appears to be a line from far away, the appearance of the line is an interpolation created within the human eye and brain physiology… there's no "line" actually there.


Newcomers to GIS are often seduced into using images inappropriately for a variety of reasons:


·      Images are easy to create by scanning photographs or paper maps.

·      When scanned from paper maps, images can have a very rich visual appearance.

·      Creating new drawings with high detail is a tedious and time-consuming process.

·      Formatting drawings to provide a rich visual appearance is very time consuming and requires good taste and expertise.


However, one can pay a high price for using images inappropriately within GIS. There are many circumstances where using images is the right thing to do and there are times when using images is a very bad idea. One must be aware of the benefits and costs of using images. In particular, using images as a replacement for vector drawings is rarely a good idea. Using images for photographs, scanned data samples and other continuously varying, analog data representations is fine.


Since the usual problem is inappropriate use of images, the following sections focus on the downside of using images.


Large Size / Low Information Content


When used to replace vector drawings, images can be unacceptably large for the amount of information they contain. Let's see how by doing a thought experiment. Suppose we have a room with a large, smooth floor like a basketball court and we want to draw a line on the floor that shows an outline of Europe. We could proceed in at least two ways.


One way to draw the line would be to place thumbtacks at places marking the outline and to then stretch a ribbon or string between the thumbtacks to mark the outline in a "connect the dots" fashion. This method is analogous to how a drawing works in that the only data necessary to keep on hand are the coordinates that mark the shape of the outline.


Another way would be to cover the entire surface of the floor with small, flat pebbles. We would use white pebbles throughout the entire floor except for those places where we wanted to mark the outline of Europe. To mark the outline, we could carefully replace some white pebbles with black pebbles. If we placed the black pebbles carefully, we could view the floor from an elevated location and see the outline of Europe in the pattern of black pebbles winding its way through a sea of white pebbles. This method is analogous to how an image works. The pebbles are the same as pixels.


It takes no special genius in computer software design to consider the above situation and realize that marking a line on the floor using a ribbon to connect the essential locations that define the shape of Europe is a lot faster and more efficient than placing millions of pebbles on the floor. It is also less wasteful of resources. In the case of using pebbles, most of the "pixels" placed on the floor are not necessary. They simply provide a sea of white against which the black pebbles become visible to our eye as a pattern that we see as the shape of Europe.


Note also that in the first case we have a true object, the ribbon, which makes up the outline of Europe. In the second case there is no object, just a visual effect we reckon to be an outline by virtue of contrasting color.


Suppose we had a computer file that told us where to place thumbtacks in the first method and we also had a computer file that listed all the locations and color for each pebble in the second method. The file for the first method would have to list only the number of coordinate locations needed to define the shape of Europe, perhaps as few as a few thousand numbers. With the pebbles, though, whether or not the pebble participates in showing the actual shape of Europe, each pebble location and color must be noted. The computer file for the second method would therefore use millions of numbers to contain the same information as the "drawing" file.


The situation in GIS is very similar. A drawing showing urban streets in high detail might be only five megabytes in size. An image showing the same streets as a raster could easily be 125 megabytes in size. Although images can be compressed to eliminate some of their waste, it seems unreasonable to first choose a highly inefficient method and then spend a lot of time trying to make it somewhat more efficient. Compression also does not eliminate the need to manage pixels in uncompressed form when the images are actually used within a GIS.


There are cases, of course, where images can be an efficient way of presenting data in terms of the amount of information captured for a given size of file. The obvious case is photographic images. Suppose the floor of our hypothetical basketball court was covered with an enlarged photographic image of a region shot from space. Each "pebble" or pixel could be the right color to form an overall photographic image. To capture the same image using a drawing would not gain any efficiency since the drawing would have to contain at least as many points as pixels in the image.


In another example, suppose we wanted to present a detailed pattern of data such as minute variations in temperature or reflectance over our basketball floor. In that case, a vector drawing representation would have to put points at every location we wished to note together with a measurement. There would be as many vector points as there would be "pebbles." In this case as well an image makes sense. In general, where data represented is analog or continuously varying in nature images are a good choice.


Low Accuracy


To consider our example of two ways of drawing an outline of Europe, it's clear that using the drawing approach provides precise accuracy for the position of every thumbtack marking the outline and the ribbon or string that passes between it. In contrast, when marking the outline of Europe with black pebbles placed within a sea of white pebbles we will inevitably confront "the jaggies" where the pattern of black pebbles stair-steps between pebble locations.


No matter how far we zoom into a vector drawing the accuracy of the shapes drawn between coordinates is perfect. A line drawn between various coordinates will always be perfectly razor-sharp at any zoom level.




With images, in contrast, we can always zoom into the image to the point that pixels appear as large square shapes. At that point, it is not exactly clear where lines may be located. We may think we can interpolate by eye but that will not work if we zoom even further into the image. Images are therefore "accurate" only if they are seen from far enough away that our eye doesn't see the inaccuracy of the image.


A further point of inaccuracy is that most images representing vector maps are derived from paper maps, which in turn were created by human, subjective interpolation of vector data. For example, terrain elevations are usually measured at specific points. When those points are marked out and joined by contours, the contour drawing is inevitably a process of interpolation. When maps are printed, they are further interpolated into the pattern of small dots used in the printing process.


To convert a printed map into a digital image, the printed map is interpolated yet again when it is scanned. The result is a highly irregular pattern of pixels like that seen above (taken from a USGS digital image of a scanned paper map) that represents several levels of interpolation from whatever original data was used to create the map.


Limited Data Content


Drawings can contain arbitrarily rich data for each object. In a drawing, each object is connected to a record in a database table. That table can contain many fields of very rich data types, and the table can be linked to other tables via relations. Drawings can therefore act like visual windows into a database. The database can also act as an algorithmic window into the drawing since objects in drawings can be selected using SQL and other database methods.


Images, in contrast, can contain no data other than the color of their pixels. With an image there is no connection to database information and little opportunity to select except by color.


Drawings also have the advantage that the specific coordinates that define the objects within the drawing also define precise spatial relationships between those objects. The drawing provides a rich set of implied data by virtue of arrangement of the objects it contains. One can use drawings to compute relationships such as how much of a particular object is contained within a different object, or to find the longest line or the largest area.


Images have no such rich spatial relationships because they have no objects, only pixels. There are no objects in the image so there is no way to say where one object ends and the next begins. With images, where an object begins or ends or whether it exists at all is a matter of opinion, not geometry.


Some enthusiasts of an exclusively raster approach to GIS may complain that there are methods for assigning database information to regions of pixels and to "classify" or otherwise assign object characteristics to regions of pixels. However, what is really going on in such cases is the creation of a set of meta-information that is really a type of vector drawing. It is more efficient to simply use real drawings, perhaps in combination with images.


Restricted User Lifestyle


When using images to represent vector data better represented as drawings one often ends up with file sizes that are larger by a factor of twenty or more than the equivalent vector data set. Manipulating such large amounts of data takes profoundly more computer resources than is required for the vector drawing equivalent. The result from a user interface perspective is that operations with large images will be so slow that interactive work will be difficult.


When operations run many times slower the effect is a reduction in the quality of life of a user. A fast, instantaneously responsive user interface is a joy to use. It enhances the mental engagement of users with their work. A slow, tedious interface that requires pauses of many seconds between all operations causes unhappiness and stress. Using images inappropriately as replacements for vector drawings is a pathway to unnecessary stress.


Financial effects for using images are a real effect as well. When every layer in a map requires over one hundred megabytes the machine required for operation will be considerably more expensive that that required when every layer is a mere five megabytes in size. Because the number of pixels to be processed in an image goes up by the square of any increase in linear dimension, the processor speed required increases much more rapidly than the apparent increase in size of an image. Images that appear slightly larger could require twice the processor speed to maintain the same level of system responsiveness.


Therefore, the costs of running large images go up faster than the size of the images that can be reasonably used at various machine price points. Large images also require large amounts of RAM and larger hard disks for storage. Finally, projects involving large images are more difficult to transmit over Internet or private wide area networks for collaborative work than projects based on equivalent drawings.


Advice to the User


·      Avoid using images to replace vector drawings. Instead, make a special effort to acquire the necessary vector drawings and invest time into formatting those drawings.

·      If no drawing exists and only an image is available to represent a vector drawing (such as a scanned image of a paper map), invest the resources required for creating a drawing from the image. Use Manifold's tracing tools or hire a digitization contractor to create a drawing version.

·      Sometimes, of course, it makes sense to bow to expediency and simply use a fast machine and a scanned image of a paper map.

·      Use images for photographic data.

·      Use images for data sets that represent continuously varying data such as terrain elevations, temperatures or other intrinsically "analog" data.

·      Use images in combination with drawing layers to enhance presentations through artistic effects.