Subscribe to this thread
Home - Cutting Edge / All posts - Manifold System 9.0.170.1
adamw


10,447 post(s)
#12-Dec-19 14:35

9.0.170.1

Here is a new build.

manifold-9.0.170.1-x64.zip

SHA256: 4bb3733da22894e128a38f3d8f37f028b5b3e1f17dc31e32a1651e24d9e72309

manifold-viewer-9.0.170.1-x64.zip

SHA256: 5a77ff2cb88c3dc21bde695ce2cae634184d24c9682adead94c3ddd8411c7c2e

adamw


10,447 post(s)
#12-Dec-19 14:36

Rasters

The main focus of the build was making better use of heavily multi-core systems, particularly for raster computations. We started with watersheds and related algorithms, however the improvements we made are pretty general in nature and we are going to carry them to other algorithms in the future.

As an illustration, here's what computing watersheds looked like prior to this build, using a reasonably-sized test image on a 24-core system:

  • 4 threads: 10:30 sec (10 minutes and 30 seconds)
  • 8 threads: 7:00 sec -- a fair improvement over 4 threads
  • 16 threads: 6:55 sec -- no improvement, basically the same as 8 threads
  • 24 threads: 7:50 sec -- no improvement and actually a bit of a slowdown
  • no further improvements, the times never go below 7:00

There always is a saturation point after which adding threads does not help. But we did not like that this saturation point was so low and reachable on modern systems.

After our changes, computing watersheds on the same image on the same system looks like this:

  • 4 threads: 6:40 sec
  • 8 threads: 3:20 sec -- a fair improvement over 4 threads
  • 16 threads: 2:40 sec -- a fair improvement again
  • 24 threads: 2:20 sec -- a small improvement
  • no further improvements

The saturation point is much further and the overall performance is significantly better. With time, we will move the saturation point even further than that, even with algorithms that are optimized as heavily as watersheds in 9, but we feel this is a good step forward.

A number of implemented improvements benefit heavily from increasing the cache size in the Options dialog. If you have a system with 16 GB or more, consider increasing the cache size from the default 4 GB to 8 GB - half of the available physical memory or slightly less is reasonable for a desktop system with a single user session.

As we said, the improvements are pretty general in nature and we are going to carry them to other algorithms beyond watersheds. We are also working on the next version of the MAP file which will have quite a number of these improvements built into it directly. We also have a bunch of improvements planned for vectors.

Other

Exporting an image with channels that have been remapped using the Style pane to TIFF produces data with remapped channels. (So that the exported image looks the same as the original.)

(Fix) Attempting to test a connection to a MySQL database by pressing Test in the Database Login dialog actually attempts to connect to the database instead of merely validating the connection string.

(Fix) Attempting to test a connection to an Oracle database no longer logs an error into the log window if the attempt to connect fails.

The Options dialog includes a switch to control access keys: auto (default) / show / hide.

(Fix) Importing a SHP file with no associated coordinate system info no longer sets the coordinate system of the produced drawing to the default pseudo-Mercator system instead of leaving it unfilled.

New coordinate transform type: NADCON5. Built-in coordinate system data include EPSG transforms based on NADCON5. (NADCON5 is basically the next version of NADCON with updated grids and an optional adjustment for altitude.)

GRIDS.DAT has been updated to include grid files for NADCON5 transforms.

The new version of GRIDS.DAT is backwards-compatible with the old version and can be downloaded from:

http://www.manifoldsoftwarelimited.mobi/updates/working/grids.dat (379 MB)

Reprojection dialogs collapse names of grid files in conversion paths from 'xxxxx.las + xxxxx.los' to 'xxxxx.las / los' to remove redundant repetitions.

Reprojection dialogs allow selecting a conversion path when converting between systems which have equal parameters but different EPSG codes. (EPSG data contains a fair number of such conversions.)

Exporting data to various ESRI formats writes coordinate system as a PRJ file in addition to MAPMETA.

End of list.

Further builds will focus on various small to medium items in editing / layouts / possibly labels.

tjhb
10,094 post(s)
#12-Dec-19 14:59

Wow, this looks great!

A small question about the tests. Does the 24-core system used have 12 physical cores and 24 logical, or 24 physical cores?

adamw


10,447 post(s)
#12-Dec-19 15:03

It has 24 physical cores (=48 logical).

tjhb
10,094 post(s)
#12-Dec-19 15:23

Thank you.

tjhb
10,094 post(s)
#12-Dec-19 17:04

We are also working on the next version of the MAP file which will have quite a number of these improvements built into it directly.

This is very interesting. Does it imply that some of the (massive) improvement you have created for many cores has involved (more) parallelisation of data transport? If so, we will also see this in basic SQL ops on native sources (to be simplistic: FROM, INSERT...).

adamw


10,447 post(s)
#13-Dec-19 13:57

Exactly.

Some of the improvements that we implemented can be built into the data structures that access the MAP file, making those data structures perform faster with a lot of threads and make better use of memory. In terms of queries, this will affect pretty much all statements to varying degrees.

Also, as I said, we have some improvements related to threads / memory planned for vectors. Some of those also can be bundled into the MAP file.

Finally, we have several improvements and additions planned specifically for the next generation of the MAP file - (a) a specialized version of spatial index for point clouds similar to what we are using in the LAS dataport, with some extensions, (b) a number of optimizations for existing indexes and record storage, etc. We also want to remove or extend a couple of limits that we currently have, most importantly, the limit of 2 billion records per table.

2020 is going to be fun. :-)

Mike Pelletier

2,122 post(s)
#16-Dec-19 15:46

A bit confused on multiple core benefits of CPUs vs GPUs. I'm assuming that the CPU is more readily available to many processes vs GPUs, thus the excitement of a 24 core CPU vs a GPU with hundreds of cores. Could you expand on this or is it as simple as that?

Looking forward to joining in on the fun!

adamw


10,447 post(s)
#16-Dec-19 16:13

CPU vs GPU and what is better / faster in a nutshell:

The number of CPU threads that can run in parallel is relatively small (however many you have CPU cores before hyperthreading), but these threads can run general-purpose code. General-purpose code can access everything: RAM, disk, etc. Writing general-purpose code is relatively easy. If a problem is parallelizable, it can be parallelized efficiently to CPU, writing parallel general-purpose code that runs on CPU is a natural first choice.

The number of GPU threads that can run in parallel is large, but these threads can only run specialized code tailored for the GPU. This specialized code can access very little: memory on GPU, and, at a heavy price in performance, RAM. No disk. The amount of memory that the specialized code can access efficiently is limited. Writing specialized code is relatively difficult. Technologies like CUDA try to help make this easier, but even then there are many limitations that you have to fit into which make coding for GPU much more difficult than coding for CPU. More, not all problems that are parallelizable in general are a good fit for GPU. But if a problem is a good fit for GPU, then the huge number of GPU threads overwhelms everything and brings big benefits.

TLDR: not all problems that can be parallelized are or can be parallelized to GPU, but if something is parallelized to both GPU and CPU, the GPU version typically wins big although it is harder to develop.

Mike Pelletier

2,122 post(s)
#16-Dec-19 23:16

Thanks Adam and Tim for the explanations. It is difficult to know how much time is involved in processing vs. RAM or disk. So to translate that to common specific tasks, it seems GPUs are really good for transforming images. Can anything general be said about when working between images and vectors (creating contours) or just crunching vectors alone (merging areas)? Where does 9 stand today on maximizing the benefit of GPUs?

adamw


10,447 post(s)
#17-Dec-19 07:34

We are using GPUs for raster operations. We do have experimental code that uses GPUs for vector and mixed raster / vector operations, but the benefits of using GPUs with just rasters are currently bigger so we are implementing them first.

tjhb
10,094 post(s)
#16-Dec-19 16:17

You know this but...

GPUs can execute a single, identical, defined task many times at once. Branching makes them choke.

CPUs can execute multiple, different, adaptive tasks at once. They can branch efficiently.

CPUs can think and solve, GPUs only crunch.

[Underlapped with Adam's better post.]

tjhb
10,094 post(s)
#16-Dec-19 19:15

Not quite accurate. Each functional unit (symmetric multiprocessor) on a GPU can execute a single identical task at a time. But if there is more than one unit, different units can in principle execute different tasks at the same time; and each unit can schedule different tasks at different times and splice them efficiently.

The main idea is that the GPU executes one or more warps of identical instructions at a time, each warp targeting a different set of data. The warps, and the threads within each warp, can also share and swap data.

But whenever execution steps need to depend on the data, CPU is inherently better.

tjhb
10,094 post(s)
#16-Dec-19 19:47

In some cases, an algorithm that depends on actual data can be rewritten to perform the same task(s) on all data, then filter by result (discarding some or many; you might only want one).

Those cases can be ported to GPU, with some wastage of electricity but perhaps massive savings in time.

If I understand it, that is similar to what CPUs do with speculative execution.

Dimitri


7,413 post(s)
#20-Dec-19 13:12

It's a case by case thing. We're used to thinking of GPU cores as miracle solutions, but the reality is that CPU cores are phenomenal, general purpose compute engines that often with many cores can do something faster than the time to setup and dispatch to GPU cores, which tend to be less general purpose. So it's not so much whether parallelizing to GPU is hard or easy as it is whether you're likely to gain beyond effective parallelization to CPU, which generally always has to be done anyway. If more effective use of more CPU cores will be faster than a borderline case of using GPU cores, you want to make sure you get the CPU parallelism done right.

If anything, the increase in CPU core counts now available at low cost, plus the improvements in CPU parallelization that have come with experience (and will appear in the product more and more), are somewhat moving the needle for when it pays to dispatch to GPGPU. Raster computations usually are particularly well suited for dispatch to GPU, so that's always going to be big, but other areas, like general purpose DBMS and vectors, are very much case by case to decide where the best investment should go first.

GPU per-core prices are dropping too, so you don't want to leave any big wins unexploited with GPU, but given the revolution in manycore CPUs it is non-negotiable that those must be done effectively. That helps with GPGPU as well, since in a very parallel system you want many CPU threads working effectively in conjunction with and supporting whatever you choose to do by way of GPGPU dispatch.

Anyway, getting back to one of the original questions, why the excitement over many CPU cores, I think the answer is that with GPGPU there are often many cases where you end up saying "doesn't really pay to dispatch to GPGPU..." but there are fewer cases where it doesn't pay to use parallel CPU. So having lots and lots of CPU cores available is a good thing, allowing more use of parallelism in more cases. It's a nice addition to having plenty of cases where GPGPU can do magic.

danb

2,064 post(s)
#13-Dec-19 22:46

Those are terrific improvements on algorithms which are already second to none.

Over the next two years or so, we will be receiving an estimated 20TB of new LiDAR point clouds and derivatives. I can see than Manifold 9 will have a big part to play in realizing the true worth of that data set.

I know you will, but keep up the great work.


Landsystems Ltd ... Know your land | www.landsystems.co.nz

ColinD

2,081 post(s)
#20-Dec-19 06:59

Using 9.0.170.1 with a 14,000x10,000 raster (DEM) derived from LIDAR at 2m resolution.

Creating contour lines at 2 m intervals took 28 sec ~8900 lines.

I found that the contours transform became active the moment it was selected and gave a screen whiteout each time I changed a parameter (just the active window). Am I missing something? I expected a run button such as !

Do all transforms behave this way? Fill Sinks for example.

System

Dual Xeon E5-2630 v4 20 physical/40 logical

Quadro M5000

128GB RAM

Dedicated Temp/PF SSD


Aussie Nature Shots

adamw


10,447 post(s)
#20-Dec-19 12:40

That's probably the preview. It's harmless, as in, no data gets modified, just a preview is computed from whatever portion of the data is currently visible. For some transforms, the preview is a little jerky in that nothing gets painted until the preview is computed and sometimes that takes noticeable time. We have some ideas regarding how to make it feel smoother.

Dimitri


7,413 post(s)
#20-Dec-19 12:46

I expected a run button such as !

The Transform pane is always on... it starts doing a preview as soon as a template has been selected or enough of an expression to be parsed has been written. See the "Live Previews" section of the Transform Pane topic.

Previews are just previews. They might not appear for a while given complex / lengthy tasks for large images in view. Zooming in can result in much quicker previews, as discussed in the Example: Zoom In to See Transform Previews for Big Images topic.

But whether or not you wait for the preview, the transform still works. The "run" button in the Transform pane is the action button down in the lower left corner that usually you can set to Add Component or to Update Field. See the illustrations in the Transform Pane topic.

[Crossed with adam's post... all the same, I suggest a visit to the above topics.]

adamw


10,447 post(s)
#24-Dec-19 12:22

Status update:

We are planning to have one more build this year. In addition to the traditional streaming in of new features, the build is going to have some long-awaited quality of life improvements, in the spirit of New Year festivities. :-)

We are going to issue the build right before going for a short break for the holidays, to continue in 2020.

KlausDE

6,410 post(s)
#24-Dec-19 14:25

german ui file for 9.0.170.1

Attachments:
ui.de.txt


Do you really want to ruin economy only to save the planet?

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.