Subscribe to this thread
Home - Cutting Edge / All posts - Manifold 9.0.177.2 + 8.0.33.1
adamw


10,175 post(s)
#19-Jul-22 16:19

The names of install packages have changed, read the thread for details.

9.0.177.2

manifold-9.0.177.2.zip

SHA256: a5fd077e29a355da78ace09d689fc00f5a7103cbb240d6b2d0b638159be03145

manifold-viewer-9.0.177.2.zip

SHA256: 3fb001f656490a853e836a612dac84a472f58875bf2b7407e1e353c9bbc2ac72

sql4arc-9.0.177.2.zip

SHA256: 22b5581caa2971c96074aee9c60c10468e7241c3a0c195093aaac5263e8b3d4b

8.0.33.1

manifold-8.0.33.1.zip

SHA256: d158f805225194510a864d45c1e1f2ae26bd5868295a46a12bccaa9fb1df77ca

adamw


10,175 post(s)
#19-Jul-22 16:21

64-bit Only

Manifold 8 and Manifold 9 no longer include 32-bit modules.

**

We have been one of the first to move to 64-bit code. Our first 64-bit products date as far back as Windows NT x64, a long time ago. Despite being one of the first to move to 64-bit, for many years we maintained both the 64-bit and 32-bit versions of all our products.

Over time most, if not all our users naturally moved to primarily use the 64-bit versions of the products. At the same time, as we continue to add new features, we find that maintaining compatibility with the 32-bit world becomes increasingly more difficult. We now frequently have to introduce 32-bit only limits for memory-intensive features (eg, when we are computing field statistics for the Style pane, there are special provisions made so that the 32-bit code does not run out of memory, there are many places like that). We have to disable features that are no longer available in 32-bit mode due to lack of support from third parties (eg, CUDA is 64-bit only, ArcGIS Pro for which we provide an add-in is 64-bit only as well, etc). We have to continue supporting outdated versions of database clients because they are the last available versions of such clients in 32-bit mode, and so on.

We decided that the increasing effort to maintain compatibility with the 32-bit world is no longer worth the decreasing value of that compatibility. We will no longer try to maintain it and will stop shipping 32-bit modules, going 64-bit only. This will make it easier for us to add new features and will allow us to design some of these features differently. This will also make many things simpler for our users.

We see several areas in which you could be relying on 32-bit code directly or indirectly:

  • 32-bit Jet -- it used to be that MDB / XLS files could only be read in 32-bit mode. This changed after Microsoft issued the 64-bit version of Access Database Engine. If you are currently working with MDB / XLS files from 32-bit mode, switch to Access Database Engine.
  • 32-bit ActiveX controls -- Manifold 8 form components use ActiveX controls. Most ActiveX controls don't have 64-bit versions. The standard Microsoft controls installed with Manifold 8 also don't have 64-bit versions. If you are currently using Manifold 8 form components with 32-bit controls, consider rewriting these forms using .NET and Windows Forms.
  • Third-party applications -- if you have third-party applications which use Manifold 8 or Manifold 9 through either the object model or the ODBC driver, these applications might be compiled as 32-bit. If so, before committing to going 64-bit only, you need to recompile these applications as 64-bit.

If you are currently relying on 32-bit code in any way, you don't have to switch immediately. The move to 64-bit only can be gradual. You can use the new 64-bit only builds of Manifold 8 and Manifold 9 as portable installs in parallel with the last base builds that still include 32-bit code. Eventually, you might be able to switch roles and use the last base builds that still include 32-bit code as portable installs with the new 64-bit only builds installed the normal way.

**

Manifold 9 products no longer use the Bin / Bin64 folders. The DLL and EXE modules are placed directly into the install folder. The Shared folder is renamed to Extras.

SQL for ArcGIS Pro no longer includes the 'Open SQL 32-bit' button. The 'Open SQL 64-bit' button is renamed to 'Open SQL'.

SQL for ArcGIS Pro tries to locate MANIFOLD.EXE using a dialog if it cannot be found automatically. The location of MANIFOLD.EXE is then saved in an environment variable for future reference.

SQL for ArcGIS Pro supports ArcGIS 3.x. The install packages include two versions of the add-in: the version for ArcGIS 2.x in the ArcGis2 folder and the version for 3.x in the ArcGis3 folder. The EXE / MSI install packages detect the installed version of ArcGIS and register the corresponding version of the add-in.

adamw


10,175 post(s)
#19-Jul-22 16:22

Table Window

The maximum number of records that can be shown in the table window is now 2 billion. (This number is a few records lower than the maximum number of records that can be stored in a table in a MAP file, the tiny difference is due to internal housekeeping. In the future we are planning to both allow a table in a MAP file contain essentially as many records as will fit on disk, and allow a table window to show all these records.)

The table window can either show all records in a table (up to 2 billion) or show the first sample of records. Showing the first sample of records instead of all records is useful when working with external data sources like PostgreSQL, or, say, MANIFOLDSRV, not fetching all records immediately protects from wasting server resources and network traffic needlessly. Tables from MAP files show all records by default, while tables from other data sources show the first sample of records. This also applies to the result tables of queries, including ad-hoc queries in command windows.

(The rule above is not absolutely bulletproof. For example, a query stored in a MAP file may access records in a table stored on SQL Server. When the user runs such a query, the table window will try to fetch all records in the result table, because the query itself is stored in a MAP file, even though the records come from SQL Server. We feel that in cases like this, the responsibility for not overloading the server for no good reason is on whoever writes or runs the query.)

The Options dialog includes the 'Initial number of records to show (non-MAP)' option to specify the size of the first sample of records fetched for non-MAP data sources. The available values are: auto (default), all, or a specific number of records. The 'auto' value is currently set to 5,000 records. (The previous default was as high as 50,000 records because it applied to tables in MAP files as well. 5,000 records is much friendlier to database servers. If you find that 5,000 records is too low for the data sources you are working with, you can always change the option to a higher value.)

The Info pane shows the number of fetched records in the table window. If there are more records available, the Info pane also shows '(+)'. If the table window is showing the first sample of records instead of all records, the Info pane also shows a button to fetch all records. Once a table window has been told to fetch all records for the displayed table, it will keep fetching all records for all further operations that refresh or replace the table.

The Info pane shows the number of displayed records in the table window = the number of fetched records that pass the filter.

The Info pane shows the Component tab for a command window with a table.

The table window detects changes to displayed data from a non-MAP data source and shows an action bar below the record list with a prompt to refresh data manually. To refresh data, use the View - Refresh Data command or click the prompt.

The table window displaying data from a MAP file does not automatically refresh data after changes if the number of records in the table exceeds 64k. Instead, the window shows an action bar with a prompt to refresh data manually.

The table window detects deleted records and shows them in gray, with icon on the record handle.

The context menu for the fill record in the table window no longer includes the Stop Reading Records command.

The table window always applies filters only to the fetched records. The Filter Fetched Records Only command is removed. (Now that the table window can show up to 2 billion records, always using only the fetched records is no longer a limitation in the vast majority of cases. Tables bigger than 2 billion records can still be filtered using a query composed with the Filter using Query command.)

The table window always applies orders only to the fetched records. (If the table window shows the first sample of records instead of all records and you want to sort the entire table, fetch all records first.) The maximum allowed size of field values used for ordering is 16 GB.

adamw


10,175 post(s)
#19-Jul-22 16:23

Other

Exporting ECW / JPEG2K tracks progress and can be canceled.

Exporting ECW / JPEG2K aligns reads to tile boundaries for better performance.

Exporting ECW / JPEG2K automatically adjusts internal buffer sizes when exporting very big images for performance (and, in very rare cases, to allow the export to succeed where it previously was failing due to exceeding one of the internal limits on temporary data).

(Fix) Exporting the result of a query with computed fields to MDB / SQLITE / similar formats no longer fails.

(Fix) Printing geometry to WKT prints points with Z values as POINT Z / MULTIPOINT Z instead of POINT / MULTIPOINT.

(Fix) Printing geometry to WKT no longer ignores Z values for areas.

(Fix) Reading FLT with a malformed HDR may no longer erroneously set pixel scales to zero.

(Fix) Attempting to use components from a data source that fails to connect, say, because it is referencing a file that does not exist, no longer disrupt the UI (map window / layout window / panes).

Failed attempts to connect to a data source are recorded and next attempts to connect to the same data source are throttled. The delay starts at 1 sec and gradually increases to 15 sec. Errors for failed attempts to connect to a data source are logged with repeat errors (same error message) omitted.

(Fix) Reading XYZ no longer ignores the first record if the file starts with UTF8 BOM.

Reading XYZ with a non-space delimiter allows whitespace both before and after the delimiter.

Reading XYZ with integer values that do not fit into INT32 converts all values to FLOAT64.

Reading XYZ skips NaN values. (Records with such values have no business being in the file, but they might technically be there and some software packages apparently produce files with them.)

Reading XYZ tracks progress and can be canceled.

Reading XYZ supports files bigger than 4 GB.

Reading XYZ protects from tiny drift in XY values to handle regular grids more reliably.

(Fix) Reading NC creates an RTREE index on tiles.

Reading XYZ / NC performs significantly faster. (4-5 times faster, more on big files.)

(Fix) Writing XYZ prints all values with full precision. (Previously, values that are very small or very large could lose some of the digits.)

Writing XYZ performs significantly faster. (5-6 times faster, more on big files.)

End of list.

artlembo


3,277 post(s)
#19-Jul-22 17:18

just imported 93,000,000 records in 10 minutes. So, that is a good 5x faster.

I like the new table readout. But, I noticed one odd thing. I ran this query:

SELECT * FROM [BeaverLake2018_bathy]

WHERE MFD_ID < 10000

ORDER BY MFD_ID;

the component window says Displayed: 9,992. And, when I scroll to the bottom, the last record is 9992. Not sure where the other 8 records are. Now, it does say it fetched 9,992+. So, for some reason they aren't all displayed.

Then, I ran this query:

SELECT * FROM [BeaverLake2018_bathy]

WHERE MFD_ID BETWEEN 9990 AND 10000

ORDER BY MFD_ID;

and it say there are 9 records diplayed (9990 - 9997). But again, missing 9998, 9999. So, I issued this query:

SELECT * FROM [BeaverLake2018_bathy]

WHERE MFD_ID IN ( 9998, 9999);

that selected the 2 records.

So, I'm not sure why some things are missing.

Also, this query is really slow:

SELECT count(*) FROM

BeaverLake2018_bathy]

WHERE MFD_ID < 10000;

so, I thought I'd help myself out a little, and ran this query:

code

SELECT count(*) FROM

(SELECT  * FROM [BeaverLake2018_bathy]

WHERE MFD_ID < 10000);

but that too was really slow. Has gone on for minutes, and it is still not returning anything.

Mike Pelletier

2,039 post(s)
#19-Jul-22 18:43

Same trouble here with my data. To make sure mfd_id is in fact a good increment of the records (no deleted records), I deleted that field and then readded it. Same result.

Really like the new additions to the Info pane and looking forward to trying the new ECW export. Thanks!

Dimitri

7,090 post(s)
#19-Jul-22 19:07

But, I noticed one odd thing. I ran this query:

SELECT * FROM [BeaverLake2018_bathy]

WHERE MFD_ID < 10000

ORDER BY MFD_ID;

the component window says Displayed: 9,992. And, when I scroll to the bottom, the last record is 9992. Not sure where the other 8 records are. Now, it does say it fetched 9,992+. So, for some reason they aren't all displayed.

I downloaded the Australia Hydro demo .map, and opened the Watercourse Lines table, which has 1,295,011 records In a Command Window I then ran:

SELECT * FROM [WatercourseLines  Table] 

WHERE [mfd_id] < 10000 ORDER BY [mfd_id]

OK, what the Info pane shows at first is 9,992(+) and if you Ctrl-End to jump to the end of the results table, you see 9992, but then after it thinks for a bit it switches to 9,999 as you might expect and the results table fills in with all the records.

What happens when you run your query and let it cook for a bit more?

artlembo


3,277 post(s)
#19-Jul-22 23:14

Yes, if you let it cook for awhile, it will eventually show up. But, it does take some time. Peculiar that it loads everything but the last 8 records.

I did this with the Philadelphia parking ticket database of 9.4M records. I even selected just 1000 records, and it displays 992 records. And, it takes quite awhile to "cook" to show the rest. I then tried it with selecting only 100 records, and it displayed 96. But, it still took a long time to finish out the list. And, just for fun, I selected only 10 records. It displayed 8 records. And took just as long to display the remaining 2 records.

And like I mentioned in the earlier post, using COUNT takes a long time, even when issuing a sub query.

tjhb

9,993 post(s)
#19-Jul-22 19:21

My guess, reading adamw…

…This number is a few records lower than the maximum number of records that can be stored in a table in a MAP file, the tiny difference is due to internal housekeeping. …

…is that the last few bytes returned for each fetch are one or more pointers for the next fetch.

Apparently not observed by all internals?

Dimitri

7,090 post(s)
#20-Jul-22 06:33

I could be wrong about this (no doubt adamw will comment...) but I don't think that is it. Those "few records lower than the maximum" I believe refers to not quite 2 billion records, but a number just a few records short of 2 billion. It's just easier to say "two billion" than "1,999,999,996".

adamw


10,175 post(s)
#20-Jul-22 10:11

Correct, that's not it.

By the way, the exact number of records that can be shown in the table window is 2147483643. The exact number of records that can be stored in a table in a MAP file is 2147483647. I used "2 billion" as an approximation for a power of two.

tjhb

9,993 post(s)
#20-Jul-22 22:11

Thanks adamw. Still intrigued then regarding what housekeeping tasks those extra four longs (I think) are used for, although just idly—absolutely no need for me to know this to know this.

adamw


10,175 post(s)
#20-Jul-22 10:05

Query 1. MFD_ID<10000 is not optimized by the index on MFD_ID. None of the comparison operators except = are, because each of them can return the entire table. The query has to run through all records and check MFD_ID for each. The records start displaying nearly immediately because the first records happen to be below 10000. Try changing the condition to MFD_ID>92000000 and you will have to wait until the records start appearing. The query will eventually find all 9999 records that are below 10000, you just have to wait until it completes. The number of displayed records stays at 9992 for a long time because the SELECT returns results in batches and the default batch size is 8. The last 7 records are already found when you see Displayed: 9992, the SELECT just does not return them trying to complete the batch. Try adding BATCH 50 after WHERE and the number of displayed records will stay at 9950 instead.

Query 2. MFD_ID BETWEEN 9990 AND 10000 is also not optimized by the index. Unlike < / <= / <> / >= / >, it makes sense to optimize BETWEEN, we will likely do that in the future. The result table will eventually show 9998 and 9999, the query engine has already found them, see above. But you have to let the query complete, and since BETWEEN is not optimized and we have 93 million records, this takes time.

Query 3. MFD_ID IN (9998, 9999) *is* optimized by the index. So the query completes fast.

Query 4. SELECT Count(*) is as slow or fast as the table under it. Since the table under it contains MFD_ID<10000 and that is not optimized, it is slow.

Query 5. As above, adding one more SELECT does not change that MFD_ID<10000 is not optimized.

In sum: everything works, it's just that with 93 million records unoptimized scans are slow. We will optimize BETWEEN in the future, but < / <= / <> / >= / > are just bad for performance.

artlembo


3,277 post(s)
#20-Jul-22 19:05

Hi Adam,

thanks for the clarification. I had a follow up. When you say that the query has to run through all records and check the MFD_ID, I'm a little confused. MFD_ID has a btree index. Does the benefit only work for equality?

I know for searches if the btree is balanced, then theoretically, once you "dive right" (the number is larger than the search number), you now have pointers to half the database, and never have to worry about the other half. Dive left (the number is less than the leaf we just entered), and you cut your universe of data by another half.

So, if we are looking for records greater than 10,000 and the first node in the binary tree is 5,000,000 we dive left and perhaps hit 2,500,000. Then, we dive left again, perhaps for 3 or 4 leaves, and eventually, get to a point where we dive right (the leaf is lower than 10,000). Then, everything downstream from there are the records with the mfd_id less than 10,000.

I did try the search with an indexed field and a non-indexed field, and the results took the same amount of time. Also, I tried a count(*) using the search criteria for mfd_id < 100 and also mfd_id < 10000 and the result took the same amount of time.

It's just puzzling that whether there are 10 records or 10,000 records being returned, about 99%+ are immediately displayed, but remaining few take a bit longer (in the case of my 9,000,000 records, about 72 seconds). So, just curious why 99% is found immediately.

tjhb

9,993 post(s)
#20-Jul-22 22:14

Art,

I think I remember adamw explaining why s BTREE can’t be used for < > <= >= (but can for equality) in a previous thread, regarding join conditions.

If I can find it I’ll post a link.

adamw


10,175 post(s)
#21-Jul-22 10:52

Given a b-tree and a condition like key<100, you can ignore keys at or above 100, correct. But the number of keys below 100 that will pass the condition is expected to be big: on average, we expect half of the records. The overhead of using the tree in the first place makes this not worth the effort.

I mean, fine, instead of going through 90 million records fast (without the index), we will now be expecting to go through 45 million records slower (with the index, the index adds new reads). Maybe the second time will be shorter than the first time, maybe not. In the best case (WHERE MFD_ID<10000 -- your example) it will be significantly shorter. In the worst case (WHERE MFD_ID>10000 -- the opposite to your example) it will be significantly longer. It isn't worth bothering with the index when the results of using it are so unclear.

BETWEEN is a different story, because with BETWEEN we have both bounds. We expect that the user will set the bounds close enough to each other so that the number of keys between them is small. If the user will use the bounds that will pass tons of records and it turns out that just reading the whole table without the index would have been faster, that's on the user.

It's just puzzling that whether there are 10 records or 10,000 records being returned, about 99%+ are immediately displayed, but remaining few take a bit longer (in the case of my 9,000,000 records, about 72 seconds). So, just curious why 99% is found immediately.

For your condition (MFD_ID<10000), 100% of the records are found immediately. 99% is returned immediately and 1% is withheld because SELECT returns data in batches. The last 1% is an incomplete batch, the SELECT tries to complete it.

If you try a different condition, eg, MFD_ID>92990000, none of the records will be found immediately. The table window will show nothing for a long time, then show all records at once at the end of the search.

adamw


10,175 post(s)
#21-Jul-22 11:33

An addition: we will still implement optimizations for < / <= / >= / >, and even <>, but only based on table statistics, so that when we see key<value, we can estimate whether that specific value will make the condition strong enough to be worth using the index for.

LEC2 post(s)
#02-Sep-22 06:48

Can anyone tell me if any of the recent Manifold 8 or Manifold 9 updates have included Australian GDA2020 or if any are planned? I am aware I can add datums/projections, but thought better check updates first.

Thank you!

adamw


10,175 post(s)
#02-Sep-22 07:30

Manifold 9 supports EPSG which has tons of systems based on GDA2020. See the screen.

Manifold 8 does not include systems based on GDA2020 directly, but you can add it as a custom datum. Example thread (that adds a custom datum and a custom datum-to-datum transform):

M8 Custom Projection GDA2020

Attachments:
gda2020-9.png

LEC2 post(s)
#05-Sep-22 03:25

Perfect, thank you! PS. will be hoping for a Manifold 8 upgrade to include GDA2020 nonetheless!

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.