I was getting no information that it was actually doing anything for the first couple minutes.
? Didn't it show the import dialog?
I can open the .xyz in Notepad+ basically instantly
If you wanted Manifold to do only what Notepad+ does, Manifold also would open that .xyz for text editing very fast. But you want to do more than text editing, right?
Notepad+ is literally about a thousand times simpler than what is necessary for GIS so the data structures it uses internally can be much simpler. When it comes to something like 9, the data structures are even more sophisticated than compared to say, PostgreSQL which is terribly slow with rasters or ArcGIS Pro, which takes forever to open a big project with many components and also is terribly slow with vectors and rasters.
It seems like Manifold would import the data quickly, then go slow on the processing to pixels.
That could be done if the path to very high speed storage was a simple matter of importing the data all at once and then doing some processing. Alas, that's not the case.
There's a lot of pre-computation that goes into storing data so that a project with hundreds of components that total hundreds of gigabytes can open instantly, store changes almost instantly, and instantly pop open components that are themselves over 100 gigabytes in size. Add to that the ability to efficiently do parallel work, for example, throwing a thousand GPU cores to do a task in seconds that takes hours or days in ArcGIS Pro, and all that is asking a lot of the data structures and access methods within Manifold's internal database. Converting data from ordinary formats into those data structures takes time.
That's especially true of plain text formats that have no intelligence to them, no spatial indexes, etc., that can be used to get a head start on organizing the data. The best way to process all that is not necessarily to first read all of the text, either.
But you only have to do all that once, when the data is imported. After that, it's much, much faster than leaving the data in dumb formats.
when nothing shows up for minutes, its hard to know at what point to give up or let it ride.
The job for the dataport is to import data from a format accurately and to build internal structures within Manifold's database to allow that data to be operated on with full performance, and to do that job as fast as possible. Anything that slows the process down should be avoided.
For example, while it's nice to reassure beginners with a new format that 9 hasn't crashed, that's not worth doing if it slows the process down. It's not a safe assumption to think that it doesn't slow down the works to surface from what might be very complex internal processes, like recursion, to give an honest report of "Still working!"
A more efficient approach is to simply do the job as fast as possible. Most of the time imports happen so fast it doesn't matter. In those few cases of long imports, new users very quickly learn that 9 doesn't crash. They know if it pops open the dialog to start importing something, it's on the job and it will get it done.
As for reassuring people by issuing progress reports on different steps in the process, unless somebody understands how all the internal data structures work it seems unproductive to fire incomprehensible phrases at them. May as well just show a rotating circle "in progress" graphic or alternate between various phrases ("Still working!", "Gosh, this is taking a while!", "I'm doing my best!", "Good news! There's time for a coffee break!") so people are reassured it hasn't stopped working.
I'm personally not a big fan of progress bars, because they invite misunderstandings: most people look at a progress bar and expect to see a linear representation of a linear process, but many tasks in programs are not linear. You see that effect with Windows updates, where it may go from 5% done to 95% done in seconds and then you stare at 95% done for the next ten minutes.
That progress bars report non-linear phenomena is especially true when the same, progress bar interface is used for what are hundreds of different data ports working on wildly varying data formats and servers, where there is very great variation in what has to be done to extract data and to structure it within Manifold's internal high performance database.
One more thing: if you do very many imports from some particular format that might be a rare format for other people, don't hesitate to read the advice on suggestions and then send in a request to speed that format import up. Something like "XYZ" format is really a family of formats where there might be opportunities for optimization that are tuned to specific types of vector or raster data stored in one of those formats.
The process for developing new dataports is also non-linear, especially for those dataports that are rarely used or which have very little sample data to them. The first versions of rarely-used dataports tend to focus on reliability and quality. They don't invest vast amounts of engineering time to increase the speed of something that is rarely used. When many people use a dataport, it's natural to apply extra effort for optimizations which might speed up the dataport.
PBF for example, is a notoriously slow format but was rarely used. The dataport for importing PBF started out being very reliable but slow. Once people started using PBF a lot more, Manifold returned to the PBF dataport and tuned it. That happened twice, if I recall correctly, with each new iteration increasing speed. The latest PBF dataport is much faster than the original version.