Subscribe to this thread
Home - General / All posts - Sql - How to do the same function as topology overlay "identify" using sql
tgazzard
146 post(s)
#14-Jan-14 00:18

Hi,

I run a lot of sql queries in sequence using some script that was posted a number of years ago on the forum.

Is there a way to perform the equivalent action as performed by topology overlay - "identify" in sql.

I have had a look intersect and split by branches but am struggling to put something meaningful together.

Any thoughts

Cheers

Tim G.

tjhb
10,105 post(s)
#15-Jan-14 09:26

Is there a way to perform the equivalent action [to] topology overlay - "identify" in sql.

You mean "identity" I think. Yes there is. Do you have test data?

tgazzard
146 post(s)
#16-Jan-14 10:07

Hi Tim,

Thanks for your reply.

Yes I did mean "identity".

Attached is test map with two small areas of datasets that I commonly use in this type of operation.

SC is a set of management units which we create 40 year future planned burn histories from.

OD is a historic record of the fire history in this area.

I would normally have the OD drawing as the data drawing and the SC drawing as the overlay drawing.

I generally keep all the columns (force transfer rules).

Thanks in advance.

Tim G.

Attachments:
sql identity test map.map

tjhb
10,105 post(s)
#17-Jan-14 00:41

I have this working now. I'll be back after some further checks.

tgazzard
146 post(s)
#17-Jan-14 01:39

Thanks Tim.

Look forward to seeing how this can work.

tjhb
10,105 post(s)
#17-Jan-14 02:12

Try this Tim. Comments welcome. Not all *that* easy to edit for different drawings (and their columns), I'm afraid. If this was a major problem then the edits could, with a bit of work, be scripted.

This is not as fast as Topology Overlay--on my laptop and with your test data, 2.6s versus 0.7s for the real thing. (But my first working version took 47s, so I'm not completely unhappy.)

If you have a large drawing you probably don't want a straight SELECT query, but an INSERT INTO query. Let me know if you need help with that.

At first I coudn't work out why I got a resut with 420 objects, versus the 329 objects returned by Tpology Overlay. But then I noticed you have set location precision to 1m, and my initial queries were using maximum precision.

The query below uses location precision internally (and sets the source precision for a linked drawing, via the OPTIONS clause), and produces the same result as Topology Overlay.

This was an interesting exercise thanks.

OPTIONS 

    COORDSYS("OD" AS COMPONENT), 

    PRECISION("OD" AS COMPONENT);

SELECT

    [ID][Season][Intensity][Disturbance Type][Age][dist_id],

    [THEME1][THEME2][THEME3][BU][Y01][Y02],

    [Part]

FROM

    (SELECT

        [ID][Season][Intensity][Disturbance Type][Age][dist_id],

        [THEME1][THEME2][THEME3][BU][Y01][Y02],

        [Type],

        CASE [Type]

            WHEN "intersection" THEN

                CASE

                    WHEN [Mask A] IS NULL THEN NULL

                        -- No intersection between this data area and any overlay mask

                    ELSE ClipIntersect([ID][Mask A]-- using epsilon

                END

            WHEN "exclusion" THEN

                CASE

                    WHEN MAX([Mask B]IS NULL THEN Geom([ID])

                        -- No intersection between this data area and any overlay mask

                    ELSE ClipSubtract([ID], UnionAll([Mask B])) -- with epsilon

                        -- NULL if this data area is completely covered by overlay masks

                END

        END AS [Part]

    FROM

        (SELECT

            [Data].[ID]--[Data].[Geom (I)]),

            [Data].[Season][Data].[Intensity][Data].[Disturbance Type],

             [Data].[Age][Data].[dist_id],

                -- Must list source columns explicitly, 

                -- if not here then at least in the GROUP BY

            [Operation].[Type],

            CASE [Operation].[Type] 

                WHEN "intersection" THEN [Overlay].[ID] -- [Geom (I)]

                WHEN "exclusion" THEN NULL

            END AS [Mask A],

                -- One reference to this overlay mask for the intersection operation

                -- Used individually, for ClipIntersect

                -- For exclusion substitute NULL (ignored in grouping)

            CASE [Operation].[Type] 

                WHEN "exclusion" THEN [Overlay].[ID] -- [Geom (I)]

                WHEN "intersection" THEN NULL

            END AS [Mask B],

                -- A second reference to this overlay mask for the exclusion operation

                -- Used after grouping, for ClipSubtract

                -- For intersection substitute NULL (ignored in grouping)

            IIf([Operation].[Type] = "intersection"[Overlay].[THEME1]NULLAS [THEME1],

            IIf([Operation].[Type] = "intersection"[Overlay].[THEME2]NULLAS [THEME2],

            IIf([Operation].[Type] = "intersection"[Overlay].[THEME3]NULLAS [THEME3],

            IIf([Operation].[Type] = "intersection"[Overlay].[BU]NULLAS [BU],

            IIf([Operation].[Type] = "intersection"[Overlay].[Y01]NULLAS [Y01],

            IIf([Operation].[Type] = "intersection"[Overlay].[Y02]NULLAS [Y02]

                -- Overlay attributes are included for the intersection operation only

                -- For exclusion substitute NULL (ignored in grouping)

        FROM

            [OD] AS [Data]

            LEFT JOIN

            [SC] AS [Overlay]

            ON Touches([Data].[ID][Overlay].[Geom (I)]-- with epsilon

            AND NOT Adjacent([Data].[ID][Overlay].[Geom (I)]-- with epsilon 

            CROSS JOIN

            (VALUES ("intersection"), ("exclusion"NAMES ([Type])) AS [Operation]

        )

    GROUP BY

        [ID][Season][Intensity][Disturbance Type][Age][dist_id]--[Geom (I)],

        [THEME1][THEME2][THEME3][BU][Y01][Y02],

        [Type]

        [Mask A]

            -- Grouping by Mask A to give a separate row for each intersection

            -- Not grouping by Mask B because we combine these copies of the masks for exclusion

    ) AS [T]

WHERE [Part] IS NOT NULL

;

[Code tidied slightly just after posting. No functional changes.]

Attachments:
Identity overlay SQL.txt

tgazzard
146 post(s)
#05-Feb-14 03:54

Hi Tim,

Apologies for the not replying sooner. Unexpectedly got called away for a few weeks.

This is awesome. Appreciate all the effort you have gone to solve this problem.

Cheers

Tim G.

tjhb
10,105 post(s)
#06-Feb-14 00:44

Cool Tim.

It would be interesting to know how you plan to use it--if at all (maybe just an experiment). I didn't ask.

Tim

tgazzard
146 post(s)
#09-Jun-15 11:22

Hi Tim,

I never got back to about how I use your query. I have used it in a couple of projects now and am finding it really helpful to enable large sequences of sql statements to be joined together.

In particular I have found that it is great where I have overlapping polygons and want to cut each one up separately - see the code below.

Or by adding an insert into statement at the front of query and putting the data directly into a new drawing.

Also, the identify function sometimes returns a polygon that isn't cut up (particularly when there are overlapping polygons). That is it partially cut the data up, but it returns a single polygon that is over the top of all the other polygons (I wondered whether this was a topology issue....) Your set of nested queries overcame this.

I have also run this query with 150 columns and it didn't seem to affect performance a whole lot.

I have been caught out a couple of times by not having a clean drawing. That is I haven't run normalise topology. It will still return a result but if you try to use insert into at the front it will stop at point of bad topology.

I do have a question about performance....

Why is it that a drawing with a lot more polygons is able to be cut up faster than a drawing with not many polygons. For example, I have a base drawing which is a million hectares and has 40000 polygons. If I run your query and intersect this with say a drawing with 16000 polygons it will take days to solve. But if I split the 40000 polygons into 800000 polygons and run the query with the overlay drawing of 16000 polygons it will only take 8 hours to solve.

This is also the same with normalize topology. For example, the 40000 polygon drawing hadn't got to 1% complete after running all night. Where as I did a test with the 800000 polygons and it completed over night.

I generally split my base dataset up with a grid of some size (say 10km).

I have found I had to play around a lot with combinations of polygon numbers to try to get an optimal result. Is there any rules of thumb to this? I guess it depends on things like the number of vertices?

FROM

 [OD] AS [Data]

 LEFT JOIN

 [SC] AS [Overlay]

 ON Touches([Data].[ID][Overlay].[Geom (I)]-- with epsilon

 AND NOT Adjacent([Data].[ID][Overlay].[Geom (I)]-- with epsilon 

 CROSS JOIN

            (VALUES ("intersection"), ("exclusion"NAMES ([Type])) AS [Operation]

 

         where column >= to some value of interest

----

I also tried using art lembo's query on his website - it may have been me, but it appear to only work with single objects.... See his code below.

https://artlembo.wordpress.com/category/manifold-gis/

SELECT * FROM

(

SELECT

      ClipIntersect(circle.id,rectangles.id) g, int(circle.idAS cid, int(rectangles.idAS rid

FROM circle, rectangles

UNION ALL

SELECT

ClipSubtract(rectangles.id,circle.idAS g, int(circle.idAS cid, int(rectangles.idAS rid

FROM circle, rectangles

)

RIGHT JOIN [circle] ON circle.id = cid

RIGHT JOIN [rectangles] ON rectangles.id = rid

Thanks again. 

tg

artlembo


3,425 post(s)
online
#09-Jun-15 14:59

I'm not sure what you mean about single objects. Can you explain? I wonder if the issue is due to the ClipSubract problem I wrote about in this thread:

https://artlembo.wordpress.com/2014/05/23/arcinfo-functions-union/

Here, we had to make a couple of modifications and do a UnionAll in the ClipSubtract portion. I wonder if the same thing holds true here. Take a look at that post and tell me if that is the issue.

artlembo


3,425 post(s)
online
#09-Jun-15 15:22

I think I found the issue. With the multiple objects, the query is returning null objects in addition to the areas. I just added this line at the end

WHERE IsArea(g)

and that fixed it. I ran the following query on your layers and it was identical to the Identity command in the GUI - and, it ran in 11 seconds!

SELECT * FROM

(

SELECT

      ClipIntersect(circle.id,rectangles.id) g, int(circle.id) AS cid, int(rectangles.id) AS rid

FROM circle, rectangles

UNION ALL

SELECT

ClipSubtract(rectangles.id,circle.id) AS g, int(circle.id) AS cid, int(rectangles.id) AS rid

FROM circle, rectangles

)

RIGHT JOIN [circle] ON circle.id = cid

RIGHT JOIN [rectangles] ON rectangles.id = rid

WHERE IsArea(g)

oh, and to make things easier, I renamed your original layers "circle" and "rectangle" because I didn't want to modify the code. Now, if Manifold supported functions, then you could just pass the drawings to the functions and you wouldn't have to change the names of the layers :-)

tgazzard
146 post(s)
#04-Aug-15 14:10

Hi Art,

I just received your book and thought it would be good to retest the above query.

Using the manifold file above, I firstly ran Tim's query and it completed in 3.26 seconds.

Then I ran your query above and it finished in 29.5 seconds.

Then I tried the query in the book and it ran in 7.8 seconds (see below).

I can see the advantage of your method in that I don't have to set up all the column names which makes it more efficient for smaller jobs. Whereas for larger overlay operations, it will most likely be more efficient to spend a little time setting up Tim's query due to the faster run speed.

Your book is a good reference. Thanks

SELECT * FROM

(

SELECT

      ClipIntersect(circle.id,rectangles.id) g, int(circle.id) AS cid, int(rectangles.id) AS rid

FROM circle, rectangles

WHERE Touches (circle.id,rectangles.id)

UNION ALL

SELECT

ClipSubtract(rectangles.id,circle.id) AS g, int(circle.id) AS cid, int(rectangles.id) AS rid

FROM circle, rectangles

WHERE Touches (circle.id,rectangles.id)

)

RIGHT JOIN [circle] ON circle.id = cid

RIGHT JOIN [rectangles] ON rectangles.id = rid

WHERE IsArea(g)

tgazzard
146 post(s)
#04-Aug-15 14:39

Hi Art,

I did note something that could catch me out. I initially used the OD drawing as rectangles and the SC drawing as circle. The OD drawing has some areas that are not covered by the SC drawing. The result correctly cuts the two drawings together. But it incorrectly assigned attributes to the areas that were not covered by the SC drawing. That is these areas should of had no attributes assigned.

I checked this against Tim's query and the topology overlay tool and they both showed no attributes for the sc columns that correspond to the od areas that don't overlap the sc drawing. Any thoughts on why this is? And how it could be fixed? Would the Union query do the trick?

----

I have now tried the union query and it did fix the attribute issue. But I can see an instance were you may not want to use union (and union is a bit slower to run).... I guess just something to watch out for. Is there a way around this?

Also worth noting that the union query in the book is missing a drawing name above the where touches lines.

code

SELECT

ClipSubtract(circle.[Geom (I)],

(Select unionall(id) from rectangles)) as g,

0 as cid, circle.id as rid

from circle, rectangles

WHERE Touches (rectangles.id,circle.id) 

)

artlembo


3,425 post(s)
online
#04-Aug-15 15:18

Thanks for the update. I am away on vacation for the next week, but will look at this when I get back.

I think it will be an easy fix to catch the condition where the objects don't touch. I may have to look over the JOIN, as that should have caught it.

artlembo


3,425 post(s)
online
#04-Aug-15 20:02

that was a good catch tgazzard. Tell me if this works for you. I left out the WHERE touches clause (in bold).

I don't have time to change the code in the book right now, but the nice thing about on-demand publishing is that when I get back from vacation, I can make the change, upload the text, and the change will be in the book - don't have to wait a year for a second printing or anything like that.

SELECT * FROM

(

SELECT

      ClipIntersect(circle.id,rectangles.id) g, int(circle.id) AS cid, int(rectangles.id) AS rid

FROM circle, rectangles

WHERE touches(circle.id,rectangles.id)

UNION ALL

SELECT

ClipSubtract(rectangles.id,circle.id) AS g, int(circle.id) AS cid, int(rectangles.id) AS rid

FROM circle, rectangles

)

RIGHT JOIN [circle] ON circle.id = cid

RIGHT JOIN [rectangles] ON rectangles.id = rid

WHERE IsArea(g) 

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.