Parsing GeoJSON format in Sql Server 2016

Parsing GeoJSON format in Sql Server 2016

GeoJSON is commonly used format for representing various geo-objects such as lines, polygons, etc. Sql Server 2016 enables you to parse GeoJson format using OPENJSON function. GeoJSON format is described here. In this post we will see how you can parse various types of GeoJSON objects and extract their coordinates. GeoJSON types that will be described here are:

  • Line strings
  • Polygons
  • Multi polygons

Goal of this post is not only to show how to parse GeoJSON, but also to see how you can parse some real-world nested JSON structures.

LineString

Line strings are represented as an array of 2D coordinates. Each coordinate is represented as a two-number array, as it is shown in the following example:

Linestring
{ "type": "LineString", 
    "coordinates": [
        [30, 10], [10, 30], [40, 40]
    ]
}

 

We can seek into the array of coordinates (on path $.coordinates), and return (x,y) pairs of each point:

declare @linestring nvarchar(max) =
N'{ "type": "LineString",
    "coordinates": [
        [30, 10], [10, 30], [40, 40]
    ]
}'

SELECT x, y
FROM OPENJSON(@linestring, '$.coordinates')
              WITH (x int '$[0]', y int '$[1]')

 

In order to fetch coordinates of points you should take elements 0 and 1 from the array using paths $[0] and $[1]. In this example we are converting coordinates to int type. Results are shown in the following table:

x y
30 10
10 30
40 40

Polygons

Polygon represents a set of independent line strings. In the simplest case we can have only one line string in the set:

Polygon
{ "type": "Polygon", 
    "coordinates": [
        [[30, 10], [40, 40], [20, 40], [10, 20], [30, 10]]
    ]
}

Format is same as in the line string; however, in polygons we assume that first and last points are the same. OPENJSON enables you to find coordinates of points that belong to polygon. In this case, we will reference first element in the $.coordinates array (i.e. $.coordinates[0]) and open all (x,y) points in this array:

declare @polygon nvarchar(max) =
N'{ "type": "Polygon",
    "coordinates": [
        [[30, 10], [40, 40], [20, 40], [10, 20], [30, 10]]
    ]
}'

SELECT x, y
FROM OPENJSON(@polygon, '$.coordinates[0]')
              WITH (x int '$[0]', y int '$[1]')

Results are shown in the following table:

x y
30 10
40 40
20 40
10 20
30 10

Polygon can also have more than one line string (each line string is represented with a separate line string in the $.coordinates array):

Polygon with hole
{ "type": "Polygon", 
    "coordinates": [
        [[35, 10], [45, 45], [15, 40], [10, 20], [35, 10]], 
        [[20, 30], [35, 35], [30, 20], [20, 30]]
    ]
}

 

In this case we need to open json in two steps:

1. First we are going to open an array of line strings on the $.coordinates path and return array of coordinates belonging to each line string. Key column contains an index of the line string.

 

declare @polygon nvarchar(max) =
N'{ "type": "Polygon",
    "coordinates": [
        [[35, 10], [45, 45], [15, 40], [10, 20], [35, 10]],
        [[20, 30], [35, 35], [30, 20], [20, 30]]
    ]
}'

SELECT line.[key] as line_number, line.value
FROM OPENJSON(@polygon, '$.coordinates') line

 

Results are shown in the following table:

 line_number   value
 0  [[35, 10], [45, 45], [15, 40], [10, 20], [35, 10]] 
 1  [[20, 30], [35, 35], [30, 20], [20, 30]]

 2. Then we are going to open array of coordinates in the value column that belongs to each line. Now we would need additional CROSS APPLY OPENJSON part where we will pass array of coordinates end expand this array to sub-table.

 

declare @polygon nvarchar(max) =
N'{ "type": "Polygon",
    "coordinates": [
        [[35, 10], [45, 45], [15, 40], [10, 20], [35, 10]],
        [[20, 30], [35, 35], [30, 20], [20, 30]]
    ]
}'

SELECT line.[key] as line_number, x, y
FROM OPENJSON(@polygon, '$.coordinates') line
       CROSS APPLY OPENJSON(line.value)
              WITH (x int '$[0]', y int '$[1]')

 

CROSS APPLY is operator that you would use in most of the cases when you are dealing with opening JSON sub-arrays. It is very similar to standard JOIN with two differences:

  • You should use it to join a primary row(s) with a dynamically generated table (i.e. generated by some table value function such as OPENJSON in our case)
  • You don’t need to specify ON condition like in JOIN. CROSS APPLY implicitly know that it should join primary row with all dynamically generated rows extracted from the value cell in this row. In out case we are joining line row with a table of coordinates that are represented as an array of coordinates in line.value column.

In this example, we are returning id of the line, (x,y) coordinates of the point in the line:.

line_number   x  y
0 35 10
0 45 45
0 15 40
0 10 20
0 35 10
1 20 30
1 35 35
1 30 20
1 20 30

 

Multipolygon

Multipolygon is a multipart geometry structure that contains an array of polygons where each polygon can contain an array of lines:

Multi-polygon with hole
{ "type": "MultiPolygon", 
    "coordinates": [
        [
            [[40, 40], [20, 45], [45, 30], [40, 40]]
        ], 
        [
            [[20, 35], [10, 30], [10, 10], [30, 5], [45, 20], [20, 35]], 
            [[30, 20], [20, 15], [20, 25], [30, 20]]
        ]
    ]
}

Each line contains an array of 2D cells. Since we have three nested arrays we would need to apply three cascade OPENJSON functions, where the first one would open array of polygons, second arrays of line strings that belongs to polygons, and the third one two element arrays of (x,y) coordinates:

declare @multipolygon nvarchar(max) =
N' { "type": "MultiPolygon",
    "coordinates": [
        [
            [[40, 40], [20, 45], [45, 30], [40, 40]]
        ],
        [
            [[20, 35], [10, 30], [10, 10], [30, 5], [45, 20], [20, 35]],
            [[30, 20], [20, 15], [20, 25], [30, 20]]
        ]
    ]
}'

SELECT l1.[key] as polygon, l2.[key] as line, x, y
FROM OPENJSON(@multipolygon, '$.coordinates') as l1
       CROSS APPLY OPENJSON(l1.value) as l2
              CROSS APPLY OPENJSON(l2.value)
                     WITH (x float '$[0]', y float '$[1]')

As a result, we can return all polygons, lines in the polygons and coordinates:

polygon line x y
0 0 40 40
0 0 20 45
0 0 45 30
0 0 40 40
1 0 20 35
1 0 10 30
1 0 10 10
1 0 30 5
1 0 45 20
1 0 20 35
1 1 30 20
1 1 20 15
1 1 20 25
1 1 30 20

 

As we can see we can parse complex nested structures with OPENJSON.

Conclusion

OPENJSON function in Sql Server enables you to parse complex JSON structures and return them as row sets. Even if JSON objects are nested, you can easily transform them to flat relational format with one or two OPENJSON calls. There are other types of GeoJSON objects but all of them can be parsed using the similar rules.