Monday, 26 November 2018

Challenge: importing FAA digital obstacle file content into PostGIS database

plus bonus: browsing data using QGIS.

What FAA Digital Obstacle File (DOF) is and how to model it using PostGIS database - refer to post
Now, it's time to load some data into our database using Python script.

First we need to populate tables such as ctry_state, hacc, lighting because primary keys of this tables are foreign keys in main table obstruction.

To insert data into hacc (horizontal accuracy) table execute following DML statement:

INSERT INTO hacc VALUES
('1', 20, 'FEET'),
('2', 50, 'FEET'),
('3', 100, 'FEET'),
('4', 250, 'FEET'),
('5', 500, 'FEET'),
('6', 1000, 'FEET'),
('7', 0.5, 'NM'),
('8', 1, 'NM'),
('9', -1, 'UNKNOWN'),
('N', -1, 'NO DATA');
Note: While reading original data file (DOF) it might occur that some of the fields are 'empty', they are contains only blanks instead of value, e. g. horizontal accuracy column (in DOF) should have one of values from 1 to 9, in case missing this value new value N is introduced.

SQL script with DML (Data Manipulation Language) statements that insert data into all tables (without obstruction table): https://github.com/strpaw/PostGIS_FAA_DOF_database/blob/master/faa_dof_postgis_db_DML.sql

Script to insert data from file into database is very similar to that which is is used to convert DOF file into CSV file. Significant changes are:

Import psycop2 module, PostgreSQL database adapter

import psycopg2

How to install and use psycop2 module click here.

Before executing 'insert' statement  I want to convert coordinates from DMS format (that format is used in digital obstacle file) into DD format, to make query more simple:

def dms2dd(dms):
    h = dms[-1]
    dms_m = dms[:-1]
    dms_t = dms_m.split(' ')
    d = float(dms_t[0])
    m = float(dms_t[1])
    s = float(dms_t[2])

    dd = d + m / 60 + s / 3600
    if h in ['W', 'S']:
        dd = - dd
    return dd

It is worth to notice, that I do not check if input (dms) is in used in DOF, e. g. N25 20 20.00 is correct correct latitude, above function will be executed with following error:
ValueError: could not convert string to float: 'N25'
This might be a little challenge:  if format is other then described in documentation pass to next line.

We can define the function that will read DOF data and insert parsed values into database:

def dof2postgis(input_file):
Parameter input_file is full path of DOF with data.
To connect with our database and open a cursor:

conn = psycopg2.connect(host="localhost", database="database", user="user",
                        password="password")
cursor = conn.cursor()

You need to pass database name, user and password that you provide during installing PostgreSQL and creating database of course.

Now we open input file, read and parse line. Notice that I want to avoid null values in some fields, that means if input file contains only blanks for specified field (e. g. agl height) I will insert into database some arbitrary chosen value to distinguish that is something missing in input file. In other words what will be in database is not one-to-one copy of  original DOF!


with open(input_file, 'r') as dof_file:
    line_nr = 0
    for line in dof_file:
        line_nr += 1
        if line_nr < 5:
            continue
        else:
            if line_nr % 10000 == 0:
                print(line_nr)
            oas_code = line[0:2]
            obs_number = line[0:9]
            vs_code = line[10]
            city_name = line[18:34].rstrip(),
            lat_dms = line[35:47]
            lon_dms = line[48:61]
            obs_type = line[62:80].rstrip()

            # If there is missing height above ground level write number -9999
            aglhgt_str = line[83:88].rstrip('0')
            if aglhgt_str == '':
                agl_height = 0
            else:
                agl_height = int(aglhgt_str)

            amsl_height = int(line[89:94])

            # If there is missing data for horizontal accuracy - add additional code
            h_acc_code = line[97]
            if h_acc_code == ' ' or h_acc_code == '':
                h_acc_code = 'N'

            # If there is missing data for vertical accuracy - add additional code
            v_acc_code = line[99]
            if v_acc_code == ' ' or v_acc_code == '':
                v_acc_code = 'N'

            marking_code = line[101]
            lighting_code = line[95]
            lat_dd = dms2dd(lat_dms)
            lon_dd = dms2dd(lon_dms)

If line is parsed and values are assigned to local variables, we are ready to insert data from one line into database, by calling execute method. Parameters for this methods are: SQL statement (in this case insert) and tuple of values to be inserted.
This is might not to be the fastest and the most efficient way. Psycopyg2 module has other function to speed up such task.

cursor.execute("""
    insert into
        obstruction 
        (oas_code, 
         obs_number, 
         vs_code, 
         city_name, 
         lat_dms, 
         lon_dms,
         obs_type, 
         agl_height, 
         amsl_height, 
         lighting_code, 
         h_acc_code,
         v_acc_code, 
         marking_code, 
         geo_location)
        values
        (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
        ST_GeomFromText('POINT(%s %s)', 4326));
       """,
      (oas_code, obs_number, vs_code, city_name, lat_dms, lon_dms,
       obs_type, agl_height, amsl_height, lighting_code, h_acc_code,
       v_acc_code, marking_code, lon_dd, lat_dd))

Full source code for python script: https://github.com/strpaw/PostGIS_FAA_DOF_database/blob/master/faa_dof2postgis_db.py



No comments:

Post a Comment