We have 'table' in text file:
ABCDE 010101.10N 0010101.10W ABC DEF 100 500
ABCDE 010101.10N 0010101.10W ABC DEF 100 500
ABCDE 010101.10N 0010101.10W ABCDEF 100 500
ABCDE 010101.10N 0010101.10W
It is easy to spot following pattern:
[variable_blank][point_name][DDMMSS.ssH][variable_blanks][DDDMMSS.ssH][variable_blanks][other_stuff]
We are going to use regular expression to parse each line of file, and write result into csv file. Morwover, we are goint to convert data from DMS into DD format and store it in resul file. We know format - so let's use 'predetermined' angle. Let's start with importing required module and building regular expression:
import re import csv from core_predermined_angle import PredeterminedCoordinates2DD regex_points = re.compile(r'''(?P<ident>[A-Z0-9]{5}) (\s+) (?P<lat>\d{6}(\.d+)?[NS]) (\s+) (?P<lon>\d{7}(\.\d+)?[EW]) (.*?) ''', re.VERBOSE)
Now we are ready to define function that will parse text file with data and write resutl to csv file. Function will take following arguments: input file path, output file path, ICAO country code from which data is and waypoint type.
As I said, we are going to convert DMS into DD format, an this will be done by using instance of PredeterminedCoordinates2DD class from core_predermined_angle module. Before we will parse line of file, a bit of 'normzlization' is required: removing leading blanks and reomving new line characters.
def parse_points_txt2csv(input_file, output_file, icao_ctry, wpt_type): dd_coord = PredeterminedCoordinates2DD() csv_fnames = ['ctry_icao_code', 'ident', 'wpt_type', 'lat_src', 'lon_src', 'lat_dd', 'lon_dd'] with open(input_file, 'r') as in_file: with open(output_file, 'w', newline='') as csv_file: writer = csv.DictWriter(csv_file, fieldnames=csv_fnames, delimiter=';') for line in in_file: line_mod = line.lstrip() line_mod = line_mod.strip('\n')
If 'normalized' line matches regex - extract ident (point name) and latitude and longitude:
if regex_points.match(line_mod): groups = regex_points.search(line_mod) ident = groups.group('ident') lat_src = groups.group('lat') lon_src = groups.group('lon') dd_coord.dmsh2dd(lat_src, lon_src)
Note that regular expression 'does not check' if DMS is valid (e.g. minutes are less then 60), we need to check if DMS 'is valid' during conversion to DD format. If is, that means that bothn latitude and longitude has been converted succesfully, write result into csv file:
if dd_coord.is_valid is True: writer.writerow({'ctry_icao_code': icao_ctry, 'ident': ident, 'wpt_type': wpt_type, 'lat_src': lat_src, 'lon_src': lon_src, 'lat_dd': str(dd_coord.lat_dd), 'lon_dd': str(dd_coord.lon_dd)})
And that is all, we are read to extract data from 'text file' tables.
No comments:
Post a Comment