Friday, 23 November 2018

Coordinates (3): way of regular expressions (continued)

In second post related to coordinates only one format of DMS has been covered. As I pointed to cover more possible format we have to write more regular expression or write complex regular expression in that way that it will match all combination of coordinate format that we want.

In this post I am going to share how to convert coordinate in various format (DMSH, HDMS) with different separators (space, hyphen, no separator) and get format of coordinate.

Let's start with defining constants to describe various coordinate formats:

# Hemisphere prefix degrees, minutes, seconds separated
F_HDMS_SEP_HYPHEN = 'F_HDMS_SEP_HYPHEN'  # hyphen separator
F_HDMS_SEP_SPACE = 'F_HDMS_SEP_SPACE'  # space separator
F_HDMS_SEP_WORD = 'F_HDMS_SEP_WORD'  # word DEG, MIN SEC separator
F_HDMS_SEP_LETTER = 'F_HDMS_SEP_LETTER'  # letter D, M, S separator

# Hemisphere suffix degrees, minutes, seconds separated
F_DMSH_SEP_HYPHEN = 'F_DMSH_SEP_HYPHEN'  # hyphen separator
F_DMSH_SEP_SPACE = 'F_DMSH_SEP_SPACE'  # space separator
F_DMSH_SEP_WORD = 'F_DMSH_SEP_WORD'  # word DEG, MIN SEC separator
F_DMSH_SEP_LETTER = 'F_DMSH_SEP_LETTER'  # letter D, M, S separator

# Degrees, minutes, seconds compacted
F_HDMS_COMP = 'F_HDMS_COMP'  # Hemisphere prefix DMS compacted
F_DMSH_COMP = 'F_DMSH_COMP'  # Hemisphere suffix DMS compacted

Now we have to write regular expression for each format.
Writing each regex for each format and then writing function that will check if regex separately (if .. elif statements) might not be optimal solution. Let's write regular expressions into dictionary,
where key will be coordinate format and value regular expression for this format. Then we will be able to go through the dict in order to check which regex match the input in for loop.
Dictionary of coordinate patterns:


coord_regex = {F_HDMS_SEP_HYPHEN: re.compile(r'''(?P<hem>^[NSEW])
                                                 (?P<deg>\d{1,3})  # Degrees
                                                 (-) 
                                                 (?P<min>\d{1,2})  # Minutes
                                                 (-)
                                                 (?P<sec>\d{1,2}(\.\d+)?$)  # Seconds
                                              ''', re.VERBOSE),
               F_HDMS_SEP_SPACE: re.compile(r'''(?P<hem>^[NSEW])
                                                (?P<deg>\d{1,3})  # Degrees
                                                (\s) 
                                                (?P<min>\d{1,2})  # Minutes
                                                (\s)
                                                (?P<sec>\d{1,2}(\.\d+)?$)  # Seconds
                                             ''', re.VERBOSE),
               F_HDMS_SEP_WORD: re.compile(r'''(?P<hem>^[NSEW])
                                               (?P<deg>\d{1,3})  # Degrees
                                               (DEG) 
                                               (?P<min>\d{1,2})  # Minutes
                                               (MIN)
                                               (?P<sec>\d{1,2}(\.\d+)?$)  # Seconds
                                            ''', re.VERBOSE),
               F_HDMS_SEP_LETTER: re.compile(r'''(?P<hem>^[NSEW])
                                                 (?P<deg>\d{1,3})  # Degrees
                                                 (D) 
                                                 (?P<min>\d{1,2})  # Minutes
                                                 (M)
                                                 (?P<sec>\d{1,2}(\.\d+)?$)  # Seconds
                                              ''', re.VERBOSE),
               F_DMSH_SEP_HYPHEN: re.compile(r'''((?P<deg>^\d{1,3})  # Degrees
                                                 (-) 
                                                 (?P<min>\d{1,2})  # Minutes
                                                 (-)
                                                 ?P<sec>\d{1,2}(\.\d+)?)  # Seconds
                                                 (?P<hem>[NSEW]$)
                                              ''', re.VERBOSE),
               F_DMSH_SEP_SPACE: re.compile(r'''(?P<deg>^\d{1,3})  # Degrees
                                                (\s) 
                                                (?P<min>\d{1,2})  # Minutes
                                                (\s)
                                                (?P<sec>\d{1,2}(\.\d+)?)  # Seconds
                                                (?P<hem>[NSEW]$)
                                             ''', re.VERBOSE),
               F_DMSH_SEP_WORD: re.compile(r'''(?P<deg>^\d{1,3})  # Degrees
                                               (DEG) 
                                               (?P<min>\d{1,2})  # Minutes
                                               (MIN)
                                               (?P<sec>\d{1,2}(\.\d+)?)  # Seconds
                                               (?P<hem>[NSEW]$)
                                            ''', re.VERBOSE),
               F_DMSH_SEP_LETTER: re.compile(r'''(?P<deg>^\d{1,3})  # Degrees
                                                 (D) 
                                                 (?P<min>\d{1,2})  # Minutes
                                                 (M)
                                                 (?P<sec>\d{1,2}(\.\d+)?)  # Seconds
                                                 (?P<hem>[NSEW]$)
                                              ''', re.VERBOSE),
               F_HDMS_COMP: re.compile(r'''(?P<hem>^[NSEW])
                                           (?P<deg>\d{2,3})  # Degrees
                                           (?P<min>\d{2})  # Minutes
                                           (?P<sec>\d{2}(\.\d+)?$)  # Seconds 
                                        ''', re.VERBOSE),
               F_DMSH_COMP: re.compile(r'''(?P<deg>^\d{2,3})  # Degrees
                                           (?P<min>\d{2})  # Minutes
                                           (?P<sec>\d{2}(\.\d+)?)  # Seconds
                                           (?P<hem>[NSEW]$)   
                                        ''', re.VERBOSE)}

Now we are able to start writing function that converts DMS format into DD format:

def coord2dd(regex_patterns, dms):

Basically, it will be very similar to that one written in post Coordinates (2) with some changes.

I need additional variables to track information if input is valid coordinate (in the range of our regular expressions), decimal degrees of input,  and  coordinate format:

flag = False
dd = None
coord_format = None

Checking if input matches pattern is done in for loop:

for pattern in regex_patterns:
    if regex_patterns.get(pattern).match(dms):
        groups = regex_patterns.get(pattern).search(dms)
        h = groups.group('hem')
        d = float(groups.group('deg'))
        m = float(groups.group('min'))
        s = float(groups.group('sec'))

Further there are no significant changes, expect those when we need assign values to newly created variables.
Checking conditions for latitude and longitude:

if h in ['N', 'S']:
    if d > 90:
        flag = False
    elif d == 90 and (m > 0 or s > 0):
        flag = False
    else:
        if m >= 60 or s >= 60:
            flag = False
        else:
            flag = True
            coord_format = pattern
            dd = d + m / 60 + s / 3600
            if h == 'S':
                dd = -dd

elif h in ['E', 'W']:
    if d > 180:
        flag = False
    elif d == 180 and (m > 0 or s > 0):
        flag = False
    else:
        if m >= 60 or s >= 60:
            flag = False
        else:
            flag = True
            coord_format = pattern
            dd = d + m / 60 + s / 3600
            if h == 'W':
                dd = -dd

Finally functions returns values:

return flag, dd, coord_format

If coordinate is not valid result will be a tuple: (False, None, None).
If coordinate is valid result will be a tuple, e.g. (True, 50.5, 'F_HDMS_SEP_SPACE')

The source code is available here:https://github.com/strpaw/python_examples/blob/master/latlon_dict_regexes_parser.py


No comments:

Post a Comment