Tuesday, March 21, 2023
No Result
View All Result
Get the latest A.I News on A.I. Pulses
  • Home
  • A.I News
  • Computer Vision
  • Machine learning
  • A.I. Startups
  • Robotics
  • Data science
  • Natural Language Processing
  • Home
  • A.I News
  • Computer Vision
  • Machine learning
  • A.I. Startups
  • Robotics
  • Data science
  • Natural Language Processing
No Result
View All Result
Get the latest A.I News on A.I. Pulses
No Result
View All Result

Extra on Highway Community Matching. Highway community matching shenanigans | by João Paulo Figueira | Feb, 2023

February 10, 2023
140 10
Home Data science
Share on FacebookShare on Twitter


Highway community matching shenanigans

Picture by Denys Nevozhai on Unsplash

The aim of this text is to enrich and proper the earlier one on the identical topic. In that article, I introduced an strategy to reconstructing lacking map-matched information from the Prolonged Automobile Vitality Dataset¹ (EVED) [1]. My method explored the mathematical properties of triangles to seek out the misplaced information on a highway community database shortly. Sadly, a refined error within the code made it fail beneath some circumstances.

Right here, I current a correction to the code and talk about the outcomes of making use of these methods to reconstruct greater than thirty-two thousand trajectories on the EVED.

Within the earlier article, I illustrated two strategies to suit map-matched areas to highway community segments utilizing triangular properties. The geometrical ideas that assist the concept are satisfactory, however the code must be corrected. The problem happens on the node filtering half and impacts each matching strategies, the ratio-based and the distance-based. Let us take a look at the unique code once more utilizing the ratio-based perform:

def get_matching_edge(self, latitude, longitude, bearing=None):loc = np.array([latitude, longitude])_, r = self.geo_spoke.query_knn(loc, 1)radius = self.max_edge_length + r[0]node_idx, dists = self.geo_spoke.query_radius(loc, radius)nodes = self.ids[node_idx]distances = dict(zip(nodes, dists))adjacent_set = set()graph = self.graph

best_edge = Nonefor node in nodes:if node not in adjacent_set:adjacent_nodes = np.intersect1d(np.array(graph.adj[node]),nodes, assume_unique=True)

adjacent_set.replace(adjacent_nodes)for adjoining in adjacent_nodes:edge_length = graph[node][adjacent][0][‘length’]ratio = (distances[node] + distances[adjacent]) / edge_lengthif best_edge is None or ratio < best_edge[2]:best_edge = (node, adjoining, ratio)

if bearing is just not None:best_edge = fix_edge_bearing(best_edge, bearing, graph)return best_edge

The perform makes use of a set to retailer all of the processed adjoining nodes to keep away from repetitions. And it does keep away from repetitions, sadly. That is the place the error lies as a result of by excluding a node, the perform can also be stopping all different adjoining edges from being examined. This example is particularly prevalent for nodes with a number of hyperlinks. Correcting the code requires a conceptual change, eradicating processed highway community edges as a substitute of processed nodes.

However one other challenge surfaced after utilizing the code with just a few completely different trajectories, a black swan. What occurs when the enter GPS location is near one of many nodes? In some uncommon conditions, the sampled GPS areas match one of many highway community nodes very carefully, which induces very refined floating level instabilities. The result’s the collection of a nonsensical edge, ruining the trail reconstruction. To resolve this challenge, I added a default parameter with the minimal distance from the enter GPS location to the closest node. When this distance is under one meter (three toes), the perform refuses to pick an edge and returns an empty worth. This answer is average as it will occur comparatively occasionally, and the following reconstruction mechanism compensates for that.

You’ll be able to see the corrected code under, that includes the set of examined highway community edges and the minimal distance examine.

def get_matching_edge(self, latitude, longitude,bearing=None, min_r=1.0):best_edge = Noneloc = np.array([latitude, longitude])_, r = self.geo_spoke.query_knn(loc, 1)if r > min_r:radius = self.max_edge_length + r[0]node_idx, dists = self.geo_spoke.query_radius(loc, radius)nodes = self.ids[node_idx]distances = dict(zip(nodes, dists))tested_edges = set()graph = self.graphnode_set = set(nodes)

for node in nodes:adjacent_nodes = node_set & set(graph.adj[node])

for adjoining in adjacent_nodes:if (node, adjoining) not in tested_edges:edge_length = graph[node][adjacent][0][‘length’]ratio = edge_length / (distances[node] + distances[adjacent])

if best_edge is None or ratio > best_edge[2]:best_edge = (node, adjoining, ratio)tested_edges.add((node, adjoining))tested_edges.add((adjoining, node))

if bearing is just not None:best_edge = fix_edge_bearing(best_edge, bearing, graph)return best_edge

Observe that I additionally modified the speed formulation in order that it ranges between zero and one, with bigger values equivalent to a greater match. Apparently, the rate-based perform works higher and sooner than the distance-based one.

The primary a part of the trail reconstruction implies making use of the perform above to each level of the enter trajectory and holding solely the distinctive highway community edges. The code under exhibits a Python perform that does that.

def match_edges(road_network, trajectory):edges = []unique_locations = set()edge_set = set()for p in trajectory:if p not in unique_locations:e = road_network.get_matching_edge(*p, min_r=1.0)if e is just not None:n0, n1, _ = eedge = (n0, n1)if edge not in edge_set:edge_set.add(edge)edges.append(edge)unique_locations.add(p)return edges

The perform assigns a highway community edge to every distinctive trajectory location. Determine 1 under exhibits a pattern of such matching.

Determine 1 — The map above exhibits the enter trajectory areas in crimson and the matched highway community edges in blue. (Picture supply: Creator utilizing Folium and OpenStreetMap imagery)

As you may see, there are some gaps between the matched hyperlinks as a result of GPS sampling frequency. Reconstructing the trail requires us to fill in these gaps, and we achieve this by discovering the shortest method between the highway community edges. The code under does that by holding collectively the related hyperlinks and filling within the gaps for all of the others utilizing OSMnx’s companies [2].

def build_path(rn, edges):path = []for e0, e1 in pairwise(edges):if not len(path):path.append(e0[0])if e0[0] != e1[0] and e0[1] != e1[1]:if e0[1] == e1[0]:path.prolong([e0[1], e1[1]])else:n0, n1 = int(e0[1]), int(e1[0])sp = ox.distance.shortest_path(rn, n0, n1)if sp is just not None:path.prolong(sp[1:])return path

After calling this perform on the earlier matches, we must always have the trajectory totally reconstructed as a sequence of highway community edges. We see a steady outcome by superimposing the trajectory areas to the patched highway community edges, as depicted in Determine 2 under.

Determine 2 — The map above exhibits the reconstructed hyperlink sequence with the unique trajectory areas superimposed. (Picture supply: Creator utilizing Folium and OpenStreetMap imagery)

The Case for a Minimal Distance

Whereas reviewing this answer, I discovered a problem when iterating by means of the person trajectories. Determine 3 under particulars the issue.

Determine 3 — The picture above exhibits a nook case the place this highway community edge matching system breaks down. One of many trajectory’s areas falls proper over one of many nodes, fooling the matching course of. (Picture supply: Creator utilizing Folium and OpenStreetMap imagery)

A spurious cycle within the route arises from an surprising worth within the enter information for the patching course of. We will dig deeper into what is occurring by reverting to the view displaying which highway community edges the algorithm matches to those GPS areas.

Determine 4 under illustrates the state of affairs on the fork, the place the matched GPS location coincides (or is extraordinarily shut) to one of many highway community nodes. This may probably trigger some numeric instability yielding a flawed outcome. A strong algorithm ought to keep away from these conditions, however how?

Determine 4 — The matching downside turns into evident as soon as we see the situation assignments. Clearly, within the above case, the very shut proximity of the GPS location to one of many highway community nodes induced instability within the edge choice course of. (Picture supply: Creator utilizing Folium and OpenStreetMap imagery)

After contemplating it, I made a decision to alter each matching algorithms and use the gap to the closest highway community node as a criterion to incorporate or exclude the enter location. If this distance is smaller than a given threshold (one meter or three toes as default), the match perform returns an empty worth signaling the calling code that it can’t safely assign a hyperlink. That is high-quality for the next step because it ought to infer the lacking hyperlinks as a result of that algorithm relies upon solely on highway community edges and never the enter areas.

Now that now we have sanitized the entire algorithm, allow us to see the way it performs on all the dataset.

To evaluate how the algorithm behaves on the entire EVED, I created a brand new standalone script that iterates by means of all trajectories, matches the sides, generates a path, and compares the unique trajectory’s size to the reconstructed path. The Python script information all of the variations so we are able to later analyze them utilizing a regular statistical toolset.

The script core lies in its essential loop, detailed within the code under.

def process_trajectories():rn = download_network()road_network = RoadNetwork(rn)

state = load_state()if state is None:state = {“trajectories”: get_trajectories(),”errors”: []}

save_counter = 0trajectories = state[“trajectories”]whereas len(trajectories) > 0:trajectory_id = trajectories[0]trajectory = load_trajectory_points(trajectory_id,distinctive=True)if len(trajectory) > 3:edges = match_edges(road_network, trajectory)path = build_path(rn, edges)

if len(path) > 0:diff = calculate_difference(rn, path, trajectory)print(f”Trajectory: {trajectory_id}, Distinction: {diff}”)state[“errors”].append((trajectory_id, diff))

trajectories = trajectories[1:]state[“trajectories”] = trajectories

save_counter += 1if save_counter % 100 == 0:save_state(state)

save_state(state)

The perform begins by downloading and preprocessing the highway community information from OpenStreetMap. As a result of it is a long-running script, I added a persistent state serialized as a JSON-formatted textual content file and saved each 100 trajectories. This characteristic permits the consumer to interrupt the Python script with out shedding all the information from already-processed trajectories. The perform that calculates the size distinction between the unique enter trajectory and the reconstructed path is displayed under.

def calculate_difference(rn, path, trajectory):p_loc = np.array([(rn.nodes[n][‘y’], rn.nodes[n][‘x’]) for n in path])t_loc = np.array([(t[0], t[1]) for t in trajectory])

p_length = vec_haversine(p_loc[1:, 0], p_loc[1:, 1], p_loc[:-1, 0], p_loc[:-1, 1]).sum()t_length = vec_haversine(t_loc[1:, 0], t_loc[1:, 1], t_loc[:-1, 0], t_loc[:-1, 1]).sum()return p_length – t_length

The code converts each inputs into sequences of geolocations, computes their cumulative lengths, and at last computes their distinction. When the script finishes, we are able to load the state information and analyze it utilizing Pandas.

We see how the distinction between each distances distributes in Determine 5 under (named “error”).

Determine 5 — The histogram above exhibits the distribution of the gap variations between the unique enter trajectories and the reconstructed paths. The horizontal axis shows distances in meters. (Picture supply: Creator)

To see extra element round zero, I clipped the histogram and created the one in Determine 6 under.

Determine 6 — The histogram above particulars the distinction distribution, highlighting the detrimental and majority optimistic variations. Most variations are minor, hinting at a superb efficiency of the matching algorithm. (Picture supply: Creator)

As you may see, there are some variations within the detrimental a part of the spectrum. Nonetheless, it’s primarily optimistic, that means that the reconstructed path lengths are normally longer than trajectories, which is expectable. Do not forget that trajectory factors seldom match highway community nodes, that means that the most typical state of affairs must be a match halfway by means of a phase.

We will additionally examine the place the mid-fifty p.c of variations is by drawing a box-and-whiskers plot, as depicted under in Determine 7.

Determine 7 — The box-and-whiskers plot above exhibits the place the center 50% of the distribution lies. The 25%, 50%, and 75% percentiles are 13.0, 50.2, and 156.5 meters, respectively. We will additionally see that the inliers band is comparatively slim, starting from -202.3 to 371.8 meters. (Picture supply: Creator)

The picture above accommodates important data on the distribution of variations and the variety of outliers. Utilizing Tukey’s fences technique, we are able to classify as outliers 5,332² trajectories out of 32,528. This proportion represents a comparatively excessive proportion worth of 16.4%, that means there are most likely higher reconstruction strategies than this or that there’s a sizable quantity of poorly processed trajectories. We will look into a few of the most dramatic distance distinction outliers to try to perceive these outcomes.

Outlier Evaluation

As beforehand famous, there are over five-thousand trajectories with poor matching. We will shortly examine just a few to find out why the distinction between the sampled trajectory and the reconstruction is so important. We begin with a revealing case, trajectory quantity 59, depicted under in Determine 8.

Determine 8 — The trajectory displayed above illustrates a map-matching error that led the reconstruction course of to construct an extended than vital path. (Picture supply: Creator utilizing Folium and OpenStreetMap imagery)

The case above exhibits that the map-matching course of incorrectly positioned three GPS samples on a parallel highway, forcing the reconstruction course of to discover a spurious detour. Such a error may outcome from the insufficient parameterization of the map-matching course of, one thing to analysis in a later article.

Lastly, we are able to see a really radical case in Determine 9 under. The sampled GPS areas are so distant from the highway that the map-matching course of gave up. As you may see, the reconstruction course of assigns highway community hyperlinks that bear no relation to the meant trajectory.

Determine 9 — The trajectory pictured above accommodates quite a few errors, and the preliminary map-matching process couldn’t get rid of them. (Picture supply: Creator utilizing Folium and OpenStreetMap imagery)

On this article, I corrected beforehand revealed code on highway community edge matching and assessed its efficiency utilizing the EVED dataset. I collected the efficiency evaluation information utilizing a specialised Python script that exposed a comparatively excessive variety of matched trajectories with seemingly irregular deviations within the measured distances. To grasp why this outcome may very well be higher, I’ll flip subsequent to the unique map-matching software program, Valhalla, to try to repair the problems above.

The unique paper authors licensed the dataset beneath the Apache 2.0 License (see the VED and EVED GitHub repositories). Observe that this additionally applies to by-product work.The supply of all pictures and calculations is in a Jupyter pocket book, out there from the general public GitHub repository.

[1] Zhang, S., Fatih, D., Abdulqadir, F., Schwarz, T., & Ma, X. (2022). Prolonged car vitality dataset (eVED): An enhanced large-scale dataset for deep studying on car journey vitality consumption. ArXiv. https://doi.org/10.48550/arXiv.2203.08630

[2] Boeing, G. 2017. OSMnx: New Strategies for Buying, Developing, Analyzing, and Visualizing Advanced Road Networks. Computer systems, Atmosphere and City Programs 65, 126–139. doi:10.1016/j.compenvurbsys.2017.05.004



Source link

Tags: FebFigueiraJoãoMatchingNetworkPauloRoadshenanigans
Next Post

What’s New in Robotics? 10.02.2023

Making Clever Doc Processing Smarter: Half 1

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent News

Modernización, un impulsor del cambio y la innovación en las empresas

March 21, 2023

How pure language processing transformers can present BERT-based sentiment classification on March Insanity

March 21, 2023

Google simply launched Bard, its reply to ChatGPT—and it needs you to make it higher

March 21, 2023

Automated Machine Studying with Python: A Comparability of Completely different Approaches

March 21, 2023

Why Blockchain Is The Lacking Piece To IoT Safety Puzzle

March 21, 2023

Dataquest : How Does ChatGPT Work?

March 21, 2023

Categories

  • A.I News
  • A.I. Startups
  • Computer Vision
  • Data science
  • Machine learning
  • Natural Language Processing
  • Robotics
A.I. Pulses

Get The Latest A.I. News on A.I.Pulses.com.
Machine learning, Computer Vision, A.I. Startups, Robotics News and more.

Categories

  • A.I News
  • A.I. Startups
  • Computer Vision
  • Data science
  • Machine learning
  • Natural Language Processing
  • Robotics
No Result
View All Result

Recent News

  • Modernización, un impulsor del cambio y la innovación en las empresas
  • How pure language processing transformers can present BERT-based sentiment classification on March Insanity
  • Google simply launched Bard, its reply to ChatGPT—and it needs you to make it higher
  • Home
  • DMCA
  • Disclaimer
  • Cookie Privacy Policy
  • Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2022 A.I. Pulses.
A.I. Pulses is not responsible for the content of external sites.

No Result
View All Result
  • Home
  • A.I News
  • Computer Vision
  • Machine learning
  • A.I. Startups
  • Robotics
  • Data science
  • Natural Language Processing

Copyright © 2022 A.I. Pulses.
A.I. Pulses is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In