Open source platform for analyzing risk on road networks
Thanks to our team member Sam Bail, we now have the mighty NYC included in our showcase. This was a bit of a tricky one since we have crash data going back to 2012. If we include ALL of these data, we essentially have a crash on almost every street segment. That leads to an issue of imbalanced classes in our risk prediction. The result is that the algorithm essentially predicts ALL segments as high risk.
I reduced the number of years to the past 5, but STILL am seeing pretty high risk scores out of the model. Comparing the distribution of scores for Boston versus NYC:
You can see that on average the risk scores for NYC are higher. They’re also less smooth, which might be a product of the limited featureset being used in NYC (vs Boston). Likely we need some more detailed features to make the risk more usable.
Interested in that or other problems? Join us on Slack