Geospatial data is everywhere around us, and it’s essential for data professionals to know how to work with it. One common way to store this type of data is in GPX files. Today you’ll learn everything about it, from theory and common questions to R and GPX file parsing.
We’ll start simple – with just a bit of theory and commonly asked questions. This is needed to get a deeper understanding of how storing geospatial data works. If you’re already familiar with the topic, feel free to skip the first section.
Online route mapping services such as Strava and Komoot store the routes in GPX file format. It’s an easy and convenient way to analyze, visualize, and display different types of geospatial data, such as geolocation (latitude, longitude), elevation, and many more.
For example, take a look at the following image. It represents a Strava cycling route in Croatia I plan to embark on later this summer. It’s the highest paved road in the country, and I expect the views to be breathtaking:
Why is this relevant? Because Strava allows you to export any route or workout in GPX file format. But what is GPX anyway?
Put simply, GPX stands for GPS eXchange Format, and it’s nothing but a simple text file with geographical information, such as latitude, longitude, elevation, time, and so on. If you plot these points on a map, you’ll know exactly where you need to go, and what sort of terrain you might expect, at least according to the elevation.
The Strava route we’ll analyze today is just a plain route and has 1855 latitude, longitude, and elevation data points. If I was to complete this route and export the file from workouts, it would also include timestamps.
These data points are ridiculously easy to load into R. You don’t need a dedicated package to combine R and GPX – all is done with an XML parser. More on that in a bit.
This is a common question beginners have. GPS stands for Global Positioning System which provides users with positioning, navigation, and timing services. GPX, on the other hand, is a file format used to exchange GPS data by storing geographical information at given intervals. These data include waypoints, tracks, elevation, and routes.
If you’re working on GPS programs or plan to build navigation applications, GPX files are a common map data format used. GPX is an open standard in the geospatial world that has been around for 2 decades. It’s important you know how to work with them.
You can’t open a GPX file without dedicated software or a programming language. Downloadable software includes Google Earth Pro and Garmin BaseCamp, just to name a few.
If you’re into coding, you should know that any major programming language can load and parse GPX files, R and Python included.
Now you’ll learn how to combine R and GPX. First things first, we’ll load a GPX file into R. To do so, we’ll have to install a library for parsing XML files. Yes – GPX is just a fancier version of XML:
We can now use the function to read a GPX file. Make sure you know where your file is saved beforehand:
The variable contains the following:
If you think that looks like a mess, you are not wrong. The file is pretty much unreadable in this form, but you can spot a structure if you focus for long enough.
The element contains latitude and longitude information for every point, and there’s also an tag which contains the elevation.
Use the following R code to extract and store them in a more readable data structure – :
The route represents a roundtrip, so starting and ending data points will be almost identical. The fun part happens in the middle, but we can’t know that for sure before inspecting the data further.
The best way to do so is graphically, so next, we’ll go over a couple of options for visualizing GPX data in R.
When it comes to data visualization and GPX files, you have options. You can go as simple as using a built-in function or you can pay for custom solutions.
The best approach would be to use the package, but it requires a GCP subscription to an API which isn’t free. We won’t cover it in the article, but we’ll go over the next best thing.
For starters, let’s explore the most basic option. It boils down to plotting a line chart that has all individual data points connected:
The route looks on point, but the visualization is useless. There’s no underlying map below it, so we have no idea where this route takes place.
The other, significantly better alternative is the package. It’s designed for visualizing geospatial data, so it won’t have any trouble working with our data frame:
Now we’re getting somewhere! The route looks almost identical to the one shown earlier on Strava, but we don’t have to stop here. You can invest hours into producing a perfect geospatial visualization, but for the purpose of this article, we’ll display one additional thing – elevation.
Leaflet doesn’t ship with an easy way of using elevation data (numeric) for coloring purposes, so we have to be somewhat creative. The function will return one of four colors, depending on the elevation group. Then, data points for groups are added manually to the chart inside a loop:
The map isn’t perfect, but it informs us which route segments have a higher elevation than the others.
And that’s the basics of R and GPX! You’ve learned the basic theory behind this file format, and how to work with it in the R programming language. We’ve only scratched the surface, as there’s plenty more you can do. For example, plotting the elevation profile or making the polyline interactive would be an excellent next step.
Now it’s time for the homework assignment. We encourage you to play around with any GPX file you can find and use R to visualize it. Feel free to explore other visualization libraries and make something truly amazing. When done, please share your results with us on Twitter – @appsilon. We’d love to see what you can come up with.
The post R and GPX – How to Read and Visualize GPX Files in R appeared first on Appsilon | Enterprise R Shiny Dashboards.