App Building day 3 – data smoothing…


I worked some more on my GPS app, and I’m looking for some advice from somebody with more experience in dealing with chunky data than I do.


One of the problems that I have with the GPS data that I get is that it’s noisy. First of all the altitude data for GPS isn’t as good as the location data. It acts as if there’s a constant bit of noise added to it. The second problem is that the receiver can’t always maintain a sync, and if it doesn’t, it ends up with a straight-line projection of the last data, and then abruptly re-syncs when it gets a good fix again.


That means I end up with discontinuities in the altitude data, which messes things up when I try to figure out the gradient of the data, and makes the plot look pretty ugly.


What I need is a good way to smooth over those sections of bad data, and I’m open for ideas.


I did realize as I was writing this that I may be able to use the quality of data information that the GPS sends me to decide what data is bad.

Comments (14)

  1. Yeah, I worked with GPS data a few years ago and was very surprised at how bad the altitude data was. We basically couldn’t use it for what we needed (vertical feet skied) which was very disappointing.

    And you can only imagine what happens when the skier goes into a gondola. Talk about straight line… from the bottom to the top. 🙂

    Good luck

  2. Of course I meant that his signal drops and you have one point before he gets in the gondola and one when he gets out.

  3. Doug McClean says:

    Check out my post on Savitzky-Golay smoothing filters for advice on how to smooth this type of data, especially if you plan to differentiate.

    GPS altitude data has gotten better since they turned off SA but is still not great. If you wanted to make this a commercial product for skiers though, you could probably set up a local DGPS system for the ski mountain.

  4. Luc Cluitmans says:

    What I have seen from GPS altitude data is that if you don’t have a very good reason to use it, it might be better to forget about it (and consider GPS data as ‘2-dimensional’ – on a sphere surface, that is).

    The biggest problem I have seen with non-altitude GPS data are the big gaps in the data that may appear when walking around in urban areas (where there is no good reception).

    The problem is that my GPS receiver simply doesn’t send anything when it has lost contact, and strictly speaking, my software therefore only ‘sees’ the gap when the next non-gap sample arrives.

    I have good experiences with injecting NaNs (System.Double.NaN) in the coordinate streams, to remedy this situation a bit, based on time-outs. These NaNs act as explicit ‘there was no data here’ markers.

    Of course, this only applies if your signal processing can handle NaNs at all; most classical signal processing algorithms break down in the presence of NaNs. Don’t forget that the only way of testing for NaN values is by using the Double.IsNaN() function; accidentally using the normal comparison operators for doubles will give other results than you may expect.

    Some categories of ‘standard’ filter algorithms are perfectly able to handle NaNs: median filters and simple averaging filters are good examples. These filters simply take the median (or the mean in the case of an averaging filter) over the past N samples. Simply drop a sample (and reduce N) in the presence of NaNs…

    Median filters can perfectly well do the data smoothing, though you may want to verify the results when applying them independently to the 2 (or 3) GPS coordinates.

  5. Neil Cawse says:

    If you take a look at some of the other NMEA messages, you can tell the number of satellites used to acquire the position. At least then you know what data is suspect. Altitude is particularly sensitive to few satellites and requires one more as a minimum than for lat and lng.

    Id keep a previous known position and altitude and provided I have valid data for position and altitude, Id update those values. If say altitude isnt valid, use the last known valid altitude in its place. This is a very simple filter.

  6. Jeff Lewis says:

    One thought is that when you have long gaps in data, use a vector between the last point and the next received point and a speed component (derived from time from last point to new point) to fill-in the missing data.

    Most of the time though, no data is better than bad data…

  7. IBMer says:

    In addition to what Niel suggested above, the GPS should provide an estimate of the quality of the fix, it is called DOP. The are horizontal, vertical, and position values (HDOP, VDOC, and PDOP).

    When the DOP value is high you can ignore the position value.

    http://www.gps-practice-and-fun.com/gps-tests.html

    I also heard that Kalman filters are very useful in this kind of calculations.

    http://www.cs.unc.edu/~welch/kalman/

  8. IBMer says:

    In addition to what Niel suggested above, the GPS should provide an estimate of the quality of the fix, it is called DOP. The are horizontal, vertical, and position values (HDOP, VDOC, and PDOP).

    When the DOP value is high you can ignore the position value.

    http://www.gps-practice-and-fun.com/gps-tests.html

    I also heard that Kalman filters are very useful in this kind of calculations.

    http://www.cs.unc.edu/~welch/kalman/

  9. Jeremy says:

    Warning! Whatever you do, do not click the link above in Rolf O. Eiffel’s post.

  10. Jeremy says:

    Seriously, that link pretty much brought my machine to a crawl by continuously opening new browser windows – all with the most disturbing pictures I’ve ever seen. Happens both in IE and Firefox, so be careful! Pop-up blockers will not help.

  11. Eric says:

    Thanks Jeremy.

  12. MBA says:

    Helpful For MBA Fans.

Skip to main content