Occassional thoughts about orienteering
Sunday, December 09, 2012
Playing with training text analysisI read training logs for a number of orienteers at Attackpoint. It is fun to see how people train. It is especially fun to read what they write about their training and racing. Attackpoint makes it pretty easy to get summaries of the basic information about how someone trains. Say you want to see how Emily Kemp trained in October. In three mouse clicks you can get a summary with the amount of training by type with a graph showing day-by-day totals. You also get narrative description like:
my room mates had started hanging their socks on the spokes of my bike so i figured it was time to fix my flat tire and get it back on the road. the arm did fabulously!! hardly any pain at all and it was so amazing to feel the wind through my hair!! :)And
For a university campus I definitely wasn't expecting so many little passageways and tricky spots. Good thing I've been practising at reading my control descriptions! I never really made any large mistakes however I did feel like I had a lot of hesitations. On the way to control 10 what is marked as an overhang passageway thingy actually goes right through a building. I think I spent a good 5sec with one foot in the doorway trying to figure out if I would be disqualified or not! The top 3 places were super close with Celine coming in first, me 4sec behind, and Isia 1sec behind me. Eek! I definitely put everything out there and don't have many regrets although it would have been nice to find those 4 sec somewhere ;)The narratives are a lot more fun to read than the base description of a training session. The base description that goes with the first quote is "velo 41:31 ." The base description that goes with the second quote is "orienteering race 15:17  2.4 km (6:22/km)."
At work, I've been analyzing written responses to open ended survey questions using "text mining" and it seems as if that sort of approach might be worth using to look at the narrative portion of an orienteer's training at Attackpoint.
A simple example...
I started by collecting all of the narrative descriptions from my log for the last year of entries. That gives me 544 small bunches of text that I'd written in my log. I cleaned up the text by removing numbers, punctuation and white space. I did some further cleaning by taking out the common English words that don't really carry much information. These "stop words" are terms like "the", "is" and "at". Finally, I combined words that describe essentially the same term. So, the words "orienteering", "orienteer" and "orienteers" are all treated as the same and are renamed "orient". Once the text is cleaned-up, I can start looking at it.
I created a list of every word that appears and of how many times that word appeared in each of the 544 entries. The result is a big table that tells me which words appear or don't appear in each entry. For example, I know that the 7th entry doesn't include the words "basketball" or "beer" but does include the words "work" and "commute" (and it includes each of those words once).
You can start to look at the entire year of training and find the words that show up most often. For example, among the 48 terms that show up at least 30 times are: bike, fun, jog, map, mtb, orient[eer], train, warm and work.
You can also get figure out how different words are correlated with each other. Take a term like "compass" and calculate the terms most frequently correlated with compass, which include:
I know why those terms are correlated (I ran a race in North Carolina without a compass because I feel like my navigation is sharper when I run without a compass - especially if I haven't recently done much O' technique training).
It can be fun to explore the text by looking for correlated terms. I do a lot of my running at biking at Clinton Lake. Here are some terms correlated with Clinton: trail, run and snake.
Playing around with the text is fun and I suspect that it could even be useful once I've learned more about how to do "text mining." One of the really easy things to do with text data is to create a word cloud. Here's the word cloud of my last 365 days of training log narrative:
Back to okansas.blogspot.com. posted by Michael | 11:19 AM