SXCTrack

Visit SXCTrack

SXCTrack is an online database of cross-country and track events and athletes at Summit High School.

However, this database is really just an exercise in user interface design, as the data, and the management of the data that SXCTrack pulls from comes from Geoffrey Buchan’s database of these events and athletes.

Why make a different database if this one already exists? Well, I had a lot of free time over covid, and I was learning about statistics, of which this dataset suddenly becomes pretty interesting.

Long story short, the database is ‘complete’ - as in, it’s perfectly usable, and especially more usable on mobile devices as compared to the source data. I made it for both athletes and their parents to more easily view their times and data.




If you want a more detailed write-up on the creation of this project, below is a description of what went into a project of this scale.


The process

Initial concepts

On the Summit cross country team in the fall of 2018, we used “RunningToWin” to log our miles for the week. It became a paid service, so we stopped using it. This is when many of us were introduced to Strava! Our coach had to make a point at the time to not buy the paid subscription for Strava, ironically called “Summit”. You could see how high schoolers could have been confused.

For a short time between when we stopped using RunningToWin and started using Strava, I thought I could create a clone of RunningToWin. I ‘did’, but only to the extent of “I tried copying the homepage in html and css”.
This is what I made.

Sometime later, this idea morphed into a “show an athlete’s bests and records” website. The screenshot of my mockup may be lost to time.

And then I discovered Python and BeautifulSoup, the web-scraping library, and began seeing how far I could get with ‘reading’ Geoffrey’s database. Along this time, I created this prototype of a running website that I called hilltop.run. “Hilltop” was the nickname of our town. I think the domain may still be available.

Data collection

Long story short, it was an iterative process of learning how to work with python, how to work with BeautifulSoup, and learning how not to anger the poor old computer that runs Geoffrey’s website. (I had a while loop request pages from the server which I think made the website stop responding for minutes. Rookie mistake.)

Before I knew what the phrase ’tech debt’ meant, I was already an expert at creating it. Instead of learning how to use a proper database such as MySQL or some of the json-based setups such as Mongo, I instead created an assortment of json files outputted by the python script. The format was never standardized, and was extended and changed arbitrarily to fit whatever new feature I wanted the front-end to be able to display.

Site building

Eventually, the json files escaped my computer and got onto an ftp server. Using gloriously stock PHP, I made an API that allowed me to develop a front-end to display the json data I had created before.

(This API is quite possibly the slowest and least efficient way to parse json and send it back as more json. I would not recommend this route to others. Loading a ‘Bests’ query takes multiple seconds.)

I chose to utilize the Bootstrap UI library to speed up the styling of the website. Anyone who’s seen any website made with Boostrap v4 has already seen SXCTrack.

As the front end was being developed, it gave me ideas on newer features to add, which meant continuous adjustment of the underlying json files.

Eventually, I was satisfied with the featureset, and I had largely stopped working on the website for most of my senior year of high school, with the biggest effort just updating the data to match what was on Geoffrey’s site. The python script grew the ability to directly update the files on the FTP server so I didn’t have to manually drag and drop them from my computer.

Automation

I bought a VPS in the tail end of 2022, and with this new tool, I was off to the races, attempting to completely automate the update script. It took a couple months of troubleshooting as my philosophy of ’testing in prod’ can only go so fast when the new data to test the automatic updates only came once every few weeks at most. However, with the exception of some storage issues on the VPS from error logs clogging up var/spool or var/mail, the VPS does a daily update with no hiccups (at least, none that I’ve noticed).

The future

This project is ‘maintained’, and will operate for the forseeable future as the server costs, domain costs, and VPS costs are all very reasonable. As long as Geoffrey’s database doesn’t really change, it should continue to work.

I tried to pass on some of the responsibility to another student at Summit High School, but it didn’t pan out. The python back-end is on Github. The web end (jQuery for the front end and PHP for the back end) is on an FTP server, but none of it is particularly complex and could be built up again. (It mostly requests a large json file and returns a specific key within the json object)

I always wanted to do some proper data analysis with the database’s data, things like ‘if an athlete runs a 5:30 mile, what’s their predicted 800m’, trained on the history of our school’s running program, but I never made it that far. Maybe next time.