Introduction
They’re saying necessity is the mother of invention, nonetheless usually, laziness is the true MVP. As a grad pupil at Boston School (shoutout to the AI program!), I’m always looking out for strategies to make my life a bit less complicated and let’s be honest, cooler.
Residing in Boston, I depend upon BlueBikes to zip throughout the metropolis, nonetheless the eternal wrestle of discovering an on the market e-bike as soon as I would like one? Not so pleasing. So one night, I assumed, why not assemble an internet app that not solely reveals bike availability nonetheless even *predicts* it? A BlueBike crystal ball at my fingertips! And an identical to that, onebike.app was born.
One downside, though — I didn’t exactly know the place to start out out.
Why BlueBikes?
Good question. BlueBikes are a unbelievable totally different to various kinds of public transport in Boston — eco-friendly and fast. Nevertheless as someone who usually bikes throughout the metropolis, I’m no stranger to the “empty bike rack” disappointment. BlueBikes are good, nonetheless discovering an e-bike exactly when you need one? Not likely straightforward. I figured it might be cool if I could predict availability, saving of us from at all times refreshing the app and hoping an e-bike would magically appear.
And honestly, it appeared like the best excuse to be taught one factor new.
Step 1: Getting the Data
DATA IS KEY.
Nevertheless I ran into two massive questions: first, the place could I uncover real-time data with out spending a dime? And second, the place could I get historic data to educate my model?
Disadvantage one — real-time data.
Fortuitously, BlueBikes publishes real-time system data inside the Primary Bikeshare Feed Specification (GBFS) format, following the North American Bike Share Affiliation (NABSA) necessities. This accommodates up-to-date information on each station, all free to utilize under the “BlueBikes Data License Settlement” for lawful capabilities. In straightforward phrases: Disadvantage One — sorted.
Disadvantage Two — historic data.
Whereas BlueBikes provides historic journey data, they don’t current historic station-level data. Now, for those who occur to suppose truly laborious, you’ll most likely infer station data from journey data. Nevertheless hey I’m not a “thinker” proper right here; I’m a “doer”! In truth, I couldn’t decide it out, and it felt like a protracted shot anyway.
So instead, I decided to make a script that can hit the GBFS feed and fetch the data every minute. By working this script for quite a few weeks, I’d accumulate ample data components to assemble one factor important (albeit main). I chosen AWS for this since they’ve been good ample to provide a free tier for a 12 months. I prepare an S3 bucket, Lambda capabilities, after which scheduled an event in AWS EventBridge to automate the tactic.
This was my first time critically using AWS, and I was pretty pumped as soon as I observed the data pouring in — station functionality, e-bike rely, common bike rely, dock functionality, free docks, e-bike range, GPS areas — all in real-time!
Step 2: Making Use of the Data
Machine learning is tough. With so many decisions, it’s highly effective to inform aside between what can work and what will work.
To keep up points fast, I’ll skip over the half the place I experimented with fully totally different time-series fashions— that’s a story for yet another time — and soar straight to how I ended up with an XGBoost model. Let’s merely say I like XGB; it’s a beast, recognized for its velocity and effectivity with structured data. It turned out to be the best various for my needs.
That’s moreover the place discovering out ML at a excessive institution with quite a few the brightest minds truly pays off. Enter Bargav Jagatha — my classmate and fellow pupil. Alongside together with his help in navigating model alternative, our combined efforts gave us a steady begin line. We lastly had a model that might current main nonetheless reliable e-bike availability predictions.
With real-time data flowing in and a steady model to work with, I used to have the ability to kind out the following step: inserting that data to work in a usable, accessible technique.
Step 3: Showcasing Our Work
Setting up one factor is simply part of the puzzle. It’s obtained to be usable daily too. My technique was straightforward: let’s mimic what people are already used to seeing and enhance it with the predictive choices we’ve added.
To try this, I used AWS API Gateway to indicate a Lambda carry out that fetches every the current and predicted standing of stations. Then, using Node.js, HTML, and CSS, I displayed this data on a map powered by Apple MapKit JS. Every the frontend and AWS backend are filled with attention-grabbing particulars, nonetheless to take care of points high-level, I’ll save the technical bits for yet another publish. Let’s merely say AWS security groups and permissions might be a nightmare for those who occur to don’t know how they work. And since I’m no skilled however, I, in any case, made a great deal of errors alongside the way in which by which.
After which received right here one in every of many hardest points in software program program development — naming points (insert scared emoji proper right here). I obtained some assist from a smart outdated man (thanks, Vishal Agarwal), who thought onebike.app was easy ample to remember. He helped out in a great deal of totally different strategies too — nonetheless hey, that’s the literal job of a mentor, so no additional credit score, sorry.
Lastly, I hosted it on S3, routed the DNS with Route 53, and prepare a CloudFront distribution in entrance of my static web internet hosting S3 to permit SSL certification. And with that, I had a sensible, protected, scalable, serverless, and keep site.
Conclusion
Taking onebike.app keep was a journey filled with trial and error, learning curves, and stunning challenges. I truly put me outside my comfort zone and compelled me to evolve. From figuring out one of the best ways to gather data in precise time to establishing a predictive model and actually launching an internet app, every step taught me one factor new. And every step was extraordinarily tough. Additional importantly, it confirmed me the impression of blending machine learning with real-world capabilities.
Anyway, proper right here’s to many additional e-bikes, fewer empty docks, and the enjoyable of taking ideas on-line. Check out the web app at onebike.app and let me know what you suppose — I’d love your options!
Maintain tuned for additional unimaginable choices that make onebike.app even smarter, particular person nice and protected.
For now, that’s solely helpful to Beantown peeps nonetheless I plan on together with additional cites and take this nationwide. Onebike.app is a testament that, even in an age dominated by LLMs, typical machine learning has the power to rework frequently life.
Disclaimer: onebike.app is an unbiased, non-public enterprise and isn’t formally affiliated with, endorsed by, or associated to BlueBikes or any metropolis transport authority. The information predictions are based totally on real-time and historic data collected under the phrases of the BlueBikes Data License Settlement and mustn’t always mirror the current availability of bikes at each station. Please use onebike.app as an informational system to help in planning your journey, nonetheless confirm the official BlueBikes app or site for basically essentially the most up-to-date information. All predictions are estimates and must be used at your private discretion.