Below is a quick overview of samples that demonstrate how to use the machine learning capabilities in Spark on IBM Bluemix.
Flight Delay Predictions
David Taieb posted the slides of his hands-on session how to predict flight delays based on historical data and whether predictions. The sample uses the machine learning algorithms Logistic Regression, Random Forrest, Decision Tree and Naive Bayes.
When playing rock-paper-scissor everyone has his/her own strategies, e.g. always throw rock. The sample recognizes these patterns and leverages the patterns and history data to predict moves (via the FPGrowth algorithm).
Titanic Survival Predictions
Manisha Sule describes in her article how to predict whether certain persons would have survived Titanic based on a decision tree algorithm.
Online Advertising Click Through Rate Predictions
In another sample Manisha explains how to predict click through rates, which is an important metric for evaluating online ad performance, via Logistic Regression.
Drop off Locations of Taxis
On developerWorks there is a tutorial describing how to determine the top drop off locations for New York City taxis using a popular algorithm known as KMeans.
Last week I posted an article describing how to run the movie recommendations sample that comes with Spark on Bluemix. To predict ratrings it uses the Collaborative Filtering technique.