Top Machine Learning Project Ideas to Try Out
With the growth of data collection in this age, interpreting information has become increasingly paramount. Enter Machine Learning (ML), an intersection of statistics and computer processing power to help you interpret data by providing predictions based on trends. It is no wonder it is a widely used technique in artificial intelligence.
Why is Building Projects for ML Important?
With so much data and so many factors to consider, those who have worked on Machine Learning projects say it is often an art more than a science. This is because there are so many different approaches to take, and a lot of times you can’t anticipate exactly what a model will do. Since you will only know what the model will output once it has run, you often have to see what worked and what didn’t and adjust your approach accordingly.
When building projects, there are different approaches you can take in regards to what features you’ll create (if any), how you will preprocess (clean and organize) the data, and the Machine Learning model or models you will apply.
You also have options to consider, such as using pre-trained models created by others trying to solve problems similar to yours. Experience will familiarize you with how to best make those choices. Some popular models are:
- Naive Bayes
- Decision Trees
- K-Nearest Neighbors (kNN)
- Linear Regression
- Logistic Regression
- Ensemble Methods (using multiple models) : BAGGing (Bootstrap Aggregating), Random Forest, etc.
- Boosting (ensemble meta-algorithms): AdaBoost, XGBoost, CatBoost,etc.
Kaggle: A Good Place to Start
Kaggle, an online data science community, hosts ML competitions for various levels, including beginners. One such example is the Titanic ML competition, which involves creating a model to predict which passengers survived the Titanic shipwreck.
The Titanic competition is specifically aimed for beginners who want to gain familiarity with using machine learning techniques. Participating in competitions like these will let you get introduced to ML projects without worrying about steps like cleaning the data. Also, you can see what other competitors in the Kaggle community have used to train their model.
Five Project Ideas for Machine Learning
What can I build to practice my machine learning skills? Good question. Here are five project ideas to get you going on your journey toward building your knowledge of machine learning:
Idea #1: Particle Distinguishing
Ever wonder how researchers identify what molecular structures are made of? One popular way is the “cryo electron microscopy” process. In this process, researchers create 3D structures of proteins and other biomolecules by flash-freezing them, and then bombarding them with electrons to create microscope images of the molecules. The data bank for structures solved using this technique — the Electron Microscopy Data Bank (EMDB) — has thousands of entries for you to choose from.
One of the most time consuming aspects of solving an image is that the particles (sometimes thousands) in these images have to be selected manually. This takes a lot of concentrated effort. By using the EMDB data bank, however, you can train a model to learn to distinguish the particles. Using neural networks you can train the model to identify relevant particles.
Idea #2: Language Recognition
There are many translators for a variety of languages, but imagine if you did not know which to call because you do not recognise the language being spoken. Getting familiar with language recognition methods is always useful.
One way to do this is to look at voiceprints, or spectrograms. A Convolutional Recurrent Neural Network (CRNN) can be trained to identify words or even speech patterns unique to a language from a given dialog spectrogram. There are many speech recording data sets (Persian, Finnish, and Spanish), but you may not find spectograms for some other languages.
This kaggle post describes how to use Python, specifically, Matplotlib and Librosa packages, to convert audio to spectograms.
Idea #3: Stock Prediction
The stock market is a tumultuous storm of buying and selling and can be too imposing to dive into. But you could train a model to predict future stock prices and help demystify what stock would be good to invest in.
In this project, you can use Reinforcement Learning to train your model. Use the Python Keras package for neural network layer modeling. The Yahoo Finance website allows you to select a stock’s history and download it as a .csv file. To train your model, use several years’ worth of data for a particular stock, like this AMZN stock data.
Idea #4: Image Restoration
Although black-and-white filters make anyone look great in a photograph, colorizing a photo can really bring it to life. Deep learning, a subfield of machine learning, is what is primarily used to build image-processing models.
Neural networks can be trained to fix “errors” in photos such as faded or partially missing elements. Then, a generative adversarial network (GAN) can be used to train your model to colorize. This project will familiarize you with generator and discriminator models. You can choose photos from datasets like TensorFlow.
Idea #5: Sketchify an Image
Have you ever used an app that converts your photos to art and wondered how it works? Wonder no longer. Computer vision is the key. Computer vision can be used to process images and perform various transformations on the image.
Your project will take an image input from the user and convert it into a pencil sketch. For this project, use Python to install NumPy and the Open Source Computer Vision Library (OpenCV) package (imported as cv2). OpenCV is a computer vision machine learning library. You can use it to automate the detection of edges, grayscaling, and the production of an overall “sketch” effect. You can choose images from datasets like TensorFlow or capture the image from a webcam.
If you want to take this project idea a bit further, try using a neural network like CNN to process sketches and predict what image they represent.
Conclusion
If you came to this article you are probably already familiar with algorithms such as Decision Tree algorithms, and programming languages like R and Python. In machine learning, you can implement any number of algorithms with Python to suit your project’s needs, and experience using them is important. For example, for projects with a lot of data points and few features, a popular opinion is that Naive Bayes is the algorithm to use.
There’s also much more you need to do other than to implement algorithms — preprocessing the data and creating features, for example. You also have the option to use pre-trained models and adding features.
All in all, after you’ve mastered the basics, it’s projects that will give you a sense for what can work.