Logo

The Data Daily

5 Machine Learning Projects You Should Not Overlook

5 Machine Learning Projects You Should Not Overlook

It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out!

After a hiatus, the "Overlook..." posts are making their comeback this month, continuing the modest quest of bringing formidable, lesser-known machine learning projects to a few additional sets of eyes.

Check out the 5 projects below for some potential fresh machine learning ideas.

fastText works only on text data, which means that it will only use a single column from a dataset which might contain many feature columns of different types. As such, a common use case is to have the fastText classifier use a single column as input, ignoring other columns. This is especially true when fastText is to be used as one of several classifiers in a stacking classifier, with other classifiers using non-textual features.

Understanding fastText is the important piece of the puzzle, but once this understanding is possessed, skift helps you easily implement fastText, as well as integrate it with other Scikit-learn functionality in general.

Tired of not having decent machine learning alternatives for PHP? Are you a masochist (if you're using PHP this answers itself)? Well then, this project just may be for you!

While I kid, I am far enough removed from the PHP world not to know whether this serves any particular pressing requirement; the 5K+ stars would suggest that it likely does! Beyond that, I'm always interested in seeing how machine learning ecosystems unfold in different programming language environments. Perhaps you are too, or more importantly you may actually have a use for what seems at preliminary glance to be a solid library for the PHP people out there.

While this is not technically its own project, I find it important enough to highlight here.

Similar to how the underlying project was the most important to understand for skift (above), the important piece of this puzzle is having an understanding of implementing neural networks with Keras, itself a high-level API. Being able to integrate Keras with additional Scikit-learn functionality, and being able to use the familiar API and methods, is what these wrappers accomplish. Find the API on the official Keras Github repository.

If you are already using Keras, there is a good chance this is not new to you. If you aren't, knowing that this integration is possible may be enough to have you take a look.

Gradient boosting continues to be all the rage. Or some of the rage, at the very least. A recent entrant into the gradient boosted trees arena is CatBoost.

CatBoost is available in Python, R, and command-line interface flavors. Check out tutorials here, and much more in its full documentation here.

...and more. You can check out the getting started guide here and the API quick start guide here.

Images Powered by Shutterstock