Demystifying decision forests

By Michele Usuelli, Lead Data Scientist   This article doesn’t require a data science background, but just some basic understanding of predictive analytics. Besides that, all the concepts are explained from scratch, including a popular algorithm called the “decision forest”. Throughout the article you won’t see any fancy or advanced machine learning algorithm, but by the… Read more

What is the role of a data scientist?

By Michele Usuelli, Lead Data Scientist Data Science has been around for decades, but it recently increased in popularity among companies. Although the tools and techniques existed already, there are some changes. Digital technologies generate more data that can drive new advanced analytics use-cases. Also, there are more success stories show-casing the value in data, making… Read more

Scaling up Scikit-Learns Random Projection using Apache Spark

By Sashi Dareddy, Lead Data Scientist What is Random Projection (RP)? Random Projection is a mathematical technique to reduce the dimensionality of a problem much like Singular Value Decomposition (SVD) or Principal Component Analysis (PCA) but only simpler & computationally faster. [Throughout this article, I will use Random Projection and Sparse Random Projection interchangeably.] It… Read more