A good friend of mine asked a question the other day that I’ve been asked before – and I have a rule that if I’m asked the same question from multiple people, I blog it. The question was this: “Should I learn statistics first and then focus on R or learn statistics along the way?”…

# Month: October 2017

## Python for the Data Scientist

In a previous notebook I introduced the R programming language and environment. While R is very powerful, widely used and has multiple packages, another language called “Python” is also popular with Data Scientists. Yes, you can do amazing things in R – in fact, part of R is written in R (think about that for a…

## Reducing (although sadly not eliminating) bias in sample gathering

To obtain the data for the analysis a Data Scientist needs to work with, there are two options: you can get all the data (called a population or “X”) or a subset of the data (called a sample, or “x”). Most of the time the information you need to perform analysis is too large to…

## The Data Scientist’s Computer

Everyone uses a computer for lots of things, from e-mail to chat, from gaming to office work. And yet, there are some specific needs a Data Scientist has for their primary system. While I don’t recommend a specific brand or model (these things change too quickly to make this notebook entry useful for any length…

## Microsoft R

What, Why, How One of the most distinctive features of Data Science, as opposed to working with databases, Business Intelligence or other data professions, is its heavy use of statistical methods. At the first appearance of computing science, programs and algorithms were created to deal with the large amounts of calculations required in statistics. One of…

## Descriptive Statistics – Initial Evaluation of the Data

The most important part of data analysis is a thorough understanding of the data we’re looking at. Once we’ve verified what the source of the data actually means and that we can trust it, we need to do some simple visualizations and calculations to see what it means. I find that using even basic descriptions is…

## Learning Data Science

Updated: 8 February 2018 It’s been over a year since I wrote the original of this article – and much has changed in the world of Data Science. I’ve decided to update the information from time to time, since it’s the most popular I’ve done – there is clearly a lot of demand and need…

## Prepping for learning Data Science

In the September 2015 issue of the Communications of the ACM magazine, there is an article on the Automated Education and the Professional – highly recommended reading. It feeds in nicely to our journey in learning Data Science. The article covers the conflict between traditional college-degree education and the newer competency-based learning systems. I prefer both –…