Join me @ IBOtoolbox for free.
Dom Fernandez
Member Since: 5/25/2011
  
performance / stats
Country: Canada
Likes Received: 1027
Featured Member: 2 times
Associates: 1396
Wall Posts: 1347
Comments Made: 2602
Press Releases: 323
Videos: 0
Phone:
Skype:    
profile visitor stats
TODAY: 40
THIS MONTH: 1158
TOTAL: 262279
are we ibo associates?
recent videos
member advertising
active associates
Whitney Jacqueline       
Last logged on: 10/15/2018


Lisa Smith     
Last logged on: 10/15/2018


Todd Treharne    
Last logged on: 10/15/2018


Sig Skeie     
Last logged on: 10/15/2018


Marty Misner  
Last logged on: 10/15/2018


Anne Pinney    
Last logged on: 10/15/2018


Bill Bateman     
Last logged on: 10/15/2018


Steven Anthony    
Last logged on: 10/15/2018


Phil Schaefer     
Last logged on: 10/15/2018


Richard Mathiason     
Last logged on: 10/15/2018


Athena Gay    
Last logged on: 10/15/2018


Abolade Odetola     
Last logged on: 10/15/2018


Eugenijus Sakalauskas    
Last logged on: 10/15/2018


Chuck Reynolds    
Last logged on: 10/15/2018


Bobby Brown    
Last logged on: 10/15/2018


other ibo platforms
Dom Fernandez   My Press Releases

Data Science: Raw data to understanding, insight, knowledge

Published on 6/1/2018
For additional information  Click Here

Data science allows you to turn raw data into understanding, insight, and knowledge. To do this you require a number of tools and a programming-language (R / R-Studio is referenced here). One cannot master Data Science by reading a single book. Tools needed in a typical data science project are pictured below:

Tools for data science

You must first import your data into R - you take data stored in a file, database, or web API, and load it into a data frame in R. If you can’t get your data into R, you can’t do data science on it!

Once you’ve imported your data, it is a good idea to tidy it. Tidying your data means storing it in a consistent form that matches the semantics of the dataset with the way it is stored. In brief, when your data is tidy, each column is a variable, and each row is an observation. Tidy data is important because the consistent structure lets you focus your struggle on questions about the data, not fighting to get the data into the right form for different functions.

Once you have tidy data, a common first step is to transform it. Transformation includes narrowing in on observations of interest (like all people in one city, or all data from the last year), creating new variables that are functions of existing variables (like computing velocity from speed and time), and calculating a set of summary statistics (like counts or means). Together, tidying and transforming are called wrangling, because getting your data in a form that’s natural to work with often feels like a fight!

Once you have tidy data with the variables you need, there are two main engines of knowledge generation: visualization and modelling. These have complementary strengths and weaknesses so any real analysis will iterate between them many times.

Visualization is a fundamentally human activity. A good visualization will show you things that you did not expect, or raise new questions about the data. A good visualisation might also hint that you’re asking the wrong question, or you need to collect different data. Visualizations can surprise you, but don’t scale particularly well because they require a human to interpret them.

Models are complementary tools to visualization. Once you have made your questions sufficiently precise, you can use a model to answer them. Models are a fundamentally mathematical or computational tool, so they generally scale well. Even when they don’t, it’s usually cheaper to buy more computers than it is to buy more brains! But every model makes assumptions, and by its very nature a model cannot question its own assumptions. That means a model cannot fundamentally surprise you.

The last step of data science is communication, an absolutely critical part of any data analysis project. It doesn’t matter how well your models and visualization have led you to understand the data unless you can also communicate your results to others.

Surrounding all these tools is programming. Programming is a crosscutting tool that you use in every part of the project. You don’t need to be an expert programmer to be a data scientist, but learning more about programming pays off because becoming a better programmer allows you to automate common tasks, and solve new problems with greater ease.

"Data Analytics for Business-minds" teaches business-minds to analyze data. Training is intended for business people who may well be doing what data scientists, or technical developers would be doing - To Analyze Mounds of Data!

Together we can tackle mounds of your Data and gain understanding, insight and knowledge.

Contact > Dominic Fernandez

defining, designing and delivering solutions to help businesses achieve results - efficiently and effectively!

Member Note: To comment on this PR, simply click reply on the owners main post below.
-  Copyright 2016 IBOsocial  -            Part of the IBOtoolbox family of sites.