Data and You: Navigating the Information Sea

Introduction to Data Analysis

Advanced Data Science Training I

Advanced Data Science Training II

### Data and You: Navigating the Information Sea

Explore what data is, what people are doing with your data, and learn strategies for navigating the information sea.

DetailsIn the Information Age, we are awash in a sea of data – it is all around us, and is used to make important decisions about our lives. But what exactly is data? What can you do with it? And what can other people do with your data? In this short course we aim to give participants the ability to appreciate data and its uses, through discussions, demonstrations, and exploration.

**Length**: 4 sessions x 2 hours/session

**Level**: General

**Cost:**

Public Course – $550 Per Person + HST

In-House Course – $4,600 Per Day + HST

**Outline**:

- Data privacy and ethics in an electronic world
- Data visualization and story telling
- Data fundamentals and applications
- Artificial intelligence and the future of data

**Audience and Requirements**: This course is intended for a general audience. No technical knowledge or equipment is required.

### Introduction to Data Analysis

This course will introduce participants to basic data analysis concepts using hands on demonstrations with familiar tools (e.g. Excel, Power BI)

DetailsWhen we do data analysis we often miss key elements and approaches that could make the outputs richer and more valuable for consumers and end users. This course will introduce participants to basic data analysis concepts using hands on demonstrations with familiar tools (e.g. Excel, Power BI). Possible uses and appropriate application of these techniques will also be discussed. The course will conclude by showcasing some more advanced data analysis concepts to lay a foundation for future development.

**Length:** 1 day

**Level:** Novice

**Cost:**

Public Course – $550 Per Person + HST

In-House Course – $4,600 Per Day + HST

**Outline:** this course will help you to:

- Set a baseline:
- Understand what underpins data analysis
- Understand what data analysis is
- How to take the first steps in modeling your data
- See what types of analytical models exist and what apply to your

organization

- Get ready to start your analysis:
- How to set up the data analysis pipeline
- Understand and communicate the importance of data validation
- Understand the best practices for dealing with missing values

- Do the analysis:
- How to deal with both quantitative and qualitative analysis
- The importance of defining and developing measures and metrics
- Understand and apply basic techniques for analysing categorical data
- Understand and apply basic techniques for analysing numeric data
- How to draw appropriate conclusions from your analysis

- See the possibilities: a showcase of advanced analysis techniques
- Time series analysis for forecasting
- Decision tree classifiers
- Anomaly detection
- Unsupervised learning techniques

**Audience and Requirements:** anybody who works with data and would like to increase their understanding of the types of analysis they can carry out on different data types. No technical background is required

### Advanced Data Science Training I

An introduction to Data Science, data analysis, and data understanding.

DetailsAn introduction to Data Science, data analysis, and data understanding, with topics selected from:

*Data Science Prerequisites*

Theory: 8 hours, Practice: 7 hours — universals of data science; basics of R/Python; data cleaning; data reduction; case studies-
*Data Visualization*

Theory: 5 hours, Practice: 1 hour — simple graphical methods, representations of multi-dimensional data, design suggestions -
*Introduction to Machine Learning*

Theory: 13 hours, Practice: 5 hours — association rules, decision trees, k-means, issues and challenges, naïve Bayes, hierarchical clustering -
*Text Mining and Natural Language Processing*

Theory: 8 hours, Practice: 4 hours — basics, classification, clustering, sentiment analysis, named-entity recognition, summarization, topic modeling, etc. *Focus on Supervised and Unsupervised Learning*

Theory: 6 hours, Practice: 3 hours — density-based clustering, spectral clustering, validation metrics, support vector machines, neural networks, etc.

The goal of this workshop is to provide an introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls.

There will be a combination of traditional lectures with Jupyter notebook exercises.

**Length:** 20 x 3 hour sessions, weekly

**Level:** Halfway between a university course and the Applied Data Science and Visualization Training

**Audience and Requirements:** This workshop is for individuals who wish to understand the functionality and capabilities offered by different data science concepts and methods, even if they won’t be the ones implementing them. This workshop requires very little mathematical or computer programming knowledge. Some experience with quantitative ideas is assumed. Necessary concepts will be introduced in the course.

*Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge.* Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

**Expectations:** 60 hours of additional independent learning outside of class; suggested readings and exercises

### Advanced Data Science Training II

A continuation of the program started in Advanced Data Science Training I.

DetailsA continuation of the program started in Advanced Data Science Training I, with topics selected from:

*Focus on Supervised and Unsupervised Learning*

Theory: 8 hours, Practice: 4 hours — expectation-maximization clustering, latent Dirichlet allocation, fuzzy clustering, cluster ensembles, logistic regression, rare occurrence mining, ensemble learning, gradient boosting-
*Big Data Analysis*

Theory: 3 hours, Practice: 3 hour — simple graphical methods, representations of multi-dimensional data, design suggestions -
*Recommender Engines*

Theory: 4 hours, Practice: 2 hours — collaborative filtering, content-based systems, knowledge-based systems, hybrids, evaluation -
*Deep Learning*

Theory: 4 hours, Practice: 2 hours — deep forward networks, regularization, convolution networks, recurrent networks, autoencoders, Boltzman machines *Bayesian Data Analysis*

Theory: 4 hours, Practice: 2 hours — statistics, inference, Bayesian methods and machine learning, belief networks, model selection*Social Network Analysis*

Theory: 4 hours, Practice: 2 hours — network data, representations, structural and locational properties, roles and positions, dyads, triads, interactions*Data Streams*

Theory: 4 hours, Practice: 2 hours — maintaining statistics, classification, clustering, ensemble methods, frequent itemsets*Automated Data Collection*

Theory: 3 hours, Practice: 3 hours — web scraping for data analysis, web technologies, scraping toolbox, applications*Putting it All Together*

Practice: 6 hours

The goal of this workshop is to provide an introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls.

**Length**: 20 x 3 hour sessions, weekly

**Level:**Halfway between a university course and the Applied Data Science and Visualization Training. Combination of traditional lectures with Jupyter notebook exercises

**Audience and Requirements:** This workshop is for individuals who wish to understand the functionality and capabilities offered by different data science concepts and methods, even if they won’t be the ones implementing them. requires very little mathematical or computer programming knowledge. Some experience with quantitative ideas is assumed. Necessary concepts will be introduced in the course. Familiarity with the concepts in Advanced Data Science Training I is an asset.

*Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge.* Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

**Expectations:** 60 hours of additional independent learning outside of class; suggested readings and exercises