**Contents: **

- Introduction to Data Science
- Advanced Data Science
- Selected Topics in A.I. and Data Science
- Data Science Modules
- Data Science Topics

On-site training to groups of 12-24 participants is available by request, as are customized workshops; contact info@data-action-lab.com for more information.

All catalogue courses are open to the general public; consult the website schedule for session dates.

Course content and pricing subject to change.

### Introduction to Data Science

An introduction to data science, data analysis, data understanding, and artificial intelligence.

Details**Topics:** (contents subject to change)

*Data Analysis Basics*

Concepts: 12 hours, Labs: 6 hours — universals of data science; basics of R/Python programming; data cleaning; data reduction; case studies*Data Visualization*

Concepts: 9 hours, Labs: 9 hours — simple graphical methods, representations of multi-dimensional data, design suggestions, graphical displays with ggplot2, dashboards with Power BI*Introduction to Machine Learning*

Concepts: 12 hours, Labs: 12 hours — association rules, decision trees,*k*-means, issues and challenges, naïve Bayes, hierarchical clustering

The goal of this workshop is to provide an introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls, and reinforced through the analysis of real-life data in 10 labs.

**Length:** 20 x 3 hour sessions, weekly

**Cost:** 2500$ per participant + HST (pricing subject to change)

**Audience and Requirements:** this training course is for individuals who wish to understand the functionality and capabilities offered by different data science concepts and methods, even if they won’t be the ones implementing them. This training course requires very little mathematical or computer programming knowledge. Some experience with quantitative ideas is assumed (an undergraduate degree in a quantitative discipline would be an asset). The necessary concepts will be introduced in the course.

*Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge.* Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

**Expectations:** 30 hours of additional independent learning outside of class; suggested readings and exercises

### Advanced Data Science

A continuing foray into the major tasks of data science, data analysis, data understanding, and artificial intelligence.

Details**Topics:** (contents and order of presentation subject to change)

*Text Mining and Natural Language Processing*

Concepts: 12 hours, Labs: 12 hours — basics, classification, clustering, sentiment analysis, named-entity recognition, summarization, topic modeling, etc.*Focus on Supervised and Unsupervised Learning*

Theory: 18 hours, Labs: 18 hours — density-based clustering, spectral clustering, validation metrics, support vector machines, neural networks, value estimation, expectation-maximization clustering, latent Dirichlet allocation, fuzzy clustering, cluster ensembles, logistic regression, rare occurrence mining, ensemble learning, gradient boosting

The goal of this workshop is to provide a deeper introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls, and reinforced through the analysis of real-life data in 10 labs.

**Length**: 20 x 3 hour sessions, weekly

**Cost:** 2500$ per participant + HST (pricing subject to change)

**Audience and Requirements:** this training course is for individuals who wish to further understand the functionality and capabilities offered by different data science concepts and methods. This training course requires some mathematical knowledge and some computer programming (R and/or Python) experience. Familiarity with the concepts introduced in *Introduction to Data Science* is assumed.

*Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge.* Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

**Expectations:** 30 hours of additional independent learning outside of class; suggested readings and exercises

### Selected Topics in A.I. and Data Science

A continuing foray into data science, data analysis, data understanding, and artificial intelligence, with a special emphasis on less traditional topics.

Details**Topics:** (contents and order of presentation subject to change)

*Big Data Adventures*

Concepts: 3 hours, Labs: 3 hours — MapReduce, finding similar itemsets, clustering, large scale machine learning, setting up a Google Cloud platform, Spark

*Recommender Engines*

Concepts: 3 hours, Labs: 3 hours — collaborative filtering, content-based systems, knowledge-based systems, hybrids, evaluation*Deep Learning*

Concepts: 3 hours, Labs: 3 hours — deep forward networks, regularization, convolution networks, recurrent networks, autoencoders, Boltzman machines*Bayesian Data Analysis*

Concepts: 3 hours, Labs: 3 hours — statistics, inference, Bayesian methods and machine learning, belief networks, model selection*Social Network Analysis*

Concepts: 3 hours, Labs: 3 hours — network data, representations, structural and locational properties, roles and positions, dyads, triads, interactions*Data Streams*

Concepts: 3 hours, Labs: 3 hours — maintaining statistics, classification, clustering, ensemble methods, frequent itemsets*Automated Data Collection*

Concepts: 3 hours, Labs: 3 hours — web scraping for data analysis, web technologies, scraping toolbox, applications*Special Topic, selected from: Anomaly Detection, Multimedia Data Mining, Interactive Data Visualization, Matrix Decomposition, Applied Robotics, the Semantic Web, or a deeper look at some previously discussed topic*

Concepts: 3 hours, Labs: 3 hours*Reporting and Deployment*

Concepts: 3 hours, Labs: 3 hours — RMarkdown, shiny, flask, web apps*Putting it All Together*

Labs: 6 hours

The goal of this workshop is to provide a deeper introduction to various concepts and algorithms used in artificial intelligence and data science; use in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls, and reinforced through the analysis of real-life data in 10 labs.

**Length**: 20 x 3 hour sessions, weekly

**Cost:** 2500$ per participant + HST (pricing subject to change)

**Audience and Requirements:** this training course is for individuals who wish to further understand the functionality and capabilities offered by different data science and artificial intelligence concepts and methods. This training course requires some quantitative ideas and some computer programming (R and/or Python) experience. Familiarity with the concepts introduced in *Introduction to Data Science* is assumed.

*Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge.* Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

**Expectations:** 30 hours of additional independent learning outside of class; suggested readings and exercises

### Data Science Modules

Individual modules covered in Advanced Data Science Training Courses are also available, by request, as week-long 20-hour modules (Theory: 10 hours, Practice: 10 hours).

### Data Science Topics

These one to two day topic-based workshops are available by request. Additional topics also available by request.

Data Science Topics: A Crash Course in Data Science for Programmers and Statisticians

Data Science Topics: Classification and Clustering

Data Science Topics: An In-Depth Survey of Clustering Techniques

Data Science Topics: GeoSpatial Data

Data Science Topics: Categorical Data

Data Science Topics: Health Data

Data Science Topics: Data Visualization

Data Science Topics: Data Science Dashboards and Dynamic Reporting