Data Science

Contents:

  • Introduction to Data Science
  • Advanced Data Science
  • Selected Topics in A.I. and Data Science
  • Data Science Modules
  • Data Science Topics

On-site training to groups of 12-24 participants is available by request, as are customized workshops; contact info@data-action-lab.com for more information.

All catalogue courses are open to the general public; consult the website schedule for session dates.

Course content and pricing subject to change.


Introduction to Data Science

An introduction to data science, data analysis, data understanding, and artificial intelligence.

Details

Topics: (contents subject to change)

  1. Data Analysis Basics
    Concepts: 12 hours, Labs: 6 hours — universals of data science; basics of R/Python programming; data cleaning; data reduction; case studies
  2. Data Visualization
    Concepts: 9 hours, Labs: 9 hours — simple graphical methods, representations of multi-dimensional data, design suggestions, graphical displays with ggplot2, dashboards with Power BI
  3. Introduction to Machine Learning
    Concepts: 12 hours, Labs: 12 hours — association rules, decision trees, k-means, issues and challenges, na├»ve Bayes, hierarchical clustering

The goal of this workshop is to provide an introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls, and reinforced through the analysis of real-life data in 10 labs.

Length: 20 x 3 hour sessions, weekly

Cost: 2500$ per participant + HST (pricing subject to change)

Audience and Requirements: this training course is for individuals who wish to understand the functionality and capabilities offered by different data science concepts and methods, even if they won’t be the ones implementing them. This training course requires very little mathematical or computer programming knowledge. Some experience with quantitative ideas is assumed (an undergraduate degree in a quantitative discipline would be an asset). The necessary concepts will be introduced in the course.

Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge. Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

Expectations: 30 hours of additional independent learning outside of class; suggested readings and exercises


Advanced Data Science

A continuing foray into the major tasks of data science, data analysis, data understanding, and artificial intelligence.

Details

Topics: (contents and order of presentation subject to change)

  1. Text Mining and Natural Language Processing
    Concepts: 12 hours, Labs: 12 hours — basics, classification, clustering, sentiment analysis, named-entity recognition, summarization, topic modeling, etc.
  2. Focus on Supervised and Unsupervised Learning
    Theory: 18 hours, Labs: 18 hours — density-based clustering, spectral clustering, validation metrics, support vector machines, neural networks, value estimation, expectation-maximization clustering, latent Dirichlet allocation, fuzzy clustering, cluster ensembles, logistic regression, rare occurrence mining, ensemble learning, gradient boosting

The goal of this workshop is to provide a deeper introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls, and reinforced through the analysis of real-life data in 10 labs.

Length: 20 x 3 hour sessions, weekly

Cost: 2500$ per participant + HST (pricing subject to change)

Audience and Requirements: this training course is for individuals who wish to further understand the functionality and capabilities offered by different data science concepts and methods. This training course requires some mathematical knowledge and some computer programming (R and/or Python) experience. Familiarity with the concepts introduced in Introduction to Data Science is assumed.

Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge. Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

Expectations: 30 hours of additional independent learning outside of class; suggested readings and exercises


Selected Topics in A.I. and Data Science

A continuing foray into data science, data analysis, data understanding, and artificial intelligence, with a special emphasis on less traditional topics.

Details

Topics: (contents and order of presentation subject to change)

  1. Big Data Adventures
    Concepts: 3 hours, Labs: 3 hours — MapReduce, finding similar itemsets, clustering, large scale machine learning, setting up a Google Cloud platform, Spark
  1. Recommender Engines
    Concepts: 3 hours, Labs: 3 hours — collaborative filtering, content-based systems, knowledge-based systems, hybrids, evaluation
  2. Deep Learning
    Concepts: 3 hours, Labs: 3 hours — deep forward networks, regularization, convolution networks, recurrent networks, autoencoders, Boltzman machines
  3. Bayesian Data Analysis
    Concepts: 3 hours, Labs: 3 hours — statistics, inference, Bayesian methods and machine learning, belief networks, model selection
  4. Social Network Analysis
    Concepts: 3 hours, Labs: 3 hours — network data, representations, structural and locational properties, roles and positions, dyads, triads, interactions
  5. Data Streams
    Concepts: 3 hours, Labs: 3 hours — maintaining statistics, classification, clustering, ensemble methods, frequent itemsets
  6. Automated Data Collection
    Concepts: 3 hours, Labs: 3 hours — web scraping for data analysis, web technologies, scraping toolbox, applications
  7. Special Topic, selected from: Anomaly Detection, Multimedia Data Mining, Interactive Data Visualization, Matrix Decomposition, Applied Robotics, the Semantic Web, or a deeper look at some previously discussed topic
    Concepts: 3 hours, Labs: 3 hours
  8. Reporting and Deployment
    Concepts: 3 hours, Labs: 3 hours — RMarkdown, shiny, flask, web apps
  9. Putting it All Together
    Labs: 6 hours

The goal of this workshop is to provide a deeper introduction to various concepts and algorithms used in artificial intelligence and data science; use in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls, and reinforced through the analysis of real-life data in 10 labs.

Length: 20 x 3 hour sessions, weekly

Cost: 2500$ per participant + HST (pricing subject to change)

Audience and Requirements: this training course is for individuals who wish to further understand the functionality and capabilities offered by different data science and artificial intelligence concepts and methods. This training course requires some quantitative ideas and some computer programming (R and/or Python) experience. Familiarity with the concepts introduced in Introduction to Data Science is assumed.

Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge. Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

Expectations: 30 hours of additional independent learning outside of class; suggested readings and exercises


Data Science Modules

Individual modules covered in Advanced Data Science Training Courses are also available, by request, as week-long 20-hour modules (Theory: 10 hours, Practice: 10 hours).


Data Science Topics

These one to two day topic-based workshops are available by request. Additional topics also available by request.

Data Science Topics: A Crash Course in Data Science for Programmers and Statisticians
Data Science Topics: Classification and Clustering
Data Science Topics: An In-Depth Survey of Clustering Techniques
Data Science Topics: GeoSpatial Data
Data Science Topics: Categorical Data
Data Science Topics: Health Data
Data Science Topics: Data Visualization
Data Science Topics: Data Science Dashboards and Dynamic Reporting