Data Science Workshops

Advanced Data Science Training I
Advanced Data Science Training II
Data Science Modules
Data Science Topics


Advanced Data Science Training I

An introduction to Data Science, data analysis, and data understanding.

Details

An introduction to Data Science, data analysis, and data understanding, with topics selected from:

  1. Data Science Prerequisites
    Theory: 8 hours, Practice: 7 hours — universals of data science; basics of R/Python; data cleaning; data reduction; case studies
  2. Data Visualization
    Theory: 5 hours, Practice: 1 hour — simple graphical methods, representations of multi-dimensional data, design suggestions
  3. Introduction to Machine Learning
    Theory: 13 hours, Practice: 5 hours — association rules, decision trees, k-means, issues and challenges, na├»ve Bayes, hierarchical clustering
  4. Text Mining and Natural Language Processing
    Theory: 8 hours, Practice: 4 hours — basics, classification, clustering, sentiment analysis, named-entity recognition, summarization, topic modeling, etc.
  5. Focus on Supervised and Unsupervised Learning
    Theory: 6 hours, Practice: 3 hours — density-based clustering, spectral clustering, validation metrics, support vector machines, neural networks, etc.

The goal of this workshop is to provide an introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls.

There will be a combination of traditional lectures with Jupyter notebook exercises.

Length: 20 x 3 hour sessions, weekly
Level: Halfway between a university course and the Applied Data Science and Visualization Training

Audience and Requirements: This workshop is for individuals who wish to understand the functionality and capabilities offered by different data science concepts and methods, even if they won’t be the ones implementing them. This workshop requires very little mathematical or computer programming knowledge. Some experience with quantitative ideas is assumed. Necessary concepts will be introduced in the course.

Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge. Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

Expectations: 60 hours of additional independent learning outside of class; suggested readings and exercises



Advanced Data Science Training II

A continuation of the program started in Advanced Data Science Training I.

Details

A continuation of the program started in Advanced Data Science Training I, with topics selected from:

  1. Focus on Supervised and Unsupervised Learning
    Theory: 8 hours, Practice: 4 hours — expectation-maximization clustering, latent Dirichlet allocation, fuzzy clustering, cluster ensembles, logistic regression, rare occurrence mining, ensemble learning, gradient boosting
  2. Big Data Analysis
    Theory: 3 hours, Practice: 3 hour — simple graphical methods, representations of multi-dimensional data, design suggestions
  3. Recommender Engines
    Theory: 4 hours, Practice: 2 hours — collaborative filtering, content-based systems, knowledge-based systems, hybrids, evaluation
  4. Deep Learning
    Theory: 4 hours, Practice: 2 hours — deep forward networks, regularization, convolution networks, recurrent networks, autoencoders, Boltzman machines
  5. Bayesian Data Analysis
    Theory: 4 hours, Practice: 2 hours — statistics, inference, Bayesian methods and machine learning, belief networks, model selection
  6. Social Network Analysis
    Theory: 4 hours, Practice: 2 hours — network data, representations, structural and locational properties, roles and positions, dyads, triads, interactions
  7. Data Streams
    Theory: 4 hours, Practice: 2 hours — maintaining statistics, classification, clustering, ensemble methods, frequent itemsets
  8. Automated Data Collection
    Theory: 3 hours, Practice: 3 hours — web scraping for data analysis, web technologies, scraping toolbox, applications
  9. Putting it All Together
    Practice: 6 hours

The goal of this workshop is to provide an introduction to various concepts and algorithms used in A.I. and Data Science, as used in common programming environments. The application of these concepts will be illustrated through some examples ranging from simple to elaborate, along with discussions of common challenges and pitfalls.

Length: 20 x 3 hour sessions, weekly
Level:Halfway between a university course and the Applied Data Science and Visualization Training. Combination of traditional lectures with Jupyter notebook exercises

Audience and Requirements: This workshop is for individuals who wish to understand the functionality and capabilities offered by different data science concepts and methods, even if they won’t be the ones implementing them. requires very little mathematical or computer programming knowledge. Some experience with quantitative ideas is assumed. Necessary concepts will be introduced in the course. Familiarity with the concepts in Advanced Data Science Training I is an asset.

Participants must provide a laptop with wi-fi connectivity and the ability to run R/Python notebooks from a web browser (typically, Chrome, Firefox, Edge. Some issues have been encountered with Internet Explorer and the WebSockets driver). Exposure to programming (R/Python/SAS/etc.) a plus.

Expectations: 60 hours of additional independent learning outside of class; suggested readings and exercises



Data Science Modules

Individual modules covered in Advanced Data Science Training Courses are also available, by request, as week-long 20-hour modules (Theory: 10 hours, Practice: 10 hours).


Data Science Topics

These one to two day topic-based workshops are available by request. Additional topics also available by request.

Data Science Topics: A Crash Course in Data Science for Programmers and Statisticians
Data Science Topics: Classification and Clustering
Data Science Topics: An In-Depth Survey of Clustering Techniques
Data Science Topics: GeoSpatial Data
Data Science Topics: Categorical Data
Data Science Topics: Health Data
Data Science Topics: Data Visualization
Data Science Topics: Data Science Dashboards and Dynamic Reporting