Data Analysis Workshops

Introduction to Data Analysis
Applied Data Science and Visualization (with Power BI)
Best Practices in Creating Data Visualizations and Dashboards
Doing Data Science: Getting Started
Learning by Doing: a Data Science Project from A to Z
Getting Ready for Data Analysis: Cleaning and Shaping Data


Introduction to Data Analysis

This course will introduce participants to basic data analysis concepts using hands on demonstrations with familiar tools (e.g. Excel, Power BI)

Details


When we do data analysis we often miss key elements and approaches that could make the outputs richer and more valuable for consumers and end users. This course will introduce participants to basic data analysis concepts using hands on demonstrations with familiar tools (e.g. Excel, Power BI). Possible uses and appropriate application of these techniques will also be discussed. The course will conclude by showcasing some more advanced data analysis concepts to lay a foundation for future development.

Length: 1 day
Level: Novice

Outline: this course will help you to:

  • Set a baseline:
    • Understand what underpins data analysis
    • Understand what data analysis is
    • How to take the first steps in modeling your data
    • See what types of analytical models exist and what apply to your
      organization
  • Get ready to start your analysis:
    • How to set up the data analysis pipeline
    • Understand and communicate the importance of data validation
    • Understand the best practices for dealing with missing values
  • Do the analysis:
    • How to deal with both quantitative and qualitative analysis
    • The importance of defining and developing measures and metrics
    • Understand and apply basic techniques for analysing categorical data
    • Understand and apply basic techniques for analysing numeric data
    • How to draw appropriate conclusions from your analysis
  • See the possibilities: a showcase of advanced analysis techniques
    • Time series analysis for forecasting
    • Decision tree classifiers
    • Anomaly detection
    • Unsupervised learning techniques

Audience and Requirements: anybody who works with data and would like to increase their understanding of the types of analysis they can carry out on different data types. No technical background is required


Applied Data Science and Visualization (with Power BI)

The focus of this course is on understanding, using, and presenting Data Science, Machine Learning, and Artificial Intelligence findings, and to provide familiarity with fundamental DS/ML/AI tasks and concepts though the visualization and interpretation of real-world datasets and analysis.

Details


The focus of this course is on understanding, using, and presenting Data Science, Machine Learning, and Artificial Intelligence findings, and to provide familiarity with fundamental DS/ML/AI tasks and concepts though the visualization and interpretation of real-world datasets and analysis. A secondary goal is to improve DS/ML/AI/Data Visualization literacy among participants. Participants do not necessarily need to work with data in their everyday activities, but may be called upon to interpret the results of analyses by employees, colleagues, consultants, etc.

Length: 10 sessions x 4 hours/session
Level: Intermediate

Outline:

  1. A Data Visualization Primer
  2. Introduction to Power BI/R/Dax
  3. Data Collection
  4. Data Shaping
  5. Data Cleaning
  6. Classification
  7. Clustering
  8. Text Mining and Sentiment Analysis
  9. Special Topics 1
  10. Special Topics 2

Special Topics selected from: Recommender Systems, Social Network Analysis, Image Mining, Time Series Analysis, Association Rules Mining, etc.
Data science tasks (blocks of 3 classes) include: background, case studies/applications, exploration, analysis, visualization, issues and challenges, etc.

Audience and Requirements: this workshop is intended for tech/analysis-savvy domain expert, managers, and/or policy analysts (participants who know how to program should only take this course IF they are also taking the Advanced Training course). Skill pre-requisites are a comfort with Excel formulas (familiarity with basic quantitative methods a plus).


Best Practices in Creating Data Visualizations and Dashboards

This course is aimed at taking participants through the basics of data visualization and design whether you are creating Power BI interactive reports, generating charts in Excel or management presentations in PowerPoint.

Details
Poorly designed visualizations (graphs, reports, charts, slides etc.) can lead to confusion and in the worst case erroneous business decisions. End users are constantly seeking the best ways to understand the data behind the data. The most effective way to help end users is by making it visual for them. This course is aimed at taking participants through the basics of data visualization and design whether you are creating Power BI interactive reports, generating charts in Excel or management presentations in PowerPoint. This course will help you to:

  • Effectively engage with the end users to properly define context
  • Understand the importance of narrative and storyboarding as part of the design process
  • Understand what design elements engage inconic, short and long term memory in the end user increasing engagement
  • Matching visualizations to data, including best practices and implementation hacks (Excel and Power BI) for:
    • Interactive text
    • Data tables
    • Data table heatmaps
    • Scatterplots and bubble plots
    • Line charts
    • Bar Charts (Vertical & Horizontal)
    • Stacked Bar Charts (Vertical & Horizontal)
    • 100% Bar Charts (Vertical & Horizontal)
    • Area Charts
    • Waterfall Charts
    • Treemaps
    • Funnel Charts
    • Key Performance Indicator Gauges
    • Data Geographical Maps and Choropleth Maps
    • Charts to avoid
  • Fully understand the basic rules of Design and Layout including:
    • Gestalt Principles
    • Preattentive Attributes
    • Decluttering your charts, dashboards and reports
    • Size and positioning
    • Basic colour rules and introduction to colour wheel calculations
  • Power BI Themes and Templates
    • Creation of .json template files
    • Use of .pbit files

Length: One day
Level: Novice – Intermediate

Audience and Requirements: Anybody that creates graphs, charts, dashboards and presentations in Power BI, Excel, PowerPoint or any other Business Intelligence software tool.


Doing Data Science: Getting Started

Immerse yourself in the world of data science. This workshop will take you through the basics of data science concepts, algorithms and techniques.

Details


Length: 2 days
Level: Intermediate

Audience and Requirements: This workshop is a technical workshop. You must have access to a laptop on which open source software can be installed. You must have some familiarity with data concepts, some programming or scripting ability and some familiarity with statistical concepts. If you do not meet these pre-requisites, it is recommended that you start with the general and then the statistical and computer programming primer workshops, prior to taking this workshop. If you are already quite familiar with programming and/or have used or tried out data science techniques, this workshop will be too basic for you. It is recommended that you take an advanced technical workshop instead.


Learning by Doing: a Data Science Project from A to Z

In this very hands-on workshop we will walk you through a curated data science project, from start to finish, exposing you to some of the challenges and techniques involved in each step and component.

Details


As usual, the best way to understand is to learn by doing. In this very hands-on workshop we will walk you through a curated data science project, from start to finish, exposing you to some of the challenges and techniques involved in each step and component. At the end you will not be a ‘master data scientist’, but you will have a much better understanding of what is involved in a data science project, from start to finish.

Length: 2 days
Level: Intermediate

Outline: In this workshop we will:

  • Get access to a public data set
  • Store the data in a particular location
  • Access the stored data
  • Clean, evaluate, work with the data
  • Choose from one of several out-of-the-box analysis techniques
  • Apply these techniques
  • Visualize the Result

Audience and Requirements: This workshop is a technical workshop. We will be using basic, publicly available (open source) tools during the workshop. Registrants will receive instructions on what to install on their laptops prior to the workshop.You must have access to a laptop on which open source software can be installed. You must have some familiarity with data concepts, some programming or scripting ability and some familiarity with statistical concepts. If you do not meet these pre-requisites, it is recommended that you start with the general and then the statistical and computer programming primer workshops, prior to taking this workshop.


Getting Ready for Data Analysis: Cleaning and Shaping Data

In this workshop we will talk about what it means to shape datasets, review different types of dataset shapes (e.g. long vs wide) and work through hands-on examples of applying a variety of techniques for slicing, dicing and reshaping data to some sample datasets

Details


How you model, clean and shape your data is critical to your success, and can require the most effort as well. It has been estimated that 80% of the work of a data science project lies in preparing your data for analysis. In this workshop we will talk about what it means to shape datasets, review different types of dataset shapes (e.g. long vs wide) and work through hands-on examples of applying a variety of techniques for slicing, dicing and reshaping data to some sample datasets. In the process, we will also discuss how to develop concrete, explicit data models that are compatible with different types of data science algorithms. We will then turn our attention to different types of messy data and practice applying a variety of techniques that can by used to deal with typical dataset quality issues.

Length: 2 days
Level: Intermediate

Audience and Requirements: This workshop is a technical workshop. You must have access to a laptop on which open source software can be installed. You must have some familiarity with data concepts, some programming or scripting ability and some familiarity with statistical concepts. If you do not meet these pre-requisites, it is recommended that you start with the general and then the statistical and computer programming primer workshops, prior to taking this workshop. If you have already used programming languages or databases to manage and work with multiple datasets, you are too advanced for this workshop.