Skip to main content

Workshops & Training

Through CCSS discover and learn tools and methods to conduct your research using various types of data!

  • Workshops

What do We Offer?

Workshops

CCSS Workshops are given by our staff consultants, Senior Data Science Fellows, and Data Science Fellows. Learn more about our instructors here


Training for Classes and Project Teams  Request Training

 

Training for Classes

We provide class training tailored to your group’s specific needs on topics related to:

  • Data processing and management
  • Use of qualitative software packages such as Atlas.ti and NVivo
  • Use of statistical software packages such as SAS, SPSS, STATA, and R
  • Use of CCSS research and computing servers

 Training for Project Teams

We provide just-in-time training for project teams in:

  • Qualitative and quantitative research data management and processing
  • Targeted training for a particular phase/time in the project team's research process where extra help is needed

 

Summer 2022 Workshops (Coming Soon)

CCSS Workshops are given by our staff consultants, Senior Data Science Fellows, and Data Science Fellows. Learn more about our instructors here.

Summer workshops will start in June 2022. Check back here in mid-May for a list of sessions. 

Workshop Descriptions

A full list of all of the topics that our consultants cover. Be on the lookout for these workshops to be offered in the coming semesters!

  • API

    Introduction to API

    Overview: An Application Programming Interface, or API, is a tool that allows you to request data from an online platform like Twitter or YouTube, using a programming language like R or Python. An API is the surest way to acquire data on social media platforms or academic journals (like Cornell's very own arXiv), allowing you to run text analysis, social network analysis, and more!

    Workshop One Topics:

    • Understand what data you are able to collect with this technology
    • Understand what you can learn from this type of data
    • Have successfully acquired some data from an API
    • Knowledge of common ways to access these data

    Workshop Two Topics:

    • Learn how to setup RStudio for API scraping
    • Explore common API packages for requesting trace data
    • Practice pulling data based on a few common criteria (time, hashtag, user)
    • Overview of cleaning techniques and effects
    • Identify potential study uses for cases

  • Atlas.ti 

    Introduction to Atlas.ti

    Overview: Atlas.ti is a powerful workbench for qualitative data analysis. No matter your field, Atlas.ti will meet your qualitative analysis needs. Sophisticated tools help you to arrange, reassemble, and manage your material in creative ways.

    Workshop Topics:

    • Creating, saving and exporting Atlas.ti Project Bundles
    • Coding, identifying themes
    • Overview of Atlas.ti windows environment.
    • Creating organized reports

  • Conjoint Analysis

    Using Conjoint Analysis in Analyzing Individuals' Underlying Preferences

    Overview: Conjoint Analysis helps researchers across different fields to identify individuals’ preferences and evaluate their choice trade-offs in the context of a survey. 

    Workshop Topics: 

    • How to successfully configure conjoint experiments in Qualtrics using the Conjoint Choice-based Application provided by Qualtrics
    • Explain how to analyze and utilize Conjoint-based data 
    • Successfully design and implement their own questionnaire in the form of a Conjoint experiment and will learn to access, analyze, and present the findings of their Conjoint survey using R

    Software/tool/method: Qualtrics Choice-Based Conjoint and R Studio 

    Area of expertise connects to: surveys, survey experiments, social sciences, business analytics, marketing

    Prerequisites: Basic understanding of Qualtrics, survey analysis, and R is helpful.
     

  • GitHub

    Introduction to Github

    Overview: Git is a tool that helps keep track of changes made in project documents such as program files or source codes, effectively versioning them, and allows teams to collaborate via a central repository hub such as Github, Bitbucket, or Gitlab. Git is a free and open source system that handles everything from small to very large projects with speed and efficiency.

    Workshop Topics:

    • Creation and configuration of a git repository:
    • Editing, staging and committing files
    • Retrieving previous versions of files
    • Branching

  • How to Make your Research Transparent and Reproducible

    Overview: Replication of results is a core requirement of the scientific method. Satisfying this requirement becomes increasingly complex when data from disparate sources is integrated and reused. This workshop will walk you through the process of reviewing your manuscript, data, code and output to ensure your results will reproduce correctly on all systems. We will go over how to package reproduction materials for easy re-use. This can be time-intensive and intimidating, especially for individual researchers seeking to openly share their work.

    Workshop Topics:

    • Walk you through the process of reviewing your manuscript, data, code and output to ensure that the reproduction materials you will share to the public or submit to a journal for publication will reproduce your results exactly
    • Discuss common mistakes to avoid in manuscripts and code
    • How to package reproduction materials for easy re-use and independent understandability
    • CCSS Data and Reproduction Archive as the institutional repository for your reproducibility package

  • Machine Learning

    Machine Learning 101

    Overview: What is Machine Learning? And why use it? This introductory workshop introduces basic concepts of machine learning and offers the audience hands-on experience in Machine Learning processes. This beginner level Machine Learning workshop is for people who have no prior experience with machine learning.

    Workshop Topics:

    • Common types of Machine Learning models
    • Distinction between linear models, tree based models and clustering
    • How you can apply Machine Learning techniques to your research.

    Natural Language Processing (NLP)

    Overview: What is Natural Language Processing(NLP)? What examples of NLP are applied in the Social Sciences? This workshop explains basic concepts of NLP and walks through a NLP project step-by-step to show these concepts realized in code.

    Workshop Topics:

    • What is NLP
    • Flow chart of application of NLP
    • Examples of NLP in social sciences

    Machine Learning - Supervised Learning in Python

    Overview: This workshop offers the audience hands-on experience on the application of supervised learning by walking through a project step-by-step. The tools involved in this project include Python and Jupyter Notebook.

    Workshop Topics:

    • What is supervised learning
    • Examples of application in social sciences
    • The distinction between different types of model
    • Flow chart of application of supervised learning models
    • Brief introduction to data preprocessing - explain some basic concepts (standard ways to clean the data)
    • Walk through the mini projects (tie back to the flow chart and show how to do things step by step)

    Machine Learning - Unsupervised Learning in Python

    Overview: This workshop offers the audience hands-on experience on the application of unsupervised learning by walking through a project step-by-step. The tools involved in this project include Python and Jupyter Notebook.

    Workshop Topics:

    • What is unsupervised learning
      Examples of application in social sciences
      The distinction between different types of models
      Flow chart of application of unsupervised learning models
      Brief introduction to data preprocessing
      Explain some basic concepts (standard ways to clean the data)
      Walk through the mini projects (tie back to the flow chart and show how to do things step by step)

  • MaxQDA

    Intro to MaxQDA

    Overview: MaxQDA is a qualitative and mixed-method analysis software package that has increasingly become popular here at Cornell. Unlike Atlas.ti and NVivo, its Mac and Windows versions are identical allowing for seamless cross-platform integration. This workshop will cover understanding the MaxQDA environment, creating a project, adding and working with documents, coding and organizing the code system, memos, lexical search and autocoding, MaxDictio, retrieving coded segments, and reporting of results.

    Workshop Topics:

    • Overview of the MaxQDA environment
    • Creating, saving, and exporting Projects
    • Coding and identifying themes
    • Creating reports

  • NVivo

    Using NVivo as a Research Tool

    Overview: NVivo is software that supports qualitative and mixed methods research. It is designed to help users organize, analyze and find insights in unstructured, or qualitative data like: interviews, open-ended survey responses, articles, social media and web content.

    Workshop Topics:

    • Creating, saving and exporting NVivo Projects
    • Coding, identifying themes.
    • Overview of NVivo windows environment.
    • Creating reports

  • OpenRefine

    Overview: OpenRefine is a user-friendly tool for cleaning, transforming, and preparing your data for analysis. This workshop will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them months of work.

    To do the exercises along with the workshop, you can first download and install OpenRefine along with the data file used in the workshop:

    Preparation for the Workshop:

  • Python

    Workshops offered:

    Introduction to Python 

    Understanding the Python Package Ecosystem

    Pandas

    Advanced Python

    Overview: Our Python Workshop series walks the inexperienced programmer through the basics of Python language and interface and equips you with the tools to dig further into the potential of Python for managing and analyzing your data. The first part of the series offers a beginner-friendly introduction to using Python. The second part of the series introduces the rich ecosystem of premade tools for data management and analysis available in Python. The third part goes into more detail by closely explaining one powerful tool, pandas. The last part of our series will cover more advanced Python skills such as decorators, generators, and context managers.

  • RMarkdown

    Introduction to RMarkdown

    Overview: RMarkdown allows you to have your code, output, text, formatting and personal notes all in one convenient platform. RMarkdown documents are fully reproducible and support dozens of output formats, like PDFs, Word files, slideshows, and more. RMarkdown documents are often designed for collaboration with data scientists who are interested in your results and how you reached them.

    Workshop Topics:

    • R code chunks for creating output
    • inline code for placing output mid sentence
    • Formatting an RMarkdown document and output for neat and organized reporting

    Preparation for the Workshop:

  • R

    Data Visualization in R

    Overview: Data visualization is a crucial component of data analysis. This workshop will cover some of the basics of visualizing data in R. We will cover R's essential plot functions, ggplot2, plotly, and RShiny.

    Workshop Topics:

    • Design visualizations to communicate data insights
    • Understand the differences between the different types of plot functions
    • Use dashboard packages to display multiple visualizations

    R Package Ecosystem

    Overview: Given the numerous R packages, we will use this workshop to explore some popular packages for data analysis.

    Workshop Topics:

    • Understand how to search for R packages in Cran
    • Understand how to interpret the help documentation for R packages
    • Experiment with multiple packages

    Introduction to Tidyverse

    Overview: A large part of the data analysis workflow is data cleaning, and the R packages in Tidyverse are the most popular for data cleaning. This workshop will cover the following packages of Tidyverse: dplyr, tidyr, readr, purr, tibble, stringr, and forcats. Using these packages, we will explore data cleaning functions such as: changing variable format, creating new variables, summarizing functions, joining operations, and basic regressions.

    Workshop Topics:

    • Differentiate between Tidyverse packages to understand which packages to use for which data cleaning functions
    • Apply multiple Tidyverse packages to an R dataset
    • Design a data cleaning workflow that culminates in a regression

  • SPSS

    Introduction to SPSS: Crash Course

    Overview: SPSS stands for the Statistical Package for the Social Scientists. In SPSS 1 you will learn to navigate the SPSS windows environment and menu. You will know how to open data inside of SPSS, create output and understand your results.

    Workshop Topics:

    • Analyzing/understanding output
    • Understanding SPSS menu
    • Basic SPSS coding
    • Summarizing Data using SPSS

    Intermediate SPSS: Creating Statistical Reports with SPSS

    Overview: SPSS stands for the Statistical Package for the Social Scientists. In SPSS 2 we will focus on creating statistical reports comprised of results from following statistical procedures: Regressions, Crosstabs, Hypothesis Tests comparing means, Correlation Coefficients, etc.

    Workshop Topics:

    • Analyzing/understanding output
    • Creating Statistical reports with SPSS
    • SPSS syntax for reproducibility

    Past Workshop Files:

    Introduction to SPSS workshop session 1 may also be viewed in the following 1-hour video:

  • Stata

    Introduction to Stata: Crash Course

    Overview: Stata is a popular statistical programming language among economists, sociologists and human ecologists here at Cornell. In Stata 1 you will learn to navigate the Stata windows environment and menu. You will know how to open data inside of Stata, create output and understand your results.

    Workshop Topics:

    • Analyzing/understanding Stata output
    • Understanding Stata menu and windows environment
    • Basic Stata coding
    • Summarize data using Stata

    Intermediate Stata: Manipulating and Cleaning Data

    Overview: Stata is a popular statistical programming language among economists, sociologists and human ecologists here at Cornell. In Stata 2 you will enhance your ability to organize your data and prepare it for analysis. We will combine and merge similar like datasets to create longitudinal data.

    Workshop Topics:

    • Manipulating/reorganizing data
    • Merging and combining to create larger data
    • Removing data and targeting specific data groups

    Advanced Stata: Advanced Programming in Stata

    Overview: Stata is a popular statistical programming language among economists, sociologists and human ecologists here at Cornell. In Stata 3 you will learn looping for repeated processes to create more efficient programs. Stata 3 will focus more on batch coding through the use of .do files. Macros will be discussed for easier access of stored results, key variables and important file paths.

    Workshop Topics:

    • Coding with loops
    • Batch coding with do files
    • Stata macros

    Past-Workshop Files:

    The Introduction to Stata workshop may also be viewed in the following 2-hour video:

  • Structural Equation Modeling

    Structural Equation Modeling

    Overview: This is an introduction to Structural Equation Modeling. SEM is a broad umbrella term under which many statistical analyses fall and is really about a way of thinking and theorizing about relationships between variables. Software that can be used include R (especially lavaan), Mplus, and LISREL.

    Workshop Topics:

    • What is structural equation modeling?
    • What types of models are included in this category?
    • What can we do with structural equation modeling?
    • What types of analysis is SEM best for?
    • What software can we use for SEM?
    • How do we evaluate our model?

  • We'd love to hear your ideas, suggestions, or questions!

    Are you
    CAPTCHA This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    9 + 2 =
    Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.