Skip to main content

Workshops

Workshop schedule for the current semester including descriptions for each session. 

  • Workshops

Spring 2022 Workshops

CCSS Workshops are given by our staff consultants, Senior Data Science Fellows, and Data Science Fellows. Learn more about our instructors here.

All workshops will be held online via Zoom this semester. Students registered for workshops will be emailed the Zoom link prior to the start of the workshop. 

  • Introduction to Python | Feb. 2 | 3:30-5:00 pm

    Our Python Workshop series walks the inexperienced programmer through the basics of Python language and interface, and equip you with the tools to dig further into the potential of Python for managing and analyzing your data. The first part of the series offers a beginner-friendly introduction on how to use Python. The second part of the series introduces the rich ecosystem of premade tools for data management and analysis available in Python, and the third part goes into more detail by closely explaining one powerful tool, pandas. The last part of our series will cover more advanced Python sklils such as decorators, generators and context managers.

    Register Here

  • Introduction to R | Feb. 3 | 3:00-4:30 pm

    This workshop introduces R and will cover the basics of R and RStudio. In addition, we will cover data loading and outline the R workshops for the rest of the semester.

    Register Here

  • Introduction to SPSS, Crash Course | Feb. 4 | 1:00-2:30 pm

    In SPSS 1 you will learn how to move around in SPSS. You will open data in SPSS, create output and understand your results.

    Register Here

  • Version control with Github | Feb. 8 | 12:00-2:00 pm

    Git is a tool that helps keep track of changes made in project documents such as program files or source codes, effectively versioning them, and allows teams to collaborate via a central repository hub such as Github, Bitbucket, or Gitlab. Git is a free and open-source system that handles everything from small to very large projects with speed and efficiency.

    Register Here

  • Why Use APIs for Social Science Research? | Feb. 9 | 1:00-2:00 pm

    Why, as a social scientist, would you want to include observational data from social media websites and other online communities? During this workshop, you will learn about the tools needed to acquire online behavioral data, what can be understood from these data, and explore real-world academic studies that have made use of these types of data. Optionally, you can learn how to apply for access to a major social media API.

    Register Here

  • Intro to Atlas.Ti | Feb. 11 | 1:00-3:00 pm

    Atlas.ti is a powerful workbench for qualitative data analysis. No matter your field, Atlas.ti will meet your qualitative analysis needs. Sophisticated tools help you to arrange, reassemble, and manage your material in creative ways.

    Register Here

  • Introduction to Stata, Crash Course | Feb. 15 | 1:00-2:00 pm

    In Stata 1 you will learn how to move around in Stata. You will know how to open data inside of Stata, create output and understand your results. Stata is often associated with economics, sociology, and human ecology.

    Register Here

  • Introduction to Tidyverse | Feb. 16 | 2:00-3:30 pm

    A large part of the data analysis workflow is data cleaning, and the R packages in Tidyverse are the most popular for data cleaning. This workshop will cover the following packages of Tidyverse: dplyr, tidyr, readr, purr, tibble, stringr, and forcats. Using these packages, we will explore data cleaning functions such as: changing variable format, creating new variables, summarizing functions, joining operations, and basic regressions.

    Register Here

  • Understanding the Python Package Ecosystem | Feb. 18 | 3:30-5:00 pm

    Our Python Workshop series walks the inexperienced programmer through the basics of Python language and interface and equips you with the tools to dig further into the potential of Python for managing and analyzing your data. The first part of the series offers a beginner-friendly introduction to using Python. The second part of the series introduces the rich ecosystem of premade tools for data management and analysis available in Python. The third part goes into more detail by closely explaining one powerful tool, pandas. The last part of our series will cover more advanced Python skills such as decorators, generators, and context managers.

    Register Here

  • Practical applications of APIs | Feb. 21 | 1:00-1:30 pm

    During this workshop, you will be exposed to common software packages that ease the process of acquiring trace data from social media platforms. You will then discuss sampling strategies for common research questions using these data. Based on these discussions, you will pull data from the API. We will discuss the variables present, which are relevant for analysis, and strategies for cleaning/filtering based on a particular research question.

    Register Here

  • Intro to NVivo | Feb. 22 | 2:30-4:30 pm

    NVivo is software that supports qualitative and mixed methods research. It is designed to help users organize, analyze and find insights in unstructured, or qualitative data like: interviews, open-ended survey responses, articles, social media and web content.

    Register Here

  • Pandas | Feb. 23 | 3:30–5:00 pm

    Our Python Workshop series walks the inexperienced programmer through the basics of Python language and interface and equips you with the tools to dig further into the potential of Python for managing and analyzing your data. The first part of the series offers a beginner-friendly introduction to using Python. The second part of the series introduces the rich ecosystem of premade tools for data management and analysis available in Python. The third part goes into more detail by closely explaining one powerful tool, pandas. The last part of our series will cover more advanced Python skills such as decorators, generators, and context managers.

    Register Here

  • Understanding the R Package Ecosystem | Feb. 24 | 3:00–4:30 pm

    Given the numerous R packages, we will use this workshop to explore some popular packages for data analysis.

    Register Here

  • Intermediate SPSS, Creating Statistical Reports with SPSS | Mar. 2 | 1:00–2:30 pm

    In SPSS 2 we will focus on creating statistical reports comprised of results from the following statistical procedures: Regressions, Crosstabs, Hypothesis Tests comparing mean, Correlation Coefficients, etc.

    Register Here

  • Intermediate Stata, Manipulating and Cleaning Data | Mar. 4 | 1:00-2:00 pm

    In Stata 2 you will enhance your ability to organize your data and prepare it for analysis. We will combine and merge similar datasets, removing undesired information.

    Register Here

  • Data visualization in R | Mar. 8 | 3:00–4:30 pm

    Data visualization is a crucial component of data analysis. This workshop will cover some of the basics of visualizing data in R. We will cover R's essential plot functions, ggplot2, plotly, and RShiny.

    Register Here

  • Advanced Python | Mar. 9 | 3:30–5:00 pm

    Our Python Workshop series walks the inexperienced programmer through the basics of Python language and interface and equips you with the tools to dig further into the potential of Python for managing and analyzing your data. The first part of the series offers a beginner-friendly introduction to using Python. The second part of the series introduces the rich ecosystem of premade tools for data management and analysis available in Python. The third part goes into more detail by closely explaining one powerful tool, pandas. The last part of our series will cover more advanced Python skills such as decorators, generators, and context managers.

    Register Here

  • Machine Learning 101 | Mar. 15 | 11:00am–12:00 pm

    This workshop introduces basic concepts of machine learning and offers the audience some basic hands-on experience on the application of machine learning.

    Register Here

  • Advanced Programming in Stata | Mar. 18 | 1:00-2:00 pm

    In Stata 3 you will learn looping for repeated measures. Accessing auto stored results. Macros for commonly used paths/objects/variables. Use of Do files for creating programs.

    Register Here

  • Organize for transparent and reproducible research | Mar. 22 | 2:30-4:30 pm

    Replication of results is a core requirement of the scientific method. Satisfying this requirement becomes increasingly complex when data from disparate sources is integrated and reused. This workshop will walk you through the process of reviewing your manuscript, data, code, and output to ensure your results will reproduce correctly on all systems. We will go over how to package reproduction materials for easy re-use. This can be time-intensive and intimidating, especially for individual researchers seeking to openly share their work, but rewarding as it leads to open and collaborative research. Others can build off your work moving science further and faster.

    Register Here

  • NLP 101 | Mar. 23 | 1:00-2:30 pm

    This workshop explains basic concepts of NLP and walks through a NLP project step-by-step to show these concepts realized in code.

    Register Here

  • Documentation with RStudio/RMarkdown Creating reports in RMarkdown | Mar. 24 | 3:00-4:00 pm

    RMarkdown allows you to have your code, output, text, formatting and personal notes all in one platform. An RMarkdown document is written using easy to write markdown text, and contains chunks of embedded R code to create output.

    Register Here

  • Structural Equation Modeling | Mar. 25 | 1:00-2:30 pm

    This is an introduction to Structural Equation Modeling. SEM is a broad umbrella term under which many statistical analyses fall and is really about a way of thinking and theorizing about relationships between variables. Software that can be used include R (especially lavaan), Mplus, and LISREL.

    Register Here

  • Machine Learning - Supervised Learning in Python | Mar. 28 | 1:00–2:30 pm

    This workshop offers the audience hands-on experience on the application of supervised learning by walking through a project step-by-step. The tools involved in this project include Python and Jupyter Notebook.

    Register Here

  • Machine Learning - Unsupervised Learning in Python | Apr. 11 | 1:00–2:30 pm

    This workshop offers the audience hands-on experience on the application of unsupervised learning by walking through a project step-by-step. The tools involved in this project include Python and Jupyter Notebook.

    Register Here

  • Intro to MaxQDA | Apr. 14 | 1:00-3:00 pm

    MaxQDA is a qualitative and mixed-method analysis software package that has increasingly become popular here at Cornell. Unlike Atlas.ti and NVivo, its Mac and Windows versions are identical allowing for seamless cross-platform integration. This workshop will cover understanding the MaxQDA environment, creating a project, adding and working with documents, coding and organizing the code system, memos, lexical search and autocoding, MaxDictio, retrieving coded segments, and reporting of results.

    Register Here

Workshop Descriptions

A full list of all of the topics that our consultants cover. Be on the lookout for these workshops to be offered in the coming semesters!

  • API

    Introduction to API

    Overview: An Application Programming Interface, or API, is a tool that allows you to request data from an online platform like Twitter or YouTube, using a programming language like R or Python. An API is the surest way to acquire data on social media platforms or academic journals (like Cornell's very own arXiv), allowing you to run text analysis, social network analysis, and more!

    Workshop One Topics:

    • Understand what data you are able to collect with this technology
    • Understand what you can learn from this type of data
    • Have successfully acquired some data from an API
    • Knowledge of common ways to access these data

    Workshop Two Topics:

    • Learn how to setup RStudio for API scraping
    • Explore common API packages for requesting trace data
    • Practice pulling data based on a few common criteria (time, hashtag, user)
    • Overview of cleaning techniques and effects
    • Identify potential study uses for cases

  • Atlas.ti 

    Introduction to Atlas.ti

    Overview: Atlas.ti is a powerful workbench for qualitative data analysis. No matter your field, Atlas.ti will meet your qualitative analysis needs. Sophisticated tools help you to arrange, reassemble, and manage your material in creative ways.

    Workshop Topics:

    • Creating, saving and exporting Atlas.ti Project Bundles
    • Coding, identifying themes
    • Overview of Atlas.ti windows environment.
    • Creating organized reports

  • Conjoint Analysis

    Using Conjoint Analysis in Analyzing Individuals' Underlying Preferences

    Overview: Conjoint Analysis helps researchers across different fields to identify individuals’ preferences and evaluate their choice trade-offs in the context of a survey. 

    Workshop Topics: 

    • How to successfully configure conjoint experiments in Qualtrics using the Conjoint Choice-based Application provided by Qualtrics
    • Explain how to analyze and utilize Conjoint-based data 
    • Successfully design and implement their own questionnaire in the form of a Conjoint experiment and will learn to access, analyze, and present the findings of their Conjoint survey using R

    Software/tool/method: Qualtrics Choice-Based Conjoint and R Studio 

    Area of expertise connects to: surveys, survey experiments, social sciences, business analytics, marketing

    Prerequisites: Basic understanding of Qualtrics, survey analysis, and R is helpful.
     

  • GitHub

    Introduction to Github

    Overview: Git is a tool that helps keep track of changes made in project documents such as program files or source codes, effectively versioning them, and allows teams to collaborate via a central repository hub such as Github, Bitbucket, or Gitlab. Git is a free and open source system that handles everything from small to very large projects with speed and efficiency.

    Workshop Topics:

    • Creation and configuration of a git repository:
    • Editing, staging and committing files
    • Retrieving previous versions of files
    • Branching

  • How to Make your Research Transparent and Reproducible

    Overview: Replication of results is a core requirement of the scientific method. Satisfying this requirement becomes increasingly complex when data from disparate sources is integrated and reused. This workshop will walk you through the process of reviewing your manuscript, data, code and output to ensure your results will reproduce correctly on all systems. We will go over how to package reproduction materials for easy re-use. This can be time-intensive and intimidating, especially for individual researchers seeking to openly share their work.

    Workshop Topics:

    • Walk you through the process of reviewing your manuscript, data, code and output to ensure that the reproduction materials you will share to the public or submit to a journal for publication will reproduce your results exactly
    • Discuss common mistakes to avoid in manuscripts and code
    • How to package reproduction materials for easy re-use and independent understandability
    • CCSS Data and Reproduction Archive as the institutional repository for your reproducibility package

  • Machine Learning

    Machine Learning 101

     

    Overview: What is Machine Learning? And why use it? This introductory workshop introduces basic concepts of machine learning and offers the audience hands-on experience in Machine Learning processes. This beginner level Machine Learning workshop is for people who have no prior experience with machine learning.

    Workshop Topics:

     

    • Common types of Machine Learning models
    • Distinction between linear models, tree based models and clustering
    • How you can apply Machine Learning techniques to your research.

    Natural Language Processing (NLP)

     

    Overview: What is Natural Language Processing(NLP)? What examples of NLP are applied in the Social Sciences? This workshop explains basic concepts of NLP and walks through a NLP project step-by-step to show these concepts realized in code.

    Workshop Topics:

     

    • What is NLP
    • Flow chart of application of NLP
    • Examples of NLP in social sciences

  • NVivo

    Using NVivo as a Research Tool

    Overview: NVivo is software that supports qualitative and mixed methods research. It is designed to help users organize, analyze and find insights in unstructured, or qualitative data like: interviews, open-ended survey responses, articles, social media and web content.

    Workshop Topics:

    • Creating, saving and exporting NVivo Projects
    • Coding, identifying themes.
    • Overview of NVivo windows environment.
    • Creating reports

  • OpenRefine

    Overview: OpenRefine is a user-friendly tool for cleaning, transforming, and preparing your data for analysis. This workshop will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them months of work.

    To do the exercises along with the workshop, you can first download and install OpenRefine along with the data file used in the workshop:

    Preparation for the Workshop:

  • RMarkdown

    Introduction to RMarkdown

    Overview: RMarkdown allows you to have your code, output, text, formatting and personal notes all in one convenient platform. RMarkdown documents are fully reproducible and support dozens of output formats, like PDFs, Word files, slideshows, and more. RMarkdown documents are often designed for collaboration with data scientists who are interested in your results and how you reached them.

    Workshop Topics:

    • R code chunks for creating output
    • inline code for placing output mid sentence
    • Formatting an RMarkdown document and output for neat and organized reporting

    Preparation for the Workshop:

  • SPSS

    Introduction to SPSS: Crash Course

    Overview: SPSS stands for the Statistical Package for the Social Scientists. In SPSS 1 you will learn to navigate the SPSS windows environment and menu. You will know how to open data inside of SPSS, create output and understand your results.

    Workshop Topics:

    • Analyzing/understanding output
    • Understanding SPSS menu
    • Basic SPSS coding
    • Summarizing Data using SPSS

     

    Intermediate SPSS: Creating Statistical Reports with SPSS

    Overview: SPSS stands for the Statistical Package for the Social Scientists. In SPSS 2 we will focus on creating statistical reports comprised of results from following statistical procedures: Regressions, Crosstabs, Hypothesis Tests comparing means, Correlation Coefficients, etc.

    Workshop Topics:

    • Analyzing/understanding output
    • Creating Statistical reports with SPSS
    • SPSS syntax for reproducibility

     

    Past Workshop Files:

    Introduction to SPSS workshop session 1 may also be viewed in the following 1-hour video:

  • Stata

    Introduction to Stata: Crash Course

    Overview: Stata is a popular statistical programming language among economists, sociologists and human ecologists here at Cornell. In Stata 1 you will learn to navigate the Stata windows environment and menu. You will know how to open data inside of Stata, create output and understand your results.

    Workshop Topics:

    • Analyzing/understanding Stata output
    • Understanding Stata menu and windows environment
    • Basic Stata coding
    • Summarize data using Stata

     

    Intermediate Stata: Manipulating and Cleaning Data

    Overview: Stata is a popular statistical programming language among economists, sociologists and human ecologists here at Cornell. In Stata 2 you will enhance your ability to organize your data and prepare it for analysis. We will combine and merge similar like datasets to create longitudinal data. This

    Workshop Topics:

    • Manipulating/reorganizing data
    • Merging and combining to create larger data
    • Removing data and targeting specific data groups

     

    Advanced Stata: Advanced Programming in Stata

    Overview: Stata is a popular statistical programming language among economists, sociologists and human ecologists here at Cornell. In Stata 3 you will learn looping for repeated processes to create more efficient programs. Stata 3 will focus more on batch coding through the use of .do files. Macros will be discussed for easier access of stored results, key variables and important file paths.

    Workshop Topics:

    • Coding with loops
    • Batch coding with do files
    • Stata macros

     

    Past-Workshop Files:

    The Introduction to Stata workshop may also be viewed in the following 2-hour video:

Custom Training for Classes

We provide class training tailored to your group’s specific needs on topics related:
  • Data processing and management
  • Use of qualitative software packages such as Atlas.ti and NVivo
  • Use of statistical software packages such as SAS, SPSS, STATA, and R
  • Use of CCSS-RS research and class computing servers

Custom Training for Project Teams

We provide just-in-time training for project teams in both qualitative and quantitative data management and processing.  Research data management and processing is a long and complex process.  We provide training that is needed at a particular phase of the data lifecycle so as to eliminate loss of knowledge and skills caused by a large gap between training and actual use.   Using this approach we also eliminate the need for refresher training due to subject knowledge loss or loss of people who leave the team before the training they received is used on the job.

  • We'd love to hear your ideas, suggestions, or questions!

    Are you
    CAPTCHA This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    4 + 14 =
    Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.