Through CCSS discover valuable software, data, and computing tools for working with data!
Workshops are open to the Cornell community!
-
Web Scraping in Python (BeautifulSoup) | September 6 | 2:00-3:30pm
Dates Offered:
- September 6, 2:00-3:30 pm. Register Here
Additional Details:
This series implements web scraping techniques using python to extract online data within social science research. We start with the python library BeautifulSoup best used for learning the basics of extracting data from online sources. BeautifulSoup provides simple methods for navigating, searching and extracting what you need. This demonstration will provide Python code, familiarity using Python is encouraged. Feel free to bring your laptop with Anaconda installed so you can follow along.
Instructor: Jacob Grippin
Pre-requisites: Introductory Python knowledge(Watch CCSS Python recording here)
Learning Objectives:
- Understanding HTML website structure(Tabs, Attributes). Locating what you need to scrape.
- Python basics for scraping data you want from static(non-changing) webpages.
- Cleaning up scraped data into organized readable format for further analysis.
-
Intermediate Web Scraping in Python (Selenium) | September 8 | 2:00-3:30pm
Dates Offered:
- September 8, 2:00-3:30 pm. Register Here
Additional Details:
This series implements web scraping techniques using python to extract online data within social science research. We continue from the previous workshop by using the python library Selenium, a more powerful web scraping tool. Traditional scraping tools struggle to collect data from websites that rely on JavaScript. Selenium enables that along with extra functionality to interact with a page like a human user would by submitting mouse clicks, scrolling and filling out forms. This demonstration will provide Python code, familiarity using Python is encouraged. Feel free to bring your laptop with Anaconda installed so you can follow along.
Instructor: Jacob Grippin
Pre-requisites: Introductory Python knowledge(Watch CCSS Python recording here)
Learning Objectives:
- Python basics for scraping data you want from dynamic(changing) webpages.
- Python basics for submitting mouse clicks, scrolling, searches, etc.
- Automate longer web scraping projects efficiently.
-
Geospatial Analysis: Introduction to Mapping in R | September 15 | 1:00-2:30pm
Dates Offered:
- September 15, 1:00-2:30 pm. Register Here
Additional Details:
Geospatial analytics gathers, manipulates and displays geographic information system (GIS) data. Geospatial data analytics rely on geographic coordinates and specific identifiers such as street address and zip code. They are used to create geographic models and data visualizations for more accurate modeling and predictions of trends. Geospatial data analytics lets the eye recognize patterns like distance, proximity, contiguity and affiliation that are hidden in massive datasets. The visualization of spatial data also makes it easier to see how things are changing over time and where the change is most pronounced. Join this workshop series to discover how to implement geospatial analysis into your research through the R programming language. This workshop covers introductory aspects of geospatial analysis through R. Familiarity in R is required. Feel free to bring your laptop with R installed so you can follow along.
Instructor: Kanika Khanna
Pre-requisites: Introductory R knowledge(Watch CCSS R recording here)
-
Geospatial Analysis: Mapping Census Data in R | September 18 | 1:00-2:30pm
Dates Offered:
- September 18, 1:00-2:30 pm. Register Here
Additional Details:
Geospatial analytics gathers, manipulates and displays geographic information system (GIS) data. Geospatial data analytics rely on geographic coordinates and specific identifiers such as street address and zip code. They are used to create geographic models and data visualizations for more accurate modeling and predictions of trends. Geospatial data analytics lets the eye recognize patterns like distance, proximity, contiguity and affiliation that are hidden in massive datasets. The visualization of spatial data also makes it easier to see how things are changing over time and where the change is most pronounced. Join this workshop series to discover how to implement geospatial analysis into your research through the R programming language. This workshop expands on the previous one by incorporating more advanced features for geospatial analysis in R. Familiarity in R is required. Feel free to bring your laptop with R installed so you can follow along.
Instructor: Kanika Khanna
Pre-requisites: Introductory R knowledge(Watch CCSS R recording here)
-
Machine Learning: Introduction to Machine Learning | September 28 | 3:00-4:30pm
This workshop series instructs users on how Machine Learning models can be applied within social science research. Best suited for social scientists with working Python proficiency and quantitative research experience. This workshop provides an overview on machine learning models and how it is currently being used in social science research.
Instructor: Jonathan Chang
Pre-Requisites: None
-
Machine Learning: Unsupervised Learning Python | October 12 | 3:00-4:30pm
This workshop series instructs users on how Machine Learning models can be applied within social science research. Best suited for social scientists with working Python proficiency and quantitative research experience. This workshop explores and represents emergent patterns within data by developing unsupervised Machine Learning models.
Instructor: Jonathan Chang
Pre-requisites: Introductory Python knowledge(Watch CCSS Python recording here)
-
Geospatial Analysis: Creating Interactive Maps and Data Visualizations | October 16 | 1:00-2:30pm
Geospatial analytics gathers, manipulates and displays geographic information system (GIS) data. Geospatial data analytics rely on geographic coordinates and specific identifiers such as street address and zip code. They are used to create geographic models and data visualizations for more accurate modeling and predictions of trends. Geospatial data analytics lets the eye recognize patterns like distance, proximity, contiguity and affiliation that are hidden in massive datasets. The visualization of spatial data also makes it easier to see how things are changing over time and where the change is most pronounced. Join this workshop series to discover how to implement geospatial analysis into your research through the R programming language. This workshop expands on the previous one by incorporating more advanced features for geospatial analysis in R. Familiarity in R is required. Feel free to bring your laptop with R installed so you can follow along.
Instructor: Kanika Khanna
Pre-requisites: Introductory R knowledge(Watch CCSS R recording here)
-
Machine Learning: Supervised Learning Python | October 19 | 3:00-4:30pm
This workshop series instructs users on how Machine Learning models can be applied within social science research. Best suited for social scientists with working Python proficiency and quantitative research experience. This workshop explores and represents emergent patterns within data by developing supervised Machine Learning models.
Instructor: Jonathan Chang
Pre-requisites: Introductory Python knowledge(Watch CCSS Python recording here)
-
Factor Analysis in R (psych, sem, cfa) | November 2 | 1:00-2:30pm
Factor analysis is a statistical method used to search for some unobserved variables called factors from observed variables called factors. Factor analysis is used in many areas of statistical analysis such as marketing, social sciences, psychology, and so on. Join to learn how to implement factor analysis in R.
Instructor: Aishat Sadiq
Pre-requisites: Introductory R knowledge(Watch CCSS R recording here)
-
Machine Learning: Natural Language Processing (NLP) Python | November 2 | 3:00-4:30pm
This workshop series instructs users on how Machine Learning models can be applied within social science research. Best suited for social scientists with working Python proficiency and quantitative research experience. This workshop dives into the intersection of ML and text data through constructing both supervised and unsupervised NLP models.
Instructor: Jonathan Chang
Pre-requisites: Introductory Python knowledge(Watch CCSS Python recording here)
-
Machine Learning. Supervised Learning in R(glmnet, caret, nlme) | November 9 | 1:00-2:30pm
This workshop series instructs users on how Machine Learning models can be applied within social science research. Best suited for social scientists with working R proficiency and quantitative research experience. This workshop explores and represents emergent patterns within data by developing unsupervised Machine Learning models.
Instructor: Aishat Sadiq
Pre-requisites: Introductory R knowledge(Watch CCSS R recording here)
-
Introduction to Publication Ready Tables in R (gtsummary, rmarkdown, kable, kable extra) | September 28 | 1:30-2:30pm
Dates Offered:
- September 28, 1:30-2:30 pm. Register Here
Additional Details:
Replication of results is a core requirement of the scientific method. This workshop will demonstrate generating tables through R code that are publication ready. This is the best way to report results as the output is completely reproducible. This workshop goes over the code used in R for creating tables that can be inserted directly into your publications.
Instructor: Aishat Sadiq
Pre-requisites: Introductory R knowledge(Watch CCSS R recording here)
-
Research Opportunities at the Cornell Federal Statistical Research Data Center | October 11 | 3:30-4:45pm
Dates Offered:
- October 11, 3:30-4:45 pm. Register Here
Additional Details:
The Cornell Federal Statistical Research Data Center (FSRDC) provides access to confidential federal data from several agencies, including the U.S. Census Bureau. The Cornell FSRDC administrator, Nichole Szembrot, will give an overview of the available data and proposal process. This workshop is recommended for faculty and Ph.D. students.
Instructor: Nichole Szembrot
Pre-Requisites: None
-
Introduction to Publication Ready Tables in Stata (estout, putpdf, putexcel, tabout, outreg2) | October 16 | 3:00-4:00pm
Dates Offered:
- October 16, 3:00-4:00 pm. Register Here
Additional Details:
Replication of results is a core requirement of the scientific method. This workshop will demonstrate generating tables through Stata code that are publication ready. This is the best way to report results as the output is completely reproducible. This workshop goes over the code used in Stata for creating tables that can be inserted directly into your publications.
Instructor: Jacob Grippin
Pre-requisites: Introductory Stata knowledge(Watch CCSS Stata recording here)
-
Finding Open Datasets Online(NYC Open Data, Census.gov, American Community Survey(ACS)) | October 19 | 1:30-2:30pm
Dates Offered:
- October 19, 1:30-2:30 pm. Register Here
Additional Details:
The American Community Survey (ACS) is the premier source for detailed population and housing data for the United States. ACS releases new data every year that you can access with different data tools. Attend this workshop for a lesson on how to pull the ACS data you want through the interactive online tools of IPums and the census website.
Instructor: Aishat Sadiq
Pre-Requisites: None
-
Replication Data and Code Preparation Training | November 10 | 1:30-4:00pm
Dates Offered:
- November 10, 1:30-4:00 pm. Register Here
Discussion topics:
- Process of reviewing the manuscript, data, code, output, and other documentation
- Preparing a Readme file
- Preparing a Data Availability Statement
- Preparing the replication package (consisting of data, code, and other documentation) to make it portable, independently understandable, easily reusable, and ready for publication, archiving, and sharing
- Discuss common mistakes in manuscripts and codes so you can avoid them
- Present CCSS services to assist your research, including Data Archiving and Replication Service
Instructor: Florio Arguillas
Pre-Requisites: None
-
Qualitative Analysis Using Atlas.ti | September 22 | 1:30-4:00pm
Dates Offered:
- September 22, 1:30-4:00 pm. Register Here
Additional Details:
Atlas.ti is a powerful workbench for qualitative data analysis. No matter your field, Atlas.ti will meet your qualitative analysis needs. Sophisticated tools help you to arrange, reassemble, and manage your material in creative ways.
Instructor: Jacob Grippin
Pre-Requisites: None
Learning Objectives:
- Uploading files(transcripts, surveys, recordings, images, etc.) and creating Atlas.ti projects.
- Coding, identifying themes.
- Analyzing results. Creating organized reports.
- Saving your work. Collaborating as a team.
-
Qualitative Analysis using MaxQDA | October 20 | 1:30-4:00pm
Dates Offered:
- October 20, 1:30-4:00pm. Register Here
Additional Details:
MaxQDA is a qualitative and mixed-method analysis software package that has increasingly become popular here at Cornell. Unlike Atlas.ti and NVivo, its Mac and Windows versions are identical allowing for seamless cross-platform integration. This workshop will cover understanding the MaxQDA environment, creating a project, adding and working with documents, coding and organizing the code system, memos, lexical search and autocoding, MaxDictio, retrieving coded segments, and reporting of results.
Workshop Topics:- Overview of the MaxQDA environment
- Creating, saving, and exporting Projects
- Coding and identifying themes
- Creating reports
Instructor: Florio Arguillas
Pre-Requisites: None
-
Web Scraping | September 8 | 4:00-5:00pm
Dates Offered:
- September 8, 4:00-5:00pm. Register Here
Additional Details:
This series implements web scraping techniques using python to extract online data within social science research. We continue by working on sample web scraping problems using BeautifulSoup or Selenium to hone our python scraping skills for later use. Please bring your laptop with Anaconda installed so you can participate in solving the sample problems.
Instructor: Jacob Grippin
Requirements:
-
Laptop with Anaconda installed (download Anaconda here) or
-
Account on CCSS Cloud Computing Solutions. Apply here.
Pre-Requisites:
-
Introductory Web Scraping Knowledge(watch CCSS recording here)
-
Introductory Python knowledge(Watch CCSS Python recording here)
-
Python | September 21 | 2:30-5:00pm
Dates Offered:
- Thursday September 21, 2:30-5:00pm. Register Here
Instructor: Jonathan Chang
Requirements:
-
Laptop with Anaconda installed (download Anaconda here) or
-
Account on CCSS Cloud Computing Solutions. Apply here.
Pre-Requisites:
-
Introductory Python knowledge(watch CCSS recording here)
-
Stata | September 29 | 2:30-5:00pm
Dates Offered:
- Friday September 29, 2:30-5:00pm. Register Here
Instructor: Jacob Grippin
Requirements:
-
Laptop (Register for the workshop to be given free access to Stata for use during the workshop) or
-
Account on CCSS Cloud Computing Solutions. Apply here.
Pre-Requisites:
-
Introductory Stata knowledge(watch CCSS recording here)
-
R | October 13 | 2:30-5:00pm
Dates Offered:
- Friday October 13, 2:30-5:00pm. Register Here
Instructor: Kanika Khanna
Requirements:
-
Laptop with R installed (install: R Windows, R Mac, RStudio) or
-
Account on CCSS Cloud Computing Solutions. Apply here.
Pre-Requisites:
-
Introductory R knowledge(watch CCSS recording here)
Affiliate with CCSS!
Faculty, Staff, Postdocs, Graduate Students, Visiting Scholars, Other.
Undergraduate Students
Workshops
CCSS Workshops are given by our staff consultants, Senior Data Science Fellows, and Data Science Fellows. Learn more about our instructors here.
- Suggest a workshop by emailing socialsciences@cornell.edu
Training for Classes and Project Teams Request Training
Training for Classes
We provide class training tailored to your group’s specific needs on topics related to:
- Data processing and management
- Use of qualitative software packages such as Atlas.ti, MaxQDA and NVivo
- Use of statistical software packages such as SAS, SPSS, STATA, and R
- Use of CCSS research and computing servers
Training for Project Teams
We provide just-in-time training for project teams in:
- Qualitative and quantitative research data management and processing
- Targeted training for a particular phase/time in the project team's research process where extra help is needed