Description of Courses

A Virtually Syntax Free Practical Introduction to Web Scraping for Survey and Social Science Researchers-July 9-10, 2020

SurvMeth 988.400 (.5 credit hours) to take this class for UM credit you must take SurvMeth 988.400 and 988.500, An Introduction to Big Data and Machine Learning for Survey Researchers and Social Scientists for a total of 1.0 credit hours.

Instructor: Trent D. Buskirk

This short course will offer a very practical introduction to web scraping geared at social scientists and survey researchers.  This course begins with an overview of web scraping discussing some basic technical jargon, types of web data and various methods for scraping.  Some websites are designed to be easily accessible by web crawlers or scraping algorithms while others require much more advanced, custom programming.  In this course we will illustrate how participants can discern these differences as well as presenting several motivating examples of the various ways web scraped data can be used throughout a study’s lifecycle from design to calibration to analysis.  We provide an extensive introduction to a suite of freeware programs that allow virtually syntax free, but customizable, web scraping capabilities.  The course concludes with specific focus on the import.io tool where we demonstrate its capabilities and provide several, hands-on practical examples for participants to begin scraping several websites of increasing complexity.

If you wish to take this course for academic credit you must also enroll in An Introduction to Big Data and Machine Learning for Survey Researchers and Social Scientists.

Prerequisite: Having a trial import.io account set up (this is a 7 day trial so please plan to have the license active during our course).  Details can be found here: https://www.import.io/signup/.

2019 Syllabus (PDF)

 


 

An Introduction to Big Data and Machine Learning for Survey Researchers and Social Scientists-July 15-16, 2020

SurvMeth 988.500 (.5 credit hours) to take this class for UM credit you must take SurvMeth 988.500 and 988.400, A Virtually Syntax Free Practical Introduction to Web Scraping for Survey and Social Science Researchers for a total of 1.0 credit hours.

Instructor: Trent D. Buskirk

The amount of data generated as a by-product in society is growing fast including data from satellites, sensors, transactions, social media and smartphones, just to name a few. Such data are often referred to as "big data", and can be used to create value in different areas such as health and crime prevention, commerce and fraud detection.  An emerging practice in many areas is to append or link big data sources with more specific and smaller scale sources that often contain much more limited information.  This practice has been used for some time by survey researchers in constructing frames by appending auxiliary information that is often not directly available on the frame, but can be obtained from an external source.   Using Big Data has the potential to go beyond the sampling phase for survey researchers and in fact has the potential to influence the social sciences in general.  Big Data is of interest for public opinion researchers and agencies that produce statistics to find alternative data sources either to reduce costs, to improve estimates or to produce estimates in a more timely fashion. However, Big Data pose several interesting and new challenges to survey researchers and social scientists among others who want to extract information from data. As Robert Groves (2012) pointedly commented, the era is “appropriately called Big Data and not Big Information”, because there is a lot of work for analysts before information can be gained from “auxiliary traces of some process that is going on in society.”  

This course offers participants a broad overview of big data sources, opportunities and examples motivated within the survey and social science contexts including the use of social media data, para data and other such sources.  This course also offers a detailed, practical introduction to four common machine learning methods that can be applied to big and small data alike at various aspects of a study’s lifecycle from design to nonresponse adjustments to propensity score matching to weighting and evaluation and analysis.  The machine learning methods will be demonstrated in R and we will provide several different examples of using these methods along with multiple packages in R that offer these methods. 

If you wish to take this course for academic credit you must also enroll in A Virtually Syntax Free Practical Introduction to Web Scrapping for Survey and Social Science Researchers.

Prerequisite: Basic proficency in R (i.e. how to load a package, launch it and basic R syntax knowledge)

2019 Syllabus (PDF)

 


 

Analysis Methods for Complex Sample Survey Data-July 6-31, 2020

SurvMeth 614 (3 credit hours)

Instructor: Yajuan Si, University of Michigan and  Brady West, University of Michigan

This course provides an introduction to specialized software procedures that have been developed for the analysis of complex sample survey data. The course begins by considering the sampling designs of specific surveys: the National Comorbidity Survey-Replication (NCS-R), the National Health and Nutrition Examination Surveys (NHANES), and the Health and Retirement Study (HRS). Relevant design features of the NCS-R, NHANES and HRS include weights that take into account differences in probability of selection into the sample and differences in response rates, as well as stratification and clustering in the multistage sampling procedures used in identifying the sampled households and individuals.

Prerequisite: Two graduate-level courses in statistical methods, familiarity with basic sample design concepts, and familiarity with data analytic techniques such as linear and logistic regression.

Why take this course? 

  • To gain an understanding of modern methods and software for the secondary analysis of survey data collected from large complex samples
  • To have the opportunity for one-on-one interaction with the instructors when walking through analyses of survey data
  • To see several examples of applied statistical analyses of survey data
  • To have the experience of writing a scientific paper that presents an analysis of complex sample survey data, and getting expert feedback on that paper

2019 Syllabus (PDF)

 


 

Applied Sampling/Methods of Survey Sampling-June 8-July 3, 2020

SurvMeth 625 (3 credit hours)

Instructor: James Wagner, University of Michigan and Raphael Nishimura, University of Michigan

A fundamental feature of many sample surveys is a probability sample of subjects. Probability sampling requires rigorous application of mathematical principles to the selection process. Methods of Survey Sampling is a moderately advanced course in applied statistics, with an emphasis on the practical problems of sample design, which provides students with an understanding of principles and practice in skills required to select subjects and analyze sample data. Topics covered include stratified, clustered, systematic, and multi-stage sample designs, unequal probabilities and probabilities proportional to size, area and telephone sampling, ratio means, sampling errors, frame problems, cost factors, and practical designs and procedures. Emphasis is on practical considerations rather than on theoretical derivations, although understanding of principles requires review of statistical results for sample surveys. The course includes an exercise that integrates the different techniques into a comprehensive sample design.

Prerequisite: Two graduate-level courses in statistical methods.

2018 Syllabus (PDF)

 


Data Collection Using Wearables, Sensors, and Apps in the Social, Behavioral, and Health Sciences-July 13-14, 2020

SurvMeth 988.400 (1 credit hour)

Instructors: Heidi Guyer, University of Michigan and Florian Keusch, University of Mannheim

The recent proliferation of mobile technology allows researchers to collect objective health and behavioral data at increased intervals, in real time, and may also reduce participant burden. In this course, we will provide examples of the utility of and integration of wearables, sensors, and apps in research settings. Examples will include the use of wearable health devices to measure activity, apps for ecological momentary assessment, and smartphone sensors to measure sound and movement, among others. Additionally, this course will consider the integration of these new technologies into existing surveys and the quality of the data collected from the total survey error perspective. We will discuss considerations for assessing coverage, participation, and measurement error when integrating wearables, sensors, and apps in a research setting as well as the costs and privacy considerations when collecting these types of data. Participants will work in groups to discuss a research study design using new technology and have the opportunity for hands-on practice with sensor data.

Prerequisite: you must have your own laptop to participate in this class. 

 

2019 Syllabus (PDF)

 


 

Introduction to Data Collection Methods-July 16-17, 2020

SurvMeth 988.225 (1 credit hour)

Instructors: Florian Keusch, University of Mannheim

This 2-day workshop will introduce students to different methods of collecting data in the social sciences. Surveys are the most common form of collecting primary data in many disciplines, and this course will provide students with an overview of interview-administered (face-to-face and telephone) and self-administered (mail, web, mobile web, and SMS) survey data collection as well as the combination of multiple modes (mixed mode surveys). The course will in particular discuss the implication of survey design decisions on data quality. In addition, students will also receive an overview on alternative data sources (e.g., passive measurement, social media and administrative data) and how they can be used in combination with traditional survey data.

2019 Syllabus (PDF)

 


 

Introduction to the Health and Retirement Study (HRS) Workshop-June 8-12, 2020

Location: Institute for Social Research, Room 1070

Dates:  June 8-12, 2020

Time:  9:00am-4:00pm, Monday-Friday

Not for credit

Instructors: Amanda Sonnega, University of Michigan

The Health and Retirement Study (hrsonline.isr.umich.edu) Summer Workshop is intended to give participants an introduction to the study that will enable them to use the data for research. HRS is a large-scale longitudinal study with more than 20 years of data on the labor force participation and health transitions that individuals undergo toward the end of their work lives and in the years that follow. The HRS Summer Workshop features morning lectures on basic survey content, sample design, weighting, and restricted data files. Hands-on data workshops are held every afternoon in which participants learn to work with the data (including the user-friendly RAND version of the HRS data) under the guidance of HRS staff. Staff of the Gateway to Global Aging project (G2Aging.org), which harmonizes data across HRS international sister studies, conduct an afternoon training. At the end of the week, students have the opportunity to present their research ideas to the class and HRS research faculty and obtain feedback. Topics include (but are not limited to) in depth information on HRS data about health insurance and medical care; biomarkers, physical measures, and genetic data; cognition; health and physical functioning; linkage to Medicare; employment, retirement, and pensions and linkage to Social Security records; psychosocial and well-being; family data; and international comparison data. The data training portion assumes some familiarity with SAS or STATA.

2018 Syllabus (PDF)

 


Introduction to Questionnaire Design-July 13-14, 2020

SurvMeth 988.223 (1 credit hour)

Instructor:  Jessica Broome, Jessica Broome Research

This course provides an overview of the art and science of questionnaire design. Topics will include basic principles of questionnaire design; factual and non-factual questions; techniques for asking about sensitive topics; designing scales and response options; survey mode considerations; and an introduction to pre-testing surveys. The course will consist of both lectures and hands-on activities.

2018 Syllabus (PDF)

 


 

Introduction to Survey Methodology-July 6-7, 2020

SurvMeth 988.208 (1 credit hour)

Instructor: Emilia Peytcheva, RTI International

This 2-day course will introduce participants to the basic principles of survey design, presented within the Total Survey Error framework.  The course provides an introduction to the skills and resources needed to design and conduct a survey, covering topics such as sampling frames and designs, mode of data collection and their impact on survey estimates, cognitive processes involved in answering survey questions, best questionnaire design practices, and pretesting methods.

2018 Syllabus (PDF)

 


 

Introduction to Survey Sampling-July 9-10, 2020

SurvMeth 988.219 (1 credit hour)

Instructor: Sunghee Lee, University of Michigan

This is a foundation course in sample survey methods and principles.  The instructors will present, in a non-technical manner, basic sampling techniques such as simple random sampling, systematic sampling, stratification, and cluster sampling.  The instructors will provide opportunities to implement sampling techniques in a series of exercises that accompany each topic. 

Participants should not expect to obtain sufficient background in this course to master survey sampling.  They can expect to become familiar with basic techniques well enough to converse with sampling statisticians more easily about sample desig

 


 

Responsive Survey Design:  A Research Education Program-June 22-July 17, 2020

Below is the list of the individual courses offered as part of the RSD Program.

For more information on this program, please visit the RSD Program web site: https://rsdprogram.si.isr.umich.edu/

Not for academic credit workshop (*Remote participation option ONLY)

RSD has financial support available to those who qualify

Responsive survey design (RSD) refers to a method for designing surveys that has been demonstrated to increase the quality and efficiency of survey data collection. RSD uses evidence from early phases of data collection to make design decisions for later phases. Beginning in the 2018 Summer Institute, we will offer a series of eleven one-day short courses in RSD techniques.

*Remote participation optionIt is not necessary to be physically in Ann Arbor to participate in these workshops. Students who cannot be in Ann Arbor can enroll and join sessions via  BlueJeans (https://www.bluejeans.com/).  Once enrollment is confirmed via email, indicate if course attendance will be in person, in Ann Arbor or via BlueJeans.  Survey Methodology for Randomized Controlled Trails does not have the remote participation option.

These courses will include:

1.  Basic Concepts and Theoretical Background-June 29, 2020

Instructor: James Wagner, Brady West, Andy Peytchev and Frauke Kreuter

This course will provide participants with an overview of the primary concepts underlying RSD. This will include discussion of the uncertainty in survey design, the role of paradata, or data describing the data collection process, in informing decisions, and potential RSD interventions. These interventions include timing and sequence of modes, techniques for efficiently deploying incentives, and combining two-phase sampling with other design changes. Interventions appropriate for face-to-face, telephone, web, mail and mixed-mode surveys will be discussed. Using the Total Survey Error (TSE) framework, the main concepts behind these designs will be explained with a focus on how these principles are designed to simultaneously control survey errors and survey costs. Examples of RSD in both large and small studies will be provided as motivation.  Small group exercises will help participants to think through some of the common questions that need to be answered when employing RSD.

 

2.  Case Studies in Responsive Design Research-July 1, 2020

Instructor: Brady West, William Axinn, Joe Murphy and Barry Schouten

This course will explore several well-developed examples of RSD. Dr. West will serve as a moderator of the course, and also introduce a case study from the National Survey of Family Growth (NSFG). The instructors will then provide independent examples of the implementation of RSD in different international surveys. All case studies will be supplemented with discussions of issues regarding the development and implementation of RSD. Case studies will include the NSFG, the Relationship Dynamics and Social Life (RDSL) survey, the University of Michigan Campus Climate (UMCC) Survey, and the Netherlands Survey of Consumer Satisfaction, among others. This variety of case studies will reflect a diversity of survey conditions. The NSFG (West) is a cross-sectional survey that is run on a continuous basis with in-person interviewing. The RDSL (Axinn) is a panel survey that employed a mixed-mode approach to collecting weekly journal data from a panel of young women. The UMCC survey is a web survey of students at UM that employed multiple modes of contact across the phases of the design. The Netherlands Survey of Consumer Satisfaction (Schouten) is a mixed-mode survey combining web and mail survey data collection with telephone interviewing. The focus of the course will be on practical tools for implementing RSD in a variety of conditions, including small-scale surveys.

 

3.  Responsive Survey Design for Web Surveys-July 3, 2020

Instructor: Stephanie Coffey, Scott Crawford and Julie Smith

Topics covered: Web surveys can be an inexpensive method for collecting data. This is especially true for designs that repeat measurement over several time periods. However, these relatively low-cost data collections may result in reduced data quality if the problem of nonresponse is ignored. This course will examine methods for using RSD to effectively deploy scarce resources in order to minimize the risk of nonresponse bias. Recent experience with the University of Michigan Campus Climate Survey and the National Survey of College Graduates is used to illustrate this point. These surveys are defined by phased designs and multiple modes of contact. This approach produced relatively high response rates and used alternative contact methods in later phases to recruit sample members from subgroups that were less likely to respond in earlier phases. In the case of the UM-CCS all of this was accomplished on a very small budget and with a small management team. Lessons from these experiences can be directly applied in many similar settings.

 

4.  Data Visualization for Active Monitoring-July 6 and 8, 2020

Instructor: Brad Edwards and Victoria Vignare

Topics covered: This course will cover basic concepts for the design and use of “dashboards” for monitoring survey data collection. We will begin with a detailed discussion of how to design dashboards from an RSD perspective. This will include concrete discussions of how relevant data may be collected and summarized across a variety of production environments. We will also discuss how these dashboards can be used to implement RSD interventions on an ongoing basis. We will demonstrate these points using examples from actual dashboards. We will briefly explore methods for modeling incoming paradata in order to detect outliers. In the afternoon, we will consider practical issues associated with the development of dashboards, including software alternatives. Finally, we will demonstrate how to update dashboards using data reflecting the results of ongoing fieldwork. Students will be provided with template spreadsheet dashboards as discussed earlier.

 

5.  Alternative Indicators Designed to Maximize Data Quality-July 10 and 13, 2020

Instructor: Barry Schouten and Natalie Shlomo

Topics covered: The response rate has been shown to be a poor indicator for data quality with respect to nonresponse bias. Several alternatives have been proposed – the fraction of missing information (FMI), R-Indicators, subgroup response rates, etc. This course will explore the use of these indicators as guides for data collection when working within an RSD framework. We also explore optimization techniques that may be useful when designing a survey to maximize these alternative indicators. The consequences of optimizing a survey to other indicators will be explored. We will also consider how the response rate fits into this approach. We will end with a brief discussion of methods for post data collection evaluation of data quality.

6.  Implementing, Managing, and Analyzing Interventions in a Responsive Survey Design Framework-July 15 and 17, 2020

Instructor: Brady West  

Topics covered: This course will discuss a variety of potential RSD interventions. Many of these have been implemented experimentally, and the course will include evaluations of those experiments. The importance of experimental evaluations in early phases of RSD will be discussed. Methods for implementing interventions will also be discussed, including implementation of experiments aimed at evaluating new interventions. Strategies for implementing these interventions with both interviewer-mediated and self-administered (e.g., web and mail) surveys will be discussed. Methods for the evaluation of the results of the interventions (experimental and otherwise) will be considered. These evaluations will include measures of both costs and errors.

 


 

Workshop in Survey Sampling Techniques-June 3-July 31, 2020

SurvMeth 616 (6 credit hours)

Instructors: Steve Heeringa, Jim Lepkowski and Raphael Nishimura, University of Michigan

The Workshop in Sampling Techniques is a component of the Sampling Program for Survey Statisticians. The workshop can only be taken in conjunction with the sampling methods courses, Methods of Survey Sampling and Analysis of Complex Sample Survey Data. The workshop allows students the opportunity to implement methods studied in the companion methods courses such as segmenting and listing in area sampling; selection of a national sample of the U.S.; stratification; controlled selection; telephone sampling; national samples for developing countries; and sampling with microcomputers.

The workshop is a required class for the Sampling Program for Survey Statisticians (SPSS). The SPSS is an eight-week program. It consists of three courses: a methods course (SurvMeth 612), a course on the analysis of complex sample survey data (SurvMeth 614), and a hands-on daily workshop (SurvMeth 616). Students enrolled in these three courses are considered Fellows in the Program. The methods and the analysis courses may be taken without being a Fellow. However, the workshop cannot be taken alone. Fellows receive a certificate upon successful completion of the program.

Syllabus 2018 (PDF)