Methods Bites

Blog of the MZES Social Science Data Lab

Events

Upcoming
2021-12-08 | Input Talk | Denis Cohen (MZES)
Getting the most out of comparative vote switching data: A new framework for studying dynamic multi-party competition
more

Online-only event [Zoom Meeting]
December 08, 2021, 13:45-15:15

Abstract

Large literatures on party competition and voting behavior focus on voter reactions to parties' policy strategies, agency, or legislative performance. While many inquiries make explicit assumptions about the direction and magnitude of voter flows between parties, comparative empirical analyses of vote switching remain rare. In this talk, I present a new approach that overcomes three challenges that have previously impeded the comparative study of dynamic party competition based on voter flows: A newly compiled data set that marries comparative vote switching data with information on party behavior and party systems in over 200 electoral contexts across 36 OECD countries, a novel conceptual framework for studying how party behavior affects voter retention, defection, and attraction in multi-party systems, and a statistical model that renders this framework operable. An applied walkthrough showcases the data set and a newly developed R package for the estimation of the newly developed statistical model, along with functions for the calculation and visualization of substantively meaningful quantities of interest.

Presenter(s)

Denis Cohen is a postdoctoral fellow in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim. His research focus lies at the intersection of political preference formation, electoral behavior and political competition. His methodological interests include quantitative approaches to the analysis of clustered data, measurement models, data visualization, strategies for causal identification, and Bayesian statistics.


2021-11-24 | Input Talk | Fabienne Lind (University of Vienna)
Multilingual Automated Text Analysis for Comparative Social Science Research
more

Online-only event [Zoom Meeting]
November 24, 2021, 13:45-15:15

Abstract

tba

Presenter(s)

Fabienne Lind is a research associate at the Department of Communication at the University of Vienna as a part of the H2020 project OPTED. Her research interests include political communication and quantitative methods with a focus on quantitative text analysis.


2021-11-03 | Roundtable | Ruben Bach, Jörg Dollmann, Jennifer Eck, Alejandro Ecker, Johanna Gereke
MZES Roundtable "Collection of Micro-level Data"
more

Internal online-only event. Open to MZES members and external MZES fellows only.
November 03, 2021, 13:45-15:15

Abstract

tba

Presenter(s)

Ruben Bach is a postdoctoral researcher at the University of Mannheim, focusing on social science quantitative research methods. His interests include topics related to big data in the social sciences, machine learning, causal inference, and survey research.

Jörg Dollmann is a research fellow at the Mannheim Centre for European Social Research (MZES) and the project coordinator of the panel survey CILS4EU-DE.

Jennifer Eck is a social psychologist at the University of Mannheim, School of Social Sciences. Her research interests include social exclusion, assimilation and contrast, as well as self-concept.

Alejandro Ecker is Assistant Professor in Politics and Communication in Ibero-America at the Heidelberg Center for Ibero-American Studies (HCIAS) and the Faculty of Economics and Social Sciences at Heidelberg University. Combining observational data with experimental and machine learning methods, his research focuses on the effects of political institutions on the behavior of multiparty governments, political parties, and individual politicians and their consequences for citizen behavior and voter attitudes.

Johanna Gereke is a postdoctoral research fellow at the Mannheim Centre for European Social Research (MZES). Her current research focuses on intergroup relations, migration, discrimination and cooperative behavior in modern societies and draws on a range of experimental and quasi-experimental methods, including original lab-in-the-field, survey and field experiments.


2021-10-13 | Input Talk | Will Lowe (Hertie School of Governance)
tba
more

Online-only event [Zoom Meeting]
October 13, 2021, 13:45-15:15

Abstract

tba

Presenter(s)

Will Lowe is Senior Research Scientist at the Hertie School. His research spans legislative politics, political economy, and public policy. Methodologically he is interested in statistical models of text and in causal inference.


2021-09-22 | Input Talk | Sarah Shugars (New York University)
Networks All the Way Down: Assessing Modeling Choices for Political Conversation
more

Online-only event [Zoom Meeting]
September 22, 2021, 13:45-15:15

Abstract

Political conversations, whether online or in person, are networked along multiple dimensions: people come into contact with each other through social networks, they spread messages and ideas using semantic networks, and conversational interactions themselves form a network of back-and-forth exchange. Each of these networked dimensions can be valuable in understanding the political implications of discourse and for developing appropriate interventions around the spread of misinformation and toxic speech. Yet it is rarely practical or meaningful to consider all of these networks simultaneously. Indeed, most studies focus on a single type of social, semantic, or conversational network and make explicit choices about the content of interest and the types of relationships examined. Research on Twitter, for example, may consider social networks formed by follower relationships, semantic networks formed by hashtag co-occurrence, or conversational networks of replies and interactions. Each of these networks is meaningful in its own right, but only captures a piece of the larger public discourse. This paper therefore examines the network modeling choices researchers must make when studying political conversations. Using diverse corpora including Twitter exchanges, Reddit threads, and U.S. Congressional debates, we present a framework for modeling the social, semantic, and conversational networks of political discourse in a range of contexts. We illustrate what can and cannot be inferred from individual network models, and assess the sensitivity of findings to various modeling choices. Ultimately, this paper presents a roadmap to assist researchers in identifying the network models most appropriate for different research questions related to political discourse.

Presenter(s)

Sarah Shugars is a computational political scientist, studying American political behavior and developing new methods in natural language processing, network analysis, and machine learning.


2020-2021
more
2021-06-16 | Input Talk | Anna M. Wilke (UC Berkeley)
Detecting Intra-Cluster Spillovers Using a Placebo-Controlled Design
more

Online-only event [Zoom Meeting]
June 16, 2021, 15:30-17:00

Abstract

Questions about the degree to which treatment effects diffuse through social networks are of great policy relevance. For example, philanthropic groups routinely deploy media interventions in developing countries to promote pro-social attitudes. Even though studies have shown that such interventions can change the minds of those who are directly exposed to media content, less is known about the existence of second-hand or spillover effects. If audience members convey the media message to others in their social network, the media campaign’s reach expands, perhaps by a sizable factor. Scholarly interest in such spillover effects has grown markedly in recent years, and experimental designs to detect them have become increasingly sophisticated. In this talk, I present a design-based strategy to assess intra-cluster diffusion of treatment effects in cluster-randomized trials. The key design innovation is a placebo condition that helps reveal the degree to which experimental subjects would have been exposed to treatment had they been assigned to it. I contrast the approach with other design-based ways to identify spillover effects and present results from two large cluster-randomized experiments that implement this strategy. Both studies are set in rural Uganda and assess the effect of video dramatizations on the topics of violence against women, teacher absenteeism and abortion stigma. We find several instances of sizable and highly significant direct effects on the attitudes of audience members, but little evidence that these effects diffused to others in the villages where the videos were aired. A paper that employs the proposed design can be found here.

Presenter(s)

Anna M. Wilke is a Ph.D. Candidate in Political Science at Columbia University and a Predoctoral Fellow at the University of California, Berkeley. Her work focuses on the comparative politics of developing countries, mainly in Sub-Saharan Africa. She has conducted field work in South Africa, Uganda and Ethiopia and employs experimental methods and formal theory in her research.

Materials

workshop materials
video recording


2021-05-26 | Workshop | Yannik Buhl (Stuttgarter Zeitung)
Telling Stories with Data: Insights into Data Journalism
more

Online-only event [Zoom Meeting]
May 26, 2021, 13:45-15:15

Abstract

Data journalism is all about using and presenting data in a way that readers will intuitively understand them. In this event, we will talk about examples of data-driven stories in order to demonstrate how journalists tell stories using data, what the obstacles are to a good data-driven story and what scientists and journalists can learn from each other regarding data storytelling.

Presenter(s)

Yannik Buhl was formerly a data journalist at Stuttgarter Zeitung and Stuttgarter Nachrichten, where he writes primarily on mobility turnaround and transport policies using intuitive visualisations (example 1, example 2). Prior to joining the Stuttgarter Zeitung and Stuttgarter Nachrichten, he obtained his MA in Political Science with a focus on quantitative methods from the University of Mannheim.

Materials

workshop materials
video recording


2021-05-05 | Workshop | Marius Sältzer (University of Mannheim)
How to Read Tea Leaves: A hands-on Guide for Semantic Validation of Text Models using Oolong
more

Online-only event [Zoom Meeting]
May 05, 2021, 13:45-15:15

Abstract

The growing supply of unstructured text is a great chance, but also a challenge for social science. In many instances we want to classify, scale or compare text for which no prelabeled data is available. In this case, unsupervised learning techniques such as topic models or the use of dictionaries promise the automated analysis of text with little or no human input. But these models are notoriously difficult to evaluate. While the validation of statistical properties of topics models is well established, the substantive meaning of categories uncovered is often less clear and their interpretation reliant on "intuition" or "eyeballing". Computer science scholars rather call it "reading tea leaves". The story for dictionary-based methods is not better. Researchers usually assume these dictionaries have built-in validity and use them directly in their research. Oolong provides a set of tools to objectively judge substantive interpretability to applied users in disciplines such as political science and communication science. It allows standardized content based testing of topic models as well as dictionary-based methods with clear numeric indicators of semantic validity. This session is a hand-on guide on how to create and administer your own tests.

Presenter(s)

Marius Sältzer is a doctoral researcher in political science at the University of Mannheim. His research revolves around the dimensions of political conflict, e.g., the questions what issues matter for the public, political parties and their constituencies. To answer these questions, he studies political communication of legislators, parties and other key political actors, with a special emphasis on political elites' use of social media.

Materials

workshop materials
video recording


2021-04-14 | Workshop | Frie Preu (CorrelAid)
Why to use Git and Git essentials workshop: An argument for adopting Git + GitHub/GitLab for academic research followed by a getting started workshop
more

Online-only event [Zoom Meeting]
April 14, 2021, 13:45-15:15

Abstract

Oolong provides a set of tools to objectively judge substantive interpretability to applied users in disciplines such as political science and communication science. It allows standardized content based testing of topic models as well as dictionary-based methods with clear numeric indicators of semantic validity. This session is a hand-on guide on how to create and administer your own tests.

Presenter(s)

Frie Preu is a data scientist, a low-budget data engineer and COO of CorrelAid, a data4good network of over 1500 data scientists. Before, she studied political science and data science at the University of Konstanz and worked in IT consulting.

Materials

workshop materials
video recording


2021-03-17 | Workshop | Sara Stoudt (Smith College)
Generalized Additive Models: Allowing for some wiggle room in your models
more

Online-only event [Zoom Meeting]
March 17, 2021, 13:45-15:15

Abstract

In this workshop, we'll unpack GAMs as an extension of generalized linear models, learn about the role of splines in these models, and explore the many choices available to define and fit these models. We'll be using data on traffic stops to investigate racially-biased policing in South Carolina as a motivating example, and we'll get a chance to try out the related R code so that you have the basic tools needed to try out GAMs in your own research context.

Presenter(s)

Sara Stoudt is a lecturer in the Statistical & Data Sciences program at Smith College. She received her PhD in statistics from the University of California, Berkeley where she was also a Berkeley Institute for Data Science, and her BA in Mathematics with an emphasis on Statistics from Smith College. Her research focuses on ecological applications of statistics and statistics communication.

Materials

workshop materials
video recording
blog post


2021-03-03 | Workshop | Julia Schulte-Cloos (LMU Munich)
Reproducible and Dynamic Documents with RMarkdown
more

Online-only event [Zoom Meeting]
March 03, 2021, 13:45-15:15

Abstract

As demands for computational reproducibility in science are increasing, tools for literate programming are becoming ever more relevant. R Markdown offers a framework to generate reproducible research in various output formats. I present a new package (reproducr) that allows users without any prior knowledge of R Markdown to implement reproducible research practices in their scientific workflows. The reproducr package provides a single Rmd-template that is fully optimized for two different output formats, HTML and PDF. While in the stage of explorative analysis and when focusing on content only, researchers may rely on the ‘draft mode’ of the template that knits to HTML and allows them to interactively explore their data. When in the stage of research dissemination and when focusing on the presentation of results, in contrast, researchers may rely on the ‘manuscript mode’ that knits to PDF and allows them to circulate a publication-ready version of their working paper or submit it (blinded) for review.

Presenter(s)

Julia Schulte-Cloos is a Marie Sklodowska-Curie funded LMU Research Fellow at the Geschwister Scholl Institute of Political Science at LMU Munich. Her research lies at the intersection of comparative politics, political sociology and socio-psychology. An advocate of open science, she is a member of the LMU Open Science Center and part of the catalyst network of the Berkeley Initiative for Transparency in the Social Sciences (BITSS).

Materials

workshop materials


2020-12-09 | Input Talk | Carsten Sauer (Zeppelin University)
Factorial Survey Designs
more

Online-only event
December 09, 2020, 13:45-15:15

Abstract

The factorial survey (vignette analyses) is a method that integrates multi-factorial experimental designs into surveys. Respondents are asked to evaluate fictitious situations, objects or persons. By systematically varying attributes of the descriptions it is possible to determine their influence on respondents' stated attitudes, decisions, or choices. Due to experimental variation of stimuli researchers can estimate the influence of each attribute on respondents' evaluations. As the experiment is embedded in a survey questionnaire, it is possible to reach heterogeneous sample populations. This workshop provides insights into the steps that are necessary to design factorial survey experiments: (1) construction of vignettes and response scales, (2) selection of an experimental design, (3) programming of vignettes for implementation into questionnaires, (4) data management, (5) data analysis techniques. The workshop furthermore discusses (6) methodological issues and best practices and shows similarities and differences to (7) related methods like conjoint analysis and choice experiments.

Presenter(s)

Carsten Sauer is Full Professor of Sociology and Social Stratification in the Department of Political & Social Sciences at Zeppelin University, Friedrichshafen, Germany. His susbtantive work focuses on social inequality, social stratification, labor markets, empirical justice research organizations, and health. Methodologically, he is interested in quantitative methods, survey experiments, and longitudinal data analysis.

Materials

workshop materials
video recording


2020-11-18 | Input Talk | Christine Choirat (Harvard) & Emma Jablonski (UC San Diego)
Enabling Collaborative and Reproducible Data Science with the Renku Platform
more

Online-only event
November 18, 2020, 15:30-17:00

Abstract

Communities and funding sources are increasingly demanding reproducibility in scientific work. There are now a variety of tools available to support reproducible data science, but choosing and using one is not always straightforward. In this tutorial, we present RENKU: an open-source platform integrating git, Jupyter/RStudio Server, Docker, analysis workflows linked with a queryable knowledge graph.

Presenter(s)

Christine Choirat is the Chief Innovation Officer of the Swiss Data Science Center and an Adjunct Lecturer on Biostatistics at the Harvard T.H. Chan School of Public Health and at the Harvard Extension School. Her research interests are data science and high-performance computing, reproducible research, and environmental policy and health policy.

Emma Jablonski is a doctoral research in the History of Science Program at UC San Diego, CA, USA. Previously, she worked on systems to facilitate computational molecular dynamics research at D. E. Shaw Research and on exoplanet climate modeling in the astrobiology group at NASA GISS, both in New York City. Her research interests include networks and complexity as applied to life in the universe and also to the flow of scientific information through academia and society.

Materials

workshop materials


2020-11-04 | Input Talk | Marcel Neunhoeffer (University of Mannheim)
Generative Adversarial Nets for Social Scientists
more

Online-only event
November 04, 2020, 13:45-15:15

Abstract

In this talk I introduce Generative Adversarial Networks (GANs) for Social Scientists. GANs are an innovative neural network architecture where two neural networks adversarially learn arbitrary target distributions. A Generator network learns to produce simulated samples that mimic real data. At the same time, a Discriminator network learns to distinguish between real and simulated data. A GAN is successful in producing simulated data if a Discriminator is maximally uncertain about the origins of the data (real or simulated). GANs achieve impressive results in producing synthetic samples from complex data like images (e.g. cats, faces) or audio data (e.g. voices, songs). In this talk, I introduce current applications of GANs and present my work on their use for Social Science research. In particular, I will cover applications to Multiple Imputation, Small Area Estimation and the Generation of fully Synthetic Data. All applications will be accompanied by hands-on code examples.

Presenter(s)

Marcel Neunhoeffer is a PhD Candidate and Research Associate at the chair of Political Science, Quantitative Methods in the Social Sciences, at the University of Mannheim. His research focuses on political methodology, specifically on the application of deep learning algorithms to social science problems. His substantive interests include data privacy, political campaigns, and forecasting elections.

Materials

workshop materials
video recording


2020-10-21 | Input Talk | Stefan Jünger (GESIS)
Management and Analysis of Georeferenced Survey Data
more

Online-only event
October 21, 2020, 13:45-15:15

Abstract

Geospatial methods have become an emerging field in social science survey research where Geographic Information Systems (GIS) facilitate enriching individual-level survey data with auxiliary geospatial information, such as road traffic noise. This development is due to researchers' increased general interest in the questions that can be answered via these methods, but also because of more and more available data. However, this endeavor remains an issue because applying GIS in social science survey research is challenging, requiring new analytical skills from diverse and foreign disciplines, such as ecology and engineering. Data management issues, including technical procedures, data protection, and access to georeferenced survey data, must be resolved. Lastly, researchers are confronted with how these additional data build upon existing knowledge within the social sciences. In my talk, I give a general overview of the data management challenges of using GIS in social science survey research, including organizational, technical, and legal barriers. I show examples of different GIS methods for enriching survey data, and further demonstrate their analysis in 'real-life' social science applications. Nowadays, we can perform all these steps in the statistical language R. Therefore, since providing public access to georeferenced survey data is a bit tricky, I conclude with a small hands-on tutorial on wrangling geospatial data in R for creating maps.

Presenter(s)

Stefan Jünger is a postdoctoral researcher at GESIS, Leibniz Institute for the Social Sciences, where he provides services in the area of geocoding, georeferencing, and spatial linking. He is also deputy head of the GESIS Secure Data Center. His research focuses on the use, analysis and management of georeferenced data in social science (survey) research.

Materials

workshop materials



2019-2020
more
2020-05-12 | Workshop | Theresa Küntzler (University of Konstanz)
Extracting Emotions (and more) from Faces with Face++ and Microsoft Azure
more

Online-only event
May 12, 2020, 12:00-13:30

Abstract

Images are an increasingly used data source in the social sciences. One application is to extract features from human faces using machine learning algorithms. This workshop will provide a guide on how to use APIs for this task, specifically how to access the services offered by Face++ and the Microsoft Face API. While the talk focuses on extracting emotions from facial expressions, the method can also be used for other variables of interest such as gender or age. The talk starts with a short introduction on why we should care about emotions in social sciences, why APIs are useful for the task of facial expression recognition and where to apply caution with this method. The main part will be a walkthrough, to show (1) how to gain API access credentials, (2) how to call the API from R and (3) how to handle the output.

Presenter(s)

Theresa Küntzler is PhD candidate at the Graduate School of Decision Sciences at the University of Konstanz. She specializes in information processing and statistical analysis. Her research focuses on the role the of emotions in politics, especially in election behaviour.

Materials

workshop materials
video recording
blog post


2020-04-21 | Workshop | Lisa Lechner (University of Innsbruck)
Inferential Network Analysis (and Big Data): Challenges and Opportunities
more

Online-only event
April 21, 2020, 12:00-13:30

Abstract

Why should we take networks seriously? What are the gains and sacrifices analysing our social world from a network perspective? The talk starts off with this general question and continues with a brief navigation through the methodological world of network analysis. A hands-on guide (in the statistical software R) on how to create, describe, and test patterns in networks follows. The last part of the talk is dedicated to current challenges in network analysis. This encompasses the lag of good and interesting theories in network studies as well as the challenge (yet opportunity) of estimating inferential network models on large datasets.

Presenter(s)

Lisa Lechner is Assistant Professor in Political Science Methodology at the University of Innsbruck. Her research interests are trade policy, tax policy, diffusion, and issue-linkage. Her methodological expertise includes automatic text analysis and network analysis.

Materials

workshop materials
video recording


2020-03-31 | Input Talk | Hendrik Winkhardt (University of Mannheim)
Remote Computing Services: bwCloud, bwHPC, and Beyond
more

Online-only event
March 31, 2020, 12:00-13:30

Abstract

Demand for computational resources in the social sciences increases steadily, often to the point where local solutions are no longer sufficient. Reasons for this may be long runtimes and large memory requirements, the need for specific hardware, the demand for an optimized software stack, or the desire to parallelize applications over a large number of cores. We will offer an introduction to bwHPC, the federal supercomputing project of the universities in Baden-Württemberg, as well as a comparison to cloud solutions. Specifically, we will focus on the architecture and use of the bwForCluster MLS&WISO, located in Mannheim and Heidelberg. The talk will be aimed at entry-level users, as the system and its logic can be difficult to understand for newcomers, but will leave room for advanced questions.

Presenter(s)

Hendrik Winkhardt is an IT staff member at the University of Mannheim where he works for the bwHPC-S5 project "High Performance Computing in Baden-Württemberg".

Materials

workshop materials
video recording


2020-02-18 | Workshop | Denis Cohen, Cosima Meyer, Marcel Neunhoeffer, Oliver Rittmann
Efficient Data Management in R
more

Room A-231, A5, 6, 68159 Mannheim
February 18, 2020, 12:00-13:30

Abstract

The software environment R is widely used for data analysis and data visualization in the social sciences and beyond. Additionally, it is becoming increasingly popular as a tool for data and file management. Focusing on the latter aspects, we present workflows and best practices for efficient data management in R. Through applied exercises and walkthroughs, participants will learn about (1) the workflow for organizing and conducting complex analyses in R, (2) creating, editing, and accessing directory hierarchies and their contents, (3) data merging, data management and data manipulation using tidy R and base R, and (4) the basics of programming and debugging.

Presenter(s)

Denis Cohen is a postdoctoral fellow in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim, and one of the organizers of the MZES Social Science Data Lab. His research focus lies at the intersection of political preference formation, electoral behavior, and political competition. His methodological interests include quantitative approaches to the analysis of clustered data, measurement models, data visualization, strategies for causal identification, and Bayesian statistics.

Cosima Meyer is a doctoral researcher and lecturer at the University of Mannheim and one of the organizers of the MZES Social Science Data Lab. Motivated by the continuing recurrence of conflicts in the world, her research interest on conflict studies became increasingly focused on post-civil war stability. In her dissertation, she analyzes leadership survival - in particular in post-conflict settings. Using a wide range of quantitative methods, she further explores questions on conflict elections, women's representation as well as autocratic cooperation.

Marcel Neunhoeffer is a PhD Candidate and Research Associate at the chair of Political Science, Quantitative Methods in the Social Sciences, at the University of Mannheim. His research focuses on political methodology, specifically on the application of deep learning algorithms to social science problems. His substantive interests include data privacy, political campaigns, and forecasting elections.

Oliver Rittmann is a PhD Candidate and Research Associate at the chair of Political Science, Quantitative Methods in the Social Sciences, at the University of Mannheim. His research focuses on legislative studies and political representation. His methodological expertise includes statistical modeling, authomated text and video analysis, and subnational public opinion estimation.

Materials

workshop materials
blog post


2019-12-10 | Input Talk | Ruben Bach (University of Mannheim)
Using Web Logs and Smartphone Records for Social Research
more

Room A-231, A5, 6, 68159 Mannheim
December 10, 2019, 12:00-13:30

Abstract

In this talk, I will demonstrate how web logs (records of individuals' browsing behavior) and records of smartphone use can be used for social research, for example, to study political views and behaviors. First, I will talk about the question how to obtain such data and how one can extract information about individuals' behavior from web logs. Second, I will present results of my own work (predicting political views and behaviors from web logs) and from other studies that work with similar data (e.g., studies of political polarization and echo chambers in the online world). I will conclude the talk with a short overview of ongoing projects and potentials for future research projects.

Presenter(s)

Ruben Bach is a postdoctoral researcher at the University of Mannheim, focusing on social science quantitative research methods. His interests include topics related to big data in the social sciences, machine learning, causal inference, and survey research.

Materials

video recording
blog post


2019-11-26 | Workshop | Cosima Meyer (University of Mannheim) & Dennis Hammerschmidt (University of Mannheim)
Introduction to LaTeX and Overleaf
more

Room A-231, A5, 6, 68159 Mannheim
November 26, 2019, 12:00-13:30

Abstract

The LaTeX workshop offers an introduction, hands-on practices and a template for scientific articles. The aim is to provide the participants with sufficient knowledge of the general set-up of LaTeX to write (future) papers and to cope with common problems. We cover the LaTeX environment, including packages, structure, and commands. This allows to substantially improve the academic workflow. We further provide an originally generated template specifically made for this workshop that can later be used by the participants to get easily started with their projects in LaTeX.

Presenter(s)

Cosima Meyer is a PhD candidate at the Doctoral Center in Social and Behavioral Science of the Graduate School of Economics and Social Sciences, a research associate at the Chair of Political Science IV at the University of Mannheim, and a co-editor of Methods Bites. Her research focuses on conflict studies, particularly post-civil war stability.

Dennis Hammerschmidt is a PhD candidate at the Doctoral Center in Social and Behavioral Science of the Graduate School of Economics and Social Sciences and a research associate at the Chair of Empirical Democracy Research at the University of Mannheim. His research focuses on the alignment structure of states in the international system and the strategic application of foreign aid with a focus on vote-buying in international organizations. His methodological expertise includes general quantitative research, text analysis, and network analysis.

Materials

workshop materials
blog post


2019-11-05 | Workshop | Julian Schuessler (University of Konstanz)
Causal Graphs
more

Room A-231, A5, 6, 68159 Mannheim
November 05, 2019, 12:00-13:30

Abstract

This workshop discusses causal graphs as a fundamental modelling framework and highly useful tool for empirical researchers in the social sciences. Questions addressed in interaction with participants include drawing and interpreting a graph, understanding d-separation, the nature of post-treatment bias and other common mistakes in observational studies, the connection of causal graphs to structural models and potential outcomes, and using them to better understand instrumental variable and mediation analysis.

Presenter(s)

Julian Schuessler is a PhD Student at the Graduate School of Decision Sciences at the University of Konstanz, Germany, where he is also affiliated with the Center for Data and Methods. His research focuses on public support for the European Union, political economy, and quantitative methods. His methodological interests include non-parametric causal inference, especially using graphs, and Bayesian statistics.

Materials

workshop materials
video recording


2019-10-15 | Input Talk | Konstantin Gavras (MZES)
Shiny Apps: Development and Deployment
more

Room A-231, A5, 6, 68159 Mannheim
October 15, 2019, 12:00-13:30

Abstract

Shiny Apps allows developers and researchers to easily build interactive web applications only using the statistical software R. These apps allow R developers to interactively communicate their work to a broader audience in order to facilitate outreach. Since Shiny Apps comes with an extensive backend setup, users do not need extensive web development skills to build and host standalone apps on a homepage. However, for those keen in building beautiful apps, Shiny Apps allows for CSS, html and JavaScript extensions. In this workshop, I introduce the Shiny environment and show important features to develop Shiny apps, which can be used either for data presentation, as a communication tool for results or even as interactive analytical tool. Using the example data sets by R, I introduce the distinction between front-end ui.R and back-end server.R required to build Shiny apps. Based upon this, I will introduce important concepts and features to build an interactive app, including control widgets, reactivity and rendering. The participants will be able to build their own Shiny App after this workshop. In the last part of the workshop, I am going to show two ways of deploying Shiny Apps (letting them run in the world wide web), shinyapps.io and Shiny Server.

Presenter(s)

Konstantin Gavras is a Ph.D. candidate at the Graduate School of Economic and Social Sciences in Political Science, research associate at the Chair of Political Psychology at the University of Mannheim and doctoral researcher for the MZES project "Fighting together, moving apart? European common defence and shared security in an age of Brexit and Trump". His research interests comprise the intersection of Social Psychology and Political Behavior, focusing on the behavioral consequences and conditions underlying political attitudes regarding both domestic and foreign policies.

Materials

workshop materials
video recording
blog post


2019-09-16 | Input Talk | Florian Foos (LSE)
Randomized Experiments and Randomization Inference
more

Room A-231, A5, 6, 68159 Mannheim
September 16, 2019, 15:30-17:00

Abstract

Randomization inference is a design-based approach to hypothesis testing, which relies on minimal assumptions and enables the researcher to "analyse as you randomize". Randomization inference considers what would have happened under all possible random assignments (all possible ways of assigning N number of units to treatment and control). Against the backdrop of all possible random assignments, is the actual experimental result unusual, and how unusual is it? Randomization inference is flexible and allows for the test of different sharp hypotheses, using a variety of test-statistics to obtain p-values, which have an intuitive interpretation: the share of random assignments that produce a test statistic as large or larger than the statistic obtained from the realised experiment. Randomization-inference-based p-values can differ from p-values obtained from conventional tests if samples are small and/or if test-statistics are not normally distributed. During the workshop, building on the potential outcomes framework, I will introduce participants to the logic of randomization inference, and discuss applied examples both on the white board and using the ri2 package in R.

Presenter(s)

Florian Foos is an Assistant Professor in Political Behaviour in the Department of Government at the London School of Economics and Political Science (LSE). His research focuses on partisan election campaigns, including electoral mobilization, opinion change and political activism of politicians. His methodological expertise includes the design, conduct, and analysis of randomized field experiments as well as natural and quasi-experiments.

Materials

workshop materials


2019-09-10 | Input Talk | Denis Cohen (MZES)
Introduction to the Potential Outcomes Framework
more

Room A-231, A5, 6, 68159 Mannheim
September 10, 2019, 12:00-13:30

Abstract

This talk introduces participants to the potential outcomes framework, one of the primary approaches to causality in the social sciences and beyond. The talk covers the basic intuition of counterfactual causality as well as the fundamental problem of causal inference and relates core assumptions of frequently used identification strategies to the potential outcomes framework. A hands-on simulation exercise allows participants to apply the framework to artificial data and to further their understanding of biases in causal quantities of interest when core assumptions are violated.

Presenter(s)

Denis Cohen is a postdoctoral fellow in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim. His research focus lies at the intersection of political preference formation, electoral behavior and political competition. His methodological interests include quantitative approaches to the analysis of clustered data, measurement models, data visualization, strategies for causal identification, and Bayesian statistics.

Materials

workshop materials



2018-2019
more
2019-05-06 | Workshop | Simon Munzert (Hertie School of Governance)
Studying Politics on and with Wikipedia
more

May 06, 2019

Abstract

The online encyclopedia Wikipedia, together with its sibling, the collaboratively edited knowledge base Wikidata, provide incredibly rich yet largely untapped sources for political research. In this hands-on workshop, I will show how these platforms can inform research on public attention dynamics, policies, political and other events, political elites, and parties, among other things. To that end, I will show how to use R and the packages WikipediR, WikidataR, pageviews, and wikipediatrend to connect with APIs from the Wikimedia foundation and efficiently access and parse content. Furthermore, I will provide an overview of the legislatoR package, a fully relational individual-level data package that comprises political, sociodemographic, and Wikipedia-related data on elected politicians across the globe.

Presenter(s)

Simon Munzert is a lecturer in Political Data Science at Hertie School of Governance, Berlin. A former member of the MZES Data and Methods Unit, he is the originator of the Social Science Data Lab. His research focuses on public opinion, political representation and the role of new media for political processes.

Materials

workshop materials
blog post


2019-04-17 | Workshop | Denis Cohen (MZES)
Applied Bayesian Statistics using Stan and R
more

April 17, 2019

Abstract

This 90 minute workshop provides an applied introduction to Stan, a platform for statistical modeling and Bayesian statistical inference. Participants will get an overview of the programming language, the R interface RStan, and the workflow for Bayesian model building, inference, and convergence diagnosis. Applied exercises provide participants with the chance to write and run their own models.

Presenter(s)

Denis Cohen is a postdoctoral fellow in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim. His research focus lies at the intersection of political preference formation, electoral behavior and political competition. His methodological interests include quantitative approaches to the analysis of clustered data, measurement models, data visualization, strategies for causal identification, and Bayesian statistics.

Materials

workshop materials
blog post


2019-03-27 | Workshop | Simon Kühne (Universität Bielefeld)
Collecting and Analyzing Twitter Data Using R
more

March 27, 2019

Abstract

This 90 minute workshop provides an overview about Twitter data and how to collect and analyse it using R. Participants learn how to access Twitter's API in order to collect data for their own research projects. A number of examples illustrate how to preprocess and analyse the content and meta-information of Tweets.

Presenter(s)

Simon Kühne is a post-doc at Bielefeld University. He holds a BA in Sociology and an MA in Survey Methodology from the University of Duisburg-Essen and a PhD in Sociology from Humboldt University of Berlin. His research focuses on survey methodology, social media and online data, and social inequality.

Materials

workshop materials
blog post


2019-02-27 | Roundtable | Konstantin Gavras, Samuel Müller, Marius Sältzer
Roundtable on Text as Data (Part II)
more

February 27, 2019

Presenter(s)

Konstantin Gavras

  • Marius Sältzer: Sentiment Analysis for German Tweets by Election Candidates
  • Samuel Müller: Automated Extraction of Reasoning Using Topic Models
  • Konstantin Gavras: Inferring Policy Preferences from Strategy Papers on National Security in Europe using Unsupervised Machine Learning Technique

Samuel Müller NA

Marius Sältzer NA


2019-01-30 | Workshop | Cornelius Puschmann (Leibniz Institute for Media Research Hamburg)
Advancing Text Mining with R and quanteda
more

January 30, 2019

Abstract

The usefulness of R for text mining and content analysis has greatly increased in recent years, especially following the release of specialized packages such as tm, stringr and tidytext. My interactive presentation will focus on quanteda, which has rapidly become a all-purpose framework for conducting text mining with R due to its high functionality, speed and quality of documentation. I will showcase a number of techniques from corpus compilation and cleaning to the application of dictionaries such as LIWC and Lexicoder Policy Agendas and the application of text scaling models such as Wordscores and Wordfish. I will also show how topic modeling and supervised machine learning for extrapolating content categories can be applied through the topicmodels, STM and RTextTools packages, and point to interfaces with external services such as the Microsoft Cognitive Services and Google Cloud Machine Learning API. My presentation will close with suggestions for improving the robustness and reproducibility of content analyses conducted with R.

Presenter(s)

Cornelius Puschmann is a senior researcher at the Leibniz Institute for Media Research in Hamburg.

Materials

workshop materials
blog post


2018-12-05 | Workshop | Julian Bernauer (MZES) & Denis Cohen (MZES)
Introduction to R
more

December 05, 2018

Abstract

This brief introduction to R covers the following topics:

  • Algebraic operators and transformation
  • Object types and conversions
  • Control structures (loops, conditions, etc.)
  • Writing simple functions
  • Installing, updating, and using packages
  • Getting help in R
  • Data import and export
  • A glimpse on the tidyverse package
  • A quick first self-authored package in R

Presenter(s)

Julian Bernauer are postdoctoral fellows in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim, and co-organizers of the Social Science Data Lab.

Denis Cohen NA

Materials

workshop materials


2018-11-28 | Input Talk | Gavin Abercrombie (University of Manchester)
Topic-centric Sentiment Analysis of UK Parliamentary Debates
more

November 28, 2018

Abstract

Debate transcripts from the UK House of Commons provide access to a wealth of information concerning the opinions and attitudes of politicians and their parties towards arguably the most important topics facing societies and their citizens, as well as potential insights into the democratic processes that take place within Parliament. In my PhD project, I apply natural language processing and machine learning methods to debate speeches with the aim of determining the attitudes and positions expressed by speakers towards the topics they discuss. In this talk, I will present research on speech-level sentiment analysis and opinion-topic/policy detection in debate motions, as well as ongoing work on compiling a comprehensive review of research from both computer science and social science in this area. I will also discuss the challenges presented and multidisciplinary approaches to the problem, and present ideas for the direction of future investigation.

Presenter(s)

Gavin Abercrombie pursues a PhD in natural language processing at the School of Computer Science, University of Manchester.

Materials

workshop materials


2018-11-21 | Roundtable | Julian Bernauer, Jason Eichorst, Dennis Hammerschmidt, Verena Kunz, Federico Nanni
Roundtable on Text as Data (Part I)
more

November 21, 2018

Presenter(s)

Julian Bernauer

  • Dennis Hammerschmid: "Talk and Action in the United Nations - How Text Analysis can Help to Uncover Vote-Buying in the International Arena"
  • Verena Kunz: "Position Blurring as a Response to Competing Principals? Assessing Speech Clarity in the European Parliament"
  • Jason Eichorst: "Political Competency Signals in Word Choice"
  • Julian Bernauer and Federico Nanni: "Cross-Lingual Topical Scaling of Sparse Political Text using Word Embeddings"

Jason Eichorst NA

Dennis Hammerschmidt NA

Verena Kunz NA

Federico Nanni NA


2018-05-12 | Input Talk | Chung-hong Chan (MZES)
Fast, cheap, but is it still good? An opinionated guide to crowdsourcing platforms in 2018
more

May 12, 2018

Abstract

In 2008, four prominent Stanford AI researchers published an article "Fast, Cheap - but is it good?" and claimed crowdsourcing can produce very high-quality data for scientific research. A decade has passed and social scientists are picking up the pace to deploy crowdsourcing to collect survey data and conduct content analysis. A new silver bullet is born. In this talk, I will share my experience of using a crowdsourcing platform to conduct a large-scale, multilingual content analysis (a.k.a. crowdcoding). I will briefly go through the promises of those platforms in the literature and then talk about the pitfalls. A realistic conclusion is: it is impossible to obtain both fast, cheap, and good data from those platforms. As in the real life, it is only possible to take at most two out of the three. Sometimes you take none of them.

Presenter(s)

Chung-hong Chan is a Research Associate at the Mannheim Center for European social science research (MZES), University of Mannheim.

Materials

workshop materials



2017-2018
more
2018-05-09 | Input Talk | Katharina Meitinger (GESIS)
Dealing with the complexity of cross-national data: The method of web probing
more

May 09, 2018

Abstract

There has been a tremendous increase in cross-national data production in social science research in recent decades. Before drawing substantive conclusions based on cross-national survey data, researchers need to verify whether the measures are indeed comparable. An important addition to quantitative measurement invariance tests are qualitative approaches, such as web probing. In the first part, I will discuss why comparability of data should not be assumed but needs to be tested. I will shortly present the different approaches to test for and explain (in)comparability of data, introduce the method of web probing and present studies where web probing could shed light on incomparable data. In the second part of this talk, I will discuss different aspects of the implementation of web probing, such as sample size, nonresponse conversion, the optimal visual design (e.g., textbox size, order of probes) and how to analyze such data.


2018-04-11 | Input Talk | Federico Nanni (University of Mannheim)
Results from a Text Scaling Hackathon
more

April 11, 2018

Abstract

In my talk I'll offer an overview of a shared-task hackathon that took place as part of a research seminar bringing together a variety of experts and young researchers from the fields of political science, natural language processing and computational social science. The task looked at ways to develop novel methods for political text scaling to better quantify political party positions on European integration and Euroscepticism from the transcript of speeches of three legislations of the European Parliament. I will also focus on the potential of hackathons for fostering interdisciplinary collaborations between computer science and the social sciences and the next steps of my research group in this direction.

Materials

This paper summarizes the results of the hackathon. Here is related code for cross-lingual classification and scaling.


2018-03-21 | Input Talk | Timo Lenzner, Cornelia Neuert, Patricia Hadler
Cognitive Pretesting Methods
more

March 21, 2018

Abstract

This talk highlights the general importance of carrying out cognitive pretests before fielding a questionnaire. This is done by presenting examples of untested as well as pretested and improved questions. With regard to cognitive pretesting methods, we provide an introduction to the traditional cognitive interview (e.g., f2f interviewing) and give an overview of current developments (e.g., combining f2f interviews with eye-tracking, conducting cognitive pretests over the Web). Finally, we discuss the pros and cons of these different cognitive pretesting methods and offer practical advice on how to conduct cognitive pretesting projects.


2018-03-14 | Workshop | Denise Traber (University of Lucerne)
Quantitative Analysis of Political Text: Tools and Applications
more

March 14, 2018

Abstract

The workshop introduces concepts and methods for the quantitative analysis of political text (QTA) in R. Speeches delivered by prime ministers during the Euro-Crisis (EUSpeech dataset) serve as an application for the demonstration of text preparation, visualization, scaling, topic models and sentiment analysis. After an introduction of the text corpus and a brief discussion of QTA methods, the participants have the opportunity to carry out some QTA themselves under the instructors' supervision.

Presenter(s)

Denise Traber is a Senior Research Fellow at the University of Lucerne, Switzerland, where she heads an Ambizione research grant project on "The divided people: polarization of political attitudes in Europe" funded by the Swiss National Science Foundation. She has a strong interest in quantitative text analysis, has co-organized the first "Zurich Summer School for Women in Political Methodology" in 2017 and has recently published the article "Estimating Intra-Party Preferences: Comparing Speeches to Votes" in PSRM, jointly with Daniel Schwarz and Ken Benoit.

Materials

workshop materials


2018-02-21 | Input Talk | Chung-hong Chan (MZES)
Introduction to Social Media's RESTful APIs and data collection with SocialMediaLab
more

February 21, 2018

Abstract

In this talk, I will demonstrate how to collect data from social media. I will walk through how RESTful API works and how to obtain API access rights from Facebook, Twitter and Youtube (optional topic: Sina Weibo). The R package SocialMediaLab will be introduced, which is a easy tool for social media data collection and data transformation.

Materials

workshop materials


2017-11-29 | Workshop | Nate Breznau (MZES) & Christiane Grill (MZES)
Introduction to Structural Equation Modeling
more

November 29, 2017

Abstract

In our talk we will introduce participants to the techniques of structural equation modeling (SEM). We will show how a theoretical model represented through measurement models and possibly causal relationships can be applied to empirical data. The talk presents basic models relevant for social scientist: we start with exploratory and confirmatory factor analysis (EFA and CFA) and then move on to path models, latent class models and measurement invariance. In our talk we will also show how to use the statistical software Mplus to perform SEM. No previous knowledge of Mplus is required. Workshop participants can download and install Mplus if they want to follow the examples in class. A demo version is available here.

Materials

workshop materials


2017-11-28 | Input Talk | Chung-hong Chan (MZES)
Social Network Analysis with igraph
more

November 28, 2017

Abstract

This talk introduces the nuts and bots of social network analysis, and how to do it in R using the package igraph. In this talk, I will quickly walk through the concept of graph (social network), the common scenarios of data collection and the usual analysis patterns. Getting up close and personal, I will use the data scraped from the MZES website as an example to demonstrate how to collect, analyze and visualize the MZES collaboration network. Let's find out the most important researchers and fractions in MZES or not.

Materials

workshop materials


2017-10-18 | Workshop | Richard Traunmüller (University of Mannheim)
Visual Inference for the Social Sciences
more

October 18, 2017

Abstract

This talk introduces a remedy to the criticism frequently voiced against data visualization and exploration: that it may give rise to an over-interpretation of random patterns. A way to overcome this problem is the realization that "visual discoveries" correspond to the implicit rejection of "null hypotheses". The basic idea of visual inference is that graphical displays can be treated as "test statistics" and compared to a reference distribution of plots under the assumption of the null. Visual inference helps us answer the question "Is what we see really there?" By so doing, it seeks to overcome long-standing reservations against visualization as merely "informal" approach to data analysis and the fear that beautiful pictures may in fact not correspond to any meaningful patterns of substantive scientific interest. The talk illustrates the application and benefits of this visual method by drawing on examples from the social sciences. A little lab exercise will encourage participants to try out visual inference in practice using the statistical programming language R.

Materials

workshop materials
blog post


2017-10-04 | Workshop | Sebastian Pink (MZES)
Using mainly Stata and increasingly R (and knitr)
more

October 04, 2017

Abstract

Very familiar with Stata, probably like most of you, throughout my project and dissertation work, I came to increasingly incorporate R in my data analysis and even my data edition. In one instance, I had to run specific models for network analysis that I was not able to run in Stata. Then I ran the analysis in R but kept doing the entire preceeding data edition in Stata. In another instance, I ran a simulation model in R, which by nature slightly changed its results every time I ran it. As I wanted to avoid a time-consuming copy-and-paste marathon between R and Word, I wrote the manuscript describing this simulation model using knitr. The reason was that it automatically handed over the values, figures, and tables to a latex processor producing a nice document. In this talk I simply describe these developments in my workflow to show you how you may gain from incorporating R or knitr in small dosages in your Stata workflow.

Materials

workshop materials



2016-2017
more
2017-04-26 | Input Talk | Florian Keusch (MZES)
Introduction to Unipark
more

April 26, 2017

Abstract

In this Social Science Data Lab, I will give an introduction to the EFS Survey Software from Unipark (Questback). If you have never worked with the tool, then you will learn how to set up a first questionnaire to collect survey data over the Internet. We will discuss basic principals of participant recruitment, web questionnaire layout, and study design to conduct methodologically sound web surveys. This will also include taking into account the increasing number of respondents who participate in web surveys using their smartphone. For those who already have worked with Unipark before, we will have time to discuss more advanced features of the software such as working with quotas, lists, and loops.

Materials

workshop materials


2017-03-29 | Input Talk | Federico Nanni (University of Mannheim)
Topic-based and Cross-lingual Scaling of Political Text
more

March 29, 2017

Abstract

Political text scaling aims to linearly order parties and politicians across political dimensions (e.g., left-to-right ideology) based on textual content (e.g., politician speeches or party manifestos). Existing models, such as Wordscores and Wordfish, scale texts based on relative word usage; by doing so, they do not take into consideration topical information and cannot be used for cross-lingual analyses. In our talk, we present our efforts toward developing a topic-based and cross-lingual political text scaling approach. First we introduce our initial work, TopFish, a multi?level computational method that integrates topic detection and political scaling and shows its applicability for temporal aspect analyses of political campaigns (pre-primary elections, primary elections, and general elections). Next, we present a new text scaling approach that leverages semantic representations of text and is suitable for cross-lingual political text scaling. We also propose a simple and straightforward setting for quantitative evaluation of political text scaling.

Materials

workshop materials


2017-03-15 | Input Talk | Philipp Zumstein (University of Mannheim)
Building Infrastructure for Data-Driven Research
more

March 15, 2017

Abstract

Most methods for data-driven research (including Big Data, Data Science, and Digital Humanities) work primarily on text data or numbers. However, there is also a lot of information which is only available in printed books or newspapers. This information has to be first digitized and then further processed to extract the text or data. The main focus of the talk is optical character recognition (OCR). We will see the OCR workflow in general, discuss some OCR software, and how you can use these tools practically. Building such an infrastructure or performing these initial steps may need a reasonable amount of time and resources, or also be a project itself. The Mannheim University Library has in this area some infrastructure projects which are briefly mentioned.

Materials

workshop materials


2017-02-15 | Input Talk | Sarah Brockhaus (LMU Munich)
Functional Data Analysis in a Nutshell
more

February 15, 2017

Abstract

Functional data analysis (FDA) is a field of statistics that deals with the analysis of data that have a functional character. Functional data include curves, images, surfaces and trajectories. In the following, we will focus on curves. Growth curves are an example for one-dimensional functional data observed over time. Other examples are spectrometric measures over wavelength or blood markers measured continuously over time. FDA is applied in diverse fields including biometry, demography, medicine, linguistics and finance. Instead of analyzing single points on the curves, FDA treats the curves as observation units. The talk will approach FDA rather intuitively to give an idea of functional data. The talk covers basic summary statistics, like mean and variance for functional data, and contains an outlook to more complex methods like regression with functional data.

Materials

workshop materials


2017-01-18 | Workshop | Simon Munzert (MZES)
Advanced R and Recent Advances in R
more

January 18, 2017

Abstract

This one-day course is set out to improve your R skills and make you a more efficient programmer. In particular, you will:

  • become better at file management with R
  • learn all about piping operators
  • understand what functional programming means
  • get an overview of string processing and regular expressions
  • get to know new tools that help you tidy data
  • learn how to manipulate data frames efficiently
  • be able to routinely split-apply-combine your data
  • learn to establish a debugging workflow

Materials

workshop materials


2016-12-16 | Workshop | Richard Traunmüller (University of Mannheim)
Data Visualization
more

December 16, 2016

Abstract

Data visualisation is one of the most powerful tools to explore, understand and communicate patterns in quantitative information. At the same time, good data visualisation is a surprisingly difficult task and demands three quite different skills: substantive knowledge, statistical skill, and artistic sense. The course is intended to introduce participants to a) key principles of graphical perception and analytic design, b) useful visualisation techniques for the exploration and presentation of various forms of data and c) new developments of data visualisation for the social sciences, such as visual inference and visualising statistical models.

Materials

workshop materials
blog post


2016-12-14 | Input Talk | Malte Schierholz (MZES)
Fundamentals in Bayesian Statistics
more

December 14, 2016

Abstract

Besides the frequentist approach to statistical inference, which was dominant in science in the 20th century, another school exists: Bayesian Statistics. With modern computational techniques, Bayesian data analysis has a proven track-record and established itself as an alternative to frequentist procedures. Sometimes, Bayesian techniques can be applied to complex scientific questions where no frequentist solution exists. This talk gives an introduction to Bayesian statistics. While it is not possible to avoid central mathematical formulas and derivations, I concentrate on concepts, intuitive motivations, and interpretations that underlie the Bayesian view. Critical model assumptions are also discussed. Participants will learn when to mistrust a Bayesian analysis and in which situations it may provide new insights.

Materials

workshop materials


2016-11-16 | Input Talk | Eike M. Rinke (MZES)
An Open Science Primer for Social Scientists
more

November 16, 2016

Abstract

"Open Science" has become a buzzword in academic circles. However, exactly what it means, why you should care about it, and - most importantly - how it can be put into practice is often not very clear to researchers. In this session of the SSDL, we will provide a brief tour d'horizon of Open Science in which we touch on all of these issues and by which we hope to equip you with a basic understanding of Open Science and a practical tool kit to help you make your research more open to other researchers and the larger interested public. Throughout the presentation, we will focus on giving you an overview of tools and services that can help you open up your research workflow and your publications, all the way from enhancing the reproducibility of your research and making it more collaborative to finding outlets which make the results of your work accessible to everyone. Absolutely no prior experience with open science is required to participate in this talk which should lead into an open conversation among us as a community about the best practices we can and should follow for a more open social science.

Materials

workshop materials


2016-10-19 | Workshop | Sarah Brockhaus (LMU Munich)
Statistical Boosting with mboost
more

October 19, 2016

Abstract

The talk will be about model based boosting. Originally, boosting is an algorithm from the field of machine learning. It was further developed to fit statistical regression models, like linear models, generalized linear models and quantile regression models. Boosting can be used in high-dimensional data settings and inherently does variable selection. The first part of the talk will give some background information on boosting and explain the basic ideas. The second part will be on the practical use of the R package mboost, which provides a flexible toolbox to boost regression models.

Materials

workshop materials


2016-10-05 | Input Talk | Lars Kaczmirek (University of Vienna)
UCSP: Universal Client-Side Paradata
more

October 05, 2016

Abstract

The talk will inform about the collection of online paradata using the universal client-side paradata script (UCSP). To see which data are collected on the fly in the GESIS panel, check out the documentation at: http://kaczmirek.de/ucsp/ucsp.html Also, we will hear about EvalAnswer, a tool that helps you automatically code non-response in open questions. This in turn can be used to trigger conversion attempts in online surveys as well as to assign nonresponse codes to open answers in existing survey data sets.


2016-06-15 | Workshop | Simon Munzert (MZES)
Three easy-to-learn tools to scrape data from the Web with R
more

June 15, 2016

Abstract

This workshop shows how to

  • use regular expression to extract data from raw text (or websites)
  • use XPath for static webpage scraping
  • tap APIs from within R
  • scrape data from dynamic webpages (i.e. JavaScript-generated content) using AJAX and Selenium
Obviously, these are four not three tools. However, regular expressions are never easy to learn, so the title is still valid.

Materials

workshop materials