Methods Bites

Blog of the MZES Social Science Data Lab

Events

Upcoming
2019-12-10 | Input Talk | Ruben Bach (University of Mannheim)
Using Web Logs and Smartphone Records for Social Research
more

Room A-231, A5, 6, 68159 Mannheim
December 10, 2019, 12:00-13:30

Abstract
In this talk, I will demonstrate how web logs (records of individuals' browsing behavior) and records of smartphone use can be used for social research, for example, to study political views and behaviors. First, I will talk about the question how to obtain such data and how one can extract information about individuals' behavior from web logs. Second, I will present results of my own work (predicting political views and behaviors from web logs) and from other studies that work with similar data (e.g., studies of political polarization and echo chambers in the online world). I will conclude the talk with a short overview of ongoing projects and potentials for future research projects.

Presenter
Ruben Bach is a postdoctoral researcher at the University of Mannheim, focusing on social science quantitative research methods. His interests include topics related to big data in the social sciences, machine learning, causal inference, and survey research.


2019-11-26 | Workshop | Cosima Meyer (University of Mannheim)
Introduction to LaTeX and Overleaf
more

Room A-231, A5, 6, 68159 Mannheim
November 26, 2019, 12:00-13:30

Abstract
This skill training workshop is organized jointly by the Social Science Data Lab and the MZES Equal Opportunities Office.

Presenter
Cosima Meyer is a PhD candidate at the Doctoral Center in Social and Behavioral Science of the Graduate School of Economics and Social Sciences, a research associate at the Chair of Political Science IV at the University of Mannheim, and a co-editor of Methods Bites. Her research focuses on conflict studies, particularly post-civil war stability.


2019-11-05 | Workshop | Julian Schuessler (University of Konstanz)
Causal Graphs
more

Room A-231, A5, 6, 68159 Mannheim
November 05, 2019, 12:00-13:30

Abstract
This workshop discusses causal graphs as a fundamental modelling framework and highly useful tool for empirical researchers in the social sciences. Questions addressed in interaction with participants include drawing and interpreting a graph, understanding d-separation, the nature of post-treatment bias and other common mistakes in observational studies, the connection of causal graphs to structural models and potential outcomes, and using them to better understand instrumental variable and mediation analysis.

Presenter
Julian Schuessler is a PhD Student at the Graduate School of Decision Sciences at the University of Konstanz, Germany, where he is also affiliated with the Center for Data and Methods. His research focuses on public support for the European Union, political economy, and quantitative methods. His methodological interests include non-parametric causal inference, especially using graphs, and Bayesian statistics.


2019-2020
more
2019-10-15 | Input Talk | Konstantin Gavras (MZES)
Shiny Apps: Development and Deployment
more

Room A-231, A5, 6, 68159 Mannheim
October 15, 2019, 12:00-13:30

Abstract
Shiny Apps allows developers and researchers to easily build interactive web applications only using the statistical software R. These apps allow R developers to interactively communicate their work to a broader audience in order to facilitate outreach. Since Shiny Apps comes with an extensive backend setup, users do not need extensive web development skills to build and host standalone apps on a homepage. However, for those keen in building beautiful apps, Shiny Apps allows for CSS, html and JavaScript extensions. In this workshop, I introduce the Shiny environment and show important features to develop Shiny apps, which can be used either for data presentation, as a communication tool for results or even as interactive analytical tool. Using the example data sets by R, I introduce the distinction between front-end ui.R and back-end server.R required to build Shiny apps. Based upon this, I will introduce important concepts and features to build an interactive app, including control widgets, reactivity and rendering. The participants will be able to build their own Shiny App after this workshop. In the last part of the workshop, I am going to show two ways of deploying Shiny Apps (letting them run in the world wide web), https://www.shinyapps.io and Shiny Server.

Presenter
Konstantin Gavras is a Ph.D. candidate at the Graduate School of Economic and Social Sciences in Political Science, research associate at the Chair of Political Psychology at the University of Mannheim and doctoral researcher for the MZES project "Fighting together, moving apart? European common defence and shared security in an age of Brexit and Trump". His research interests comprise the intersection of Social Psychology and Political Behavior, focusing on the behavioral consequences and conditions underlying political attitudes regarding both domestic and foreign policies.

Materials
workshop materials


2019-09-16 | Input Talk | Florian Foos (LSE)
Randomized Experiments and Randomization Inference
more

Room A-231, A5, 6, 68159 Mannheim
September 16, 2019, 15:30-17:00

Abstract
Randomization inference is a design-based approach to hypothesis testing, which relies on minimal assumptions and enables the researcher to "analyse as you randomize". Randomization inference considers what would have happened under all possible random assignments (all possible ways of assigning N number of units to treatment and control). Against the backdrop of all possible random assignments, is the actual experimental result unusual, and how unusual is it? Randomization inference is flexible and allows for the test of different sharp hypotheses, using a variety of test-statistics to obtain p-values, which have an intuitive interpretation: the share of random assignments that produce a test statistic as large or larger than the statistic obtained from the realised experiment. Randomization-inference-based p-values can differ from p-values obtained from conventional tests if samples are small and/or if test-statistics are not normally distributed. During the workshop, building on the potential outcomes framework, I will introduce participants to the logic of randomization inference, and discuss applied examples both on the white board and using the ri2 package in R.

Presenter
Florian Foos is an Assistant Professor in Political Behaviour in the Department of Government at the London School of Economics and Political Science (LSE). His research focuses on partisan election campaigns, including electoral mobilization, opinion change and political activism of politicians. His methodological expertise includes the design, conduct, and analysis of randomized field experiments as well as natural and quasi-experiments.

Materials
workshop materials


2019-09-10 | Input Talk | Denis Cohen (MZES)
Introduction to the Potential Outcomes Framework
more

Room A-231, A5, 6, 68159 Mannheim
September 10, 2019, 12:00-13:30

Abstract
This talk introduces participants to the potential outcomes framework, one of the primary approaches to causality in the social sciences and beyond. The talk covers the basic intuition of counterfactual causality as well as the fundamental problem of causal inference and relates core assumptions of frequently used identification strategies to the potential outcomes framework. A hands-on simulation exercise allows participants to apply the framework to artificial data and to further their understanding of biases in causal quantities of interest when core assumptions are violated.

Presenter
Denis Cohen is a postdoctoral fellow in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim. His research focus lies at the intersection of political preference formation, electoral behavior and political competition. His methodological interests include quantitative approaches to the analysis of clustered data, measurement models, data visualization, strategies for causal identification, and Bayesian statistics.

Materials
workshop materials



2018-2019
more
2019-05-06 | Workshop | Simon Munzert (Hertie School of Governance)
Studying Politics on and with Wikipedia
more

Abstract
The online encyclopedia Wikipedia, together with its sibling, the collaboratively edited knowledge base Wikidata, provide incredibly rich yet largely untapped sources for political research. In this hands-on workshop, I will show how these platforms can inform research on public attention dynamics, policies, political and other events, political elites, and parties, among other things. To that end, I will show how to use R and the packages WikipediR, WikidataR, pageviews, and wikipediatrend to connect with APIs from the Wikimedia foundation and efficiently access and parse content. Furthermore, I will provide an overview of the legislatoR package, a fully relational individual-level data package that comprises political, sociodemographic, and Wikipedia-related data on elected politicians across the globe.

Presenter
Simon Munzert is a lecturer in Political Data Science at Hertie School of Governance, Berlin. A former member of the MZES Data and Methods Unit, he is the originator of the Social Science Data Lab. His research focuses on public opinion, political representation and the role of new media for political processes.

Materials
workshop materials
blog post


2019-04-17 | Workshop | Denis Cohen (MZES)
Applied Bayesian Statistics using Stan and R
more

Abstract
This 90 minute workshop provides an applied introduction to Stan, a platform for statistical modeling and Bayesian statistical inference. Participants will get an overview of the programming language, the R interface RStan, and the workflow for Bayesian model building, inference, and convergence diagnosis. Applied exercises provide participants with the chance to write and run their own models.

Presenter
Denis Cohen is a postdoctoral fellow in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim. His research focus lies at the intersection of political preference formation, electoral behavior and political competition. His methodological interests include quantitative approaches to the analysis of clustered data, measurement models, data visualization, strategies for causal identification, and Bayesian statistics.

Materials
workshop materials
blog post


2019-03-27 | Workshop | Simon Kühne (Universität Bielefeld)
Collecting and Analyzing Twitter Data Using R
more

Abstract
This 90 minute workshop provides an overview about Twitter data and how to collect and analyse it using R. Participants learn how to access Twitter's API in order to collect data for their own research projects. A number of examples illustrate how to preprocess and analyse the content and meta-information of Tweets.

Presenter
Simon Kühne is a post-doc at Bielefeld University. He holds a BA in Sociology and an MA in Survey Methodology from the University of Duisburg-Essen and a PhD in Sociology from Humboldt University of Berlin. His research focuses on survey methodology, social media and online data, and social inequality.

Materials
workshop materials
blog post


2019-02-27 | Roundtable
Roundtable on Text as Data (Part II)
more Presenters
  • Marius Sältzer: Sentiment Analysis for German Tweets by Election Candidates
  • Samuel Müller: Automated Extraction of Reasoning Using Topic Models
  • Konstantin Gavras: Inferring Policy Preferences from Strategy Papers on National Security in Europe using Unsupervised Machine Learning Technique


2019-01-30 | Workshop | Cornelius Puschmann (Leibniz Institute for Media Research Hamburg)
Advancing Text Mining with R and quanteda
more

Abstract
The usefulness of R for text mining and content analysis has greatly increased in recent years, especially following the release of specialized packages such as tm, stringr and tidytext. My interactive presentation will focus on quanteda, which has rapidly become a all-purpose framework for conducting text mining with R due to its high functionality, speed and quality of documentation. I will showcase a number of techniques from corpus compilation and cleaning to the application of dictionaries such as LIWC and Lexicoder Policy Agendas and the application of text scaling models such as Wordscores and Wordfish. I will also show how topic modeling and supervised machine learning for extrapolating content categories can be applied through the topicmodels, STM and RTextTools packages, and point to interfaces with external services such as the Microsoft Cognitive Services and Google Cloud Machine Learning API. My presentation will close with suggestions for improving the robustness and reproducibility of content analyses conducted with R.

Presenter
Cornelius Puschmann is a senior researcher at the Leibniz Institute for Media Research in Hamburg.

Materials
workshop materials
blog post


2018-12-05 | Workshop | Julian Bernauer & Denis Cohen (MZES)
Introduction to R
more

Abstract
This brief introduction to R covers the following topics:

  • Algebraic operators and transformation
  • Object types and conversions
  • Control structures (loops, conditions, etc.)
  • Writing simple functions
  • Installing, updating, and using packages
  • Getting help in R
  • Data import and export
  • A glimpse on the tidyverse package
  • A quick first self-authored package in R

Presenters
Julian Bernauer and Denis Cohen are postdoctoral fellows in the Data and Methods Unit at the Mannheim Centre for European Social Research (MZES), University of Mannheim, and co-organizers of the Social Science Data Lab.

Materials
workshop materials


2018-11-28 | Input Talk | Gavin Abercrombie (University of Manchester)
Topic-centric Sentiment Analysis of UK Parliamentary Debates
more

Abstract
Debate transcripts from the UK House of Commons provide access to a wealth of information concerning the opinions and attitudes of politicians and their parties towards arguably the most important topics facing societies and their citizens, as well as potential insights into the democratic processes that take place within Parliament. In my PhD project, I apply natural language processing and machine learning methods to debate speeches with the aim of determining the attitudes and positions expressed by speakers towards the topics they discuss. In this talk, I will present research on speech-level sentiment analysis and opinion-topic/policy detection in debate motions, as well as ongoing work on compiling a comprehensive review of research from both computer science and social science in this area. I will also discuss the challenges presented and multidisciplinary approaches to the problem, and present ideas for the direction of future investigation.

Presenter
Gavin Abercrombie pursues a PhD in natural language processing at the School of Computer Science, University of Manchester.

Materials
presentation


2018-11-21 | Roundtable
Roundtable on Text as Data (Part I)
more

Presenters

  • Dennis Hammerschmid: "Talk and Action in the United Nations - How Text Analysis can Help to Uncover Vote-Buying in the International Arena"
  • Verena Kunz: "Position Blurring as a Response to Competing Principals? Assessing Speech Clarity in the European Parliament"
  • Jason Eichorst: "Political Competency Signals in Word Choice"
  • Julian Bernauer and Federico Nanni: "Cross-Lingual Topical Scaling of Sparse Political Text using Word Embeddings"


2018-05-12 | Input Talk | Chung-hong Chan (MZES)
Fast, cheap, but is it still good? An opinionated guide to crowdsourcing platforms in 2018
more

Abstract
In 2008, four prominent Stanford AI researchers published an article "Fast, Cheap - but is it good?" and claimed crowdsourcing can produce very high-quality data for scientific research. A decade has passed and social scientists are picking up the pace to deploy crowdsourcing to collect survey data and conduct content analysis. A new silver bullet is born. In this talk, I will share my experience of using a crowdsourcing platform to conduct a large-scale, multilingual content analysis (a.k.a. crowdcoding). I will briefly go through the promises of those platforms in the literature and then talk about the pitfalls. A realistic conclusion is: it is impossible to obtain both fast, cheap, and good data from those platforms. As in the real life, it is only possible to take at most two out of the three. Sometimes you take none of them.

Presenter
Chung-hong Chan is a Research Associate at the Mannheim Center for European social science research (MZES), University of Mannheim.

Materials
presentation


2017-2018
more
2018-05-09 | Input Talk | Katharina Meitinger (GESIS)
Dealing with the complexity of cross-national data: The method of web probing
more

Abstract
There has been a tremendous increase in cross-national data production in social science research in recent decades. Before drawing substantive conclusions based on cross-national survey data, researchers need to verify whether the measures are indeed comparable. An important addition to quantitative measurement invariance tests are qualitative approaches, such as web probing. In the first part, I will discuss why comparability of data should not be assumed but needs to be tested. I will shortly present the different approaches to test for and explain (in)comparability of data, introduce the method of web probing and present studies where web probing could shed light on incomparable data. In the second part of this talk, I will discuss different aspects of the implementation of web probing, such as sample size, nonresponse conversion, the optimal visual design (e.g., textbox size, order of probes) and how to analyze such data.


2018-04-11 | Input Talk | Federico Nanni (University of Mannheim)
Results from a Text Scaling Hackathon
more

Abstract
In my talk I'll offer an overview of a shared-task hackathon that took place as part of a research seminar bringing together a variety of experts and young researchers from the fields of political science, natural language processing and computational social science. The task looked at ways to develop novel methods for political text scaling to better quantify political party positions on European integration and Euroscepticism from the transcript of speeches of three legislations of the European Parliament. I will also focus on the potential of hackathons for fostering interdisciplinary collaborations between computer science and the social sciences and the next steps of my research group in this direction.

Materials
This paper summarizes the results of the hackathon: https://ub-madoc.bib.uni-mannheim.de/44172/1/findings-hackathon-understanding.pdf. Here is related code for cross-lingual classification and scaling: https://github.com/codogogo/topfish.


2018-03-21 | Input Talk | Timo Lenzner, Cornelia Neuert & Patricia Hadler (GESIS)
Cognitive Pretesting Methods
more

Abstract
This talk highlights the general importance of carrying out cognitive pretests before fielding a questionnaire. This is done by presenting examples of untested as well as pretested and improved questions. With regard to cognitive pretesting methods, we provide an introduction to the traditional cognitive interview (e.g., f2f interviewing) and give an overview of current developments (e.g., combining f2f interviews with eye-tracking, conducting cognitive pretests over the Web). Finally, we discuss the pros and cons of these different cognitive pretesting methods and offer practical advice on how to conduct cognitive pretesting projects.


2018-03-14 | Workshop | Denise Traber (University of Lucerne)
Quantitative Analysis of Political Text: Tools and Applications
more

Abstract
The workshop introduces concepts and methods for the quantitative analysis of political text (QTA) in R. Speeches delivered by prime ministers during the Euro-Crisis (EUSpeech dataset) serve as an application for the demonstration of text preparation, visualization, scaling, topic models and sentiment analysis. After an introduction of the text corpus and a brief discussion of QTA methods, the participants have the opportunity to carry out some QTA themselves under the instructors' supervision.

Presenter
Denise Traber is a Senior Research Fellow at the University of Lucerne, Switzerland, where she heads an Ambizione research grant project on "The divided people: polarization of political attitudes in Europe" funded by the Swiss National Science Foundation. She has a strong interest in quantitative text analysis, has co-organized the first "Zurich Summer School for Women in Political Methodology" in 2017 and has recently published the article "Estimating Intra-Party Preferences: Comparing Speeches to Votes" in PSRM, jointly with Daniel Schwarz and Ken Benoit.

Materials
blog post


2018-02-21 | Input Talk | Chung-hong Chan (MZES)
Introduction to Social Media's RESTful APIs and data collection with SocialMediaLab
more

Abstract
In this talk, I will demonstrate how to collect data from social media. I will walk through how RESTful API works and how to obtain API access rights from Facebook, Twitter and Youtube (optional topic: Sina Weibo). The R package SocialMediaLab will be introduced, which is a easy tool for social media data collection and data transformation.

Materials
workshop materials


2017-11-29 | Workshop | Nate Breznau & Christiane Grill (MZES)
Introduction to Structural Equation Modeling
more

Abstract
In our talk we will introduce participants to the techniques of structural equation modeling (SEM). We will show how a theoretical model represented through measurement models and possibly causal relationships can be applied to empirical data. The talk presents basic models relevant for social scientist: we start with exploratory and confirmatory factor analysis (EFA and CFA) and then move on to path models, latent class models and measurement invariance. In our talk we will also show how to use the statistical software Mplus to perform SEM. No previous knowledge of Mplus is required. Workshop participants can download and install Mplus if they want to follow the examples in class. A demo version is available at https://www.statmodel.com/demo.shtml.

Materials
workshop materials


2017-11-08 | Input Talk | Chung-hong Chan (MZES)
Social Network Analysis with igraph
more

Abstract
This talk introduces the nuts and bots of social network analysis, and how to do it in R using the package igraph. In this talk, I will quickly walk through the concept of graph (social network), the common scenarios of data collection and the usual analysis patterns. Getting up close and personal, I will use the data scraped from the MZES website as an example to demonstrate how to collect, analyze and visualize the MZES collaboration network. Let's find out the most important researchers and fractions in MZES! . or not.

Materials
presentation


2017-10-18 | Workshop | Richard Traunmüller (University of Mannheim)
Visual Inference for the Social Sciences
more

Abstract
This talk introduces a remedy to the criticism frequently voiced against data visualization and exploration: that it may give rise to an over-interpretation of random patterns. A way to overcome this problem is the realization that "visual discoveries" correspond to the implicit rejection of "null hypotheses". The basic idea of visual inference is that graphical displays can be treated as "test statistics" and compared to a reference distribution of plots under the assumption of the null. Visual inference helps us answer the question "Is what we see really there?" By so doing, it seeks to overcome long-standing reservations against visualization as merely "informal" approach to data analysis and the fear that beautiful pictures may in fact not correspond to any meaningful patterns of substantive scientific interest. The talk illustrates the application and benefits of this visual method by drawing on examples from the social sciences. A little lab exercise will encourage participants to try out visual inference in practice using the statistical programming language R.

Materials
workshop materials
blog post


2017-10-04 | Workshop | Sebastian Pink (MZES)
Using mainly Stata and increasingly R (and knitr)
more

Abstract
Very familiar with Stata, probably like most of you, throughout my project and dissertation work, I came to increasingly incorporate R in my data analysis and even my data edition. In one instance, I had to run specific models for network analysis that I was not able to run in Stata. Then I ran the analysis in R but kept doing the entire preceeding data edition in Stata. In another instance, I ran a simulation model in R, which by nature slightly changed its results every time I ran it. As I wanted to avoid a time-consuming copy-and-paste marathon between R and Word, I wrote the manuscript describing this simulation model using knitr. The reason was that it automatically handed over the values, figures, and tables to a latex processor producing a nice document. In this talk I simply describe these developments in my workflow to show you how you may gain from incorporating R or knitr in small dosages in your Stata workflow.

Materials
workshop materials


2016-2017
more
2017-04-26 | Input Talk | Florian Keusch (MZES)
Introduction to Unipark
more

Abstract
In this Social Science Data Lab, I will give an introduction to the EFS Survey Software from Unipark (Questback). If you have never worked with the tool, then you will learn how to set up a first questionnaire to collect survey data over the Internet. We will discuss basic principals of participant recruitment, web questionnaire layout, and study design to conduct methodologically sound web surveys. This will also include taking into account the increasing number of respondents who participate in web surveys using their smartphone. For those who already have worked with Unipark before, we will have time to discuss more advanced features of the software such as working with quotas, lists, and loops.

Materials
presentation


2017-03-29 | Input Talk | Federico Nanni (University of Mannheim)
Topic-based and Cross-lingual Scaling of Political Text
more

Abstract
Political text scaling aims to linearly order parties and politicians across political dimensions (e.g., left-to-right ideology) based on textual content (e.g., politician speeches or party manifestos). Existing models, such as Wordscores and Wordfish, scale texts based on relative word usage; by doing so, they do not take into consideration topical information and cannot be used for cross-lingual analyses. In our talk, we present our efforts toward developing a topic-based and cross-lingual political text scaling approach. First we introduce our initial work, TopFish, a multi?level computational method that integrates topic detection and political scaling and shows its applicability for temporal aspect analyses of political campaigns (pre-primary elections, primary elections, and general elections). Next, we present a new text scaling approach that leverages semantic representations of text and is suitable for cross-lingual political text scaling. We also propose a simple and straightforward setting for quantitative evaluation of political text scaling

Materials
presentation


2017-03-15 | Input Talk | Philipp Zumstein (University of Mannheim)
Building Infrastructure for Data-Driven Research
more Abstract
Most methods for data-driven research (including Big Data, Data Science, and Digital Humanities) work primarily on text data or numbers. However, there is also a lot of information which is only available in printed books or newspapers. This information has to be first digitized and then further processed to extract the text or data. The main focus of the talk is optical character recognition (OCR). We will see the OCR workflow in general, discuss some OCR software, and how you can use these tools practically. Building such an infrastructure or performing these initial steps may need a reasonable amount of time and resources, or also be a project itself. The Mannheim University Library has in this area some infrastructure projects which are briefly mentioned.

Materials
presentation


2017-02-15 | Input Talk | Sarah Brockhaus (LMU Munich)
Functional Data Analysis in a Nutshell
more

Abstract
Functional data analysis (FDA) is a field of statistics that deals with the analysis of data that have a functional character. Functional data include curves, images, surfaces and trajectories. In the following, we will focus on curves. Growth curves are an example for one-dimensional functional data observed over time. Other examples are spectrometric measures over wavelength or blood markers measured continuously over time. FDA is applied in diverse fields including biometry, demography, medicine, linguistics and finance. Instead of analyzing single points on the curves, FDA treats the curves as observation units. The talk will approach FDA rather intuitively to give an idea of functional data. The talk covers basic summary statistics, like mean and variance for functional data, and contains an outlook to more complex methods like regression with functional data.

Materials
presentation


2017-01-18 | Workshop | Simon Munzert (MZES)
Advanced R and Recent Advances in R
more

Abstract
This one-day course is set out to improve your R skills and make you a more efficient programmer. In particular, you will:

  • become better at file management with R
  • learn all about piping operators
  • understand what functional programming means
  • get an overview of string processing and regular expressions
  • get to know new tools that help you tidy data
  • learn how to manipulate data frames efficiently
  • be able to routinely split-apply-combine your data
  • learn to establish a debugging workflow

Materials
workshop materials


2016-12-16 | Workshop | Richard Traunmüller (University of Mannheim)
Data Visualization
more

Abstract
Data visualisation is one of the most powerful tools to explore, understand and communicate patterns in quantitative information. At the same time, good data visualisation is a surprisingly difficult task and demands three quite different skills: substantive knowledge, statistical skill, and artistic sense. The course is intended to introduce participants to a) key principles of graphical perception and analytic design, b) useful visualisation techniques for the exploration and presentation of various forms of data and c) new developments of data visualisation for the social sciences, such as visual inference and visualising statistical models.

Materials
workshop materials
blog post


2016-12-14 | Input Talk | Malte Schierholz (MZES)
Fundamentals in Bayesian Statistics
more

Abstract
Besides the frequentist approach to statistical inference, which was dominant in science in the 20th century, another school exists: Bayesian Statistics. With modern computational techniques, Bayesian data analysis has a proven track-record and established itself as an alternative to frequentist procedures. Sometimes, Bayesian techniques can be applied to complex scientific questions where no frequentist solution exists. This talk gives an introduction to Bayesian statistics. While it is not possible to avoid central mathematical formulas and derivations, I concentrate on concepts, intuitive motivations, and interpretations that underlie the Bayesian view. Critical model assumptions are also discussed. Participants will learn when to mistrust a Bayesian analysis and in which situations it may provide new insights.

Materials
presentation


2016-11-16 | Input Talk | Eike M. Rinke (MZES)
An Open Science Primer for Social Scientists
more

Abstract
"Open Science" has become a buzzword in academic circles. However, exactly what it means, why you should care about it, and - most importantly - how it can be put into practice is often not very clear to researchers. In this session of the SSDL, we will provide a brief tour d'horizon of Open Science in which we touch on all of these issues and by which we hope to equip you with a basic understanding of Open Science and a practical tool kit to help you make your research more open to other researchers and the larger interested public. Throughout the presentation, we will focus on giving you an overview of tools and services that can help you open up your research workflow and your publications, all the way from enhancing the reproducibility of your research and making it more collaborative to finding outlets which make the results of your work accessible to everyone. Absolutely no prior experience with open science is required to participate in this talk which should lead into an open conversation among us as a community about the best practices we can and should follow for a more open social science.

Materials
presentation


2016-10-19 | Workshop | Sarah Brockhaus (LMU Munich)
Statistical Boosting with mboost
more

Abstract
The talk will be about model based boosting. Originally, boosting is an algorithm from the field of machine learning. It was further developed to fit statistical regression models, like linear models, generalized linear models and quantile regression models. Boosting can be used in high-dimensional data settings and inherently does variable selection. The first part of the talk will give some background information on boosting and explain the basic ideas. The second part will be on the practical use of the R package mboost, which provides a flexible toolbox to boost regression models.

Materials
workshop materials


2016-10-05 | Input Talk | Lars Kaczmirek (University of Vienna)
UCSP: Universal Client-Side Paradata
more

Abstract
The talk will inform about the collection of online paradata using the universal client-side paradata script (UCSP). To see which data are collected on the fly in the GESIS panel, check out the documentation at: http://kaczmirek.de/ucsp/ucsp.html Also, we will hear about EvalAnswer, a tool that helps you automatically code non-response in open questions. This in turn can be used to trigger conversion attempts in online surveys as well as to assign nonresponse codes to open answers in existing survey data sets.


2016-06-15 | Workshop | Simon Munzert (MZES)
Three easy-to-learn tools to scrape data from the Web with R
more

Abstract
This workshop shows how to

  • use regular expression to extract data from raw text (or websites)
  • use XPath for static webpage scraping
  • tap APIs from within R
  • scrape data from dynamic webpages (i.e. JavaScript-generated content) using AJAX and Selenium
Obviously, these are four not three tools. However, regular expressions are never easy to learn, so the title is still valid.

Materials
workshop materials