Statistical methods for linguistic research: Advanced Tools

Language & Computation

Statistical methods have become central to almost every data-driven research problem, in both computational linguistics and in linguistic research. This is true in both industry and in academia. And yet, many users of statistical tools have only a vague understanding about the central ideas that underpin statistical theory. For example, many researchers do not understand what a p-value tells you about the research hypothesis; and even professional users of statistics, with many years of practical experience behind them, cannot accurately explain what a confidence interval is. Understanding these concepts is crucial for drawing correct inferences from data. Learning statistical inference concepts is vitally important for students of language and computation. In this course (which presupposes the contents covered in the course Statistical methods for linguistic research: Foundational Ideas in Week 1), we will move on to Bayesian Data Analysis (BDA). The focus is on multiple regression and linear (mixed) modeling, as these tools are of central importance in linguistics. We will build up the story up to the point where the student can fit linear (mixed) models in Stan. For further details, see the course web page.

Second week
17:00 - 18:30 - slot 4