Contents 1 Introduction: Statistics Meets Corpus Linguistics 1 2 Vocabulary: Frequency, Dispersion and Diversity 38 3 Semantics and Discourse: Collocations, Keywords and Reliability of Manual Coding 66 4 Lexico-grammar: From Simple Counts to Complex Models 102 5 Register Variation: Correlation, Clusters and Factors 139 6 Sociolinguistics and Stylistics: Individual and Social Variation 183 7 Change over Time: Working Diachronic Data 219 8 Bringing Everything Together: Ten Principles of Statistical Thinking, Meta-analysis and Effect Sizes 257 Final Remarks 283 References 285 Index 294