HarvardX Working Papers

The HarvardX Working Papers 

The HarvardX Working Papers consists of papers and reports that give researchers and the broader public access to current research findings from the HarvardX research organization. HarvardX Working Papers are generally written by HarvardX researchers, or their collaborators, and they use HarvardX data to inform the learning of students on and off campus. 

  1. HarvardX and MITx: The First Year of Open Online Courses

    The first two HarvardX open online courses launched on the edX platform in October 2012; four more courses launched in Spring 2013. This report and its companion course reports examine these initial six course offerings -- alongside the initial 11 MITx courses -- in order to inform ongoing course design and research. Now that data has been delivered and analyzed, it is an ideal time to examine these initial offerings in order to inform ongoing course design and research. This summary report addresses simple questions across multiple courses: Who registered?  What did they do?  Where are they from? We strongly encourage reading these reports as a package to understand the full story of the HarvardX and MITx initiative in its first year.  

  2. PH207x: Health in Numbers and PH278x: Human Health and Global Environmental Change 

    In the 2012-2013 academic year, the first two Harvard School of Public Health courses were offered through HarvardX on the edX platform: PH207x: Health in Numbers and PH278x: Human Health and Global Environmental Change. They were taught by Professors Earl Francis Cook and Marcello Pagano, and Aaron Bernstein and Jack Spengler, respectively. This report describes the structure of these two courses, the demographic characteristics of registrants, and the activity of students. This report was prepared by researchers external to the course teams and is based on examination of the courseware, analyses of the data collected by the edX platform, and interviews and consultations with the course faculty and team members.

  3. CB22x: HeroesX 

    CB22x was offered as a HarvardX course on the edX platform in Spring 2013. It was taught by Professor Greg Nagy. The report is based on examination of the courseware, analyses of the data collected by the edX platform, and interviews and consultations with the course faculty and team members.

  4. ER22x: JusticeX 

    ER22x was offered as a HarvardX course on the edX platform in Spring 2013. It was taught by Professor Michael Sandel. The report was prepared by researchers external to the course team, based on an examination of the courseware, analyses of data collected by the edX platform, and interviews with the course faculty and team members.

  5. HLS1X: CopyrightX 

    This report describes the first Harvard Law School open online course, first offered through HarvardX on the edX platform in Spring 2013. The course was taught by Professor William Fisher, who also prepared this report. Modified versions will be offered in the spring semesters of 2014 and 2015.  This document describes and evaluates the 2013 version and outlines plans for the 2014 version. 

  6. Computer-Assisted Reading and Discovery for Student Generated Text in Massive Open Online Courses
    Dealing with the vast quantities of text that students generate in a Massive Open Online Course (MOOC) is a daunting challenge. Computational tools are needed to help instructional teams uncover themes and patterns as MOOC students write in forums, assignments, and surveys. This paper introduces to the learning analytics community the Structural Topic Model, an approach to language processing that can (1) find syntactic patterns with semantic meaning in unstructured text, (2) identify variation in those patterns across covariates, and (3) uncover archetypal texts that exemplify the documents within a topical pattern. We show examples of computationally-aided discovery and reading in three MOOC settings: mapping students’ self-reported motivations, identifying themes in discussion forums, and uncovering patterns of feedback in course evaluations.

  7. Staggered Versus All-at-Once Content Release in Massive Open Online Courses: Evaluating a Natural Experiment
    We report on an experiment testing the effects of releasing all of the content in a Massive Open Online Course (MOOC) at launch versus a staggered release. In 2013, HarvardX offered two “runs” of the HeroesX course: In the first, content was released weekly over four months; in the second, all content was released at once. We develop three operationalizations of “ontrackedness” to measure how students participated in sync with the recommended syllabus. Ontrackedness in both versions was low, though in the second, mean ontrackedness was approximately one-half of levels in the first HeroesX. We find few differences in persistence, participation, and completion between the two runs. Controlling for a students’ number of active weeks, we estimate modest positive effects of ontrackedness on certification. The revealed preferences of students for flexibility and the minimal benefits of ontrackedness suggest that releasing content all at once may be a viable strategy for MOOC designers.

  8. Socioeconomic Status and MOOC Enrollment: Enriching Demographic Information with External Datasets 
    To minimize barriers to entry, massive open online course (MOOC) providers collect minimal demographic information about users. In isolation, this data is insufficient to address important questions about socioeconomic status (SES) and MOOC enrollment and performance.  In this paper, we demonstrate the use of third-party datasets that were used to enrich demographic portraits of MOOC students and answer fundamental questions about SES and MOOC enrollment. 

  9. Addressing Common Analytic Challenges to Randomized Experiments in MOOCs: Attrition and Zero-Inflation
    Massive open online course (MOOC) platforms increasingly allow easily implemented randomized experiments. The heterogeneity of MOOC students, however, leads to two methodological obstacles in analyzing interventions to increase engagement. (1) Many MOOC participation metrics have distributions with substantial positive skew from highly active users as well as zero-inflation from high attrition. (2) High attrition means that in some experimental designs, most users assigned to the treatment never receive it; analyses that do not consider attrition result in “intent-to-treat” (ITT) estimates that underestimate the true effects of interventions. We address these challenges in analyzing an intervention to improve forum participation in the 2014 JusticeX course offered on the edX MOOC platform. We compare the results of four ITT models (OLS, logistic, quantile, and zero-inflated negative binomial regressions) and three “treatment-on-treated” (TOT) models (Wald estimator, 2SLS with a second stage logistic model, and instrumental variables quantile regression). A combination of logistic, quantile, and zero-inflated negative binomial regressions provide the most comprehensive description of the ITT effects. TOT methods then adjust the ITT underestimates. Substantively, we demonstrate that self-assessment questions about forum participation encourage more students to engage in forums and increases the participation of already active students.