Appendix A: Syllabus

A.1 Canvas & Recorded Lectures

We will use the learning management system, Canvas, to conduct some course business, including assignment disbursement and submitting. I will use Canvas to record lectures for future viewing.

A.2 INF511 Book Website

I have compiled a course website that has supplemental text and coded examples that we will walk through in class. This website essentially serves as the course textbook and is required reading. There will be other required reading materials (see Section A.8).

A.3 Course Purpose

INF 511 Modern Regression I is the first course in a two-semester sequence required for the MS and PhD in Informatics and Computing (INF). (See INF 512 Modern Regression II.) These courses are designed to serve the computationally oriented statistical analysis needs of the INF graduate program. Through a series of hands-on individual or team-based assignments, students will master statistical analyses, from preparing data, exploring data using numerical and/or graphical methods, modeling data, diagnosing model assumptions, remodeling and final inference. This course will provide INF graduate students with the necessary foundation for more specialized statistical methods and applications that students will encounter in subsequent INF courses, such as INF 626 Applied Bayesian Modeling and the more prediction-oriented INF 504 Data Mining and Machine Learning. More generally, INF 511 provides skills widely applicable to analysis of data across science and engineering.

INF 511 Modern Regression I covers fundamental probability models and their use in the analysis of independent data with linear models within both frequentist and Bayesian statistical frameworks. Random variables, expectation, variance, covariance, correlation. Joint, conditional and marginal distributions. Linear combinations of random variables; central limit theorem; matrices, vectors, basic matrix arithmetic, matrix formulation of linear statistical models (regression and ANOVA) for independent data, normal likelihood, least squares, Gauss-Markov theorem, \(t\) and \(F\) sampling distribution-based inference for linear combinations of parameters. Corresponding Bayesian analysis including prior and posterior distributions, introductory Markov chain Monte Carlo methods. Diagnostics, including graphical residual analysis. Scope of inference (randomization and causality, random sampling and population).

A.4 Course Student Learning Outcomes

The overall learning outcome for this course is a demonstrated acquisition of skills and conceptual understanding of statistical methods to enable complete and valid statistical analyses of primarily independent data modeled with relatively traditional linear statistical models from both a frequentist and Bayesian perspective at a level commensurate with high expectations of a well-trained graduate student in the quantitatively and computationally intensive field of informatics. For these data, and using the methods and concepts detailed in the Course Purpose, students should be able to:

  • Use numerical and graphical exploratory data analysis tools to prepare data and to develop conceptual mod- els of data
  • Transform conceptual models of data into formal linear statistical models of data
  • Implement linear model methods in modern software, such as R or Stan, to analyze data
  • Demonstrate an understanding that models are an abstracted simplification of real processes by effectively using linear model diagnostics, remedial measures, remodeling and final model validation/confirmation methods.
  • Demonstrate an understanding of the limitations of data and methods by communicating how the data and methods relate to randomization and causality, random selection and population, over-fitting and exploratory/confirmatory analyses, and the trade-off between efficiency of inference and robustness to departures from method assumptions
  • Demonstrate an ability to work effectively in a team environment to solve realistic problems, which may be beyond any one individual!s ability to address, as indicated by peer review or other assessments of team- work.

A.5 Assessments of Course Student Learning Outcomes

There will be three assessment strategies: problem sets, homework, and quizzes. Problem sets will have dedicated in-class time to complete, whereas homework assignments will be done entirely outside of class. Problem sets will be primarily assessed as complete/incomplete, whereas Homework assignments will be graded in full. Assignment format is designed in part to mimic and reinforce the similar presentation of analyses in class/notes and to encourage discussion among students. There will be $$5 in-class Quizzes, one every $$3 weeks, in lieu of exams. Attendance & Participation will also be used to assess your course performance.

A.6 Grading System

Problem Sets Homework Quizzes Attendance
10% 50% 35% 5%
  • Attendance & Participation. In-person attendance is required. See University Policy on excused absences. You are responsible to plan with your fellow classmates to obtain in-class material not received due to your absence. (Recall that lectures will be recorded and will be available in Canvas) Participation in the form of responding to questions in class, asking questions, and attending office hours may be used to determine “borderline” grade cases.
Prior notification of absence

A student must notify the instructor prior to absence. Students should notify the instructor of an upcoming absence via email, and the instructor will evaluate whether the absence will be counted as excused or unexcused.

  • Assignments. There are two categories of assignments, Problem Sets and Homework. See Section A.5 for the distinctions. See Section A.9 for due-dates. Assignments will be posted periodically via Canvas. Assignments are to be submitted electronically, via Canvas, by the due date/time indicated in Canvas.
Late assignments

Late assignments will not be accepted. Late assignments will receive a score of zero.

  • Quizzes. There will be approximately five in-class quizzes that are all cumulative. Each quiz will be designed to take approximately 20-30 minutes, and each will be scaled to 100 points so that they are equally weighted.
In-class quizzes

If a student does not attend a class when a quiz is given, that quiz will receive zero points. The only exception is if a student notifies the instructor, before class, of an impending absence. The absence must be formally excused in writing by the instructor before class for a make-up quiz to be considered. Therefore, the notification of absence must be received at least several hours prior to class time.

A.7 Course Grades

Overall course grades will follow a typical scale:

To earn the letter grade -> A B C D F
You need at least this score 90 80 70 60 0

While you should be able to compute an estimate of your current grade using the information above, I will attempt to use the Grade feature in Canvas so that you are able to check your grades. Grading mistakes may occur, and students are encouraged to discuss such concerns with the instructor during office hours or by appointment.

A.8 Readings and Materials

  • Lecture Materials: Lecture topics, course notes, readings, and assignments will be made available as the semester progresses. For each lecture, a document will be posted on Canvas. During lecture, we will work together to fill out this document with written notes presented on the (virtual) whiteboard. Therefore, it is essential to download and print these materials prior to attending class.

  • Required Text:

  • Suggested Text:

  • Computing. Each student must bring their laptop to class with the following (freely available) software pre-installed:

    • Latest version of RStudio Desktop IDE
    • Compatible version of R software environment
    • Quarto publishing system (for documents with integrated code).
    • You must have a functional PDF Engine to render Quarto (.qmd) documents into PDF. See this section on PDF Engines, and be sure to test whether you can render an example .qmd file into a PDF.
    • Stan programming language, via the rstan package for R.
    • We will potentially use the R package rstanarm, but this is a straightforward package to download using the install.packages() function.

A.9 Living course schedule

This schedule will be consistently updated throughout the course. Check back often.

Week Date Topic Reading_Due Assign_Due Quiz
Week 1 16-Jan Introduction Syllabus, JM(Ch1&2), JB(1-20, AppA)
Week 1 18-Jan Probability distributions JB(pg1-20), JM(Ch3), FAR(Ch1) PS-0
Week 2 23-Jan Probability distributions
Week 2 25-Jan Least Squares JB(pg20-38, AppB), JM(Ch4) PS-1, HW-1
Week 3 30-Jan Least Squares FAR(Ch2) Quiz 1
Week 3 1-Feb Least Squares JB(pg41-60) PS-2
Week 4 6-Feb Least Squares HW-2
Week 4 8-Feb Snow Day - NO CLASS
Week 5 13-Feb Least Squares JB(pg157-170), FAR(Ch3)
Week 5 15-Feb Hypothesis Testing JB(pg61-79), JM(Ch5), FAR(Ch10) PS-3
Week 6 20-Feb Hypothesis Testing JB(pg80-127), FAR(Ch4)
Week 6 22-Feb Hypothesis Testing HW-3 Quiz 2
Week 7 27-Feb Maximum Likelihood JB(pg38-40), JM(Ch6)
Week 7 29-Feb Maximum Likelihood FAR(Ch6), JB(skimCh6)
Week 8 5-Mar Maximum Likelihood FAR(Ch6), JB(skimCh6) PS-4
Week 8 7-Mar Maximum Likelihood FAR(Ch7)
Week 9 12-Mar Spring Break - NO CLASS
Week 9 14-Mar Spring Break - NO CLASS
Week 10 19-Mar Maximum Likelihood FAR(Ch8) HW-4 Quiz 3
Week 10 21-Mar CLASS CANCELLED
Week 11 26-Mar Model Comparison JB(refreshCh3.2), FAR(Ch10, Ch 11)
Week 11 28-Mar Model Comparison JB(pg170-223) PS-5
Week 12 2-Apr ANOVA JB(Ch14), JM(Ch7), FAR(Ch14)
Week 12 4-Apr ANOVA JB(Ch15.1-15.3) HW-5
Week 13 9-Apr ANOVA JB(Ch15.4-15.5), FAR(Ch15) Quiz 4
Week 13 11-Apr ANOVA PS-6
Week 14 16-Apr ANOVA FAR(Ch16)
Week 14 18-Apr Bayesian Inference HW-6
Week 15 23-Apr Bayesian Inference
Week 15 25-Apr Bayesian Inference PS-7
Week 16 30-Apr Bayesian Inference
Week 16 2-May Bayesian Inference HW-7 Quiz 5

A.10 Course Policies

  • Students are encouraged to attend the office hours of the instructor. If a student cannot attend regular office hours with the instructor, an appointment may be considered if made via email with sufficient advanced notice.
  • Emails addressed to the instructor must be respectful and professional. The instructor will respond to emails promptly, within 2 business days. The instructor will generally not respond to emails on weekends or after working hours (i.e., in the evenings), so please plan accordingly.
  • Cheating, including plagiarism of writing or computer code, will not be tolerated. All academic integrity violations are treated seriously. Academic integrity violations will result in penalties including, but not limited to, a zero on the assignment, a failing grade in the class, or expulsion from NAU. The University’s Academic Integrity policies (Section A.11) will be strictly enforced.
  • The paramount policy of this course is that each student is required to demonstrate respect towards their peers and the instructor. The behavior of the instructor is held to the same standard. Students and instructors come from all walks of life, and may identify with a variety of ethnic, racial, religious, gender and sexual identities. Diversity of thought and perspective enhances our science.
  • Attendance is required and repeated, unexcused absences may affect the student’s grade.
  • The instructor will not provide copies of course notes. These materials should be sought from the students’ peers or by watching the recorded lectures.
  • Electronic device usage must support learning in the class. All cell phones, PDAs, music players and other entertainment devices must be turned off (or put on silent) during lecture.
  • Grades will be entered in Canvas. Please check LOUIE for your final grade.

A.11 University Policies

Please see this document for all of the required Syllabus Policy Statements that equally apply to this course.