Coding for Data Science and Data Management, DSE UniMI – 2019/2020

The course aims at providing technical skills about coding/scripting aspects for data analysis and to manage persistent data storage of sources and results involved in analysis. On the one side, the Python programming language and the R framework are illustrated. The goal is to deal with essential notions about data structures and control structures of both Python and R. On the other side, the goal is to present the core notions of relational databases, such as keys, integrity, and primary/foreign key constraints, as well as the SQL language for data definition, manipulation, and query. Recent and innovative NoSQL solutions are also discussed, with special focus on a document-oriented system called MongoDB.

Course Structure

  1. R
  2. Python
  3. Databases

Syllabus (R)

Coding for Data Science and Data Management (R), DSE UniMI – 2019/2020

Lectures (R)

Introduction to the R framework and R Studio (html)

Basic Data Types (html)

Basic Data Structures (html)

Basic operations (html)

Time Series (html)

Control Structures (html)

User-Defined Functions (html)

Performance Optimization (html)

Data Acquisition (html)

Data visualization (ggplot2) (plotly)

Building interactive interfaces, documents and websites (shiny) (rmarkdown)

Building R packages (structure) (metadata) (data)

Midterm Exam (R)

Midterm exam: R package (pdf) (grades) The final grades will ba available the first week of March.