Appearance
Course Information
The course gives an introduction to obtaining, transforming, exploring and visualising data using modern tools which is at the core of datascience.
We will specifically be using the following tools: Python
/R
for programming, Jupyter notebook
for analysing data, Git
together with Github
for version controll. This course will give and introduction to these tools and how they are used in Data Science.
An important aspect of data science and science in general is reproducibility. There will be a great emphasis on creating reproducable code and reports. The tools mentioned above will be used for this.
This course is naturally designed for self-study. Each week there will be course material that you should read up on. We will not have traditional lectures in this course. Rather, we will meet in person once a week, where we will give a live demo of the material covered the week before. In previous itterations of the course, the live demos have been very appreciated. After the live demo we will be available until 17.00 for help with assignments or questions. In addition, we will be available on Fridays between 13-16 via Zoom for questions.
The main literature of the course will be
- R for Data Science by Grolemund and Wickham which is specific to R.
- Python for Data Analysis by Wes McKinney which is for Python.
Both have an open access version, which are accessed through the links above. You can buy the books if you want but most of the information can be found on the web!
The teacher of the course is Taariq Nazar.
On the bottom of each page on this website you will be able to make suggestions on the page if you feel like anything is unclear on the specific page or plain wrong. It can be anything from spelling mistakes to unclear formulations. We recommend that you do this.
Finally, we should give credit where it is due. This course is based on ideas & material developed by Martin Sköld, Erik Thorsén, Michael Höhle and Felix Günther.
Examination
Examination of the course consists of three parts weekly home assignments, a project and an exam.
Home assignments
There will be a home assignment every week with strict deadlines. Read more on Homework.
Project
There will be an individual project, where you use the tools in this courses to analyse a dataset of your own choice. Read more on project.
Exam
The last part of this course is an oral exam where you will apply methods taught in this course to a given problem. Read more on exam.
Contact
The contact info to the TAs can be found on the Moodle page.