14249
Proseminar
SoSe 23: Data collection and analysis with Python: Survey and Introduction
Jesse Paul Lehrke
Hinweise für Studierende
Bitte beachten: die Unterrichtszeiten variieren!
Zusätzl. Angaben / Voraussetzungen
Students are requested to have a basic familiarity with Python 3 acquired through any short (circa 10 modules), free introductory course. Codecademy, Udemy, and even YouTube have excellent options. The latter requires you install Python and an IDE (Integrated Development Environment), and thus has a higher entry barrier. You do not need to remember everything. It is simply beneficial if the course is not the first time you are seeing ideas such as working with data structures (lists, dictionaries, strings, tuples) and functions (loops, try/except, and user-defined). Please ensure you choose to study Python 3. If your course includes lessons on SQL, you may skip those.
Please note that while data science has mathematical (statistics) elements, especially the more advanced it gets, there is no math requirement for this course and students of any mathematical aptitude are encouraged to participate.
Preparation
Laptops are required for the course, as it aims for learning-by-doing. Please bring your machine to every class.
Please set up a Google account and use it to sign into Google Colab (https://colab.research.google.com/). Colab will be our main tool, as it allows you to use Python without having to install stuff on your personal machine and with cloud-based processing power. Early in the course we will explore other installation options, as these can be required to use certain tools.
For our tasks, a Youtube account and possibly a Wiebo account may be beneficial. Use of these will be later in the course and the extent to which we use them will be based on student learning.
Lastly, to facilitate your learning, it is recommended (but not strictly required) to set up a Stackoverflow account (https://stackoverflow.com/).
Schließen
Kommentar
Data science is rapidly becoming an essential method for research - applied, scientific and investigatory - even within the social sciences. The purpose of this course is to enable students to begin their journey into data science. The course is firstly structured as a survey course, because the topic of Data Science is large and takes a lifetime to learn. Thus the course will introduce students to the possibilities data science offers, the questions you can answer, and how to think and structure your work like a coder. However, to enable students to begin their lifelong learning, the course will also have an introduction component. Thereby, students will leave the course with the practical ability to conduct basic data collection, preparation, and analysis. These skills will serve as a solid foundation upon which to build.
Upon successfully completing the course students will:
1. Be familiar with the primary Python tools for data analysis/science and how to structure a data-centric project,
2. Be able to efficiently collect, manage, and prepare a large quantity of data from web-based sources,
3. Be able to perform standard Natural Language Processing (NLP) tasks on the textual data,
4. Be familiar with how to perform standard Machine Learning classification tasks on textual data,
5. Be familiar with how to perform basic audio-visual data preparation and analysis.
The course will be conducted in English as noted, but the data we work with will be in English and Chinese (particular language skills are not required though).
The course will consist of short lectures and live-code/code walkthroughs, but will also have a large interactive buddy and group work element.
The course will move fast and be challenging, but we will be open to continuous feedback and adjustment so nobody is left behind and that we enjoy the learning experience.
Schließen
6 Termine
Regelmäßige Termine der Lehrveranstaltung
Fr, 26.05.2023 14:00 - 17:00
Fr, 02.06.2023 12:00 - 18:00
Fr, 09.06.2023 14:00 - 17:00
Fr, 16.06.2023 12:00 - 18:00
Fr, 23.06.2023 12:00 - 18:00
Fr, 30.06.2023 14:00 - 17:00