• Genome Data Science

    We develop methods and tools to work with tens of thousands of genomes and analyze and integrate the corresponding data.

    Model of DNA double helix in front of a student.
    © Universität Bielefeld

Learning in Big Data Analytics


392232 Schönhuth Winter 2020/21 Tue 16:15 - 17:45 (S) in Zoom

Contents

The recent surge of machine learning (ML) has opened up various opportunities when analyzing big datasets. Beyond basic, non-ML supported techniques of big data analytics, such as identifying similar items in big datasets, or arranging how to distribute jobs on large compute clusters, for example, the ML supported techniques enable to extract knowledge from large datasets at utmost diversity and accuracy.

The seminar will start with a mini lecture. First, lectures will explain how to cluster datasets. Clustering is an 'unsupervised' machine learning technique by which to mine social network graphs, for example. Second, 'supervised' machine learning techniques (where 'deep learning' likely is the most prominent recent technique) and their use in analyzing big data will be discussed. The mini lecture will be followed by seminar presentations, to be presented in small groups of 2-3 students.

Literature

Time table

Date Topic
10.11.2020 Logistics (slides)
17.11.2020 -
24.11.2020 Introduction ML + SVMs (slides)
01.12.2020 Web Advertisements I (slides)
08.12.2020 Web Advertisements II (slides)
15.12.2020 Social Network Analysis I (slides)
22.12.2020 Social Network Analysis II (slides)
23.12.2020 - 04.01.2021 Christmas Break
05.01.2021 -
12.01.2021 Logistics of presentations
19.01.2021 -
26.01.2021 Adaptive random forests for evolving data stream classification
Time-Aware Prospective Modeling of Users for Online Display Advertising
02.02.2021 DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
Scalable K-Means++
09.02.2021 Wide and Deep Recommender Systems