Speaker:
Hagen Langer (Universität Bremen)
Course Description:
Data Mining, often used as a synonym for Knowledge Discovery in Databases (KDD), is the application of various symbolic and statistical methods for extracting knowledge from large, unstructured, and often yet unknown data collections. Typical results are the detection of structures, e.g., groups of
similar or related data subsets, patterns, associations, and trends. The course will cover - data preparation and preprocessing, - partitional and hierarchical cluster algorithms, density-based clustering, clustering validation, - classification methods, - time series analysis, and - data mining applications.
A special focus of this course will be on Text Mining, i.e., Data Mining techniques applied to large-scale textual databases. Mining natural language documents usually requires specific additional preprocessing efforts, such as part-of-speech detection, stemming, compound analysis, chunking, etc., in order to achieve high quality results.
Disciplines/Research Areas:
Computer Science, Artificial Intelligence, Computational Linguistics,
Applied Statistics.
CV:
Hagen Langer is senior researcher at the Artificial Intelligence group of the department of Computer Science at the University of Bremen (Germany). Areas of research include knowledge representation, multiagent systems, machine learning, and natural language processing.