## Natural Language Processing & Information Retrieval

**CS 546 Structured Prediction: Theories and Applications to Natural Language Processing***Spring 2011 - Roth*

Making decisions in natural language processing problems often involves assigning values to sets of interdependent variables where the expressive dependency structure can influence, or even dictate, what assignments are possible. Structured learning problems such as semantic role labeling provide one such example, but the setting is broader and includes a range of problems such as name entity and relation recognition and co-reference resolution. The setting is also appropriate for cases that may require a solution to make use of multiple models (possible pre-designed or pre-learned components) as in summarization, textual entailment and question answering.

This semester, we will devote the course to the study of *structured learning problems* in natural language processing.

**CS 410 / LIS 410 Introduction to Text Information Systems***Spring 2010 - Zhai*

Introduction to the theory, design, and implementation of text-based information systems. Text analysis, retrieval models (e.g. Boolean, vector space, probabilistic), text categorization, text filtering, clustering, retrieval system design and implementation, and applications to web information management.

**LING 400 Introduction to Linguistic Structure***Fall 2009-Lasersohn, Spring 2010-Buesa Garcia*

Introduction to the theory and methodology of the science of linguistics with special reference to phonology, morphology, syntax and semantics.

**LING 406 Introduction to Computational Linguistics***Spring 2010 - Girju*

Introduces the field of natural language processing and computational linguistics. Topics include finite-state methods, parsing, probabilistic methods, machine learning in NLP, computational semantics and applications of NLP technology. The course is mostly about concepts rather than programming, though some programming assignments will be given.

**CS 498JH Special Topics -- Introduction to Natural Language Processing***Fall 2009-Hockenmaier*

This course will provide an introduction to computational linguistics, from morphology (word formation) and syntax (sentence structure) to semantics (meaning) and natural language processing applications such as parsing, machine translation, generation and dialog systems. Prerequisites: Formal language and automata theory (CS373 or equivalent). Programming experience is necessary for the assignments. Prior exposure to linguistics is not required.

**CS 598JH Special Topics - Advanced Natural Language Processing***Spring 2010-Hockenmaier*

Theory and applications of Bayesian models. In recent years, Bayesian techniques have been applied to a number of natural language processing tasks. The aim of this course is to provide students with an understanding of the theory behind these models, and to enable them to apply these techniques in their own research. We will study Bayesian models such as Latent Dirichlet Allocation (topic models) and (Hierarchical) Dirichlet Processes and their applications to various natural language processing tasks. We will review both variational and sampling-based inference algorithms. The course will consist of a research project and a mixture of lectures and seminar-style presentations.

**LING 591A Seminar in Linguistic Analysis***Spring 2010-Girju*

Discussion of advanced topics of current interest.

**LING 402 Tools and Techniques for Speech and Language Processing***Fall 2009-Girju*

Introduction to aspects of the tools and methods of studies in speech and natural language processing (NLP), with a focus on programming for NLP and speech applications, statistical methods for data analysis, and tools for displaying and manipulating speech data.

**LIS 456 Information Storage and Retrieval***Spring 2010-Efron*

Introduces problems of document representation, information need specification, and query processing. Describes the theories, models, and current research aimed at solving those problems. Primary focus is on bibliographic, text, and multimedia records.

**ECE 594 / LING 594 Mathematical Models of Language***Spring 2010-Levinson*

Mathematical models of linguistic structure and their implementation in computational algorithms used in automatic speech understanding and speech synthesis. Statistical and automata-theoretic techniques are studied allowing a quantitative description of acoustic-phonetics, phonology, phonotactics, lexicons, syntax, and semantics. The methods are used to build components of a speech understanding system.

## Probability, Statistics, and Optimization

**STAT 451 / MATH 461 Probability Theory***Fall 2009-var., Spring 2010-var.*

Introduction to mathematical probability; includes the calculus of probability, combinatorial analysis, random variables, expectation, distribution functions, moment-generating functions, and central limit theorem.

**MATH 482 Linear Programming***Spring 2010-Vijay*

Rigorous introduction to a wide range of topics in optimization, including a thorough treatment of basic ideas of linear programming, with additional topics drawn from numerical considerations, linear complementarity, integer programming and networks, polyhedral methods.

**MATH 484 Nonlinear Programming***Fall 2009-Vijay*

Iterative and analytical solutions of constrained and unconstrained problems of optimization; gradient and conjugate gradient solution methods; Newton's method, Lagrange multipliers, duality and the Kuhn-Tucker theorem; and quadratic, convex, and geometric programming.

**CS 578 / ECE 563 / STAT 563 Information Theory***Fall 2009-Blahut*

Mathematical models for channels and sources; entropy, information, data compression, channel capacity, Shannon's theorems, and rate-distortion theory.

**STAT 525 Computational Statistics***Spring 2010-Chen*

Various topics, such as ridge regression; robust regression; jackknife, bootstrap, cross-validation and resampling plans; E-M algorithm; projection pursuit; all with a strong computational flavor.

**STAT 510 Mathematical Statistics I***Fall 2009-Martinsek*

Distributions, transformations, order-statistics, exponential families, sufficiency, delta-method, Edgeworth expansions; uniformly minimum variance unbiased estimators, Rao-Blackwell theorem, Cramer-Rao lower bound, information inequality; equivariance.

**STAT 511 Mathematical Statistics II***Spring 2010-Qu*

Bayes estimates, minimaxity, admissibility; maximum likelihood estimation, consistency, asymptotic efficiency; testing and confidence intervals; Neyman-Pearson lemma, uniformly most powerful tests; likelihood ratio tests and large-sample approximation; nonparametrics

**ECE 580 / MATH 58 Optimization by Vector Space Methods***Fall 2009-Basar*

Introduction to normed, Banach, and Hilbert spaces; applications of the projection theorem and the Hahn-Banach Theorem to problems of minimum norm, least squares estimation, mathematical programming, and optimal control; the Kuhn-Tucker Theorem and Pontryagins maximum principle; introduction to iterative methods.

## Machine Learning

**CS 440 / ECE 448 Artificial Intelligence***Fall 2009-DeJong, Spring 2010-Amir*

Major topics in and directions of research in artificial intelligence: AI languages (LISP and PROLOG), basic problem solving techniques, knowledge representation and computer inference, machine learning, natural language understanding, computer vision, robotics, and societal impacts.

**CS 446 Machine Learning***Fall 2009-Roth*

Theory and basic techniques in machine learning. Major theoretical paradigms and key concepts developed in machine learning in the context of applications such as natural language and text processing, computer vision, data mining, adaptive computer systems and others. Review of several supervised and unsupervised learning approaches: methods for learning linear representations; on-line learning, Bayesian methods; decision-trees; features and kernels; clustering and dimensionality reduction.

**CS 546 Machine Learning In Natural Language Processing***2010-Roth*

Introduction to the central learning frameworks and techniques that have emerged in the field of natural language processing and found applications in several areas in text and speech processing: from information retrieval and extraction, through speech recognition to syntax, semantics and language understanding related tasks. Presents the theoretical paradigms - learning theoretic, probabilistic, and information theoretic - and the relations among them, as well as the main algorithmic techniques developed within these and in key natural language applications.

**STAT 542 Statistical Learning***Fall 2009-Liang*

Modern techniques of predictive modeling, classification, and clustering are discussed. Examples of these are linear regression, nonparametric regression, kernel methods, regularization, cluster analysis, classification trees, neural networks, boosting, discrimination, support vector machines, and model selection. Applications are discussed as well as computation and theory.

## Databases

**CS 411 Database Systems***Fall 2009-Winslett/Minami, Spring 2010-Chang*

Examination of the logical organization of databases: the entity-relationship model; the hierarchical, network, and relational data models and their languages. Functional dependencies and normal forms. Design, implementation, and optimization of query languages; security and integrity; concurrency control, and distributed database systems.

**CS 412 Introduction to Data Mining***Fall 2009-Han*

Introduction to the concepts, techniques, and systems of data warehousing and data mining. Design and implementation of data warehouse and on-line analytical processing (OLAP) systems; data mining concepts, methods, systems, implementations, and applications.

**CS 512 Data Mining Principles***Spring 2010-Han*

An advanced course on principles and algorithms of data mining. Data cleaning and integration; descriptive and predictive mining; mining frequent, sequential, and structured patterns; clustering, outlier analysis and fraud detection; stream data, web, text, and biomedical data mining; security and privacy in data mining; research frontiers

## Computer Vision

**CS 543 / ECE 549 Computer Vision***Spring 2010-Forsyth/Hoiem*

Information processing approaches to computer vision, algorithms, and architectures for artificial intelligence and robotics systems capable of vision: inference of three-dimensional properties of a scene from its images, such as distance, orientation, motion, size and shape, acquisition, and representation of spatial information for navigation and manipulation in robotics.

## Other AI Courses

**CS 548 Models of Cognitive Processes***Spring 2010-DeJonj*

Formal models and concepts in vision and language; detailed analysis of computer vision, language, and learning problems; relevant psychological results and linguistic systems; survey of the state-of-the-art in artificial intelligence.

**CS 498AE Reasoning in Artificial Intelligence***Fall 2009-Amir*

This class concerns reasoning techniques used and developed in Artificial Intelligence. It will include topics from reasoning with graphical probabilistic representations, sampling and variational inference, logical inference (propositional and FOL), combinations of logical and probabilistic inference techniques, and applications in natural-language processing, vision, robotics, and others. The class is suitable for graduate students and undergraduate students interested in AI and machine learning.

**ECE 470 / CS 443 / GE 421 / ME 445 Introduction to Robotics***Fall 2009-Bretl*

Fundamentals of robotics, rigid motions, homogeneous transformations, forward and inverse kinematics, velocity kinematics, motion planning, trajectory generation, sensing, vision, and control.