In today’s computerized and information-based society, people are inundated with vast amounts of text data, ranging from news articles, social media posts, scientific publications, to a wide range of textual information from various vertical domains (e.g., corporate reports, advertisements, legal acts, medical reports). How to turn such massive and unstructured text data into structured, actionable knowledge, and how to enable effective and user-friendly access to such knowledge is a grand challenge.
Networked data arises in a number of application domains ranging from IoT, cloud computing, software analysis, neuroscience, biology, geography, to social sciences. Accordingly, network/graph analysis has emerged as a major paradigm for exploring complex processes behind observed data. Compared to high dimensional data, analysis of network data is more challenging due to interdependencies between entities, the presence of attributes, and the natural evolution of networks over time.
Multimodal Big Data
Unstructured multimodal data (1D time sequences, including audio, and 2d/3d/4d/5d images) is routinely collected but is hard to analyze. Ranging from social sciences to biology, remote sensing to materials research, such data come from a variety of sources: web pages, camera networks, mobile sensors, smart phones, microscopes, satellite and aerial imagery, and medical instruments, to name a few. This data is typically dynamic, spatial, and heterogeneous; in other words, the data are disparate and obtained with non-uniform sampling in space and time. Researchers often collate multiple datasets, collected in a variety of different ways, which quantify different aspects of the same scientific process of interest. Thus, it is increasingly rare that hard questions can be answered with experimental data of a single type. To address these challenges, we need new techniques that combine multiple data sources effectively in order to deliver new scientific insights.
Human Agent Teams
The recent convergence of research in social and psychological sciences, dynamic and quantitative modeling, and network science has led to a re-examination of collective team behavior from a quantitative and systems-oriented viewpoint. Teams cannot be understood fully by studying their components (members) in isolation: team performance is not simply a sum of individual performances; and a diversity of opinions among members leads to better group outcomes. However, it is not yet understood how patterns of interactions and relationships among team members (i.e. team networks) impact performance. Understanding these patterns is critical, as the resolution of complex issues requires deliberative within-group interaction processes in which alternative courses of action are surfaced, evaluated, and acted upon.
Data science aims to leverage knowledge and insights from data in order to better understand and solve complex problems. In order for data science to effectively inform human judgment, people must be able to visualize and conceptualize information at every step of the cycle of data preprocessing, exploration, selection, transformation, analysis, and interpretation, and to be able to provide input, guidance, and expertise to the algorithms governing the data analysis process. Key capabilities of this human component of data science include data visualization, intelligence and multimodal interfaces, gestural control, and augmented and virtual reality.
Social networks not only reflect our modern society but also have the capacity to funadamentally rewire it. They impact business, subsume media, offer immense opportunities for learning and exploration, create powerful ad-hoc organizations, and serve as a catalyst for cultural changes. Along with the power of positive change, however, comes the risk of manipulation of public opinion, amplification of biases, and impersonal communication.
Environmental Data Science
Environmental problems are becoming increasingly complex, requiring multi-disciplinary approaches to address them. These problems can no longer be solved solely with a disciplinary focus, and they increasingly demand data-driven solutions. The rise of big data and new technologies for observing earth systems and the human actions that rapidly change and respond to the environment demand that resource management and conservation decisions be informed by data in a fully transparent and repeatable way. These demands are shifting the landscape of the skills that are needed to tackle environmental problems.
Affiliated Projects and Centers
- Broom Demography Center
- National Center for Ecological Analysis and Synthesis (NCEAS)
- Earth Research Institute (ERI)
- Brain Initiative
- Quantitative Biology Initiative and BMSE
- IGERT on Network Science
- Center for Spatial Studies
- Center for Information Technology and Society (CITS)
- Institute for Energy Efficiency (IEE)
- Center for Bioengineering
- Center for Bioimage Informatics
- Mellon-funded WhatEvery1Says project (WE1S)
- UCSB Smart Farm
- Where's the Bear? (WTB)
- Aristotle Project
- Center for Responsible Machine Learning