Skip to main content

IDT Lab

At Innovative Data Technologies (IDT) Lab, we conduct research in all aspects of data management for science, including storage and I/O, file systems, metadata management, data quality assessment and improvement, performance analysis, performance tuning, data security, and energy-efficiency. Our emphasis is on developing systems and tools that make managing scientific data efficient and easy for scientists using high-performance computing (HPC), cloud, and edge computing systems.

Our research covers various aspects of data management

Data Management Systems for Science

Design and implementation of data management systems for scientific applications, including data-intensive workflows.

Efficient Parallel and Distributed I/O

Development of efficient parallel and distributed I/O systems, including data containers and storage hierarchies.

Data Readiness and Security

Research on data readiness for AI applications, ensuring data quality and compliance, and addressing security challenges in data management.

Featured projects

AI Data Readiness Inpector

IDT-ledFunded

AIDRIN is a framework designed accross centralized and decentralized (eg: federated learning) workflows to assess the readiness of data for AI applications, ensuring that datasets meet quality and compliance standards.

Drishti: I/O Insights for All

IDT-ledFundedOpen Source

Dristhi is a novel interactive web-based analysis framework to visualize I/O traces, highlight bottlenecks, and help understand the I/O behavior of scientific applications.

Proactive Data Containers

IDT-ledFundedOpen Source

Formulation of object-oriented PDCs and their mapping in different levels of the exascale storage hierarchy; Efficient strategies for moving data in deep storage hierarchies using PDCs. Techniques for transforming and reorganizing data based on application requirements. Novel analysis paradigms for enabling data transformations and user-defined analysis on data in PDCs

S2-D2: Securing Self-describing Data, Formats, and Libraries

IDT-ledFunded

This project will apply comprehensive testing, evaluation, issue identification, hardening, and validation to correct security deficiencies in self-describing file formats and libraries. The specific R&D tasks include: (1) assessing and fixing file format vulnerabilities, (2) protecting data access libraries, (3) exploring security solutions for metadata and data, and (4) constructing a security framework, called S2-D2.