Ansible
An open-source IT automation tool for configuration management and orchestration, widely used for HPC cluster and data center provisioning.
A comprehensive glossary of terms and definitions covering IT infrastructure, high-performance computing, enterprise storage, networking, and data center technologies.
An open-source IT automation tool for configuration management and orchestration, widely used for HPC cluster and data center provisioning.
A high-performance parallel file system with distributed metadata, simple administration, and built-in high availability for HPC and AI workloads.
A fast intermediate storage layer using NVMe SSDs that absorbs I/O bursts between compute nodes and the parallel file system.
An open-source distributed storage platform providing unified object, block, and file storage with automatic data distribution and self-healing capabilities.
Compute Express Link — an open interconnect standard enabling high-speed communication and memory sharing between CPUs, accelerators, and memory devices.
A centralized repository storing raw data in its native format at scale, supporting diverse analytics workloads from batch to real-time processing.
A data reduction technique that eliminates duplicate data blocks, significantly reducing storage consumption in backup and primary storage systems.
A data protection method that encodes data with redundant fragments across multiple locations, offering fault tolerance with less storage overhead than replication.
A high-speed, lossless networking protocol for Storage Area Networks operating at up to 64 Gbps with guaranteed delivery.
Floating Point Operations Per Second — the standard measure of computing performance for scientific and HPC workloads, from GFLOPS to EFLOPS.
General Parallel File System (IBM Spectrum Scale) — an enterprise parallel file system with advanced data management features for HPC and AI.
A massively parallel processor with thousands of cores, essential for AI training, scientific simulation, and high-performance computing workloads.
Hyperconverged Infrastructure — a software-defined IT framework that combines compute, storage, and networking in a single system for simplified management.
A high-bandwidth, low-latency networking technology with native RDMA support, widely used in HPC clusters and AI training infrastructure.
Input/Output Operations Per Second — a key storage performance metric measuring random read/write operation throughput.
Internet Small Computer Systems Interface — a protocol enabling block-level SAN storage access over standard TCP/IP Ethernet networks.
An open-source container orchestration platform that automates deployment and scaling of containerized applications across clusters.
The time delay between an I/O request and its response, measured in microseconds for flash storage — a critical metric for real-time applications.
An open-source parallel file system that powers the majority of the world's top supercomputers, designed for maximum throughput at scale.
The non-volatile memory technology in SSDs, storing data in cells organized into pages and blocks with 3D stacking for increased density.
Non-Volatile Memory Express — a high-performance storage protocol designed for flash SSDs, communicating directly over PCIe for maximum throughput and minimal latency.
NVMe over Fabrics — an extension of the NVMe protocol across network fabrics, enabling remote NVMe storage access with near-local performance.
A storage architecture managing data as objects with rich metadata, designed for massive scale and accessed via HTTP/S3 APIs.
A distributed file system that stripes data across multiple servers, enabling thousands of clients to access storage concurrently with linear bandwidth scaling.
Peripheral Component Interconnect Express — the high-speed serial bus standard connecting GPUs, NVMe SSDs, and network adapters to the CPU.
Portable Operating System Interface — a standard defining file system behavior that ensures application compatibility across different storage systems.
Quad-Level Cell NAND flash storing four bits per cell, delivering the highest density and lowest cost per gigabyte for capacity-oriented storage.
Redundant Array of Independent Disks — a method of combining multiple drives for improved performance, redundancy, or both.
Remote Direct Memory Access — a technology enabling direct memory-to-memory data transfer between computers without CPU involvement, critical for low-latency HPC networking.
RDMA over Converged Ethernet — a protocol enabling high-performance RDMA communication over standard Ethernet networks.
Storage Area Network — a dedicated high-speed network providing block-level access to shared storage for databases, virtualization, and enterprise workloads.
Software-Defined Storage — an approach that separates storage intelligence from hardware, enabling storage services on commodity infrastructure.
A container platform for HPC that runs without root privileges, enabling reproducible scientific computing on shared clusters.
Simple Linux Utility for Resource Management — the dominant open-source job scheduler for HPC clusters and supercomputers.
A space-efficient, point-in-time copy of data using copy-on-write techniques, enabling rapid backup and recovery with minimal storage overhead.
A storage technique that allocates disk space on demand rather than upfront, improving capacity utilization through overcommitment.
A strategy that automatically moves data between fast, medium, and archival storage tiers based on access frequency to optimize cost and performance.
A flash storage phenomenon where physical writes exceed logical writes due to block-level erase requirements, affecting SSD endurance and performance.