Distributed Training Estimator of LLMs
This component implements a time cost estimator for distributed training of large language models (LLMs). It is used to predict the time required to train one batch across multiple GPUs. The predictor module only requires at least a CPU. The computation sampling module needs one or more GPUs, while the communication sampling module requires multiple GPUs, depending on your computing platform.
Explanation
- This dataset captures both tabular metadata and graph representations from deep learning training workflows, extracted via TensorFlow's XLA compiler.
Explanation
Convention and Usage
Explanation
Graph Neural Networks for Food Trade Flow Prediction
HARP - HPC Application Runtime Predictor
Overview
HLO Feature Dataset for AI Resource Estimation
A dataset designed to support AI-driven resource estimation like runtime prediction to support HPC scheduling optimization by leveraging compiler-level High-Level Optimizer (HLO) graph features and deep learning workload metadata.
How-To Guide
WAYS to configure HARP to setup applications for profiling:
How-To Guide
System Requirements
How-To Guides
How to Predict Training Time Using Metadata
How-To Guides
How to Implement a Hurdle Model for Trade Prediction
iSpLib - An Intelligent Sparse Library
iSpLib is an accelerated sparse kernel library with PyTorch interface. This library has an auto-tuner which generates optimized custom sparse kernels based on the user environment. The goal of this library is to provide efficient sparse operations for Graph Neural Network implementations. Currently it has support for CPU-based efficient Sparse Dense Matrix Multiplication (spmm-sum only) with autograd.
Multi-Scale Food Flow Prediction using Graph Neural Networks
A project leveraging Graph Neural Networks (GNNs) to predict food flows between counties and FAF zones for economic planning, infrastructure development, and policy-making.
Tutorial
SpMM Example
Tutorials
Enviroment
Tutorials
Getting Started with the HLO Feature Dataset
Tutorials
Getting Started with Food Flow Prediction