Skip to main content

16 docs tagged with "AI4CI"

View all tags

Distributed Training Estimator of LLMs

This component implements a time cost estimator for distributed training of large language models (LLMs). It is used to predict the time required to train one batch across multiple GPUs. The predictor module only requires at least a CPU. The computation sampling module needs one or more GPUs, while the communication sampling module requires multiple GPUs, depending on your computing platform.

Explanation

- This dataset captures both tabular metadata and graph representations from deep learning training workflows, extracted via TensorFlow's XLA compiler.

Explanation

Graph Neural Networks for Food Trade Flow Prediction

HLO Feature Dataset for AI Resource Estimation

A dataset designed to support AI-driven resource estimation like runtime prediction to support HPC scheduling optimization by leveraging compiler-level High-Level Optimizer (HLO) graph features and deep learning workload metadata.

How-To Guide

WAYS to configure HARP to setup applications for profiling:

How-To Guides

How to Implement a Hurdle Model for Trade Prediction

iSpLib - An Intelligent Sparse Library

iSpLib is an accelerated sparse kernel library with PyTorch interface. This library has an auto-tuner which generates optimized custom sparse kernels based on the user environment. The goal of this library is to provide efficient sparse operations for Graph Neural Network implementations. Currently it has support for CPU-based efficient Sparse Dense Matrix Multiplication (spmm-sum only) with autograd.

Tutorials

Getting Started with the HLO Feature Dataset

Tutorials

Getting Started with Food Flow Prediction