14 docs tagged with "Release 2025-07"

Agricultural Routing Synthetic Data Generation

This repository provides a python script to generate synthetic location and vehicle data (csv files) for developers and researchers aiming to model and solve agricultural logistics problems such as:

Arraymorph

ArrayMorph is a software to manage array data stored on cloud object storage efficiently. It supports both HDF5 C++ API and h5py API. The data returned by h5py API is numpy arrays. By using h5py API, users can access array data stored on the cloud and feed the read data into machine learning pipelines seamlessly.

camera-traps

The Camera Traps application is both a simulator and IoT device software for utilizing machine learning on the edge in field research. The first implementation specializes in applying computer vision (detection and classification) to wildlife images for animal ecology studies. Two operational modes are supported (1) an input dataset of images to act as the images that would be generated an IoT camera device or (2) an input video file that would be captured by a camera which is then processed by an image detecting plugin that saves frames with motion in them; it uses these images to drive the simulation.

CT Controller

The ctcontroller tool can be used to manage the provisioning and releasing of edge hardware as well as running and shutting down the camera-traps application.

Cyberinfrastructure Knowledge Network

The Cyberinfrastructure Knowledge Network (CKN) is an extensible and portable distributed framework designed to optimize AI at the edge—particularly in dynamic environments where workloads may change suddenly (for example, in response to motion detection). CKN enhances edge–cloud collaboration by using historical data, graph representations, and adaptable deployment of AI models to satisfy changing accuracy‑and‑latency demands on edge devices.

Distributed Training Estimator of LLMs

This component implements a time cost estimator for distributed training of large language models (LLMs). It is used to predict the time required to train one batch across multiple GPUs. The predictor module only requires at least a CPU. The computation sampling module needs one or more GPUs, while the communication sampling module requires multiple GPUs, depending on your computing platform.

FAF-API-ICICLE

API access to the US Bureau of Transportation Statistics' Freight Analysis Framework dataset

FafFrontend

This is intended as a helpful front end to a REST API to the US Bureau of Transportation Statistics (BTS) Feight Analysis Framework (FAF) dataset. It has been developed by the Data To Insight Center (D2I) at Indiana University as part of the NSF ICICLE AI Institute and in collaboration with the US Department of Transportation, Bureau of Transportation Statistics, and the University of Wisconsin-Madison. See FAF-API-ICICLE. This project was generated with Angular CLI version 18.2.12.

Food Waste Ontology Chatbot

Overview

Harvest

Harvest is a tool designed to allow multiple types of stake holders in the digital agriculture space further their own unique goals from research to increases of the bottom line. Harvest allows for the creation of pipelines where users can preprocess their data, train models on HPC resources, infer on models to get insights on farm fields, and some visualizations to give an at a glance understand of what is happening on the field.

Organization-SIC-Classifier-for-Smart-Foodsheds

This repository contains code for training and evaluating models that classify organizations into Standard Industrial Classification (SIC) codes based on different types of descriptive text. This model is designed for researchers and data scientists who need to categorize unknown or newly listed organizations by business type. It can be applied to tasks such as food systems research, analyzing supply chains, and regional economic mapping, particularly in scenarios where structured corpora are unavailable. Given only an organization’s name and its description, the model predicts a high-level SIC category.

Patra Knowledge Base

The Patra Knowledge Base is a system designed to manage and track AI/ML models, with the objective of making them more accountable and trustworthy. It's a key part of the Patra ModelCards framework, which aims to improve transparency and accountability in AI/ML models throughout their entire lifecycle. This includes the model's initial training phase, subsequent deployments, and ongoing usage, whether by the same or different individuals.

Patra Model Card Toolkit

The Patra Toolkit is a component of the Patra ModelCards framework designed to simplify the process of creating and documenting AI/ML models. It provides a structured schema that guides users in providing essential information about their models, including details about the model's purpose, development process, and performance. The toolkit also includes features for semi-automating the capture of key information, such as fairness and explainability metrics, through integrated analysis tools. By reducing the manual effort involved in creating model cards, the Patra Toolkit encourages researchers and developers to adopt best practices for documenting their models, ultimately contributing to greater transparency and accountability in AI/ML development.

ScienceAgent Interface

ScienceAgent Interface provides a web interface for conducting data-driven scientific tasks using ScienceAgent. The interface connects to a Python backend which allows users to execute generated programs in an isolated Docker environment and view the results.