Highlights
- Pro
-
-
composer Public
Forked from mosaicml/composerlibrary of algorithms to speed up neural network training
Python Apache License 2.0 UpdatedJan 8, 2025 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedOct 15, 2024 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
-
-
TransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Python Apache License 2.0 UpdatedMay 2, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedApr 15, 2024 -
lm-evaluation-harness Public
Forked from EleutherAI/lm-evaluation-harnessA framework for few-shot evaluation of language models.
Python MIT License UpdatedMar 22, 2024 -
-
-
tensorflow Public
Forked from tensorflow/tensorflowComputation using data flow graphs for scalable machine learning
C++ Apache License 2.0 UpdatedJan 28, 2024 -
grouped_gemm Public
Forked from tgale96/grouped_gemmPyTorch bindings for CUTLASS grouped GEMM.
-
llm-analysis Public
Forked from cli99/llm-analysisLatency and Memory Analysis of Transformer Models for Training and Inference
-
ffcv Public
Forked from libffcv/ffcvFFCV: Fast Forward Computer Vision (and other ML workloads!)
-
Lux-Design-S2 Public
Forked from Lux-AI-Challenge/Lux-Design-S2Repository for the Lux AI Challenge, season 2
-
examples Public
Forked from mosaicml/examplesFast and flexible reference benchmarks
Python Apache License 2.0 UpdatedJan 25, 2024 -
diffusion-benchmark Public
Forked from mosaicml/diffusion-benchmarkPython Apache License 2.0 UpdatedJan 25, 2024 -
-
frustum-pointnets Public
Forked from charlesq34/frustum-pointnetsFrustum PointNets for 3D Object Detection from RGB-D Data
-
-
search-stack Public
Forked from Appleseed/search-stackAppleseed Search Stack Docker composition. Uses Solr, Elasticsearch, MongoDB, Mono, DotNet, ASPNet, NGINX, MySQL, PostgreSQL
-
keras-frcnn Public
Forked from leriomaggio/keras-frcnn -
-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
-
AITemplate Public
Forked from facebookincubator/AITemplateAITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
-
-
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
-
pytest-codeblocks Public
Forked from nschloe/pytest-codeblocks📄 Test code blocks in your READMEs
-
-