Chapter 1 Deep Learning Ecosystem

Research
- PyTorch
- TensorFlow
- Keras: complete frameworkd for building and training modules, instead of just neural network modules
- JAX
- MLX: Apple
- PyTorch Lightning
Production
- Inference-only
  - vLLM
  - TensorRT
- Triton: by OpenAI, CUDA like, but in Python
- torch.compile
- TorchScript
- ONNX Runtime: accelerate the model training time on multi-node NVIDIA GPUs
- Detectron2: supports training and inferencel; computer vision project started at Facebook (Meta); for detection and segmentation algorithms
Low-Level
- CUDA
- ROCm: for AMD GPUs
- OpenCL: a more general, open source computing language; CPUs, GPUs, digital signal processors, other hardware
Inference for Edge Computing and Embedding Systems
- Edge computing: low-latency and high efficient local computing in the context of real-world distributed systems like fleets.
- CoreML: for development of pre-trained models on Apple devices
- PyTorch Mobile
- TensorFlow Lite
Easy to Use
- FastAI
- ONNX：Open Neural Network eXchange, a middle formate
- wandb: short for weights and biases
Cloud Providers
- AWS: EC2 instance, Sagemaker
- Google Cloud: Vertex AI, VM instance
- Microsoft Azure: Deep speed
- OpenAI
- VastAI
- Lambda Labs: cheap
Compilers
- XLA
- LLVM
- MLIR
- NVCC
hub
- Hugging Face

Chapter 3 C/C++ Review