基本上什麼也沒說。在這裡把問題情境說出來——大目標就是想要把機器學習模型放到各式各樣的硬體上。這應該就是各家 AI Compiler 公司都在做的事。
Key Questions to answer:
- What level of abstraction to have?
- Too high: lack of reuse, will have to rebuild different operator types if so coarse grained
- Too low: too much verbosity, harder to identify high level informations (e.g. control flow)
- How to address the process from “Development” to “Deployment”?
- When developing… ML engineers train model in language that is more easier to configure — Python. Machine learning frameworks are mainly based on Python, like Tensorflow, PyTorch and JAX.
- When deploying… the environment varies from end devices like holdable devices, tiny cameras, tiny microphones to individual GPU, or even large scale computing farms. The environment and hardwares are different in multiple senses. The problem of deployment is the next big question for this machine learning era.
Machine learning compilation goals
- Integration and dependency minimization: 在 end device 上用最少的資源達到 deployment
- Leverage hardware native acceleration: 善用硬體特性做加速,舉凡 SIMD lane, multiple core, cache friendly tiling 都是這階段會做的事。常數優化的累積不可小覷。
MLC is all about bridging the gap. 編譯器就是去追逐人類與 0/1 之間的巴別塔。