High-Performance Optimizations on Tiled Manycore Embedded Systems: A Matrix Multiplication Case Study* | part of Modeling and Optimization of Parallel and Distributed Embedded Systems | Wiley-IEEE Press books | IEEE Xplore