This paper describes the idea of the multi-core programmable cores architecture for real-time image processing in embedded applications. The authors propose the architecture of a simple 8-bit processor core dedicated to low and intermediate level image operations. Several cores are connected to a common, 128-bit wide data bus by multiplexes and their operation is synchronized. The image data on the data bus is processed in parallel by all the processor cores. Each core realizes its own part of the image processing algorithm, what significantly improves the frame rate of the whole system. Apart from a low-level image processing, such as background subtraction, moving object extraction or geometrical transformation of the image, also higher level information can be processed and analysed, i.e. object indexing, blob size and shape estimation or basic trajectory analysis. The system consisting of 9 processor cores has been practically realized in FPGA hardware and verified. The assembler has also been written to provide the tool for software development. Comparing to the typical hardware approach, the proposed idea is very flexible and enables the realization of a wide range of low and intermediate level image processing algorithms.