The growing trend in current complex embedded systems is to deploy a multiprocessor system-on-chip (MPSoC). A MPSoC consists of multiple heterogeneous processing elements, a memory hierarchy, and input/output components which are linked together by an on-chip interconnect structure. Such an architecture provides the flexibility to meet the performance requirements of multimedia applications while respecting the constraints on memory, cost, size, time, and power. Many embedded systems employ software-managed memories known as scratch-pad memories (SPM). Unlike caches, SPMs are software-controlled and hence the execution time of applications on such systems can be accurately predicted. Scheduling the tasks of an embedded application on the processors and partitioning the available SPM budget among these processors are two critical issues in such systems. Often, these are considered separately; such a decoupled approach may miss better quality schedules. In this paper, we present an integrated approach to task scheduling and SPM partitioning to further reduce the execution time of embedded applications. Results on several real-life benchmarks show the significant improvement from our proposed technique.