Many embedded computers are distributed systems, composed of several heterogeneous processors and communication links of varying speeds and topologies. This paper describes a new, heuristic algorithm which simultaneously synthesizes the hardware and software architectures of a distributed system to meet a performance goal and minimize cost. The hardware architecture of the synthesized system consists of a network of processors of multiple types and arbitrary communication topology; the software architecture consists of an allocation of processes to processors and a schedule for the processes. Most previous work in co-synthesis targets an architectural template, whereas this algorithm can synthesize a distributed system of arbitrary topology. The algorithm works from a technology database which describes the available processors, communication links, I/O devices, and implementations of processes on processors. Previous work had proposed solving this problem by integer linear programming (ILP); our algorithm is much faster than ILP and produces high-quality results.