Partitioning an embedded system application among a microprocessor and custom hardware has been shown to improve the performance, power or energy of numerous examples. The advent of single-chip microprocessor/FPGA platforms makes such partitioning even more attractive. Previous partitioning approaches have partitioned sequential program source code, such as C or C++. We introduce a new approach that partitions at the software binary level. Although source code partitioning is preferable from a purely technical viewpoint, binary-level partitioning provides several very practical benefits for commercial acceptance. We demonstrate that binary-level partitioning yields competitive speedup results compared to source-level partitioning, achieving an average speedup of 1.4 compared to 1.5 for eight benchmarks partitioned on a single-chip microprocessor/FPGA device.