Because of its good image quality and moderate computational requirements, error diffusion has become a popular halftoning solution for desktop printers, especially inkjet printers. By making the weights and thresholds tone-dependent and using a predesigned halftone bitmap for tone-dependent threshold modulation, it is possible to achieve image quality very close to that obtained with far more computationally complex iterative methods. However, the ability to implement error diffusion in very low cost or large format products is hampered by the requirement to store the tone-dependent parameters and halftone bitmap, and also the need to store error information for an entire row of the image at any given point during the halftoning process. For the first problem, we replace the halftone bitmap by deterministic bit flipping, which has been previously applied to halftoning, and we linearly interpolate the tone-dependent weights and thresholds from a small set of knot points. We call this implementation a reduced lookup table. For the second problem, we introduce a new serial block-based approach to error diffusion. This approach depends on a novel intrablock scan path and the use of different parameter sets at different points along that path. We show that serial block-based error diffusion reduces off-chip memory access by a factor equal to the block height. With both these solutions, satisfactory image quality can only be obtained with new cost functions that we have developed for the training process. With these new cost functions and moderate block size, we can obtain image quality that is very close to that of the original tone-dependent error diffusion algorithm.