We introduce BURN, a methodology to create customized benchmarks for testing multitier applications under time-varying resource usage conditions. Starting from a set of preexisting test workloads, BURN finds a policy that interleaves their execution to stress the multitier application and generate controlled burstiness in resource consumption. This is useful to study, in a controlled way, the robustness of software services to sudden changes in the workload characteristics and in the usage levels of the resources. The problem is tackled by a model-based technique which first generates Markov models to describe resource consumption patterns of each test workload. Then, a policy is generated using an optimization program which sets as constraints a target request mix and user-specified levels of burstiness at the different resources in the system. Burstiness is quantified using a novel metric called overdemand, which describes in a natural way the tendency of a workload to keep a resource congested for long periods of time and across multiple requests. A case study based on a three-tier application testbed shows that our method is able to control and predict burstiness for session service demands at a fine-grained scale. Furthermore, experiments demonstrate that for any given request mix our approach can expose latency and throughput degradations not found with nonbursty workloads having the same request mix.