Designers of data centers and Web servers aim to make on-demand allocation of resources to clients in order to lower the deployment cost of hosted services. Moreover, they must also minimize operating costs, such as energy consumption, by matching service-capacity demand with resource supply. However, since the term "capacity" is typically defined vaguely or inadequately, it is difficult to assess resource needs and, hence, servers, which are several times larger than needed at runtime, are usually deployed. The time-varying nature of the workload model further complicates the problem and necessitates an online capacity-estimation solution. To address this overprovisioning problem, we first define the capacity of a server cluster as the sustainable throughput subject to a request retransmission ratio constraint and then analyze different approaches to capacity estimation in a running system. Various capacity-estimation mechanisms, such as offline benchmarking and CPU-utilization evaluation, are discussed and compared with our queue-monitoring method. We employ several different data-collection methods (application instrumentation, user-space tools, simple network management protocol (SNMP), and kernel modules) to compare their effects on estimation accuracy. Of these, queue monitoring is found to provide a good and stable estimate of server capacity. To validate this finding, we propose a simple cluster- resizing mechanism and evaluate the energy-conservation performance. A good combination of data collection and online capacity estimation is found to make significantly more energy savings than traditional approaches (that is, static estimation and scheduled capacity). Our experimental results show that more than 40 percent of energy can be saved for regular daily usage patterns without any prior knowledge of the workload and that long start-up and shutdown delays affect energy savings considerably.