Peak power reduction has been a critical challenge in the design of integrated circuits impacting the chip's performance and reliability. The reduction of peak power also reduces the power density of integrated circuits. Due to large IR-voltage drops in circuits, transistor switching slows down giving rise to timing violations and logic failures. In this paper, we present a new clock control strategy for peak-power reduction in VLSI circuits. In the proposed method, the simultaneous switching of combinational paths is minimized by taking advantage of the delay slacks among the paths and clustering the paths with similar slack values. Once the paths are identified based on the path delays and their slack values, the clustering algorithm determines the ideal number of clusters for the given circuit and for each cluster the maximum possible phase shift that can be applied to the clock. The paths are assigned to clusters in a load balanced manner based on the slack values and each cluster will have a phase shift possible on its clock depending on the slack. Thus, the proposed register-transfer level (RTL) method takes advantage of the logic-path timing slack to re-schedule circuit activities at optimal intervals within the unaltered clock period. When switching activities are redistributed more evenly across the clock period, the IC supply-current consumption is also spread across a wider range of time within the clock period. This has the beneficial effect of reducing peak-current draw in addition to reducing RMS power draw without having to change the operating frequency and without utilizing additional power supply voltages as in dual or multi VT approaches. The proposed method is implemented and tested through simulations using an experimental setup with Synopsys Tools Suite and Cadence Tools on the ISCAS'85 benchmark circuits, OpenCore circuits and LEON processor multiplier circuit. Experimental results indicate that peak power can be reduced significantly to at- least 72% depending on the number of clusters and the phase-shifted clock identified as suitable for the given circuit by the proposed algorithms. Although the proposed method incurs some power overhead compared to the traditional clocking method, the overhead can be made negligible compared to the peak-power reduction as seen in the experimental results presented.