In formating temporal sequences of notes played by the same instrument (referred to as music streams), timbre of musical instruments may be a predominant feature. In polyphonic music, the performance of timbre extraction based on power-related features deteriorates, because such features are blurred when two or more frequency components are superimposed in the same frequency. To cope with this problem, we integrated timbre similarity and direction proximity with success, but left using other features as future work. In this paper, we investigate four features. timbre similarity, direction proximity, pitch transition and pitch relation consistency to clarify the precedence among them in music stream formation. Experimental results with quartet music show that direction proximity is the most dominant feature, and pitch transition is the secondary. In addition, the performance of music stream formation was improved from 63.3% by only timbre similarity to 84.9% by integrating four features.