Loading [MathJax]/extensions/MathMenu.js
Evaluation of CD-HIT for constructing non-redundant databases | IEEE Conference Publication | IEEE Xplore
Scheduled Maintenance: On Monday, 30 June, IEEE Xplore will undergo scheduled maintenance from 1:00-2:00 PM ET (1800-1900 UTC).
On Tuesday, 1 July, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC).
During these times, there may be intermittent impact on performance. We apologize for any inconvenience.

Evaluation of CD-HIT for constructing non-redundant databases


Abstract:

CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an ...Show More

Abstract:

CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an input database into several representative sequences, under a user-defined threshold of sequence identity. We present a comprehensive assessment of the redundancy in the outputs of CD-HIT, exploring the impact of different identity thresholds and new evaluation data on the redundancy. We demonstrate that the relationship between threshold and redundancies is surprising weak. Applications of CD-HIT that set low identity threshold values also may suffer from substantial degradation in both efficiency and accuracy.
Date of Conference: 15-18 December 2016
Date Added to IEEE Xplore: 19 January 2017
ISBN Information:
Conference Location: Shenzhen

Contact IEEE to Subscribe

References

References is not available for this document.