Conferences >2016 IEEE International Confe...

Evaluation of CD-HIT for constructing non-redundant databases

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an ...Show More

Metadata

Abstract:

CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an input database into several representative sequences, under a user-defined threshold of sequence identity. We present a comprehensive assessment of the redundancy in the outputs of CD-HIT, exploring the impact of different identity thresholds and new evaluation data on the redundancy. We demonstrate that the relationship between threshold and redundancies is surprising weak. Applications of CD-HIT that set low identity threshold values also may suffer from substantial degradation in both efficiency and accuracy.

Published in: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Date of Conference: 15-18 December 2016

Date Added to IEEE Xplore: 19 January 2017

ISBN Information:

DOI: 10.1109/BIBM.2016.7822604

Conference Location: Shenzhen

Contents

References is not available for this document.

Evaluation of CD-HIT for constructing non-redundant databases

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Evaluation of CD-HIT for constructing non-redundant databases

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?