Chapter Abstract:
Summary Data science is notorious for spending most of its time on the nitty‐gritty of scrutinizing, cleaning, and organizing the data. This chapter gives a practical gui...Show MoreMetadata
Chapter Abstract:
Summary
Data science is notorious for spending most of its time on the nitty‐gritty of scrutinizing, cleaning, and organizing the data. This chapter gives a practical guide to this all important step, including common pathologies to watch for and insights into what they might mean. Complementing this is a discussion of string processing, the ultimate tool for data cleaning that can work when all other methods fail. Unconventionally, this chapter talks in detail about regular expressions, which are the most powerful tools for working with patterns in text.
Page(s): 43 - 54
Copyright Year: 2025
Edition: 2
ISBN Information: