Abstract:
Web scraping, often known as web crawling, is employing software to gather data from websites automatically. It is a procedure that is very crucial in domains like busine...Show MoreMetadata
Abstract:
Web scraping, often known as web crawling, is employing software to gather data from websites automatically. It is a procedure that is very crucial in domains like business intelligence in the digital era. This technological advancement enables the extraction of structured data from text, including HTML. Web Scraping is incredibly helpful when data is unavailable in a machine-readable format, such as XML or JSON. Web scraping can also gather intelligence on illegal businesses. It has been discovered that data obtained by utilizing a web scraping application is significantly more comprehensive, precise, and reliable than data entered by hand. Web scraping is a vital tool in contemporary disciplines and a beneficial tool in the information era. Web scraping requires various technologies, including pattern matching and spidering, which are covered. This paper examines web scraping’s definition, methods, stages, technologies, relationships to cyber security, business intelligence, artificial intelligence, data science, big data, and cyber science; it also looks at how web scraping can be done using the Python programming language, some of its main advantages, and potential future developments. A particular focus is highlighting the ethical and legal issues surrounding web scraping.
Published in: 2024 International Conference on Healthcare Innovations, Software and Engineering Technologies (HISET)
Date of Conference: 18-19 January 2024
Date Added to IEEE Xplore: 11 November 2024
ISBN Information: