Data wrangling is the process of cleansing, transforming, and reshaping data from a raw form into a more usable form. It is the data-focused step in the data pipeline.
The process of data wrangling consists of three steps:
- Data transformation: transforming data from one representation to another
- Data cleansing: removing or fixing erroneous, misleading, or missing data entries
- Data reshaping: changing data into a format that can be used later on