I am a data engineer with 3+ years of experience in designing and building ETL pipelines for big data and fast data. I utilize best practice design patterns and architecture-centered processes. Your data can be extracted from a number of (typically) heterogenous data sources. These sources can include business systems, APIs, sensor data, marketing tools, and transaction databases, among others. As you can see, some of these data types are likely to be the structured outputs of widely used systems, others may be semi-structured JSON server logs. Furthermore, different sources are likely to produce vastly different amounts and rates of data, all of which will have to be managed by your system.