Fundamentals of Data Engineering Plan and Build Robust Data Systems by Joe Reis and Matt Housley, released in July 2022, serves as a comprehensive guide to understanding the intricacies of data engineering. From the foundational concepts to the future trends, this book presents a strategic approach to building robust data systems, covering critical aspects such as data architecture, technology selection, and ensuring security and privacy. With a keen eye on the evolving landscape of data engineering, this book is an essential resource for both novices and seasoned professionals in the field.

  1. Chapter 1, "Data Engineering Described," introduces the concept of data engineering, outlining the necessary skills and activities, along with the roles of data engineers within an organization.
  2. Chapter 2, "The Data Engineering Lifecycle," explains the phases within the data engineering lifecycle, emphasizing the key elements and stages involved in the process.
  3. Chapter 3, "Designing Good Data Architecture," explores the principles and concepts essential for creating effective data architecture, along with the various types and examples of data architecture and the key stakeholders involved in the design process.
  4. Chapter 4, "Choosing Technologies Across the Data Engineering Lifecycle," focuses on the factors influencing technology choices during the data engineering process, considering team capabilities, cost optimization, future scalability, and other relevant criteria.
  5. Chapter 5, "Data Generation in Source Systems," delves into the sources and practical details of data generation, emphasizing the main ideas and practical considerations involved in managing source systems.
  6. Chapter 6, "Storage," covers the fundamentals of data storage, storage systems, and storage abstractions, discussing the latest trends and big ideas in the field, along with the relevant collaborators and undercurrents.
  7. Chapter 7, "Ingestion," examines the concept of data ingestion, including key engineering considerations and various approaches to data ingestion, addressing both batch and message/stream ingestion, along with the pertinent collaborators and underlying factors.
  8. Chapter 8, "Queries, Modeling, and Transformation," discusses queries, data modeling, and transformations, emphasizing the collaborative aspects involved in these stages and the underlying factors that influence decision-making.
  9. Chapter 9, "Serving Data for Analytics, Machine Learning, and Reverse ETL," covers the general considerations for serving data for analytics and machine learning purposes, highlighting the key elements in the process, including the concept of reverse ETL, and outlining the relevant collaborators and underlying dynamics.
  10. Chapter 10, "Security and Privacy," focuses on the aspects of people, processes, and technology within the context of ensuring security and privacy in data engineering processes.
  11. Chapter 11, "The Future of Data Engineering," presents insights into the future of the data engineering lifecycle, emphasizing the decline of complexity, the rise of user-friendly data tools, the role of cloud-scale data OS, and the evolving nature of titles and responsibilities within the field.