Doorgaan naar hoofdcontent

Book review: Data Management at Scale (Piethein Strengholt)

 This blog is a review of the book "Data Management at Scale (See also at bol.com)


Data Management is a hot topic nowadays and this book does a fantastic job at adding value to this topic. It is a must read and one of the few technical books I finished reading in a weekend.

The book gives a fantastic overview on how to implement a Data Mesh data architecture. The Data Mesh concept is explained by Martin Fowler here. The book is a good mix between conceptual and implementation architecture level. It gives a lot of examples of how this architecture at scale can work, for both small and big companies. It is practical and I used it to implement it at one of my customers.

The book describes an architecture in which the focus is on the DIAL (Data- and Integration Access Layer). 


On a high level the book covers the following topics:

  • The key principles for data management at scale
    - Domain-Driven Design 
    - Domain Data Stores
    - Meta data management
  • Ready Data Store
    The concept of serving data to your consumers as fit-for-purpose. Using the CQRS design pattern to split operational data store from read-only data store. This is a nice concept, however sometimes difficult to implement when using SaaS solutions. Those SaaS applications do not expose the database but only exposes APIs which are most of the time hard to use.
  • API Architecture
    The most remarkable notion is that a Common Data Model (CDM) is not advised. Each (functional) domain is responsible for its data and its domain model. I also experience within my own work that a CDM is hard to accomplish and very time consuming, Furthermore the domain itself knows everything about the data, so it is best that they define and are responsible for the data and data definitions as well.
  • Streaming Architecture
    Because data must be available as soon as possible Event Driven Architecture are becoming popular (again). This also has to do with the arising of Microservices,  Serverless Architectures and complex event processing, in which events are the basis of computation.
  • Meta Data
    Meta data is also key to have a central data catalog that makes it possible for consumers to find the data they need, and how to get this data.
  • Data governance and security is also covered
  • Master Data Management
Eye-openers and remarks from my side:
  1. No more use of a Common Data Model but the use of Domain Models

  2. Fit-for-purpose APIs for consumers. There can be a lot of different data consumers with different requirements. I think this can be a challenge to serve all those consumers well.

  3. The exposing of data will be the responsibility of each domain.
    What you see within organisations is that Integration is a different subunit. I also would like to have cross functional teams delivering data as a product and delivering value to the customers that way. Data Engineers will need to know more about the domain they are working for. But still there are data engineers that just wants to build data pipelines and discuss with fellow engineers.

  4. Logical Warehouse
    The data will stay at the domain data stores and the report applications will use these stores to generate reports. The data is decentralized and this works better, because the domain knows everything about the data. DWH management is slow and this can be the next step in exposing data must faster. Also the harmonization and rationalisation of (master)data must sit at the domain site and not within a BI team. Because this team will always need and get this information from the domain itself. This introduces extra communication structures, and this will always slow down.

  5. More responsabilities to the teams/domains
    This is a good thing! They know or should know everything about their data, for which they are responsable.

Happy reading ! And if you have some questions about my experience, please leave a comment. 

Reacties

  1. I liked the way you put together everything, there is certainly no need to go any further to look for any additional information. You mentioned each and everything about Data Mesh.

    BeantwoordenVerwijderen
  2. I recently stumbled upon your blog and I must say, it was quite interesting to read. Your posts are thought-provoking, and I appreciate the way you approach different topics with an open mind and a willingness to engage in dialogue and discussion.
    One post that caught my attention was "The Art of Critical Thinking". I think this is an essential skill that is often overlooked in today's society, and your insights on the importance of questioning assumptions and seeking out alternative perspectives were spot on. Your advice on how to develop critical thinking skills by reading widely, seeking out diverse viewpoints, and being open to changing your mind was very valuable.
    I also enjoyed reading your post on "The Power of Music". I agree with you that music can have a profound effect on our emotions and can be a powerful tool for self-expression and creativity. Your insights on the role of music in different cultures and throughout history were very interesting, and I appreciated the way you wove in personal anecdotes and experiences to illustrate your points.
    Overall, I think your blog is a great platform for exploring different topics and ideas, and I appreciate the way you approach each subject with curiosity and an open mind. Thank you for sharing your thoughts and insights with your readers!
    Learn More: https://pmoglobalinstitute.org/agile-pmo/

    BeantwoordenVerwijderen

Een reactie posten

Populaire posts van deze blog

Microservices mindmap

"The tree" - See also   my photo page When you are fairly new within the Microservices land, there are a lot of terms fired at you. So also for my own understanding i have made a mindmap. I think it has a good status now, so that i can share it with you. As always feedback is very welcome ! You can download the mindmap here .

OSB 10gR3 and SWA and MTOM

This blog is about using soap with attachments and the use of MTOM within the OSB (10gR3). A service is created that accepts a soap with attachment (DocumentService) and translates it to a service that accepts a binary element. MTOM is used for performance reasons for the second. Some notes: * For the use of attachments you need RPC-style document instead of the usual document-style. This due to the fact that the document-style limits a message to a single . * A service can not have both SWA and MTOM within OSB. First a WSDL is setup for the DocumentService: The $attachments variable holds the attachments and the body holds the attachment data. Also other data is stored within the attachment element (see h...

Cloud to Cloud Application Integration

A lot of applications have integration possibilities, so do cloud applications. The question I got from a customer is whether to have a point-to-point integration with Cloud applications or to go through their ESB solution. This blog describes some considerations. Context The customer has a HRM application in which job vacancies are managed. Furthermore that system also handles the full applicant process flow. They also have another cloud application that handles the job vacancies. This application posts the jobs to social sites and other channels to promote the vacancies. Furthermore this application has some intelligence for job seekers to advice some new vacancies based on previous visits or profiles. The job vacancies need to be sent to the Vacancies application and applicant information needs to be sent to the HRM application, when a job seeker actually applies for a job. Furthermore status information about the job application is als...