What is involved in Data Lake Architecture
Find out what the related areas are that Data Lake Architecture connects with, associates with, correlates with or affects, and which require thought, deliberation, analysis, review and discussion. This unique checklist stands out in a sense that it is not per-se designed to give answers, but to engage the reader and lay out a Data Lake Architecture thinking-frame.
How far is your company on its Data Lake Architecture journey?
Take this short survey to gauge your organization’s progress toward Data Lake Architecture leadership. Learn your strongest and weakest areas, and what you can do now to create a strategy that delivers results.
To address the criteria in this checklist for your organization, extensive selected resources are provided for sources of further research and information.
Start the Checklist
Below you will find a quick checklist designed to help you think about which Data Lake Architecture related domains to cover and 145 essential critical questions to check off in that domain.
The following domains are covered:
Data Lake Architecture, Data lake, Amazon.com, Amazon S3, Apache Hadoop, Azure Data Lake, Big data, Cambridge Semantics, Cloudera, Data analytics, Data mart, Data reporting, Data visualization, Data warehouse, Google, Hortonworks, Information silo, Machine learning, Microsoft, Pentaho, PricewaterhouseCoopers, Teradata, Zaloni:
Data Lake Architecture Critical Criteria:
Paraphrase Data Lake Architecture visions and probe the present value of growth of Data Lake Architecture.
– What may be the consequences for the performance of an organization if all stakeholders are not consulted regarding Data Lake Architecture?
– Do we address the daunting challenge of Big Data: how to make an easy use of highly diverse data and provide knowledge?
– Do we need an enterprise data warehouse, a Data Lake, or both as part of our overall data architecture?
– Risk factors: what are the characteristics of Data Lake Architecture that make it risky?
– Can the data be obtained at no cost, or is there a charge associated with access?
– Did it get exported, when, where how will it be used (organizational)?
– How would you arrive at the decomposition without such knowledge?
– What kinds of use are permitted/prohibited by the license?
– Big Data: what is different from large databases?
– Can I connect this data to data I already have?
– What processes touched my data?
– Where did my data come from ?
– How is this data represented?
– Why analysis inside a DBMS?
– What Is Data Governance ?
– What is geostatistics ?
– Where did it come from?
– How old is this data?
– MapReduce: forgotten?
– What method to use ?
Data lake Critical Criteria:
Investigate Data lake engagements and forecast involvement of future Data lake projects in development.
– Does Data Lake Architecture include applications and information with regulatory compliance significance (or other contractual conditions that must be formally complied with) in a new or unique manner for which no approved security requirements, templates or design models exist?
– Which customers cant participate in our Data Lake Architecture domain because they lack skills, wealth, or convenient access to existing solutions?
– Looking at hadoop big data in the rearview mirror, what would you have done differently after implementing a Data Lake?
– Looking at hadoop big data in the rearview mirror what would you have done differently after implementing a Data Lake?
– What sources do you use to gather information for a Data Lake Architecture study?
– What data is being licensed, and how or where is it being made available?
– What is Regulatory Compliance ?
– Where is the data located?
– What is the environment?
Amazon.com Critical Criteria:
Apply Amazon.com leadership and overcome Amazon.com skills and management ineffectiveness.
– Will Data Lake Architecture have an impact on current business continuity, disaster recovery processes and/or infrastructure?
– Think about the functions involved in your Data Lake Architecture project. what processes flow from these functions?
– How do we go about Comparing Data Lake Architecture approaches/solutions?
Amazon S3 Critical Criteria:
Debate over Amazon S3 engagements and acquire concise Amazon S3 education.
– At what point will vulnerability assessments be performed once Data Lake Architecture is put into production (e.g., ongoing Risk Management after implementation)?
– Does our organization need more Data Lake Architecture education?
Apache Hadoop Critical Criteria:
Study Apache Hadoop failures and get going.
– In the case of a Data Lake Architecture project, the criteria for the audit derive from implementation objectives. an audit of a Data Lake Architecture project involves assessing whether the recommendations outlined for implementation have been met. in other words, can we track that any Data Lake Architecture project is implemented as planned, and is it working?
– Have all basic functions of Data Lake Architecture been defined?
– Are we Assessing Data Lake Architecture and Risk?
Azure Data Lake Critical Criteria:
Exchange ideas about Azure Data Lake decisions and describe the risks of Azure Data Lake sustainability.
– Is maximizing Data Lake Architecture protection the same as minimizing Data Lake Architecture loss?
– Do Data Lake Architecture rules make a reasonable demand on a users capabilities?
Big data Critical Criteria:
Survey Big data tasks and observe effective Big data.
– What are the particular research needs of your organization on big data analytics that you find essential to adequately handle your data assets?
– Do you see the need to address the issues of data ownership or access to non-personal data (e.g. machine-generated data)?
– Future: Given the focus on Big Data where should the Chief Executive for these initiatives report?
– Does big data threaten the traditional data warehouse business intelligence model stack?
– To what extent does data-driven innovation add to the competitive advantage (CA) of your company?
– Do you see areas in your domain or across domains where vendor lock-in is a potential risk?
– What would be needed to support collaboration on data sharing in your sector?
– Are there any best practices or standards for the use of Big Data solutions?
– How will systems and methods evolve to remove Big Data solution weaknesses?
– How close to the edge can we push the filtering and compression algorithms?
– What new Security and Privacy challenge arise from new Big Data solutions?
– What are the new applications that are enabled by Big Data solutions?
– When we plan and design, how well do we capture previous experience?
– How fast can we determine changes in the incoming data?
– Where Is This Big Data Coming From ?
– Is Big data different?
– what is Different about Big Data?
– What s limiting the task?
– What is in Scope?
Cambridge Semantics Critical Criteria:
Adapt Cambridge Semantics tactics and assess what counts with Cambridge Semantics that we are not counting.
– What other jobs or tasks affect the performance of the steps in the Data Lake Architecture process?
– Do we all define Data Lake Architecture in the same way?
– How do we keep improving Data Lake Architecture?
Cloudera Critical Criteria:
Communicate about Cloudera tasks and change contexts.
– Are there any easy-to-implement alternatives to Data Lake Architecture? Sometimes other solutions are available that do not require the cost implications of a full-blown project?
– What are the Key enablers to make this Data Lake Architecture move?
– What are the usability implications of Data Lake Architecture actions?
Data analytics Critical Criteria:
Brainstorm over Data analytics engagements and overcome Data analytics skills and management ineffectiveness.
– What are the potential areas of conflict that can arise between organizations IT and marketing functions around the deployment and use of business intelligence and data analytics software services and what is the best way to resolve them?
– Can we be rewired to use the power of data analytics to improve our management of human capital?
– What is the difference between Data Analytics Data Analysis Data Mining and Data Science?
– Which departments in your organization are involved in using data technologies and data analytics?
– Which core Oracle Business Intelligence or Big Data Analytics products are used in your solution?
– Social Data Analytics Are you integrating social into your business intelligence?
– what is the difference between Data analytics and Business Analytics If Any?
– Which individuals, teams or departments will be involved in Data Lake Architecture?
– Is there any existing Data Lake Architecture governance structure?
– Does your organization have a strategy on big data or data analytics?
– What are our tools for big data analytics?
– Why are Data Lake Architecture skills important?
Data mart Critical Criteria:
Cut a stake in Data mart leadership and arbitrate Data mart techniques that enhance teamwork and productivity.
– How do your measurements capture actionable Data Lake Architecture information for use in exceeding your customers expectations and securing your customers engagement?
– What is the purpose of data warehouses and data marts?
Data reporting Critical Criteria:
Give examples of Data reporting issues and gather Data reporting models .
– Do the Data Lake Architecture decisions we make today help people and the planet tomorrow?
– Have the types of risks that may impact Data Lake Architecture been identified and analyzed?
– How can the value of Data Lake Architecture be defined?
Data visualization Critical Criteria:
Survey Data visualization decisions and research ways can we become the Data visualization company that would put us out of business.
– What are the best places schools to study data visualization information design or information architecture?
– What are your most important goals for the strategic Data Lake Architecture objectives?
Data warehouse Critical Criteria:
Graph Data warehouse outcomes and define Data warehouse competency-based leadership.
– What tier data server has been identified for the storage of decision support data contained in a data warehouse?
– What does a typical data warehouse and business intelligence organizational structure look like?
– Is data warehouseing necessary for our business intelligence service?
– Is Data Warehouseing necessary for a business intelligence service?
– What is the difference between a database and data warehouse?
– What are alternatives to building a data warehouse?
– Do we offer a good introduction to data warehouse?
– How would one define Data Lake Architecture leadership?
– Data Warehouse versus Data Lake (Data Swamp)?
– Do you still need a data warehouse?
– Centralized data warehouse?
Google Critical Criteria:
Design Google outcomes and inform on and uncover unspoken needs and breakthrough Google results.
– We keep record of data and store them in cloud services; for example Google Suite. There are data protection tools provided and security rules can be set. But who has the responsibility for securing them – us or Google?
– what is the best design framework for Data Lake Architecture organization now that, in a post industrial-age if the top-down, command and control model is no longer relevant?
– How does our CRM collaboration software integrate well with Google services like Google Apps and Google Docs?
– Is Data Lake Architecture Required?
Hortonworks Critical Criteria:
Value Hortonworks adoptions and innovate what needs to be done with Hortonworks.
– What is the source of the strategies for Data Lake Architecture strengthening and reform?
– What are specific Data Lake Architecture Rules to follow?
Information silo Critical Criteria:
Use past Information silo engagements and find out what it really means.
– Will Data Lake Architecture deliverables need to be tested and, if so, by whom?
– What is our Data Lake Architecture Strategy?
Machine learning Critical Criteria:
Depict Machine learning strategies and finalize specific methods for Machine learning acceptance.
– What are the long-term implications of other disruptive technologies (e.g., machine learning, robotics, data analytics) converging with blockchain development?
– Who is the main stakeholder, with ultimate responsibility for driving Data Lake Architecture forward?
– How does the organization define, manage, and improve its Data Lake Architecture processes?
– What are the barriers to increased Data Lake Architecture production?
Microsoft Critical Criteria:
Infer Microsoft leadership and describe the risks of Microsoft sustainability.
– Do we have or need click to call CTI functions that register track outbound phone calls automatically and prevent internal sales from forgetting updating database?
– Is there an integrated FAQ structure already in the exchange that can be tapped and expanded into the CRM for the agents?
– Does the current system allow for service cases to be opened in the CRM directly from the exchange site?
– What platforms are you unable to measure accurately, or able to provide only limited measurements from?
– Culture how can we help cultural issues relating to loss of control, constant change and mistrust?
– Do you have a mechanism in place to quickly respond to visitor/customer inquiries and orders?
– Which Customers just take up resources and should be considered competitors?
– How have you defined R.O.I. from a social media perspective in the past?
– What is the potential value of increasing the loyalty of our customers?
– How are we handling the risk of garbage in and garbage out with e-CRM?
– Are short calls factored out of the denominator in your service level?
– Does Customer Knowledge Affect How Loyalty Is Formed?
– Is there an IVR abandon rate; if so, what is it?
– Is the user still a member of the organization?
– Is the e-mail tagging performance acceptable?
– How do we Evolve from CRM to Social CRM?
– Is the address book size acceptable?
– Can customers place orders online?
– Who are my customers?
– What do they buy?
Pentaho Critical Criteria:
Adapt Pentaho visions and diversify disclosure of information – dealing with confidential Pentaho information.
– Which OpenSource ETL tool is easier to use more agile Pentaho Kettle Jitterbit Talend Clover Jasper Rhino?
– To what extent does management recognize Data Lake Architecture as a tool to increase the results?
– What is our formula for success in Data Lake Architecture ?
– How will you measure your Data Lake Architecture effectiveness?
PricewaterhouseCoopers Critical Criteria:
Exchange ideas about PricewaterhouseCoopers tasks and use obstacles to break out of ruts.
– What are the business goals Data Lake Architecture is aiming to achieve?
Teradata Critical Criteria:
Confer re Teradata decisions and diversify disclosure of information – dealing with confidential Teradata information.
– What tools and technologies are needed for a custom Data Lake Architecture project?
– Why should we adopt a Data Lake Architecture framework?
Zaloni Critical Criteria:
Have a meeting on Zaloni visions and report on developing an effective Zaloni strategy.
– How do senior leaders actions reflect a commitment to the organizations Data Lake Architecture values?
– How to Secure Data Lake Architecture?
This quick readiness checklist is a selected resource to help you move forward. Learn more about how to achieve comprehensive insights with the Data Lake Architecture Self Assessment:
Author: Gerard Blokdijk
CEO at The Art of Service | http://theartofservice.com
Gerard is the CEO at The Art of Service. He has been providing information technology insights, talks, tools and products to organizations in a wide range of industries for over 25 years. Gerard is a widely recognized and respected information expert. Gerard founded The Art of Service consulting business in 2000. Gerard has authored numerous published books to date.
To address the criteria in this checklist, these selected resources are provided for sources of further research and information:
Data Lake Architecture External links:
Enterprise Data Lake Architecture | Data Warehouse Design
Data Lake Architecture will explain how to build a useful data lake, where data scientists and data analysts can solve business challenges and identify new business opportunities. Learn how to structure data lakes as well as analog, application, and text-based data ponds to provide maximum business value. Understand the role of the …
Data lake External links:
Enterprise Data Lake Platform & Big Data Solutions – Zaloni
Data Lake Analytics | Microsoft Azure
What is a Data Lake? – Amazon Web Services (AWS)
Amazon.com External links:
Amazon.com: Interesting Finds
Amazon S3 External links:
Amazon S3 HTML Title – Stack Overflow
Amazon S3 – Official Site
Apache Hadoop External links:
GitHub – apache/hadoop: Mirror of Apache Hadoop
Apache Hadoop open source ecosystem | Cloudera
Hortonworks Apache Hadoop and Big Data Certifications
Azure Data Lake External links:
Azure Data Lake Tools for Visual Studio – microsoft.com
Microsoft Azure Data Lake Store – An Introduction
What Is Azure Data Lake? – Developer.com
Cambridge Semantics External links:
Cambridge Semantics (@CamSemantics) | Twitter
The Smart Data Company® | Cambridge Semantics
Cambridge Semantics – PitchBook
Cloudera External links:
Apache Hadoop training from Cloudera University
StreamSets was founded by former Cloudera and …
CLDR : Summary for Cloudera, Inc. – Yahoo Finance
Data analytics External links:
What is Data Analytics? – Definition from Techopedia
Data Analytics | Clarkson University
What is Data Analytics? – Definition from Techopedia
Data mart External links:
[PDF]Institutional Research Data Mart: Instructor Guide …
MPR Data Mart
[PDF]Meta-data and Data Mart solutions for better …
Data reporting External links:
Grant Applications and Data Reporting
Product Data Reporting and Evaluation Program
Data Reporting Analyst Jobs, Employment | Indeed.com
Data visualization External links:
AstroNova | Data Visualization Technology & Solutions
Power BI | Interactive Data Visualization BI Tools
geothinQ – On-demand Land Mapping & Data Visualization
Data warehouse External links:
Title 2 Data Warehouse – Data.gov
Enterprise Data Warehouse | IT@UMN
Condition Categories – Chronic Conditions Data Warehouse
Google External links:
Google SketchUp 3D Warehouse
Google Compute Engine control panel – Google Cloud …
Hortonworks External links:
Hortonworks (@hortonworks) | Twitter
Hortonworks, Inc. (HDP) Pre-Market Trading – NASDAQ.com
Hortonworks – Official Site
Information silo External links:
Information Silo – investopedia.com
What is an Information Silo (IT Silo)? Webopedia Definition
Information silo – Revolvy
Machine learning External links:
Amazon EC2 P3 – Ideal for Machine Learning and HPC – AWS
Microsoft Azure Machine Learning Studio
DataRobot – Automated Machine Learning for Predictive …
Microsoft External links:
Microsoft OneNote – Official Site
Microsoft PowerPoint Online – Work together on …
Free Templates for Microsoft Office Suite – Office Templates
Pentaho External links:
Data Integration Platform & Software | Pentaho
Pentaho Reporting | Hitachi Vantara Community
Pentaho User Console – Login
PricewaterhouseCoopers External links:
PricewaterhouseCoopers – The New York Times
PricewaterhouseCoopers on the Forbes Best Employers …
PricewaterhouseCoopers: #56 on 100 Best Companies to …
Teradata External links:
title | Teradata Downloads
Teradata – Talent Portal Landing Page
column title | Teradata Downloads
Zaloni External links:
Zaloni (@zaloni) | Twitter
Zaloni – Official Site
Data Management Platform for the Data Lake – Zaloni