Architecting the Multi-Purpose Data Lake with Data Virtualization

April 1, 2018

The Data Lake and the Data Scientist – Although it’s still a relatively young concept, the data lake has already been adopted by many organizations. Its primary role is to store raw structured and unstructured data in one central place, making it easy for data scientists and other investigative and exploratory users to analyze data. Without a data lake these users waste lots of time on hunting for all the data before they can begin with their actual work: analytics. In short, the data lake is supposed to shorten the data selection process that precedes the analytical work. However, the original data lake’s architecture has two severe drawbacks. One relates to the physical nature of the data lake and the other to the restricted usage of the data lake investment – it’s designed exclusively for data scientists.

Spotlight

Plotly

Plotly offers Dash Enterprise, the premier data app platform for Python that enables organizations to collaboratively develop and deploy apps in a secure, scalable, managed environment. We're also the stewards of the open-source graphing libraries behind our namesake, bringing interactive data visualization to your web browser. With 89,000+ GitHub stars, 12,600,000+ downloads per month of Plotly open-source libraries, and 326,000,000+ total open-source downloads, Dash is the leading low-code platform for AI apps. These Dash apps give a point-and-click interface to models written in Python, vastly expanding the notion of what's possible in a traditional dashboard.

OTHER WHITEPAPERS
news image

VM Insight: The Critical Path to App-aware Infrastructure

whitePaper | April 27, 2022

As virtual environments scale, understanding how each virtual machine impacts the underlying infrastructure becomes a necessity for efficient data center design and timely issue diagnosis. Capturing this information, however, typically requires complex and costly manual processes. These issues are often particularly challenging across the network, generating a “fog” that hides data congestion and impedes identifying a resolution.

Read More
news image

Intel Page Modification Logging, a hardware virtualization feature: study and improvement for virtual machine working set estimation

whitePaper | January 7, 2020

Intel Page Modification Logging (PML) is a novel hardware feature for tracking virtual machine (VM) accessed memory pages. This task is essential in today’s data centers since it allows, among others, checkpointing, live migration and working set size (WSS) estimation. Relying on the Xen hypervisor, this paper studies PML from three angles: power consumption, efficiency, and performance impact on user applications. Our findings are as follows. First, PML does not incur any power consumption overhead. Second, PML reduces by up to 10.18% both VM live migration and checkpointing time. Third, PML slightly reduces by up to 0.95% the performance degradation on applications incurred by live migration and checkpointing.

Read More
news image

Versa SD-WAN Solution Use Case for Satellite ISPs

whitePaper | July 19, 2022

Satellite networks offer their customers several advantages over other types of connectivity. They are easily deployed, reliable, and allow a wide degree of mobility, making them a perfect fit for Disaster Recovery Plans (DRP). They are an essential asset in places where other connectivity methods are not available, such as oil rigs, vessels, or even planes. However, they have several characteristics that make them harder to manage when compared to other kinds of networks. This document will discuss those challenges and explore how the Versa Operating System (VOS™) can help you extract better performance out of your Satellite links.

Read More
news image

Virtualization Best Practices

whitePaper | May 27, 2022

This best practice guide will provide advice for making the right choice in your environment. It will recommend or discourage the usage of options depending on your workload. Fixing configuration issues and performing tuning tasks will increase the performance of VM Guest's near to bare metal.

Read More
news image

Top Ten Considerations for Your Nextgeneration SD-WAN

whitePaper | September 7, 2022

The need for organizations to change to meet evolving business environments is not a new concept. However, what is new in our current moment is the pace of change. Organizations across all industries now have to adapt to a rapidly evolving IT and application environment. These changes have been accelerated by a global pandemic that is forcing employees to work remotely and by the adoption of modern applications that are distributed across corporate data centers, multiple public clouds, and increasingly, edge locations. Unfortunately, both of these efforts create more IT complexity.

Read More
news image

Achieving pervasive security above and below the OS

whitePaper | October 3, 2022

Keeping business data secure is a challenging task, complicated by the proliferation of endpoints operating outside of the organizational network and the constant evolution of threat vectors

Read More

Spotlight

Plotly

Plotly offers Dash Enterprise, the premier data app platform for Python that enables organizations to collaboratively develop and deploy apps in a secure, scalable, managed environment. We're also the stewards of the open-source graphing libraries behind our namesake, bringing interactive data visualization to your web browser. With 89,000+ GitHub stars, 12,600,000+ downloads per month of Plotly open-source libraries, and 326,000,000+ total open-source downloads, Dash is the leading low-code platform for AI apps. These Dash apps give a point-and-click interface to models written in Python, vastly expanding the notion of what's possible in a traditional dashboard.

Events