TechSlice - The Technology View

Welcome to TechSlice - The Technology View. This blog is my commentary on new developments in the world of technology especially when they impact our markets and customers.

Steve Yaskin, Chief Technology Officer

MDM Meets Small and Mid-Sized Business

Written by Steven Yaskin Tuesday, September 20, 2011 08:38 PM

An analysis of the recent trends in the data management space points to the emergence of the SMB-driven master data management adaption. The ubiquitous MDM projects have always been a prerogative of the Global 5000 companies; however this has been changing throughout 2011. The tidal wave of data caused by the amount of the information heading to the SMBs and driven by increasing SMB participation in the social network ecosystem requires smaller companies to start looking at Data Quality, Data Governance and MDM within their organizations. With SaaS pricing going down and usability going up, SaaS applications like NetSuite, Salesforce and Jive are spreading through the market segment like wildfire. As SMB CIOs and CFOs start to gain the benefits of having a solid (but inexpensive) back office and sales force management systems, they inadvertently start experiencing the data management problems way ahead of their company traditional growth needs.
In a recent article for Information Management, The MDM Institute's chief research officer, Aaron Zones, outlines future MDM trends through 2013.

For example, trend number two is “MDM Market Momentum,” but it includes all of the following trend predictions:
1. While Global 5000 enterprises will spend an average of $1 million on MDM software, they'll spend $3-4 million more for integration services.
2. IBM, Oracle, SAP and Informatica offer SMB's entry-level MDM for $250,000 to $500,000.
3. Mergers and acquisitions, the drive for sales leads, and compliance will be the drivers for funding MDM.
4. IT-initiated MDM projects will struggle to justify the business value.
5. There will be a skill shortage for MDM and data governance projects, leading to more work for systems integrators through 2012.

We confirm these trends as we interact daily with Queplix prospects and customers. We sell our persistent metadata server to both SMB and larger companies and what we observe is that SMB’s and Global-1000’ requests for data management are essentially moving towards each other and starting to overlap. Similar features are being requested on both spectrums of the market driven by data management needs. While SMBs have a lot more interest in measuring their social outreach and networking than their larger counterparts, very similar trends and feature requirements emerge around data quality and data and application integration. Larger companies start to pay very close attention to their brand recognition and social “chatter” around their products and marketing activities, measuring people perceptions and even “friending” them directly. All these data needs to be merged and analyzed within the corporate structure. Naturally, larger companies have more data silos and larger data volumes to handle, but essentially this is where the differences stop.


Both SMBs and Global-1000s need to integrate their data from social networks, marketing campaigns, salesforce and internal accounting systems to have an agile and holistic view of their data at any moment in time. Is it possible for them then to continue using 20 y.o. technologies like data warehousing or ETL to address these problems? The answer is a flat out no. The amount of data and ever-changing dynamic nature of data feeds and constant need to bring even more data for analysis renders traditional ETL and rigid mapping tools useless.
Big data management vendors i.e. IBM, Oracle and Informatica, are dropping prices to make MDM and data quality tools more affordable to the SMB segment, while trying to adapt the SaaS and cloud platforms to their traditional large enterprise offers. Their product management teams had to come to this realization by observing the market trends; they are seeing what we are seeing and their reaction is confirming our analysis of these trends. However, lowering pricing and adapting cloud platforms to products which were not designed for this is not the best approach to solve current data management problems. The MDM and data management industry overall is in desperate need of the next qualitative leap on its technology S-curve.


We see customer environments, both on SMB side and larger companies, where several ETL tools are deployed to feed a data warehouse (or two), to create a datamart (or two) for BI; mostly just moving and shuffling data around and laying pipes. IT teams are stretched and consulting teams are brought in to close the gaps. Unfortunately, the requirements constantly change and IT departments armed with traditional tools are falling behind, causing out of budget run-aways, which never reach the destination. In the same paper above the author mentions more than a quarter of MDM expense is services. I think this is a conservative number because it is based on comparing initial estimates to the current point in time; however as MDM projects progress the amount of services required to implement traditional systems only increases. To learn more about how persistent metadata servers are addressing the SMB’s MDM market needs please go to http://www.queplix.com/solutions.html or download our software for free and give it a try. We will even configure it for you and teach you how to bake the bread, all at no charge.




 

Thoughts on Big Data and Data Virtualization

Written by Steven Yaskin Wednesday, March 23, 2011 08:30 AM

Thoughts on Big Data and Data Virtualization.
Big Data Analysis in Relationship to Queplix Data Virtualization Solution.

"On the plus side for obtaining IT and business alignment, more companies are beginning to combine business and information management responsibilities in a single role, carried out by a single person, rather than a “business and IT partnership” with two people, two hierarchies and two sets of reporting relationships. Gartner expects 20 percent of companies to employ business information managers by 2013, compared with 5 per cent in 2009."
- Massive Data News in the report from April 2010

Here are the next ten things you should know about big data:
1. Big data means the amount of data you’re working with today will look trivial within five years.
2. Huge amounts of data will be kept longer and have way more value than today’s archived data.
3. Business people will covet a new breed of alpha geeks. You will need new skills around data science, new types of programming, more math and statistics skills and data hackers…lots of data hackers.
4. You are going to have to develop new techniques to access, secure, move, analyze, process, visualize and enhance data; in near real time.
5. You will be minimizing data movement wherever possible by moving function to the data instead of data to function. You will be leveraging or inventing specialized capabilities to do certain types of processing- e.g. early recognition of images or content types – so you can do some processing close to the head.
6. The cloud will become the compute and storage platform for big data which will be populated by mobile devices and social networks.
7. Metadata management will become increasingly important.
8. You will have opportunities to separate data from applications and create new data products.
9. You will need orders of magnitude cheaper infrastructure that emphasizes bandwidth, not iops and data movement and efficient metadata management.
10. You will realize sooner or later that data and your ability to exploit it is going to change your business, social and personal life; permanently.
--David Vellante in Big Data on February 16, 2011

Queplix® Virtual Data Manager™ provides a data management solution continuum, starting from data integration of multiple disperse data sources to Master Data Management. All in a single “dashboard” view. We have an automated NoSQL, object oriented representation of business objects, abstracted from multiple sources and described as metadata repository. Queplix Virtual Data Manager is a persistent solution that operates with minimum disruption to the sources in automated fashion.

Queplix Virtual Data Manager offers today enhanced Data Alignment, Data Quality and Data Enrichment, proactive Data Stewardship interface and Global Data Dictionary. As such, Queplix Virtual Data Manager is a full-spectrum data management solution. One of our strengths is in our ability to intelligently identify and described business objects from a variety of data sources, using our Application Software Blades™. In doing so, we eliminate the need to deal with proprietary data storage formats and the need to copy large amounts of data in order to make it available for analytics.

The Big Data industry has been developing rapidly recently, even though the technology was created years back (Google Big Table, etc.) The goal of Big Data is to be able to store and process large amount of data for analytical purposes using Map/Reduce technology. The big data technology by itself does not provide analytical engine, but rather enables it to operate on large volumes of data. The big data storage vendors (i.e. Hadoop) provide flat non-relational storage facility and the multi-processing engine to address BI queries for large data volumes. It is not possible to achieve the same performance using traditional RDBMS.

The most obvious synergy between Queplix Virtual Data Manager and Big Data technologies is in the noSQL approach to data management. Big Data vendors pursue common goal which is to enable BI solutions to work on large volumes. Let’s consider an example of Jaspersoft BI:
“Jaspersoft’s vision goes well beyond Big Data. Our modern architecture and agnostic data source support is tailored for the cloud, from IaaS to PaaS, either public or private variations. In particular, NoSQL support puts us in the driver’s seat to become the de facto embedded standard for reporting and analysis within PaaS cloud environments.”
— Brian Gentile, CEO of Jaspersoft.

Jaspersoft’s BI engine can be deployed on top of Hadoop and in order to work effectively with large data sets it needs to utilize abstraction of business entities. Queplix Virtual Data Manager is built on the noSQL architecture and can provide the abstraction required from Jaspersoft and other BI vendors today.

Queplix Virtual Data Manager architecture works natively with Big Data engine by abstracting the data from the siloed sources and can optionally be used to migrate the data to the Big Data storage like Hadoop. With Queplix Virtual Data Manager it is now possible to create the abstracted metadata layer and then use it to recreate the business objects in Hadoop in order to copy/move the data and therefore enable BI to work on Big Data engines. In other words, through Queplix Hadoop application Blade, we can now enable BI vendors is to virtualize Big Data repositories and provide persistence.

 

Queplix Data Virtualization and Hadoop - Marriage Made in Heaven

Written by Steven Yaskin Thursday, February 03, 2011 05:46 PM

Recently, there has been a lot of news covering advances in parallel processing frameworks, such as Hadoop. Some innovative data warehouse software vendors are increasingly starting to research new development strategies that parallel processing offers. So far, the majority of thsee efforts were targeted at improving the performance and optimization maps of queries within the traditional physical data warehouse architectures. For example, traditional data warehouse vendors like Teradata joined the Hadoop movement and applied parallel processing to their physical DW infrastructures. Companies like Yahoo and Amazon are also spearheading map/reduce Hadoop adaption for large data scale analytics.

I have been monitoring advances in the Hadoop front in particular, as I believe it will provide grounds for convergence for our products and a new development direction for Queplix Data Virtualization. Data virtualization and Hadoop are born from the same premise – provide data storage scalability and ease of information access and sharing and I see how the two technologies complement each other perfectly.

Hadoop’s data warehouse infrastructure (Hive) is what we are researching now to integrate with Queplix Data Virtualization products. Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL. Queplix Data Virtualization will soon utilize the flexibility of its object-oriented data modeling combined with the massive power of Hadoop parallel processing to build virtual data warehouse solutions. Imagine the analytical performance of such a virtual data warehouse solution created by using the Virtual Metadata Catalog and Virtual Entities in its base as organizational and hierarchal units (instead of traditional tables and columns and SQL-driven access). Such “virtual” data warehouse solutions would be a perfect fit for large scale operational and analytical processing, data quality and data governance projects with the full power of Queplix heuristic and semantic data analysis. Today, data virtualization solutions are deployed by many larger enterprises to gain visibility into disperse application data silos without disrupting the original sources and applications; in the near future Data Virtualization and Hadoop-based virtual data warehouse solutions will be deployed in tandem to implement the full spectrum of data management enterprise solutions, ranging from larger-scale data integration projects (i.e. massive application data store mergers as a result of M&As between large companies) all the way to Virtual Master Data Management pioneered by Queplix. Such solutions will not only provide a better abstraction and business continuity for enterprise applications but will also utilize the full power of parallel processing and provide immense scalability to Queplix semantic data analytics and data alignment products.

Here are some new and exciting ideas Queplix is working on now:

  • utilizing Hadoop for Virtual CEP (Complex Event Processing) within Queplix Virtual Metadata Catalog
  • generating “data steward” real-time alerts using predictive data lineage analysis before data quality problems start to affect your enterprise applications
  • implementing Hadoop-based virtual data warehouse solutions to provide high availability for large application stores that require massive analytics and semantic data processing
  • large-scale Virtual Master Data Management initiatives involving enterprise-wide customer or product catalog building
  • large-scale business intelligence projects based on Queplix Virtual Metadata Catalog

Watch this blog for new developments and advances of Queplix technology integrating Hadoop and Data Virtualization as we make announcements throughout the year!

   

Social Networks are coming to the Enterprise!

Written by Steven Yaskin Thursday, January 27, 2011 07:04 AM

I am happy to write that Queplix just announced our social Blades for Facebook, Linked In and Google Contacts: Read more. What we are trying to do here is to bridge the social network information with the enterprise data stores. When we started working on that we had a lot of skepticism internally about the potential use case for the social information within the enterprise. Having worked within large corporate structures and implemented hundreds of large scale secure Enterprise information systems - I had my own doubts. It all changed in a heartbeat when we got our justification from our potential customer. One of the leading US banks approached Queplix and requested.. social blades. Yes.. THE large F-500 BANK!! I guess the time for this is now and forward looking companies are starting to recognize the value of the information contained in social ranking, chatter and people's opinions spread out across the social universe but still very much inter-connected. Enterprises are becoming increasingly sensitive to their public image, the perceptions of them by their customers and employees. The delicate point of balance here of course is privacy. One thing is to share your personal information and what you had for breakfast this morning with the members of your family and your friends; it completely changes when you know someone at your company will look at it. So the lines must be drawn around what can be seen, shared and used by the enterprises by getting access to the social networks information. While this work is still in progress and we are threading very carefully here. In reality - we always rely on the specific social network privacy laws and enforce them within our Blades. You, as the information owner have to specifically approve the Queplix application "Blade" to be connected to you and further you can see what exactly is being shared and how. I think the biggest value right now for the enterprise is to monitor their own social images within social networks, and not look at people. There is a lot of information out there about the company itself and its products, this is where we focus first when connecting the enterprise with the social information. And this is what Queplix social blades are all about.

   

QUECLOUD NOW

  • QueCloud - FREEMIUM Edition - Start Today

Industry Analyst ESG

  • “Data migration and integration solutions such as those offered by Queplix, address a real need that enterprises face today. Organizations need simpler, lower cost alternatives to existing ETL methods so they spend less time integrating and migrating data, and more time leveraging it.

    Julie Lockner

    VP and Senior Analyst

    Enterprise Strategy Group

    February, 2011

Industry Analyst The 451G

  • “Each wave of virtualization has provided strong cost savings and technology advantage. Data Virtualization offers a fundamental change in the way application and data integration problems can be solved. Once virtualized the data is available for secure re-use, with potential incremental return on investment with each application.”

    John Abbott

    Chief Analyst

    The 451 Group

    October, 2010