by Mary Branscombe
These days, business intelligence means much more than drilling into database cubes.
HardCopy Issue: 55 | Published: February 1, 2012
For much of the last decade, each new year has been heralded as the year in which Business Intelligence will be democratised, becoming accessible to the ordinary user and not just a small group of data-modelling experts working with highly specialised tools that offer little interactivity. In fact, there’s been software available for some time now that makes it possible for ordinary business users to extract information and gain useful insight from the mass of data exposed by their business. Giving end users timely access to such information was one of the major improvements offered by SQL Server 2008, and the R2 release went further by adding ‘self-service’ analysis through PowerPivot for Excel and an improved Report Builder. These went hand-in-hand with master data management and data quality tools which help ensure that users are accessing data that is synchronised and accurate across multiple systems. As these tools have evolved, attention has returned to the back-office in an effort to ensure that the information sources are capable of reacting to a query in good time.
Any data analysis that returns information that is at all interesting is bound to provoke more questions, and new ideas on how to explore the data in an effort to understand what is going on. This is a spontaneous process. You need to be able to ask questions and get answers in real time, and not when someone in the IT department happens to find the time to craft and run the queries you have requested. This is self-service BI, and it’s how the majority of business analysis is now being done. Furthermore, these days we want to go beyond traditional databases and spreadsheets. Whether it’s emails, survey responses, documents, video, sound, the fire hose of Twitter and other social networks, the growing number of ‘big data’ datasets that are increasingly available in the cloud, or the flood of sensor data generated by the nascent ‘Internet of things’, the information that is relevant to your business is now coming from more disparate sources than ever. When it comes to external information out on the Web, we take the universal interface of search engines for granted, bringing you results from a wide range of sites and services. Business Intelligence needs to take the same universal approach, giving access to different sources that we can mine and compare through a single simple interface, rather than forcing us to learn multiple tools and interfaces.
CareGroup Healthcare in Boston used PowerPivot to analyse how many patients were being readmitted to hospital within 30 days, and why. So-called ‘bounce-back’ readmissions are bad news for patients and expensive for the hospital. CareGroup quickly discovered it was a particular problem for elderly patients, where it turned out to often be related to issues with taking medications. Using PowerPivot it took four hours to develop a reusable report that pulled data from Oracle, SQL and Access databases – a process that had previously taken two days every month to build by hand. As a result, nurses now call recently discharged elderly patients to talk them through their prescriptions, so avoiding many readmissions.
So-called ‘traditional’ BI reporting tools have become simpler and more powerful in recent years, and they are beginning to work with a wider range of data sources than traditional numerical data stored in structured databases and spreadsheets. You can expect not just dashboards and reports but pivot tables and multi-dimensional models that let you explorer data visually and interactively, nesting variables and navigating hierarchies of information, or animating data to see what trends develop. There are also more specialist tools for handling less traditional data sources, although these inevitably offer less sophisticated options. It’s also important to remember that not all information is created equal: opinions on Twitter might be influential, but they’re not easy to translate into the hard numbers required for analysing past trends and market results. Extracting information from video takes either serious back-end voice recognition processing, or someone on hand to transcribe or annotate manually before you can use it for analysis. Be realistic about what you can achieve from unstructured and semi-structured data. Look for tools that let you check back to the underlying data source so that you can make a judgement on the reliability of your insights against the original data.
Slice and dice
SAP Crystal Reports is a classic BI reporting and analysis tool and, as with the reporting services built into SQL Server, the focus is on pre-built reports designed by an IT team that can use Adobe Flash and Flex which allow users to sort, filter and reformat. The filtering options are powerful enough to make the reports flexible, but this is about structured reports rather than the spontaneous exploration of data. Crystal Interactive Analysis is a simpler tool that focuses more on self-service data access to multiple data sources, and on ad-hoc tools for creating interactive reports with tables and charts that allow you to drill down or refine with ranking and filtering. Automatically replacing the more obscure database terminology with the appropriate business terms makes it more accessible to the ordinary user. For many, Excel is the BI tool of choice, and with the PowerPivot add-in, it’s not limited to number-crunching and pivot tables. PowerPivot lets you access data from relational databases, text files, public and commercial data in SQL Azure, or any service supporting the Open Data Protocol (ODP), and work with them together. You effectively get the power of an OLAP PivotTable without having to create a SQL cube, or connect to SQL Server.
PowerPivot gives you friendly layouts and the full power of Excel’s data visualisations, including horizontal and vertical slicers for interacting with the data. You can also share PowerPivot with your colleagues through SharePoint, visualisations and all. PowerPivot offers excellent performance even with large data sets containing millions of rows, so you really can explore data without grinding your system to a halt (although for the largest files you want to be using the 64-bit version of Excel). The mashups of Web data that have been popular in recent years are good examples of a different way to explore data, organising it geographically as an effective way to visualise data. The European Environment Agency’s Eye on Earth site overlays a large data-stream of water, air and noise quality information coming from 22,000 water monitoring points in 27 countries, 1,000 air monitoring stations in 32 countries, user reports of noise levels and air quality models onto an interactive map of Europe. Programming a mashup like that is complicated, but PowerPivot makes it relatively straightforward to connect data to maps. If you’re going to do a lot of this kind of work, tools like Panorama NovaView make it much easier to add grids, charts, heat maps and other components and graphical objects onto PowerPivot data.
Managing customer support and sentiment analysis (to see whether people talking about your company are angry, happy, frustrated or satisfied) are the commonest uses of the streams of social network data, and tools like HootSuite, TweetFeel, Dialogix, Radian6 and NetBase Insight Workbench can help you here. But although they are improving, most don’t yet match the features or the interfaces of more familiar BI tools.
Analysing data in the cloud
More and more data is becoming available out in the cloud, either in public or commercially through the likes of Azure DataMarket. Microsoft has said it intends to put all the features of SQL Server in SQL Azure, including reporting and analysis services. Currently in beta is a DataMarket add-in for Excel that lets you purchase and import data, but Azure is far from the only cloud service offering datasets, and the types of data you can work with in the cloud are often different to those you are used to.
These data sources involve higher volumes of data that is far less structured, and often requiring you to experiment with the model rather than fill in a familiar template. Intel’s Compute Continuum GM Peter Biddle suggested recently that in five years, a fire hydrant could have a terabyte of data associated with it, ranging from water pressure history, to which fire engine last connected to it, to which films used it as a location. When everyday objects have that much data available, we’re going to need tools to filter and navigate to what’s relevant.
While every ‘big data’ source has its own tools, learning and using multiple interfaces is time consuming and can lead to errors if you can’t compare sources directly. A better solution would be connections for familiar tools that let you work with unstructured and semi-structured ‘big data’ sources alongside internal company data and other data sources that you already have. Such a tool chain for big-data analytics is something Microsoft’s Information Services team is working on under the codenames Isotope and Roswell.
According to a recent job advert, Roswell is “a new service for information workers to easily discover and publish data and applications in the enterprise.” The job description talks about accessing and unifying cloud data services and we already know that Isotope – the Hadoop service expected for both Azure and Windows Server this summer – will integrate with tools like Flume, which aggregates log data from sources like sensor networks, as well as Sqoop, a tool for transferring data between Hadoop and relational databases.
Put all that together and Microsoft will have a set of data services that scale up and down so that you can work with familiar datasets, unstructured data from the public cloud or your company servers and private cloud, all from the same familiar tools. This shift to big data and new sources is going to be the next big step in business intelligence, and Microsoft’s approach should help you continue to get value from the BI investments you’re making today.
But when it comes to information from social networks, don’t assume that analysing how people feel about your brand is all you can really do. Applications like MicroStrategy’s Wisdom helps you analyse promotions and feedback received through Facebook applications, giving you detailed demographic information about your fans and your customers so that you can do some very targeted marketing, or automate the sending of personal messages such as birthday greetings from your CMS. For more powerful analytics, SAS Social Network Analysis combines sentiment analysis, customer segmentation and fraud detection. That said, for most of us, the truth about what’s happening in the projects we work on are to be found as much in email as in any formal documents. The Outlook Social Connector in Outlook 2010 makes it easier to see social updates and files you’ve exchanged with contacts as well as meetings you’ve both been in, but if you want to dig into the information that is concealed in your inbox, then check out Xobni. Xobni adds links to information you’ve exchanged through services like Dropbox and salesforce.com as well as emailed file, link, appointment and social network updates. It also lets you see useful statistics such as when particular people most often reply to your email, when you read or send the most messages, and how often the boss emails you.
Mine and map
If you’re working with a mix of documents, survey results, interviews and other qualitative research data, NVivo lets you analyse and structure your information without losing track of where your references come from. You can tag audio, video and image files alongside text and data, and assign concepts and relationships to ‘nodes’ that you use to build hierarchies or mind maps. The textual analysis automatically links synonyms and words with similar meanings so that you can search for one term and be sure you’re seeing all the relevant results. Video, pictures and audio aren’t automatically analysed or transcribed, so you have to annotate them manually, but as you do so you get a visual representation of the notes on screen as well as searchable, tracked text. You can chart the data you annotate in NVivo even if it’s not based on numbers, or visualise data using radar charts, tag clouds and tree maps to get an overview of the patterns and connections in your data. This can also help you identify where your results have been skewed by input from any specific source, such as a particularly chatty interview subject.
NVivo is best known in academic and research circles but it’s becoming popular with economists, market researchers, analysts and consultants. Australian firm Banarra is using it to measure corporate sustainability. Making sure the improvements that corporations claim are real rather than just PR puffery means analysing a wide range of source data that includes both qualitative and quantitative data. Organising information in a more free form way is useful, whether you clip details from Web pages into OneNote and apply custom tags, or build a visual representation using a mind-mapping tool like MindManager to organise key results or topics you want to research. MindManager’s new Connect SP plugin can automatically show SharePoint document libraries and collaborations as mind maps, so helping you to explore semi-structured data. BI tools are vital for anyone who needs to make business decisions. The advantages are both obvious and dramatic. As Microsoft’s Ted Kummert puts it, “You make people look better. They now know something they didn’t know before. They have some insight they didn’t have before. They’re going to be more effective at their jobs tomorrow. Business is going to be better.”