Digitally Speaking

Justin Price
November 8, 2017

Year by year we are generating increasingly large volumes of data which require more complex and powerful tools to analyse in order to produce meaningful insights.

What is machine learning?

Anticipating the need for more efficient ways of spotting patterns in large datasets on mass, Machine Learning was developed to give computers the ability to learn without being explicitly programmed.

Today, it largely remains a human-supervised process, at least in the developmental stage. This consists of monitoring a computer’s progress as it works through a number of “observations” in a data set arranged to help train the computer into spotting patterns between attributes as quickly and efficiently as possible. Once the computer has started to build a model to represent the patterns identified, the computer then goes through a looping process, seeking to develop a better model with each iteration.

How is it useful?

The aim of this is to allow computers to learn for themselves, knowing when to anticipate fluctuation between variables which then helps us to forecast what may happen in future. With a computer model trained on a specific data problem or relationship, it then allows data professions to produce reliable decisions and results, leading to the discovering of new insights which would have remained hidden without this new analytical technique.

Real-world Examples

Think this sounds like rocket science? Every time you’ve bought something from an online shop and had recommendations based on your purchase – that’s based on machine learning. Over thousands of purchases the website has been able to aggregate the data and spot correlations based on real buying users’ buying patterns, and then present the most relevant patterns back to you based on what you did or bought. You may see these as “recommended for you” or “this was frequently bought with that”. Amazon and Ebay have been doing this for years, and more recently, Netflix.

This sounds fantastic – but where can this help us going forward?

Deep learning

This is distinguished from other data science practices by the use of deep neural networks. This means that the data models pass through networks of nodes, in a structure which mimics the human brain. Structures like this are able to adapt to the data they are processing, in order to execute in the most efficient manner.

Using these leading techniques, some of the examples now look ready to have profound impacts on how we live and interact with each other.We are currently looking at the imminent launch of commercially available real-time language translation which requires a speed of analysis and processing never available before. Similar innovations have evolved in handwriting-to-text conversion with “smartpads” such as the Bamboo Spark, which bridge the gap between technology and traditional note taking.

Other applications mimic the human components of understanding; classify, recognise, detect and describe (according to SAS.com). This has now entered main-stream use with anti-spam measures on website contact forms, where the software knows which squares contain images of cars, or street signs.

Particularly within the healthcare industry, huge leaps are made where scanned images of CT scans have been “taught” how to spot the early sign of lung cancer in Szechwan People’s Hospital, China. This has come in to meet a great need as there is a shortage of trained radiologists to examine patients.

In summary, there have been huge leaps in data analysis and science in the last couple years. The future looks bright for the wider range of real world issues to which we can apply more and more sophisticated techniques and tackle previously impossible challenges. Get in touch and let’s see what we can do for you.

Category: Analytics, Automation

Anis Makeriya
August 21, 2017

It’s always the same scenario: someone giving me some data files that I just want to dive straight into and start exploring ways to visually depict them, but I can’t.

I’d fire up a reporting tool only to step right back, realising that for data to get into visual shapes, they need to be in shape first!  One correlation consistently appearing over the years is that time spent on ETL/ELT (Extract, Transform and Load [in varying sequences]) and the speed of exit from reporting layer back to data prep share a negative correlation.

Data preparation for the win

‘80% of time goes into data prep’ and ‘Garbage in Garbage out (GIGO)’ have existed for some time now but don’t actually hit you until you face it in practical situations and it suddenly translates into ‘backward progress’. Data quality issues can vary from date formats, multiple spellings of the same value to values not existing at all in the form of nulls. So, how can they all be dealt with? Data prep layer is the answer.

Often with complex transformations or large datasets, analysts find themselves turning to IT to perform the ETL process. Thankfully, over the years, vendors have recognised the need to include commonly used transformations in the reporting tools themselves. To name a few, tools such as Tableau and Power BI have successfully passed this power on to the analysts making time to analysis a flash. Features such as pivot, editing aliases, joining and unioning tables and others are available within a few clicks.

There may also be times when multiple data sources need joining, such as matching company names. Whilst Excel and SQL fuzzy look-ups have existed for some time, the likes of dedicated ETL tools such as Paxata have imbedded further intelligence that enable it to go a step further and recognise that the solutions lies beyond just having similar spellings in between the names.

All the tasks mentioned above are for the ‘T’ (Transformation) of ETL and is only the second OR third step in the ETL/ELT process! If data can’t be extracted as part of the E in ETL in the first place, there is nothing to transform. When information lies in disparate silos, often it cannot be ‘merged’ unless the data is migrated or replicated across stores. Following the data explosion in the past decade, Cisco Data Virtualisation has gained traction for its core capability of creating a ‘merged virtual’ layer over multiple data sources enabling quick time to access as well as the added benefits of data quality monitoring and single version of the truth.

These recent capabilities are now even more useful with the rise in data services like Bloomberg/forex and APIs that can return weather info, if we want to further know how people feel about the weather, then the twitter API also works.

Is that it..?

Finally after the extraction and transformation of the data, the load process is all that remains… but even that comes with its own challenges. Load frequencies, load types (incremental vs. full loads) depending on data volumes, data capture (changing dimensions) to give an accurate picture of events and also storage and query speeds from the source to name a few.

Whilst for quick analysis a capable analyst with best practice knowledge will suffice, scalable complex solutions will need the right team from IT and non-IT side in addition to the tools and hardware to support it going forward smoothly. Contact us today to help you build a solid Data Virtualisation process customised to your particular needs.

Fanni Vig
April 20, 2017

Finally, it’s out!

With acquisitions like Composite, ParStream, Jasper and AppDynamics, we knew something was bubbling away in the background for Cisco with regards to edge analytics and IoT.

Edge Fog Fabric – EFF

The critical success factor for IoT and analytics solution deployments is to provide the right data, at the right time to the right people (or machines) .

With the exponential growth in the number of connected devices, the marketplace requires solutions that provide data generating devices, communication, data processing, and data leveraging capabilities, simultaneously.

To meet this need, Cisco recently launched a software solution (predicated on hardware devices) that encompasses all the above capabilities and named it Edge Fog Fabric aka EFF.

What is exciting about EFF?

To implement high performing IoT solutions that are cost effective and secure, a combination of capabilities need to be in place.

  • Multi-layered data processing, storing and analytics – given the rate of growth in the number of connected devices and the volume of data. Bringing data back from devices to a DV environment can be expensive. Processing information on the EFF makes this a lot more cost effective.
  • Micro services – Standardized framework for data processing and communication services that can be programmed in standard programming language like Python, Java etc.
  • Message routers – An effective communication connection within the various components and layers. Without state of the art message brokerage, no IoT systems could be secure and scalable in providing real time information.
  • Data leveraging capabilities – Ad hoc, embedded or advanced analytics capabilities will support BI and reporting needs. With the acquisition of Composite and AppDynamics, EFF will enable an IoT platform to connect to IT systems and applications.

What’s next?

Deploying the above is no mean feat. According to Gartner’s perception of the IoT landscape, no organization have yet achieved the panacea of connecting devices to IT systems and vice versa, combined with the appropriate data management and governance capabilities embedded. So there is still a long road ahead.

However, with technology advancements such as the above, I have no doubt that companies and service providers will be able to accelerate progress and deliver further use cases sooner than we might think.

Based on this innovation, the two obvious next steps that one can see fairly easily are:

  • Further automation – automating communication, data management and analytics services including connection with IT/ERP systems
  • Machine made decisions – once all connections are established and the right information reaches the right destination, machines could react to information that is shared with ‘them’ and make automated decisions.

Scott Hodges
April 18, 2017

Attending a recent IBM Watson event, somebody in the crowd asked the speaker, “So, what is Watson? ” It’s a good question – and one isn’t really a straightforward answer to. Is it a brand? A supercomputer? A technology? Something else?

Essentially, it is an IBM technology that combines artificial intelligence and sophisticated analytics to provide a supercomputer named after IBM’s founder, Thomas J. Watson. While interesting enough, the real question, to my mind, is this: “What sort of cool stuff can businesses do with the very smart services and APIs provided by IBM Watson?”

IBM provides a variety of services, available through Application Programmable Interfaces (APIs) that can developers can use to take advantage of the cognitive elements and power of Watson. The biggest challenge to taking advantage of these capabilities is to “Think cognitively” and imagine how they could benefit your business or industry to give you a competitive edge – or, for not-for-profit organisations, how they can help you make the world a better place.

I’ve taken a look at some of the APIs and services available to see some of the possibilities with Watson. It’s important to think of them collectively rather than individually, as while some use-cases may use one, many will use a variety of them, working together. We’ll jump into some use-cases later on to spark some thoughts on the possibilities.

Natural Language Understanding

Extract meta-data from content, including concepts, entities, keywords, categories, sentiment, emotion, relations and semantic roles.

Discovery

Identify useful patterns and insights in structured or unstructured data.

Conversation

Add natural language interfaces such as chat bots and virtual agents to your application to automate interactions with end users.

Language Translator

Automate the translation of documents from one language to another.

Natural Language Classifier

Classify text according to its intent.

Personality Insights

Extract personality characteristics from text, based on the writer’s style.

Text to Speech and Speech to Text

Process natural language text to generate synthesised audio, or render spoken words as written text.

Tone Analyser

Use linguistic analysis to detect the emotional (joy, sadness etc) linguistic (analytical, confident etc) and social (openness, extraversion etc) tone of a piece of text.

Trade-off Analytics

Make better choices when analysing multiple, even conflicting goals.

Visual Recognition

Analyse images for scenes, objects, faces, colours and other content.

All this is pretty cool stuff, but how can it be applied to work in your world? You could use the APIs to “train” your model to be more specific to your industry and business, and to help automate and add intelligence to various tasks.

Aerialtronics offers a nice example use-case of visual recognition in particular, they develop, produce and service commercial unmanned aircraft systems. Essentially, the company teams drones, an IoT platform and Watson’s Visual recognition service, to help identify corrosion, serial numbers, loose cables and misaligned antennas on wind turbines, oil rigs and mobile phone towers. This helps them automate the process of identifying faults and defects.

Further examples showing how Watson APIs can be combined to drive powerful, innovative services can be found on the IBM Watson website’s starter-kit page.

At this IBM event, a sample service was created, live in the workshop. This application would stream a video, convert the speech in the video to text, and then categorise that text, producing an overview of the content being discussed. The application used the speech-to-text and natural language classifier services.

Taking this example further with a spot of blue sky thinking, for a multi-lingual organisation, we could integrate the translation API, adding the resulting service to video conferencing. This could deliver near real-time multiple dialect video conferencing, complete with automatic transcription in the correct language for each delegate.

Customer and support service chat bots could use the Conversation service to analyse tone. Processes such as flight booking could be fulfilled by a virtual agent using the ‘Natural Language Classifier’ to derive the intent in the conversation. Visual recognition could be used to identify production line issues, spoiled products in inventory or product types in retail environments.

Identification of faded colours or specific patterns within scenes or on objects could trigger remedial services. Detection of human faces, their gender and approximate age could help enhance customer analysis. Language translation could support better communication with customers and others in their preferred languages. Trade-off Analytics could help optimise the balancing of multiple objectives in decision making.

This isn’t pipe-dreaming: the toolkit is available today. What extra dimensions and capabilities could you add to your organisation, and the way you operate? How might you refine your approach to difficult tasks, and the ways you interact with customers? Get in contact today to discuss the possibilities.

Alastair Broom
December 10, 2016

I was recently asked what I think will be three things making an impact on our world in 2017, with a few permutations of course:

Maximum of 3 technologies that will be significant for enterprises in terms of driving value and transforming business models and operations in 2017

Innovations that are most likely to disrupt industries and businesses

I’ve put my three below – it would be great to hear your thoughts and predictions in the comments!

Internet of Things

The Internet of Things is a big one for 2017. Organisations will move from exploring ideas around what IoT means for them in theory, to rolling out sensors across key opportunity areas and starting to gather data from what were previously “dark assets”. The reason IoT is so important is because of the amount of data the things will generate, and what new insight this gives to organisations, including things like physical asset utilisation & optimisation and proactive maintenance. Those organisations that take the IoT seriously are going to see their customers, their data, and their opportunities in completely new ways. Being able to add more and more data sources into the “intelligence in stream” mean the decision is backed by more facts. It’s Metcalfe’s Law – the value of the network is proportional to the square of the number of users. Data is the network, and each thing is another user.

Being prepared to exploit the IoT opportunity though, especially at scale, will take proper planning and investment. Organisations will need a strategy to address the IoT, one that identifies quick wins that help further the business case for further IoT initiatives. The correct platform is key, an infrastructure for things. The platform that forms the basis for the connectivity of the things to the network will need to be robust, likely be a mix of wired and wireless, and because it’s unlikely to be a separate infrastructure, it needs to have Ideology the required visibility and control to ensure data is correctly identified, classified and prioritised.
Security too will be fundamental. Today the things are built for user convenience, security being a secondary concern. What the IoT then represents is a massively increased attack surface, one that is particularly vulnerable to unsophisticated attack. The network will therefore need to be an integral part of the security architecture.

Edge Analytics

Edge analytics is another one to look out for. As the amount of data we look to analyse grows exponentially, the issue becomes twofold. One, what does it cost to move that data from its point of generation to a point of analysis? Bandwidth doesn’t cost what it used to, but paying to transport TB and potentially PB of information to a centralised data processing facility (data centre that is) is going to add significant cost to an organisation. Two, having to move the data, process it, and then send an action back adds lag. The majority of data we have generated to this point has been for systems of record. A lag to actionable insight in many scenarios here may very well be acceptable. But as our systems change to systems of experience, or indeed systems of action, lag is unacceptable.
Analytics at the edge equates to near real-time analytics. The value of being able to take data in real time, its context, and analyse this amongst potentially multiple other sources of data, and then present back highly relevant in the moment intelligence, that’s amazing. Organisations once again need to ensure the underlying platform is up to the task. The ability to capture the right data, maintain its integrity, confirm to privacy regulations and be able to manage the data throughout its lifecycle. Technology will be needed to analyse the data at its point of creation, essentially you will need to bring compute to the data (and not the other way round as typically done today).

Cognitive Systems

Lastly, cognitive systems. Computers to this point have been programmed by humans to perform pretty specific tasks. Cognitive systems will not only now “learn” what to do from human interaction, but from the data they generate themselves, alongside the data from other machines. Cognitive systems will be continually reprogramming themselves, each time getting better and better at what they do. And what computers do Some is help us to things humans can do, but faster. Cognitive systems will expand our ability make better decisions, to help us think better. Cognitive systems move from computing systems that have been essentially built to calculate really fast, to systems that are built to analyse data and draw insights from it. This extends to being able to predict outcomes based on current information and consequences of actions. And because it’s a computer, we can use a far greater base of information from which to draw insight from. Us humans are really bad at remembering a lot of information at the same time, but computers (certainly for the short term) are only constrained by the amount of data we can hold in memory to present to compute node for processing.

Latest Tweets