Embed Reports and visualize Data in your Bluemix Applications

By Niklas Heidloff, posted on May 21, 2015

Here is a quick introduction to another powerful service in IBM Bluemix - Embeddable Reporting (beta). With this service developers can easily build reports using a graphical tool (or API) to visualize data and embed the reports either as HTML or JSON in their web applications.

There are two good tutorials describing the service - Leverage IBM Cognos on IBM Bluemix using the Embeddable Reporting service and Embed rich reports in your applications. I've tried the sample that is documented in the Bluemix documentation. Below is a quick summary of the key concepts.

To get started you need two database services in your Bluemix app. The report artifacts are stored in either Cloudant or Mongo. The data that you want to visualize needs to reside in a Bluemix DB2 database or dashDB. In order to create reports you can use either a graphical tool (Embeddable Reporting console) or APIs.

The data is read from databases via SQL.



Via the graphical tool you can tweak the data, e.g. filter data, join data or add additional data items to summarize data. These changes do not require schema changes to the underlaying databases but can be done on top of this in the service.



The reports are defined graphically in an HTML-like editor. The editor provides components like tables and repeat controls to define the layout as well as several charting types out of the box. Additionally you can download and import further visualization components from IBM's AnalyticsZone. The data can easily be bound to the visual components via drag and drop or property boxes.



This is the resulting HTML page with data and images from the generated report.



Web applications can embed the reports by either reading the data as JSON or by reading an HTML representation of the report that can be injected into the DOM (without iframes). The HTML contains data and formatting as text and the graphical elements as images. The reports are read by the web applications via AJAX REST calls. In order to read Bluemix credentials that are necessary to access the reports server side proxies can be used.

Here is another quick introduction to one of the cool services in IBM Bluemix - the DataWorks service that IBM announced at the end of last year.

In order to leverage and analyze data from various sources and big amounts of data, data typically has to be cleansed and prepared first. This includes activities like joining data from multiple sources, filtering out unnecessary parts, sorting and classifying data and so forth. Then the actual processing and analysis of this data is simpler when using tools like dashDB, Hadoop or Watson Analytics. The DataWorks service in Bluemix provides various functionality to do this.

The service comes with a graphical tool Forge (beta) to load data from various sources. These sources can be sources that are available in the cloud like Salesforce or Amazon Redshift or databases that are run on-premises like DB2 and Oracle. To access the on-premises sources in the cloud the DataWorks service comes with a secure gateway that you can also use separately on Bluemix. The loaded data can then be shaped easily with Forge, e.g. sorted or filtered.



After this step activities are created that can be triggered on a scheduled basis to move the shaped data into data sources like Cloudant, dashDB, Watson Analytics or SQL databases which either provide built-in analytics functionality and/or can be used by application developers to query the data they are interested in.

The same functionality that is available in Forge (and more) can also be invoked by application developers via APIs. There is a Data Load REST API to load data (sample) and a Data Profiling REST API to analyze your cloud-based data source to understand the structure and content of the data (sample). Additionally there is an Address Cleansing REST API to verify US addresses (sample).

Check out this video to see some of this in action.

Bluemix Object Storage Service to store Files in the Cloud

By Niklas Heidloff, posted on May 19, 2015

IBM Bluemix provides object storage services to store and manage various types and large amounts of files by providing high-available, cloud-based storage that scales, is secure and cost-effective. The Bluemix services use SoftLayer Object Storage and make it easy to consume for Bluemix developers by provisioning accounts automatically when services are added to apps. SoftLayer Object Storage is based on the OpenStack Swift project so that you can use standard APIs to access your files.

Bluemix offers two different versions of the object storage as beta - Object Storage version 1 and Object Storage version 2. I originally assumed that version 2 is a successor of version 1 but the two versions actually address slightly different needs. Version 1 is similar to most other Bluemix services where you get your own account provisioned when adding a service to your CloudFoundry app. For example you could even have two accounts within one space. Version 2 provides a flexible usage model to allow usage of the same service not only from CloudFoundry apps, but also from containers or VMs or even outside of the public Bluemix context. When adding version 2 to an app the service creates a public cloud tenant and Swift account per Bluemix organization. You can then access your account also via the Cloud Management Dashboard to manage your containers, folders, files, etc.



There are several tutorials available how to use the object storage services. The tutorial Build a cloud storage application describes how to develop a Node.js application, how to do the authentication, how to list and upload files, etc. The tutorial Fine-grained access control for the Bluemix Object Storage service explains how to use the Single Sign On Service in a Node.js application to ensure that certain users of your app can only access files to which they have access.

This morning when I was trying to build a Java sample I ran into a deck from my colleague Joseph Chang who has done this already and published his sample. He uses the Java library Apache jclouds and reads the credentials from the Bluemix environment.

I just tried quickly a beta service in IBM Bluemix that was announced earlier this year. The Static Analyzer service helps finding potential vulnerabilities in your Java code like cross site scripting issues and missing encodings and displays the results in a report with descriptions and mitigation strategies.

There are different ways to run the tool. I chose the Eclipse plugin.



The reports can be accessed via a dashboard.



Here is a sample of a reported issue.



For a quick demo check out this video.

What are IPython Notebooks and how to use them on Bluemix

By Niklas Heidloff, posted on May 13, 2015

Last week at StrataHadoop Rod Smith and David Fallside demonstrated project Nitro from IBM Emerging Technologies allowing business users to analyze big amounts and different types of data including real-time data. In the optimal case business users can do this without any help but sometimes collaboration with data scientists is necessary who provide the necessary code that can then be consumed in Nitro.

As I started to learn about big data and analytics only recently I had and have to learn about these topics, for example about other roles, other technologies and other programming languages. In the big data world there is the role of a Data Scientist. Here is how Wikipedia defines this role:
"Data scientists use the ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure consistency of datasets; create visualizations to aid in understanding data; build mathematical models using the data."

As far as programming languages data scientists use a variety of languages which includes also for application developers familiar languages like Java and Scala, but as far as I can see the two mostly used languages are Python and R which have pros and cons.

Python code can be written, documented and run easily in IPython Notebooks which are essentially web based IDEs for data scientists and very popular these days. Many universities are using them and the number of notebooks on GitHub is growing very fast. As example check out how to use Python to see how the Times writes about men and women and how to find out how clean restaurants are in San Francisco.

To learn more about IPython Notebooks check out these tutorials, the A Programmer's Guide to Data Mining or the videos on the IPython website.

There are several ways to run IPyton Notebooks on IBM Bluemix. My colleague Jean Francois Puget blogged recently about how to deploy notebooks to Bluemix via Docker within minutes. I've followed his instructions and it was really easy. So if you want to learn more about this technology just set it up on Bluemix and give it a try.

More Blog Entries ...

Hi, my name is Niklas Heidloff. I work for IBM as an IBM Bluemix Developer Advocate. The blog contains information about IBM Bluemix and articles about my previous work in IBM Collaboration Solutions, esp. IBM Connections and XPages.

@nheidloff

Disclaimer

The postings on this site are my own and don't necessarily represent my employer IBM's positions, strategies or opinions.