Tuesday, August 31, 2010

Open Source R Language May Supercharge BI

Analyzing very large data sets, like those generate by major direct commerce merchants -- sometimes known as "Big Data" --  has traditionally posed significant programming challenges. This led to the development of the R programming language in the early 90s by Ross Ihaka and Robert Gentleman (known as "R" and "R"), two academics at the Department of Statistics at the University of Auckland, New Zealand, R has since become the language of choice for large-scale analystic statistical analysis among many students, scientists, programmers and data managers.

"R is a GNU project similar to the S statistical programming language and environment, which is often the vehicle of choice for analytic statistics," notes Herman Mehling in Open Source R Language Could Revolutionize Business Intelligence on the eCRMGuide.com. "R provides an open source route to S and adds some unique capabilities. One of R's greatest strengths is the ease with which it can create well-designed publication-quality plots with mathematical symbols and formulae."

R has been brought to the business world by vendors like SAS, Netezza, Revolution Analytics and IBM, which acquired SPSS, the most active of which has been Revolution Analytics, founded by Norman Nie, who co-invented the Statistical Package for Social Sciences (SPSS), which marked the beginning of analytic and predictive statistical software.

Notes Mehling: "Earlier this month, Revolution Analytics introduced 'Big Data' analysis to its Revolution R Enterprise software, taking R to what it claims are unprecedented levels of capacity and performance for analyzing very large data sets. The company says R users will be able to process, visualize and model terabyte-class data sets in a fraction of the time of legacy products, without the need for expensive or specialized hardware.

"This Big Data scalability will help R transition from a research and prototyping tool to a production-ready platform for enterprise applications such as quantitative finance and risk management, social media, bioinformatics and telecommunications data analysis, said Nie."

A new version of Revolution R Enterprise introduces an add-on package called RevoScaleR that provides a new framework for fast and efficient multi-core processing of large data sets, a capability that Revolution Analytics claims sets it apart from other vendors.

Revolution R Enterprise works with Hadoop, NoSQL databases, relational databases and data warehouses.

Revolution Analytics also offers inside-R.org for the open source R statistics community

Netezza TwinFin DWH Appliance
Mehling reports that "Netezza, a maker of data warehouse, analytic and monitoring appliances, recently announced its TwinFin data warehousing appliance, which integrates with R. The new appliance includes Netezza's i-Class analytics capabilities and a new release of Netezza Performance Software.

"The vendor says its i-Class technology provides extensions for the development and execution of advanced analytics, including support for Java, C/C++, Fortran, Python, MapReduce, Hadoop, SAS and R. The i-Class technology eliminates the need to move data into specialized systems for advanced analytics, accelerating application performance and simplifying their deployment.

"Recently, Kelley Blue Book selected the Netezza warehouse appliance to gain deeper insights into advertising performance and site traffic, and to increase customer satisfaction and advertising revenue. kbb.com, with more than 18 million monthly visits, dramatically reduced the amount of time it took to process data for a variety of purposes, said Dan Ingle, vice president of analytic insights at Kelley Blue Book," reports Mehling.

1 comment:

Maria Sydney said...

Thanks for sharing your post and it was superb. Compare epos supplier Prices I would like to hear more from you in future too.

Web Analytics