Big Data

Recogemos una serie de recursos que hemos ido encontrando por internet. En este caso, esta sección se basa en la Big Data Wiki de DataAnalysis.


  1. Fuentes de datos
    1. Globales
    2. Locales
      1. Australia
      2. Estados Unidos
      3. Reino Unido
      4. Europa
  2. Recursos Educativos
    1. Vídeos
    2. Tutoriales Online
  3. Tecnologías
    1. Software Libre
    2. Software Privativo
  4. Links útiles
    1. Blogs
    2. Noticias
    3. Artículos y Libros
    4. Otros.


Fuentes de datos



Estados Unidos

Reino Unido


Recursos Educativos


Online Tutorials


Free Software

  • R – top notch free statistical software (once in there you might like to install a graphical user interface called R-Commander)
  • R Studio – an integrated development environment for R
  • Hadoop – Apache’s software for handling large quantities of data
  • Hortonworks HDP – Hadoop distribution for Windows
  • RHadoop – a way of interfacing Hadoop (ability to handle great volumes of data) and R (ability to apply statistical analysis)
  • Apache – An collection of Open Source Projects including Hadoop, Cassandra and Accumulo (a NOSQL solution)
  • IBM DB2 Express-C – a free relational database from IBM that allows unlimited data sizes
  • Toad for DB2 Freeware – an excellent way to analyse data in a DB2 database
  • Toad for Data Analysts/DataPoint Freeware – ODBC only, but excellent way to analyse data from across multiple databases from different vendors. The non-freeware version connects natively to Oracle, DB2 and SQL Server (which makes it faster)
  • Karmasphere Studio Professional plug-in for Eclipse – a big data development environment that connects Eclipse to Hadoop
  • Eclipse – A java-based development environment
  • Talend – a suite of useful big data management tools from Talend
  • RapidMiner and RapidAnalytics – free Data Mining Tools from Rapid-I
  • Weka – Our NZ friends excellent machine learning packages
  • Googles Chart Tools – free data visualisation tools
  • Tableau Public – an easy charting/data visualisation tool for the web
  • D3 – Data Driven Documents – a javascript library which can be used to visualise data
  • Wax – map visualisation javascript library (you’ll also need an api like ModestMaps and a map server like MapBox)

Not Free Software

  • SAS – (almost) everything a Big Data/Data Analytics/Data Mining professional could imagine they’d need
  • SPSS – statistical packages for social sciences (yes I’m old enough to remember that)
  • Oracle Advanced Analytics – Oracle Data Miner + Oracle’s integration with R (I love it!) – needs 11g
  • ACL – Audit and fraud focused analytics software – not everything is about customers!

Useful Links Wiki

Thought Leaders

  • Kurt Thearling – Data Mining specialist
  • Thomas Davenport – Fact-based Decision Making
  • Ed Tufte – Visual Display of Information
  • Dan Kahnemann – Cognitive Psychologist who (with Tversky and others) proved the field of economics could not rely on the false assumption of rational humans

Big Data News

Academic Articles and Book


  • KD Nuggets – a data mining site that’s been around since at least the late 90s.
  • Ayasdi – a leading academic based big data company focussed on combining cluster and tree type algorithms with intuitive human readable visual displays, should be interesting to see if they can find some truly valuable and generally applicable uses for this approach
  • IBM Big Data Hub – a web site dedicated to informing readers about Big Data and PureData for Hadoop in particular


Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de

Estás comentando usando tu cuenta de Cerrar sesión /  Cambiar )

Google photo

Estás comentando usando tu cuenta de Google. Cerrar sesión /  Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión /  Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión /  Cambiar )

Conectando a %s