Friday, November 29, 2013

How I learned D3


Everything genious is simple and D3.js is proving this. There is no other efficient way to build an unconventional plot except using this tool. It has a steep learning curve, but you gain a lot after you understand the approach.

The best way to learn D3 is reading all tutorials several times and then trying to replicate as many examples as possible. You can try my tutorial as well, it doesn't cover everything but is a good material for sure.

And here is the checklist you should keep in mind, before you start doing something really serious:
  • margin conventions
  • data enter/exit
  • js callback functions
  • transitions
  • d3 helpers

Check out the latest speech by Mike Bostock, the father of D3. 


Eyeo 2013 - Mike Bostock from Eyeo Festival on Vimeo.

That's it. Good luck and have fun!

Wednesday, November 27, 2013

Redis for Data Analysis

Attractor has a lot of realtime calculations. In this post I would like to share my insights about working with streaming data using redis. There are many challenging issues related with realtime data analysis. Among them are the following:
  • How to store data?
  • Which data structures should be used?
  • How to keep data persistent?
  • How to make queries fast? 
The list is not an ideal one. However, it shows what you should keep in mind while building your own data warehouse with realtime data processing. 

You should definitely look through these presentations:





High-Volume Data Collection and Real Time Analytics Using Redis from cacois


My personal insights:

You should not use Redis like HDFS or other "big data" storage. Raw data can be stored in Redis, but not for a long time. Transfer raw data to other data storage solutions regularly.

Use all data structures efficiently. You should clearly understand when it is appropriate to use sorted sets, hashes or bitmaps. If you do everything right, Redis is able to handle billions of rows within milliseconds.

Don't forget about data persistence. Use AOF each second and keep db dumps in a safe place.