Thursday, 11 September 2008

AHM08: Cloud data mining

Robert Grossman: The Emergence of the Data Centre as a scientific instrument

Diff between google and escience

.. scale to datacentre . google, esci, health
.. scale over datacentre . esci only
.. support large data flows . esci only
.. user and file security . google, health

For Sector

implies transport and routing services needed in a addition to google's stack -
. so developed UDT 'UDP based Data Transport'

UDF map reduce applied across this stack

sector / sphere is fast, easy to program customisable 2-3x, 4-6x faster than hadoop

sphere is the compute cloud, sector is the data cloud

sector's security based on SSL and also the audit tracking that is needed.

No comments: