big data aws cloud hadoop emr apache amazon flume hive kafka spark sql architecture security performance zeppelin streaming big data demystified kinesis spark azure machine learning amazon web services automation deep learning etl sql messaging high availability nosql qa dynamo datacenter athena aws cloud kpi redis event stream qa testing sqa bigdata document store iot disaster recovery gcp website activity drilling memcache key value talend mongodb columnar store devops apache yarn hadoop 2.0 devops ansible continuous deployment testing introduction mesos os jenkins containers microsoft windows ha active passive zookeeper scale dbms innovation metodology creativity product startup change management execution lean startup management migration walla success story lesson learned demystified meetup alexa operations aerospike health care ciso anlytics strategy logz.io analytics web analytics clicktale big data architecture sparksql fake news deep fake data science python knn logisitic regression naive bayes neural network couchbase statistics tips hdfs graph store opensource imapla firehose batch processing in flight analyticts open source sqs in transit s3 at rest batch oozie cost piq data pipeline best practices impala livy tuning spectrum thrift ganglia sparkr vpc account segregation compliance encryption obfuscation
See more