SlideShare a Scribd company logo
BIG DATA
FIRST INTERNSHIP
Sep 2013 – Jan 2014
Organized by Contemi Vietnam
Outcomes
• Contemi
• Big Data presence
• Big Data know-how
• Big Data experience
• Interns
• Linux
• R / Python language
• Machine Learning practice
• Process
• Scrum
• Cross Industry Standard Process
for Data Mining (CRISM-DM)
• Kaggle profile
• Hadoop
Preparation
• Platform: Ubuntu 12.04 LTS
• Process:
• Scrum
• Cross Industry Standard Process for Data Mining (CRISP DM)
• Weekly blog
• http://contanalytics.wordpress.com/
Headstart for Dung
• 16/09 – 30/09
• Learn R / Python
• Try Digit Recognizer competition on Kaggle.com
• Join in Introduction to Recommender System and Web Intelligence and Big
Data on Coursera.com
3 month plan
• 1/10 – 31/10
• Go through all typical Machine Learning algorithms, implement, demo and present to Contemi
• 1/11 – 15/11
• Compete for AMS 2013-2014 Solar Energy Prediction Contest
• URL: http://www.kaggle.com/c/ams-2014-solar-energy-prediction-contest
• 16/11 – 22/11
• Compete for Accelerometer Biometric Competition
• URL: http://www.kaggle.com/c/accelerometer-biometric-competition
• 23/11 – 31/12 (end of internship)
• Deploy Hadoop
• Learn Java
• Run Word counting and Sorting experiments with large data (> 1GB)
• Compete for Facebook Recruting III – Keyword Extraction (personally)
• Re-optimize built model basing on Hadoop
Next plan for next internships
• App using Singapore open datasets
• Stock prediction app for Vietnam market
• Visualization
• GitHub
• R-Bloggers

More Related Content

PDF
Gerrit topics support with AWS Lambda
PDF
Data Driven
PPTX
KubeCon EU 2019 Review
PPTX
Intro 2018 01-25
PPTX
Lap around ASP.NET 5 - Dayton UG
PDF
Behind the Scenes at Coolblue - Feb 2017
PPTX
Introduction to Cloud Foundry
PDF
Amazon Web Services Introduction
Gerrit topics support with AWS Lambda
Data Driven
KubeCon EU 2019 Review
Intro 2018 01-25
Lap around ASP.NET 5 - Dayton UG
Behind the Scenes at Coolblue - Feb 2017
Introduction to Cloud Foundry
Amazon Web Services Introduction

What's hot (20)

PPTX
Free Your On-Premises Data
PDF
Sharing and Deploying Data Science with KNIME Server
PPTX
Sap open connectors #sitcph
PDF
AWS Finland Meetup 2019 April
PDF
All About Angular and ArcGIS - Developers Forum - AC18
PPTX
DevOpsDays Amsterdam 2016 workshop
PDF
Cloud Developer Days - BigQuery
PDF
DevTest Labs en Azure (por Iván Cañizares)
PDF
Intro to Quantum GIS Desktop GIS
ODP
An Introduction to Cloud Computing
PPTX
Cloud computing
PDF
API Management: La Puerta de enlace (por Francisco Nieto)
PDF
Power Apps Deep Dive - Munchen 2019
PDF
Open Source Story and what’s new in KNIME Software
PPTX
Scaling and Fault-resistance strategies and geography
PDF
Sitech
PPTX
AWS Dev Day 2018
PPTX
AWS Finland meetup 2017 August
PPTX
dotCMS - Move Forward, Build Faster, Get Farther
PDF
Real time serverless data pipelines on AWS
Free Your On-Premises Data
Sharing and Deploying Data Science with KNIME Server
Sap open connectors #sitcph
AWS Finland Meetup 2019 April
All About Angular and ArcGIS - Developers Forum - AC18
DevOpsDays Amsterdam 2016 workshop
Cloud Developer Days - BigQuery
DevTest Labs en Azure (por Iván Cañizares)
Intro to Quantum GIS Desktop GIS
An Introduction to Cloud Computing
Cloud computing
API Management: La Puerta de enlace (por Francisco Nieto)
Power Apps Deep Dive - Munchen 2019
Open Source Story and what’s new in KNIME Software
Scaling and Fault-resistance strategies and geography
Sitech
AWS Dev Day 2018
AWS Finland meetup 2017 August
dotCMS - Move Forward, Build Faster, Get Farther
Real time serverless data pipelines on AWS
Ad

Similar to Big data internship plan at Contemi Vietnam (20)

PPTX
SIC Finale Status Report August 6.pptx
PDF
Day 00 - Introduction to machine learning with big data
PPTX
Choosing a Cloud Provider: Public-Private-Hybrid
PPTX
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
PDF
Data Science in the Cloud
PPTX
Automating Infrastructure as a Service Deployments and monitoring – TEC213
PDF
Getting started with GCP ( Google Cloud Platform)
PPTX
Connecting DMPs & Repositories
PDF
Train the Trainers: Cloud Computing & Big Data Workshop
PPTX
With Automated ML, is Everyone an ML Engineer?
PPTX
Digital Asset Management: Intro & Career Path for Librarians
PDF
Google Associate Cloud Engineer Certification Tips
PDF
Challenges of Operationalising Data Science in Production
PDF
QuSandbox+NVIDIA Rapids
PDF
Google Cloud Machine Learning
PPTX
Top 7 mistakes
PDF
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
PPTX
20140116 Tim Willoughby and James Fogarty FOSS in Local Government
PDF
Architecting govCMS: Australian Government as a Service -
PDF
22 May 2014 CDE competition: Information processing and sensemaking presentation
SIC Finale Status Report August 6.pptx
Day 00 - Introduction to machine learning with big data
Choosing a Cloud Provider: Public-Private-Hybrid
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
Data Science in the Cloud
Automating Infrastructure as a Service Deployments and monitoring – TEC213
Getting started with GCP ( Google Cloud Platform)
Connecting DMPs & Repositories
Train the Trainers: Cloud Computing & Big Data Workshop
With Automated ML, is Everyone an ML Engineer?
Digital Asset Management: Intro & Career Path for Librarians
Google Associate Cloud Engineer Certification Tips
Challenges of Operationalising Data Science in Production
QuSandbox+NVIDIA Rapids
Google Cloud Machine Learning
Top 7 mistakes
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
20140116 Tim Willoughby and James Fogarty FOSS in Local Government
Architecting govCMS: Australian Government as a Service -
22 May 2014 CDE competition: Information processing and sensemaking presentation
Ad

More from Quang Nguyen (15)

PDF
Intro to Hadoop
PDF
Big Data Internship Program @Natural Science University
PDF
Intro to Big Data
PPTX
Hồ sơ mời tài trợ - ĐI MÔ FC
PDF
Ho so xin tai tro giai bong da dong hanh den giang duong 2013
PDF
Gioi thieu du an dong hanh den giang duong 2013 vn
PPTX
[Giaibongda.com] fi 7 min slides
PPTX
[Echelon 2012 vietnam satellite] giaibongda.com pitch deck
PPTX
[Giaibongda.com] FI 3-min pitch
PPTX
giaibongda.com pitch deck
PPTX
Agile development @open consultant offline
PPT
[4interns.vn] hành trang thực tập
PPTX
Lean startup overview @ipl offline
PDF
Lean startup
PPTX
Keep the eyes beyond competition
Intro to Hadoop
Big Data Internship Program @Natural Science University
Intro to Big Data
Hồ sơ mời tài trợ - ĐI MÔ FC
Ho so xin tai tro giai bong da dong hanh den giang duong 2013
Gioi thieu du an dong hanh den giang duong 2013 vn
[Giaibongda.com] fi 7 min slides
[Echelon 2012 vietnam satellite] giaibongda.com pitch deck
[Giaibongda.com] FI 3-min pitch
giaibongda.com pitch deck
Agile development @open consultant offline
[4interns.vn] hành trang thực tập
Lean startup overview @ipl offline
Lean startup
Keep the eyes beyond competition

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Cloud computing and distributed systems.
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Big Data Technologies - Introduction.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Programs and apps: productivity, graphics, security and other tools
Unlocking AI with Model Context Protocol (MCP)
Empathic Computing: Creating Shared Understanding
NewMind AI Weekly Chronicles - August'25 Week I
Building Integrated photovoltaic BIPV_UPV.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Understanding_Digital_Forensics_Presentation.pptx
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Review of recent advances in non-invasive hemoglobin estimation
Dropbox Q2 2025 Financial Results & Investor Presentation
sap open course for s4hana steps from ECC to s4
Cloud computing and distributed systems.
Network Security Unit 5.pdf for BCA BBA.
Big Data Technologies - Introduction.pptx
MYSQL Presentation for SQL database connectivity
Programs and apps: productivity, graphics, security and other tools

Big data internship plan at Contemi Vietnam

  • 1. BIG DATA FIRST INTERNSHIP Sep 2013 – Jan 2014 Organized by Contemi Vietnam
  • 2. Outcomes • Contemi • Big Data presence • Big Data know-how • Big Data experience • Interns • Linux • R / Python language • Machine Learning practice • Process • Scrum • Cross Industry Standard Process for Data Mining (CRISM-DM) • Kaggle profile • Hadoop
  • 3. Preparation • Platform: Ubuntu 12.04 LTS • Process: • Scrum • Cross Industry Standard Process for Data Mining (CRISP DM) • Weekly blog • http://contanalytics.wordpress.com/
  • 4. Headstart for Dung • 16/09 – 30/09 • Learn R / Python • Try Digit Recognizer competition on Kaggle.com • Join in Introduction to Recommender System and Web Intelligence and Big Data on Coursera.com
  • 5. 3 month plan • 1/10 – 31/10 • Go through all typical Machine Learning algorithms, implement, demo and present to Contemi • 1/11 – 15/11 • Compete for AMS 2013-2014 Solar Energy Prediction Contest • URL: http://www.kaggle.com/c/ams-2014-solar-energy-prediction-contest • 16/11 – 22/11 • Compete for Accelerometer Biometric Competition • URL: http://www.kaggle.com/c/accelerometer-biometric-competition • 23/11 – 31/12 (end of internship) • Deploy Hadoop • Learn Java • Run Word counting and Sorting experiments with large data (> 1GB) • Compete for Facebook Recruting III – Keyword Extraction (personally) • Re-optimize built model basing on Hadoop
  • 6. Next plan for next internships • App using Singapore open datasets • Stock prediction app for Vietnam market • Visualization • GitHub • R-Bloggers