You don’t need more projects. You need better ones.
For many aspiring and early-career data scientists, the portfolio is a source of significant anxiety. "What projects should I have?" is a question I hear frequently from those looking to break into or advance in the field, including many who inquire about the Data Veterans Career Accelerator.
It's a valid concern. Your portfolio is your primary asset to showcase your skills and capabilities. However, the type of project you should focus on depends heavily on your current experience level. While seasoned professionals with a proven track record of impact may not need to sweat the details of their GitHub repositories, for junior candidates, a well-curated portfolio is paramount.
But what does a "good" portfolio project look like? The answer isn't a single, one-size-fits-all solution. Instead, think of your portfolio as a reflection of your journey as a data scientist, progressing through distinct levels of complexity and demonstrating a growing ability to deliver value.
Level 0: The Foundational Tutorials
If you're just starting and lack formal work experience in data science, tutorials are your first step. These are the projects you likely won't be sharing widely, so don't obsess over perfect syntax, extensive comments, or polished markdown files. The priority here is execution and learning. Completing a variety of tutorials ensures you move beyond theoretical knowledge and can genuinely claim hands-on practice with the fundamental tools and techniques of the trade.
Level 1: Consolidating Your Learning
Once you have the basics down, it's time to solidify that knowledge. At this stage, the specific topic of your project is less important than its role in building your confidence and competence. The goal is to do a number of these projects to become proficient in data science methodologies, coding in general, or the use of specific, in-demand tools.
Find a subject that genuinely interests you – whether it's analysing sports statistics, exploring a niche hobby, or diving into a public dataset that has always captured your curiosity. Your personal investment will fuel your motivation to see the project through to completion. Once finished, you can confidently list the technologies and skills you utilised on your CV. This will not only help you pass the initial screening by Applicant Tracking Systems (ATS) but can often be enough to secure you that crucial first interview.
Level 2: From Skills to Solutions
This is where you begin to differentiate yourself. Level 2 projects demonstrate that you can do more than just execute technical tasks; you can solve real-world problems. It’s time to be more discerning with your dataset choices. Move away from the classic "Titanic," "Housing Prices," and "Diabetes" datasets that are staples of tutorials and bootcamps. Instead, seek out unique datasets that allow you to identify and tackle a meaningful problem. This shows initiative and a deeper understanding of how data science is applied in practice.
Level 3: Delivering Targeted Impact
The pinnacle of portfolio projects involves aligning your work with the specific needs of your target employer. This level builds upon the problem-solving approach of Level 2 but with a strategic focus. Research the companies or industries you want to work in. What challenges are they facing? What kind of problems are their data science teams trying to solve?
Tailor a project that addresses a similar type of problem. This demonstrates not only your technical acumen but also your business sense and proactive approach. You're no longer just a candidate who can "do data science". You are a potential team member who is already thinking about how to deliver the impact they are looking for.
The Magnifying Factor: Real-World Interaction
Beyond the code and the analysis, the ability to work with people is a critical, and often overlooked, skill. If you can, find opportunities to work on projects that involve real stakeholders. This could be through volunteering for a charity, collaborating with a startup incubator, or even taking on a small freelance project.
This experience is invaluable. It demonstrates your ability to manage changing requirements, communicate your progress effectively, and keep stakeholders engaged throughout the project lifecycle. Even better is a situation where you had to make tough decisions, as this showcases your ability to navigate the complexities and trade-offs inherent in any real-world project.
In conclusion, your data science portfolio should be more than a random assortment of projects. It should tell a story of your growth, from mastering the fundamentals to solving complex problems and, ultimately, delivering targeted value. By thinking about your projects in these progressive levels, you can build a portfolio that not only showcases your skills but also strategically positions you for the roles you want.
Data Scientist specializing in Data Intelligence and Machine Learning
2wAll full of facts as I will be quick in contracting someone whose portfolio shows growth - from fundamentals to advanced tools usage and real - world problem-solving
Kaggle 3x Expert | Machine Learning Engineer | Data Scientist | Logistics & Analytics | Driven to Innovate with Machine Learning for Real-World Solutions
3wGreat Breakdown, well said. Thanks for sharing the detailed approach.
Data Analyst | Data Scientist | Machine Learning | Database Management | Pyhton | SQL | R | SPARK | AWS
4wThis is really helpful and timely for me. Thanks Dan
I Design, Build & Deploy Intelligent Systems | Machine Learning & Backend Software Engineer | Python, Django, FastAPI, AI, NLP, REST APIs, Data Pipelines
1moYou've said it! Dan Sanz 📈🧪💼 I also wonder if most data scientist use data science in their personal life or if it's just for a job? 🤔