From the course: Executive Guide to Predictive Modeling Strategy at Scale

Unlock the full course today

Join today to access over 24,700 courses taught by industry experts.

The nine big data bottlenecks

The nine big data bottlenecks

- [Instructor] I'm going to identify nine steps in the building of supervised machine learning models where you want to pause and ask yourself, what volume of data am I processing at this step? Store. First, you have to store the data and steward it so that the organization can access it as needed. Typically, when organizations come up with a big data strategy, they are focused almost entirely on the data storage aspect, and assume that all other phases are done on the entirety of the data. It's simply not true. The volume of data will change dramatically from step to step in the process. Assess. The data scientist who is in charge of modeling has to have access to all of the data so that they can assess it. They can't be exactly certain of what they'll need until they take a good look. While the assessment is pretty basic, they might have to look at several years' worth of data. The most important thing that…

Contents