The Internet is a vast space where huge quantities and varieties of data are generated regularly and flow freely.
▪ The massive volume of data generated by this huge number of users is further enhanced by the multiple devices utilized by most users.
▪ In addition to these data-generating sources, non-human data generation sources such as sensor nodes and automated monitoring systems further add to the data load on the Internet.
▪ This huge data volume is composed of a variety of data such as e-mails, text documents (Word docs, PDFs, and others), social media posts, videos, audio files, and images, as shown in Figure 6.1.
▪ However, these data can be broadly grouped into two types based on how they can be accessed and stored:These are typically text data that have a pre-defined structure.
▪ Structured data are associated with Relational Database Management Systems (RDBMS).
▪ These are primarily created by using length-limited data fields such as phone numbers, social security numbers, and other such information.
▪ Even if the data is human or machine-generated, these data are easily searchable by querying algorithms as well as human-generated queries.
▪ Common usage of this type of data is associated with flight or train reservation systems, banking systems, inventory controls, and other similar systems.
▪ Established languages such as Structured Query Language (SQL) are used for accessing these data in RDBMS.
▪ In the context of IoT, structured data holds a minor share of the total generated data over the Internet.All the data on the Internet, which is not structured, is categorized as unstructured.
▪ These data types have no pre-defined structure and can vary according to applications and data-generating sources.
▪ Some of the common examples of human-generated unstructured data include text, e-mails, videos, images, phone recordings, chats, and others.
▪ Some common examples of machine-generated unstructured data include sensor data from traffic, buildings, industries, satellite imagery, surveillance videos, and others.
▪ From its examples, this data type does not have fixed formats associated with it, which makes it very difficult for querying algorithms to perform a look-up.
▪ Querying languages such as NoSQL(not only SQL) are generally used for this data type.The vast amount and types of data flowing through the Internet necessitate the need for intelligent and resourceful processing techniques.
▪ This necessity has become even more crucial with the rapid advancements in IoT, which is laying enormous pressure on the existing network infrastructure globally.
▪ Given these urgencies, it is important to decide—when to process and what to process?
▪ Before deciding upon the processing to pursue, we firThe identification and intelligent selection of processing requirement of an IoT application are one of the crucial steps in deciding the architecture of the deployment.
▪ A properly designed IoT architecture would result in massive savings in netw