TensorFlow Basics
What is TensorFlow?
TensorFlow is an open-source machine learning framework developed by Google Brain. It allows developers and researchers to build and deploy machine learning (ML) and deep learning models efficiently. TensorFlow offers a variety of tools for designing, training, and deploying models across multiple platforms (cloud, edge devices, web, and mobile).
Why is TensorFlow Used?
TensorFlow provides:
Ease of Model Building: Supports high-level APIs like Keras, making it easy to create and experiment with neural networks.
Scalability: Models can be deployed on various devices, including CPUs, GPUs, and TPUs.
Flexibility: Works with multiple languages (Python, JavaScript, C++) and supports distributed training.
Visualization: TensorBoard helps visualize metrics like loss, accuracy, and gradients during model training.
Production Deployment: It enables cross-platform deployments—from edge devices (e.g., mobile apps) to large cloud environments.
Where is TensorFlow Used in Deep Learning?
TensorFlow is heavily utilized in deep learning for creating and training various models. Some examples include:
Image Classification (CNNs - Convolutional Neural Networks)
Natural Language Processing (NLP tasks like sentiment analysis, machine translation)
Time Series Forecasting (using LSTM, RNN models)
Generative Models (like GANs and VAEs)
Reinforcement Learning (for tasks such as game AI)
What is a Tensor?
A tensor is a multi-dimensional array used to represent data in TensorFlow. It's the fundamental data structure of TensorFlow, similar to arrays or matrices in other programming languages, but with added capabilities for higher dimensions. Tensors can represent scalars, vectors, matrices, or n-dimensional arrays.
Examples of Tensors
Scalar (0-D Tensor): A single number, like a value 3.03.03.0.
Vector (1-D Tensor): A 1D array, like [1.0, 2.0, 3.0].
Matrix (2-D Tensor): A 2D grid of numbers, like [[1.0, 2.0], [3.0, 4.0]].
Higher-Dimensional Tensor (3-D, 4-D, etc.): Used for complex data like images or video sequences.
Tensor Dimensions and Rank
Rank: Number of dimensions in a tensor.
Shape: The size of each dimension.
Tensor Properties
Immutable: Tensors are immutable by default, meaning once created, they cannot be modified.
Data Type: TensorFlow supports data types such as float32, int32, bool, etc.
Device Independence: Tensors can run on different hardware devices (like CPU or GPU).
What is a constant in tensforflow?
In TensorFlow, a constant is a tensor whose value is fixed and does not change during execution. It is created using the tf.constant() function and is useful when you need tensors with predefined values that won’t be modified throughout the computation
Key Characteristics of TensorFlow Constants:
Immutable: Once defined, their values cannot be changed.
Predefined Values: Suitable for inputs that do not require updates.
Used in Models: Often used for things like initial weights, biases, or hyperparameters.
Syntax: tf.constant(value, dtype=None, shape=None)
value: The initial value of the constant.
dtype: (Optional) Data type of the tensor (e.g., tf.float32, tf.int32).
shape: (Optional) Specifies the shape if the value can be broadcast to it.
When to Use Constants?
When the data remains unchanged throughout the computation.
For inputs such as weights or biases that you don’t want to modify.
As hyperparameters, like learning rates or fixed values, used in model design.
Install TensorFlow
Make sure you have Python installed and then run the below code:
TensorFlow Data Structure
0D Tensor (Scalar)
A 0D tensor is a single value or a scalar (no dimensions).
1D Tensor (Vector)
A 1D tensor is a sequence of numbers, similar to an array or a list.
2D Tensor (Matrix)
A 2D tensor is like a table or a matrix with rows and columns.
3D Tensor (Cube or Volume)
A 3D tensor represents data with three dimensions, like a stack of matrices or multiple 2D grids (for example, RGB images).
Explanation of Shapes:
0D Tensor: No shape, just a single value (()).
1D Tensor: List of values ((n,) where n is the number of elements).
2D Tensor: Matrix with rows and columns ((rows, columns)).
3D Tensor: A stack of matrices or volume ((depth, rows, columns)).
Tensor with Different Data Types
Tensors can store data of different types like integers, floats, or strings.
Basic Tensor Operations
Addition:
Subtraction:
Element-wise Multiplication of Tensors:
In element-wise multiplication, corresponding elements of two tensors are multiplied together.
Explanation:
1×5=5
2×6=12
3×7=21
4×8=32
Matrix Multiplication of Tensors (Dot Product)
In matrix multiplication, the dot product is calculated between corresponding rows and columns.
Explanation:
First row, first column: 1×5+2×7=19
First row, second column: 1×6+2×8=22
Second row, first column: 3×5+4×7=43
Second row, second column: 3×6+4×8=50
Subtraction of Tensors
In subtraction, corresponding elements are subtracted from each other.
Explanation:
1−5=−4
2−6=−4
3−7=−4
4−8=−4
Tensor with Different Data Types
Tensors can store data of different types like integers, floats, or strings.
Shape, Rank, Axis, and Size of Tensor
Shape: Dimensions of the tensor (rows and columns).
Rank: The number of dimensions (e.g., 1D, 2D, 3D).
Axis: Specific dimension in a tensor (like row or column).
Size: Total number of elements.
Tensor Indexing
Tensor Reshaping
Tensor Transpose
Tensor Broadcasting
Broadcasting lets TensorFlow perform operations on tensors of different shapes.
Tensor Slicing
Random Number Generation
Ragged Tensors
Ragged tensors have rows of different lengths.
Tensor Concatenation
Variables in TensorFlow
Variables are tensors whose values can be changed.
Creating a Simple Linear Model
Let’s build a simple linear regression model to fit y = 2x + 1.
Code Explanation:
This code builds and trains a simple linear regression model to fit the equation y=2x+1
Input Data
X and Y represent the input and output data, respectively.
Y = 2X+1
When X=1.0 Y=3.0
When X=2.0, Y=5.0
When X=3.0, Y=7.0, and so on.
The input data (X, Y) is stored as tensors of type float32.
Initialize Weight and Bias
W (weight) and b (bias) are initialized to 0.0. These are trainable variables that the model will learn to adjust during training.
Weight and bias are used in the linear equation: y=W⋅X+b.
Define the Linear Model
The goal of training is to find the optimal values for W and b so that the predicted value Y_pred matches the actual value Y.
Define the Loss Function
The loss function measures how far the model’s predictions (Y_pred) are from the actual values (Y_true).
Here, we use Mean Squared Error (MSE)
The goal is to minimize this loss by adjusting W and b.
Define the Optimizer
Stochastic Gradient Descent (SGD) is used as the optimizer. It adjusts W and b to minimize the loss.
learning_rate = 0.01 controls how large a step the optimizer takes when adjusting the parameters.
Training Loop
GradientTape: Tracks operations to compute the gradients automatically during backpropagation.
Predictions: The linear model generates predictions (Y_pred).
Loss Calculation: The current loss between Y_pred and the actual Y is calculated.
Compute Gradients: Gradients of W and b with respect to the loss are calculated.
Update Weights and Bias: The optimizer updates W and b using the computed gradients.
Print Loss: Every 10 epochs, the loss value is printed to track the model’s performance.
Final Weight and Bias
After 100 epochs, the trained values of W (weight) and b (bias) are printed.
Ideally, the trained values should be W ≈ 2.0 and b ≈ 1.0 (since we are trying to fit the equation y=2x+1).
Output
What Each Part of the Output Means
Epoch 0: Loss = 55.0
Initial Loss: This is the loss after the first iteration (epoch 0) before any significant updates to the parameters W (weight) and b (bias).
High Loss: A large initial loss (55.0) indicates that the model’s predictions (Y_pred) are very far from the actual values (Y) at the start, as the initial values for W and b were both set to 0.0.
Epoch 10: Loss = 1.0125
After 10 epochs, the loss has decreased to 1.0125. This shows that the model is improving by adjusting the parameters (W and b) in the right direction, bringing predictions closer to the actual values.
Gradient Descent is Working: The optimizer (SGD) is successfully minimizing the difference between the predicted and actual values.
Epoch 20: Loss = 0.03746875
As the training progresses, the loss further decreases. By epoch 20, the loss is very small (0.037), indicating that the predictions are becoming more accurate.
Epoch 90: Loss = 0.00032407963
At epoch 90, the loss is very close to 0, meaning the model's predictions are nearly perfect for the given input data. This suggests that the model has almost perfectly learned the relationship y=2x+1.
Explanation of Loss Values Across Epochs
The loss decreases over time because the optimizer (SGD) continuously adjusts the parameters W and b to minimize the difference between predicted and actual values. The key idea is to reduce the error between the predicted outputs (Y_pred) and the actual outputs (Y) with each epoch, resulting in a smaller loss value.
High Initial Loss (Epoch 0):
Since both W and b were initialized to 0.0, the initial predictions are all 0.0. This causes a large error between predicted and actual values, resulting in a high loss.
Loss Decreases Over Time:
As the model learns, the weight and bias are gradually updated, and predictions get closer to the actual outputs.
Near Zero Loss (Epoch 90):
By epoch 90, the loss becomes extremely small, meaning that the model has effectively learned the correct relationship between X and Y.
For more in-depth technical insights and articles, feel free to explore:
Technical Blog: Ebasiq Blog
GitHub Code Repository: Python Tutorials
YouTube Channel: Ebasiq YouTube Channel
Instagram: Ebasiq Instagram
AI/ML/GenAI Architect Innovator
10moVery informative