You are reading the article Transfer Learning Using Vgg16 In Pytorch updated in December 2023 on the website Hatcungthantuong.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Transfer Learning Using Vgg16 In Pytorch
This article was published as a part of the Data Science Blogathon
We’re always told that “Practice makes a man perfect” and we’re made to practice tons of problems in different domains to prepare us for the doom day i.e our final exam. The more variety of problems we solve, the better we get at transferring that knowledge to solve a new problem. What if there’s a way to apply the same technique to solve classification, regression, or clustering problems.
Transfer learning is a technique by which we can use the model weights trained on standard datasets such as ImageNet to improve the efficiency of our given task.
Why transfer learning?Before we go further into how transfer learning works, let’s look at the benefits we gain after doing transfer learning. The learning process during transfer learning is:
Fast – Normal Convolutional neural networks will take days or even weeks to train, but you can cut short the process with transfer learning.
Accurate- Generally, a Transfer learning model performs 20% better than a custom-made model.
Needs less training data- Being trained on a large dataset, the model can already detect specific features and need less training data to further improve the model.
Transfer Learning on Image DataTo demonstrate transfer learning here, I’ve chosen a simple dataset of the binary classifier which can be found here:
This data consists of two classes of cats and dogs, i.e 2.5k images for cats and 2.5k images for the dog.
VGG ArchitectureThere are two models available in VGG, VGG-16, and VGG-19. In this blog, we’ll be using VGG-16 to classify our dataset. VGG-16 mainly has three parts: convolution, Pooling, and fully connected layers.
Convolution layer- In this layer, filters are applied to extract features from images. The most important parameters are the size of the kernel and stride.
Pooling layer- Its function is to reduce the spatial size to reduce the number of parameters and computation in a network.
Fully Connected- These are fully connected connections to the previous layers as in a simple neural network.
Given figure shows the architecture of the model:
To perform transfer learning import a pre-trained model using PyTorch, remove the last fully connected layer or add an extra fully connected layer in the end as per your requirement(as this model gives 1000 outputs and we can customize it to give a required number of outputs) and run the model.
Pre-processingPreprocessing images before training is a very essential step to avoid errors. Preprocessing can resize the images to the same dimension and transform every image uniformly. different transformation tools available in torchvison.transforms is used for this process.
transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.1, hue=0.1), transforms.RandomAffine(degrees=40, translate=None, scale=(1, 2), shear=15, resample=False, fillcolor=0), transforms.ToTensor(), transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) ])The images are loaded using ImageFolder and saved into a data loader. ImageFolder saves the images and their respective labels according to the folders they’re present in, and the dataloader divides the data into different batches for training. Here, a batch size of 8 is chosen.
Visualising the datasetVisualising the dataset before training the data is a good practice. This can be used to make sure data is loaded properly along with their labels and transformations are applied successfully.
For this process, the images are saved in a tensor format in a grid, and labels are extracted from the dictionary.
import torchvision def imshow(inp, title=None): """Imshow for Tensor.""" inp = inp.numpy().transpose((1, 2, 0)) mean = np.array([0.485, 0.456, 0.406]) std = np.array([0.229, 0.224, 0.225]) inp = std * inp + mean inp = np.clip(inp, 0, 1) plt.imshow(inp) if title is not None: plt.title(title) plt.pause(0.001) # pause a bit so that plots are updated # Get a batch of training data inputs, classes = next(iter(trainloader)) # Make a grid from batch out = torchvision.utils.make_grid(inputs) imshow(out,title=[class_names[x] for x in classes]) Importing and training the modelThe pre-trained model can be imported using Pytorch. The device can further be transferred to use GPU, which can reduce the training time.
import torchvision.models as models device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model_ft = models.vgg16(pretrained=True)The dataset is further divided into training and validation set to avoid overfitting. Some parameters used in this model while training is as follows:
Criterion- Crossentropy loss
optimiser- Stochastic gradient descent, learning rate=0.01, momentum=0.9
Exponential Learning rate scheduler- This reduces the value of learning rate every 7 steps by a factor of gamma=0.1.
A linear fully connected layer is added in the end to converge the output to give two predicted labels.
num_ftrs = model_ft.fc.in_features # Here the size of each output sample is set to 2. # Alternatively, it can be generalized to nn.Linear(num_ftrs, len(class_names)). model_ft.fc = nn.Linear(num_ftrs, 2) #model_ft = model_ft.to(device) criterion = nn.CrossEntropyLoss() # Observe that all parameters are being optimized optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9) # Decay LR by a factor of 0.1 every 7 epochs exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)These parameters can be chosen according to your own convenience and depending on the dataset.
Initially, we pass the inputs and labels to the model, and we get a predicted value of the label as an output. This predicted value and the actual value of the label are used to compute the cross-entropy loss, which is further used in backpropagation to update the value of weights and biases.
def train_model(model, criterion, optimizer, scheduler, num_epochs=25): since = time.time() best_model_wts = copy.deepcopy(model.state_dict()) best_acc = 0.0 for epoch in range(num_epochs): print('Epoch {}/{}'.format(epoch, num_epochs - 1)) print('-' * 10) # Each epoch has a training and validation phase for phase in ['train', 'val']: if phase == 'train': model.train() # Set model to training mode else: model.eval() # Set model to evaluate mode running_loss = 0.0 running_corrects = 0 # Iterate over data. for inputs, labels in trainloader: inputs = inputs.to(device) labels = labels.to(device) # zero the parameter gradients optimizer.zero_grad() # forward # track history if only in train with torch.set_grad_enabled(phase == 'train'): outputs = model(inputs) _, preds = torch.max(outputs, 1) loss = criterion(outputs, labels) # backward + optimize only if in training phase if phase == 'train': loss.backward() optimizer.step() # statistics running_loss += loss.item() * inputs.size(0) running_corrects += torch.sum(preds == labels.data) if phase == 'train': scheduler.step() epoch_loss = running_loss / dataset_sizes epoch_acc = running_corrects.double() / dataset_sizes print('{} Loss: {:.4f} Acc: {:.4f}'.format( phase, epoch_loss, epoch_acc)) # deep copy the model print() time_elapsed = time.time() - since print('Training complete in {:.0f}m {:.0f}s'.format( time_elapsed # load best model weights model.load_state_dict(best_model_wts) return model model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=25)After this step, you’ve successfully trained the model.
Thanks for reading!
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
You're reading Transfer Learning Using Vgg16 In Pytorch
Automated Intent Classification Using Deep Learning In Google Sheets
We also learned how to automatically populate Google Sheets in Python.
Wouldn’t it be cool if we could perform our intent classification directly in Google Sheets?
That is exactly what we will do here!
Introducing Google Apps ScriptOne limitation of the built-in functions in Google Sheets is that it limits you to predefined behavior.
The good news is that you can define custom functions with new behavior if you can code them yourself in Google Apps Script.
Google Apps Script is based on JavaScript and adds additional functionality that helps interact with Sheets, Docs and other Google Apps.
We are going to define a new custom function named fetchPrediction that will take keywords in Google Sheet cells, and run them through a BERT-powered predictive model to get the intention of search users.
Here is our plan of action:
Learn to review and update values in Google Sheets from Apps Script.
Practice fetching results from an API and populate a sheet with the retrieved values.
Train our BERT-powered predictive model using Uber’s Ludwig.
Use Ludwig to power an API we can call from Apps Script.
Learn some new tools and concepts that help us connect both services together.
Let’s get started!
Retrieving Keyword Data From Google SheetsThis is an empty Google sheet with some barcode related keywords we pulled from SEMrush.
In our first exercise, we will read and print the first 10 keywords from column A.
This is a built-in IDE (Integrated Development Environment) for Google Sheets.
We are going to write a simple JavaScript function called logKeywords that will read all the keywords in our sheet and log them to the console.
Please refer to the official documentation here.
function logKeywords() { var data = sheet.getDataRange().getValues(); for (var i = 0; i < data.length; i++) { console.log('Keyword: ' + data[i][0]); } }Let’s walk over the function, step by step.
We first get a reference to the active sheet, in this case, it is Sheet1.
We didn’t need to authenticate.
It is a good idea to keep this page in another tab as you will refer to it often as your code and want to see if the changes worked.
Now, we printed more than 100 rows, which took a bit of time. When you are writing and testing your code, it is better to work with smaller lists.
We can make a simple change in the loop to fix that.
function logKeywords() { var data = sheet.getDataRange().getValues(); for (var i = 0; i < 10; i++) { console.log('Keyword: ' + data[i][0]); } }When you run this, it not only runs faster but checking the log is also a lot faster.
Add a Column with keyword IDsNext, let’s learn to add data to the sheet.
We are going to write a new function named addIDtoKeywords. It creates a column with one numeric ID per keyword.
There isn’t a lot of value in doing this, but it should help you test the technique with something super simple.
Here is the code to do that.
function addIDtoKeywords() { var data = sheet.getRange("B1"); var values = []; length = 100; for (var i = 1; i <= length+1; i++){ values.push([i]); } console.log(values.length); var column = sheet.getRange("B2:B102"); column.setValues(values); }You should get a new column in the sheet with numbers in increasing order.
We can also add a column header in bold named Keyword ID using the following code.
data.setValue("Keyword ID"); data.setFontWeight("bold");This is what the updated output looks like.
It is a very similar code. Let’s review the changes.
I added a JavaScript array named values to hold the keyword IDs.
During the loop, I added a line to add each ID generated within the loop to the array.
values.push([i]);I printed the length of the value array at the end of the loop to make sure the correct number of IDs was generated.
Finally, I need to get the values to the sheet.
var column = sheet.getRange("B2:B102");This code selects the correct cells to populate and then I can simply set their value using the list I generated.
column.setValues(values);It can’t get simpler than this!
Fetching API Results From Apps ScriptIn the next exercise, we will learn to perform API requests from Apps Script.
We are going to adapt code from step 11 which pulls data from a Books API.
Instead of fetching books, we will translate keywords using the Google Translate API.
Now, we are starting to write more useful code!
Here is a new function named fetchTranslation based on code adapted from step 11.
function fetchTranslation(TEXT){ API_KEY="INPUT YOUR API KEY"; var response = UrlFetchApp.fetch(url, {'muteHttpExceptions': true}); var json = response.getContentText(); translation = JSON.parse(json); return translation["data"]["translations"][0]["translatedText"]; }This function takes an input text, encodes it and inserts it into an API URL to call the Google Translate service.
There is an API key we need to get and also we need to enable to Translate service. I also recommend restricting the API to the IP you are using to test during development.
Once we have the API URL to call, it is as simple as calling this code.
var response = UrlFetchApp.fetch(url, {'muteHttpExceptions': true});The next lines get us the response in JSON format and after a bit of navigation down the JSON tree, we get the translated text.
As you can see in my code, I like to log almost every step in the code to the console to confirm it is doing what I expect.
Here is one example of how I figured out the correct JSON path sequence.
You can see the progression in the logs here, including the final output.
Translating KeywordsAs we tested the function and it works, we can proceed to create another function to fetch and translate the keywords from the sheet.
We will build up from what we’ve learned so far.
We will call this function a super original name TranslateKeywords!
function TranslateKeywords() { var header = sheet.getRange("B1"); header.setValue("Translation"); header.setFontWeight("bold"); var keyword = sheet.getRange("A2").getValue(); console.log(keyword); translated_keyword = fetchTranslation(keyword); console.log(translated_keyword); var data = sheet.getRange("B2"); data.setValue(translated_keyword); }The code in this function is very similar to the one we used to set Keyword IDs.
The main difference is that we pass the keyword to our new fetchTranslation function and update a single cell with the result.
Here is what it looks like for our example keyword.
As you can probably see, there is no for loop, so this will only update one single row/keyword. The first one.
Please complete the for loop to get the translation for all keywords as a homework exercise.
Building an Intent Classification ModelLet’s move to build our intent classification service that we will call to populate keyword intents.
In my previous deep learning articles, I’ve covered Ludwig, Uber’s AI toolbox.
I like it a lot because it allows you to build state-of-the-art deep learning models without writing a single line of code.
It is also very convenient to run in Google Colab.
We are going to follow the same steps I described in this article, this will give us a powerful intent prediction model powered by BERT.
Here is a quick summary of the steps you need paste into Google Colab (make sure to select the GPU runtime!).
Please refer to my article for the context:
%tensorflow_version 1.x import tensorflow as tf; print(tf.__version__) !pip install ludwig #upload Question_Classification_Dataset.csv and 'Question Report_Page 1_Table.csv' from google.colab import files files.upload() import pandas as pd df = pd.read_csv("Question_Classification_Dataset.csv", index_col=0) !unzip uncased_L-12_H-768_A-12.zip # create the ludwig configuration file for BERT-powered classification template=""" input_features: - name: Questions type: text encoder: bert config_path: uncased_L-12_H-768_A-12/bert_config.json checkpoint_path: uncased_L-12_H-768_A-12/bert_model.ckpt preprocessing: word_tokenizer: bert word_vocab_file: uncased_L-12_H-768_A-12/vocab.txt padding_symbol: '[PAD]' unknown_symbol: '[UNK]' output_features: - name: Category0 type: category - name: Category2 type: category text: word_sequence_length_limit: 128 training: batch_size: 32 learning_rate: 0.00002 """ with open("model_definition.yaml", "w") as f: f.write(template) !pip install bert-tensorflow !ludwig experiment --data_csv Question_Classification_Dataset.csv --model_definition_file model_definition.yamlAfter completing these steps in Google Colab, we should get a high accuracy predictive model for search intent.
We can verify the predictions with this code.
test_df = pd.read_csv("Question Report_Page 1_Table.csv") #we rename Query to Questions to match what the model expects predictions = model.predict(test_df.rename(columns={'Query': 'Questions'} )) test_df.join(predictions)[["Query", "Category2_predictions"]]We get a data frame like this one.
The intentions predicted are not the ones you typically expect: navigational, transactional, informational, but they are good enough to illustrate the concept.
Please check an awesome article by Kristin Tynski that explains how to expand this concept to get true search intents.
Turning Our Model Into an API ServiceLudwig has one super cool feature that allows you to serve models directly as an API service.
The command for this is Ludwig serve.
I was trying to accomplish the same thing following a super complicated path because I didn’t check that something like this already existed. 🤦
It is not installed by default, we need to install it with this command.
!pip install ludwig[serve]We can check the command-line options with:
!ludwig serve --helpCreating an API from our model is as simple as running this command.
!ludwig serve -m results/experiment_run/model INFO: Started server process [5604] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Shutting down INFO: Finished server process [5604]As we are running this code in the notebook, we need to use a little trick to push this process to the background (a separate thread).
%%bash --bgThe magic command %%bash –bg runs the shellcode in a separate thread returning control to the notebook so we can run code that can interact with the service.
I found this to be a super cool and valuable trick. I’m also introducing more shell tricks that I learned many years ago.
The nohup command prevents the process from getting killed when the parent dies. It is optional here.
We can track the progress of the background process using this command.
!tail debug.logAfter you see this message, you can proceed to the next step.
Let’s send a test API request using curl to see if the service works.
You should get this response back.
{"Category0_predictions":"HUMAN","Category0_probabilities_":0.00021219381596893072,"Category0_probabilities_ENTITY":7.17515722499229e-05,"Category0_probabilities_HUMAN":0.9988889098167419,"Category0_probabilities_DESCRIPTION":0.000423480843892321,"Category0_probabilities_NUMERIC":2.7793401386588812e-05,"Category0_probabilities_LOCATION":0.0003020864969585091,"Category0_probabilities_ABBREVIATION":7.374086999334395e-05,"Category0_probability":0.9988889098167419,"Category2_predictions":"ind","Category2_probabilities_":8.839580550557002e-05,"Category2_probabilities_ind":0.9759176969528198,"Category2_probabilities_other":0.0013697665417566895,"Category2_probabilities_def":3.929347076336853e-05,"Category2_probabilities_count":4.732362140202895e-05,"Category2_probabilities_desc":0.014149238355457783,"Category2_probabilities_manner":7.225596345961094e-05,"Category2_probabilities_date":7.537546480307356e-05,"Category2_probabilities_cremat":0.00012272763706278056,"Category2_probabilities_reason":0.00042629052768461406,"Category2_probabilities_gr":0.0025540771894156933,"Category2_probabilities_country":0.0002626778441481292,"Category2_probabilities_city":0.0004305317997932434,"Category2_probabilities_animal":0.00024954770924523473,"Category2_probabilities_food":8.139225974446163e-05,"Category2_probabilities_dismed":7.852958515286446e-05,"Category2_probabilities_termeq":0.00023714809503871948,"Category2_probabilities_period":4.197505040792748e-05,"Category2_probabilities_money":3.626687248470262e-05,"Category2_probabilities_exp":5.991378566250205e-05,"Category2_probabilities_state":0.00010361814202342297,"Category2_probabilities_sport":8.741072088014334e-05,"Category2_probabilities_event":0.00013374585250858217,"Category2_probabilities_product":5.6306344049517065e-05,"Category2_probabilities_substance":0.00016623239207547158,"Category2_probabilities_color":1.9601659005274996e-05,"Category2_probabilities_techmeth":4.74867774755694e-05,"Category2_probabilities_dist":9.92789282463491e-05,"Category2_probabilities_perc":3.87108520953916e-05,"Category2_probabilities_veh":0.00011915313370991498,"Category2_probabilities_word":0.00016430433606728911,"Category2_probabilities_title":0.0010781479068100452,"Category2_probabilities_mount":0.00024070330255199224,"Category2_probabilities_body":0.0001515906333224848,"Category2_probabilities_abb":8.521509153069928e-05,"Category2_probabilities_lang":0.00022924368386156857,"Category2_probabilities_plant":4.893113509751856e-05,"Category2_probabilities_volsize":0.0001462997024646029,"Category2_probabilities_symbol":9.98345494735986e-05,"Category2_probabilities_weight":8.899033855414018e-05,"Category2_probabilities_instru":2.636547105794307e-05,"Category2_probabilities_letter":3.7610192521242425e-05,"Category2_probabilities_speed":4.142118996242061e-05,"Category2_probabilities_code":5.926147059653886e-05,"Category2_probabilities_temp":3.687662319862284e-05,"Category2_probabilities_ord":6.72415699227713e-05,"Category2_probabilities_religion":0.00012743560364469886,"Category2_probabilities_currency":5.8569487009663135e-05,"Category2_probability":0.9759176969528198} Exposing Our Service Using NgrokSo, we have a new API that can make intent predictions, but one big problem is that it is only accessible from within our Colab notebook.
Let me introduce another cool service that I use often, Ngrok.
Ngrok helps you create publicly accessible URLs that connect to a local service like the one we just created.
I do not recommend doing this for production use, but it is very handy during development and testing.
You don’t need to create an account, but I personally do it because I get to set up a custom subdomain that I use very frequently.
Here are the steps to give our API a public URL to call from App Script.
We first download and uncompress ngrok.
%%bash --bgThe code above tells ngrok to connect to the local service in port 8000. That is all we need to do.
You can confirm it works by repeating the curl call, but calling the public URL. You should get the same result.
If you don’t want to set up a custom domain, you can use this code instead.
%%bash --bgThis will generate a random public URL and you get retrieve with this code.
"import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"
Now, we get back to our final steps.
Fetching Intent PredictionsWe are going to adapt the code we used to make Google Translate API requests so we can make intent prediction requests.
One big difference between the two API services is that we need to make HTTP POST requests instead of simpler HTTP GET requests.
Let’s see how that changes our code and learn a bit more about HTTP in the process.
function fetchPrediction(question = "who is the boss?"){ TEXT = encodeURI(TEXT); console.log(TEXT); var options = { "method" : "POST", "contentType" : "application/x-www-form-urlencoded", "payload" : TEXT, 'muteHttpExceptions': true }; var response = UrlFetchApp.fetch(url, options); var json = response.getContentText(); prediction = JSON.parse(json); console.log(prediction["Category0_predictions"]); return prediction["Category0_predictions"]; }The function fetchPrediction calls the API service we created and returns the predicted intent. It basically reproduces the equivalent of the curl call we made Colab, but in Apps Script.
I highlighted some key changes in the code. Let’s review them.
One key difference between GET and POST requests is that in GET requests the data is passed in the URL as parameters.
In POST requests, the data is passed inside the body of the request.
We need to format the data before we pass it in the body and we need to set the correct content type so the server knows how to decode it.
This line encodes the question we are passing.
TEXT = encodeURI(TEXT);This is an example of what the encoded TEXT looks like.
Questions=label%20generatorThe correct content type for this encoding is application/x-www-form-urlencoded. This is recommended encoding for HTML form data.
We create an options data structure where we specify these settings and the correct request type and we are set to go.
You should see the encoded input and predicted intent in the logs.
How do we get the intentions for all the keywords in the sheet?
You might be thinking we will create another function that will read the keywords in a loop and populate the intentions. Not at all!
We can simply call this function by name directly from the sheet! How cool is that?
Resources to Learn MoreCombining simple App Script functions with powerful API backends that you can code in any language opens the doors to infinite productivity hacks.
Here are some of the resources I read while putting this together.
Finally, let me highlight a very important and valuable project that JR Oakes started.
It is an awesome repository for Python and JavaScript projects from the coders in the SEO community. I plan to find time to upload my code snippets, please make sure to contribute yours.
For some reason, this non-issue keeps popping up in my Twitter feed. I will leave this tweet here as a friendly reminder. ✌️
— Hamlet 🇩🇴 🇺🇸 (@hamletbatista) March 10, 2023
More Resources:
Image Credits
All screenshots taken by author, March 2023
Machine Learning Model Deployment Using Streamlit
This article was published as a part of the Data Science Blogathon
Overview of StreamlitIf you are someone who has built ML models for real-time predictions and wondering how to deploy models in the form of web applications, to increase their accessibility. You are at the right place as in this article you will be seeing how to deploy models that are already built on Machine Learning or Deep Learning.
Article overview:
Understand the concept of Model Deployment.
Perform Model deployment using Streamlit for the dog-breed classifier.
Once you are done training the model you have several options of deploying a project on the web that are Flask, Django, Streamlit.
Flask and Django are somewhat heavy so it takes more than one article and time to understand them better (we would be focusing on this one also), but for now, we would be discussing Streamlit. So let’s start with a question.
Why Streamlit?
Streamlit lets you create apps for your Machine Learning project using simple code.
It also supports hot-reloading that lets your app update live as you edit and save your file.
Using streamlit creating an app is very easy, adding a widget is as simple as declaring a variable.
No need to write a backend, No need to define different routes or handle HTTP requests.
We would be discussing how you can deploy a Deep Learning Classifier using Streamlit. For this article let’s take a Dog Bread Classifier, you can check how to create a Dog Bread classifier in the given article link.
train your model and save feature_extractor.h5, dog_breed.h5, dog_breeds_category.pickle.
feature_extractor.h5 is a saved model which will extract features from images,
dog_breed.h5 is another saved model which will be used for prediction.
dog_breeds_category.pickle the file will be used to covert class_num to class_label.
Model Deployment Using StreamlitOnce you have all the required files let’s start with Streamlit installation procedure and build a web application.
Installing Streamlit pip install streamlit Setting up the Project Structure for Model Deployment using StreamlitCreating a Directory tree is not required but it is a good practice to organize your files and folders.
Source: Local
Start by creating a project_folder, inside the project folder create another folder static and put all the downloaded files inside static and also create folder images inside static. now create an empty chúng tôi and chúng tôi file and place it in the project directory.
Create prediction PipelineCreating a predictor function that will take an uploaded picture’s path as input and give different dog breed classes as output.
The predictor function will handle all the image processing, model loading require for the prediction.
The predictor function will be coded in helper.py to keep our structure look ordered.
Let’s begin by loading all of the required libraries:
import cv2 import os import numpy as np import pickle import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras import models,utils import pandas as pd from tensorflow.keras.models import load_model from tensorflow.keras.preprocessing.image import load_img,img_to_array from tensorflow.python.keras import utilsLoading the saved model from the directory:
current_path = os.getcwd() # getting the current path dog_breeds_category_path = os.path.join(current_path, 'staticdog_breeds_category.pickle') # loading class_to_num_category predictor_model = load_model(r'staticdogbreed.h5') with open(dog_breeds_category_path, 'rb') as handle: dog_breeds = pickle.load(handle) # loading the feature extractor model feature_extractor = load_model(r'staticfeature_extractor.h5')In the above code chunk, we are loading the different categories of Dog Breeds using the pickle file and then we are loading the weights file (.h5 file) that has training weights. Now you will be defining a predictor function that takes the image’s path as input and returns prediction.
def predictor(img_path): # here image is file name img = load_img(img_path, target_size=(331,331)) img = img_to_array(img) img = np.expand_dims(img,axis = 0) features = feature_extractor.predict(img) prediction = predictor_model.predict(features)*100 prediction = pd.DataFrame(np.round(prediction,1),columns = dog_breeds).transpose() prediction.columns = ['values'] prediction = prediction.nlargest(5, 'values') prediction = prediction.reset_index() prediction.columns = ['name', 'values'] return(prediction)In the above block of code, we have performed the following operations:-
First passing image path to predictor function.
Then it converts the image array into tensors (4-d) array for prediction.
Then it passes tensors to feature_extractor function to get extracted features for prediction, now our extracted features will become input for predictor_model
And finally, it passes extracted features to predictor_model and gets final prediction then converts into data frame to get our prediction in the desired format
The predictor_function returns the top 5 detected dog breeds with their prediction confidence in a data frame.
Now you have a function ready that takes image path and gives prediction which we will call from our web app.
Creating FrontendOur goal is to create a web app where we can upload a picture and then save that picture in the static/images directory for the prediction part.
Pipeline
Create an upload button and save uploaded pics in the directory.
The function predictor will take an uploaded image’s path as input and would give the output.
Show the uploaded image.
Show the top-5 predictions with their confidence percentage in a barplot.
After prediction, delete the uploaded picture from the directory
Frontend Streamlit code will be written in main.py and to make use of the predictor function created in chúng tôi we need to import the predictor function in main.py file. Let’s check the code part for the given pipeline.
from helper import * #importing all the helper fxn from chúng tôi which we will create later import streamlit as st import os import matplotlib.pyplot as plt import seaborn as sns sns.set_theme(style="darkgrid") sns.set() from PIL import Image st.title('Dog Breed Classifier')In the above code, we have first imported all the dependencies and then created an app with the title “Dog Breed Classifier”. It’s time to define a function to save uploaded images.
def save_uploaded_file(uploaded_file): try: with open(os.path.join('static/images',uploaded_file.name),'wb') as f: f.write(uploaded_file.getbuffer()) return 1 except: return 0This function saves the uploaded pics to the static/images folder.
Creating Upload button, display uploaded image on the app, and call the predictor function which we had just created.
uploaded_file = st.file_uploader("Upload Image") # text over upload button "Upload Image" if uploaded_file is not None: if save_uploaded_file(uploaded_file): # display the image display_image = Image.open(uploaded_file) st.image(display_image) prediction = predictor(os.path.join('static/images',uploaded_file.name)) os.remove('static/images/'+uploaded_file.name) # deleting uploaded saved picture after prediction # drawing graphs st.text('Predictions :-') fig, ax = plt.subplots() ax = sns.barplot(y = 'name',x='values', data = prediction,order = prediction.sort_values('values',ascending=False).name) ax.set(xlabel='Confidence %', ylabel='Breed') st.pyplot(fig)Let’s discuss the above code:
You can write text anywhere in the program using the st.write() method.
os.remove() removes the uploaded file after prediction.
st.file_uploader(‘Upload Image’) creates an upload button.
Whatever file will be uploaded will be passed to save_uploaed function in order to save the uploaded file.
For plotting the prediction bar-plot we are using seaborn in Streamlit.
sns.barplot() creates barplot.
Plotted bar-plot will be sorted according to their confidence percentage.
Source: Local
Run Web AppRun the web app on your browser by running the command:-
streamlit run main.pyHere chúng tôi is the file containing all the frontend code.
Source: Local
So Far we build the web app using Streamlit and it runs as a website on your local computer.
Model Deployment using Streamlit Over InternetDeploying over the internet increases the accessibility of your application. After deployment app can be accessed from mobile, computer from anywhere in the world.
Streamlit gives you a Streamlit Share feature to deploy your Streamlit web app for free on the internet.
care of the rest. as you can see down below the deployment page on streamlit.io.
I personlally would not suggest deploying the model on Streamlit share as it is not much flexible.
You have multiple choices when it comes to hosting your model on the cloud. AWS, GCD, AZURE CLOUD are some popular services nowadays.
Heroku is a free online model hosting service. you can host the model on the cloud for free. Deploying apps on Heroku is more flexible, Managing your app, package versions, storage is a lot easier with Heroku.
ConclusionCreating a Machine Learning model is not enough until you make it available to general use or to a specific client. If you are working for a client you probably need to deploy the model in the client’s environment, but if you are working on a project that needs to be publicly available you should use technology to deploy it on the web. Streamlit is the best lightweight technology for web deployment.
So by this, we come to an end to the article. I hope you enjoyed it and now can start creating beautiful apps yourself.
The whole code and architecture can be downloaded from this link.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
Building Machine Learning Model Is Fun Using Orange
Introduction
With growing need of data science managers, we need tools which take out difficulty from doing data science and make it fun. Not everyone is willing to learn coding, even though they would want to learn / apply data science. This is where GUI based tools can come in handy.
Today, I will introduce you to another GUI based tool – Orange. This tool is great for beginners who wish to visualize patterns and understand their data without really knowing how to code.
In my previous article, I presented you with another GUI based tool KNIME. If you do not want to learn to code but still apply data science, you can try out any of these tools.
By the end of this tutorial, you’ll be able to predict which person out of a certain set of people is eligible for a loan with Orange!
Table of Contents:
Why Orange?
Setting up your System:
Creating your first Workflow
Familiarizing yourself with the basics
Problem Statement
Importing the data files
Understanding the data
How do you clean your data?
Training your first model
1. Why Orange?Orange is a platform built for mining and analysis on a GUI based workflow. This signifies that you do not have to know how to code to be able to work using Orange and mine data, crunch numbers and derive insights.
You can perform tasks ranging from basic visuals to data manipulations, transformations, and data mining. It consolidates all the functions of the entire process into a single workflow.
The best part and the differentiator about Orange is that it has some wonderful visuals. You can try silhouettes, heat-maps, geo-maps and all sorts of visualizations available.
2. Setting up your SystemOrange comes built-in with the Anaconda tool if you’ve previously installed it. If not, follow these steps to download Orange.
Step 2: Install the platform and set the working directory for Orange to store its files.
This is what the start-up page of Orange looks like. You have options that allow you to create new projects, open recent ones or view examples and get started.
Before we delve into how Orange works, let’s define a few key terms to help us in our understanding:
A widget is the basic processing point of any data manipulation. It can do a number of actions based on what you choose in your widget selector on the left of the screen.
A workflow is the sequence of steps or actions that you take in your platform to accomplish a particular task.
You can also go to “Example Workflows” on your start-up screen to check out more workflows once you have created your first one.
3. Creating Your First WorkflowThis is your blank Workflow on Orange. Now, you’re ready to explore and solve any problem by dragging any widget from the widget menu to your workflow.
4. Familiarising yourself with the basicsOrange is a platform that can help us solve most problems in Data Science today. Topics that range from the most basic visualizations to training models. You can even evaluate and perform unsupervised learning on datasets:
4.1 ProblemThe problem we’re looking to solve in this tutorial is the practice problem Loan Prediction that can be accessed via this link on Datahack.
4.2 Importing the data filesWe begin with the first and the necessary step to understand our data and make predictions: importing our data
Step 3: Once you can see the structure of your dataset using the widget, go back by closing this menu.
Neat! Isn’t it?
Let’s now visualize some columns to find interesting patterns in our data.
4.3 Understanding our DataThe plot I’ve explored is a Gender by Income plot, with the colors set to the education levels. As we can see in males, the higher income group naturally belongs to the Graduates!
Although in females, we see that a lot of the graduate females are earning low or almost nothing at all. Any specific reason? Let’s find out using the scatterplot.
One possible reason I found was marriage. A huge number graduates who were married were found to be in lower income groups; this may be due to family responsibilities or added efforts. Makes perfect sense, right?
4.3.2 Distribution
What we see is a very interesting distribution. We have in our dataset, more number of married males than females.
4.3.3 Sieve diagram
Let’s visualize using a sieve diagram.
This plot divides the sections of distribution into 4 bins. The sections can be investigated by hovering the mouse over it.
Let’s now look at how to clean our data to start building our model.
5. How do you clean your data?Here for cleaning purpose, we will impute missing values. Imputation is a very important step in understanding and making the best use of our data.
Here, I have selected the default method to be Average for numerical values and Most Frequent for text based values (categorical).
You can select from a variety of imputations like:
Distinct Value
Random Values
Remove the rows with missing values
Model-Based
6. Training your First ModelBeginning with the basics, we will first train a linear model encompassing all the features just to understand how to select and build models.
Step 1: First, we need to set a target variable to apply Logistic Regression on it.
Step 4: Once we have set our target variable, find the clean data from the “Impute” widget as follows and place the “Logistic Regression” widget.
Ridge Regression:
Performs L2 regularization, i.e. adds penalty equivalent to square of the magnitude of coefficients
Minimization objective = LS Obj + α * (sum of square of coefficients)
Lasso Regression:
Performs L1 regularization, i.e. adds penalty equivalent to absolute value of the magnitude of coefficients
Minimization objective = LS Obj + α * (sum of absolute value of coefficients)
I have chosen Ridge for my analysis, you are free to choose between the two.
Step 8: To visualize the results better, drag and drop from the “Test and Score” widget to fin d “Confusion Matrix”.
This way, you can test out different models and see how accurately they perform.
Let’s try to evaluate, how a Random Forest would do? Change the modeling method to Random Forest and look at the confusion matrix.
Looks decent, but the Logistic Regression performed better.
We can try again with a Support Vector Machine.
Better than the Random Forest, but still not as good as the Logistic Regression model.
Sometimes the simpler methods are the better ones, isn’t it?
This is how your final workflow would look after you are done with the complete process.
For people who wish to work in groups, you can also export your workflows and send it to friends who can work alongside you!
The resulting file is of the (.ows) extension and can be opened in any other Orange setup.
End NotesOrange is a platform that can be used for almost any kind of analysis but most importantly, for beautiful and easy visuals. In this article, we explored how to visualize a dataset. Predictive modeling was undertaken as well, using a logistic regression predictor, SVM, and a random forest predictor to find loan statuses for each person accordingly.
Hope this tutorial has helped you figure out aspects of the problem that you might not have understood or missed out on before. It is very important to understand the data science pipeline and the steps we take to train a model, and this should surely help you build better predictive models soon!
Related
How To Apply A 2D Average Pooling In Pytorch?
We can apply a 2D Average Pooling over an input image composed of several input planes using the torch.nn.AvgPool2d() module. The input to a 2D Average Pooling layer must be of size [N,C,H,W] where N is the batch size, C is the number of channels, H and W are the height and width of the input image.
The main feature of an Average Pooling operation is the filter or kernel size and stride. This module supports TensorFloat32.
Syntax torch.nn.AvgPool2d(kernel_size) Parameters
kernel_size – The size of the window to take an average over.
Along with this parameter, there are some optional parameters also such as stride, padding, dilation, etc. We will take examples of these parameters in detail in the following Python examples.
StepsYou could use the following steps to apply a 2D Average Pooling −
Import the required library. In all the following examples, the required Python library is torch. Make sure you have already installed it. To apply 2D Average Pooling on images we need torchvision and Pillow as well.
import torch import torchvision from PIL import Image
Define input tensor or read the input image. If an input is an image, then we first convert it into a torch tensor.
Define kernel_size, stride and other parameters.
Next define an Average Pooling pooling by passing the above defined parameters to torch.nn.AvgPool2d().
pooling = nn.AvgPool2d(kernel_size)
Apply the Average Pooling pooling on the input tensor or image tensor.
output = pooling(input)
Next print the tensor after Average Pooling. If the input was an image tensor, then to visualize the image, we first convert the tensor obtained # Import the required libraries import torch import chúng tôi as nn
”’input of size = [N,C,H, W] or [C,H, W] ”’ input = torch.empty(3, 4, 4).random_(256) print(“Input Size:”,input.size())
# pool of square window of size=3, stride=1 pooling1 = nn.AvgPool2d(3, stride=1)
# Perform Average Pooling output = pooling1(input) print(“Output Size:”,output.size())
# pool of non-square window pooling2 = nn.AvgPool2d((2, 1), stride=(1, 2))
# Perform average Pool output = pooling2(input) print(“Output Size:”,output.size())
Output Input Tensor: tensor([[[194., 159., 7., 90.], [128., 173., 28., 211.], [252., 123., 248., 147.], [144., 107., 28., 17.]], [[122., 140., 117., 52.], [252., 118., 216., 101.], [ 88., 121., 25., 210.], [223., 162., 39., 125.]], [[168., 113., 53., 246.], [199., 23., 54., 74.], [ 95., 246., 245., 48.], [222., 175., 144., 127.]]]) Input Size: torch.Size([3, 4, 4]) Output Tensor: tensor([[[145.7778, 131.7778], [136.7778, 120.2222]], [[133.2222, 122.2222], [138.2222, 124.1111]], [[132.8889, 122.4444], [155.8889, 126.2222]]]) Output Size: torch.Size([3, 2, 2]) Output Tensor: tensor([[[161.0000, 17.5000], [190.0000, 138.0000], [198.0000, 138.0000]], [[187.0000, 166.5000], [170.0000, 120.5000], [155.5000, 32.0000]], [[183.5000, 53.5000], [147.0000, 149.5000], [158.5000, 194.5000]]]) Output Size: torch.Size([3, 3, 2]) Example 2In the following Python example, we perform 2D Avg Pooling on an input image. To apply 2D Avg Pooling, we first convert the image to a torch tensor and after Avg Pooling again convert it to a PIL image for visualization
# Python 3 program to perform 2D Average Pooling on image # Import the required libraries import torch import torchvision from PIL import Image import torchvision.transforms as T import torch.nn.functional as F # read the input image img = Image.open('panda.jpg') # convert the image to torch tensor img = T.ToTensor()(img) print("Original size of Image:", img.size()) #Size([3, 466, 700]) # unsqueeze to make 4D img = img.unsqueeze(0) # define avg pool with square window of size=4, stride=1 pool = torch.nn.AvgPool2d(4, 1) img = pool(img) img = img.squeeze(0) print("Size after AvgPool:",img.size()) img = T.ToPILImage()(img) img.show() Output Original size of Image: torch.Size([3, 466, 700]) initialization of the weights and biases.Learn Latest Versions Of Pytorch
Introduction to PyTorch Versions
Web development, programming languages, Software testing & others
Different Versions of PyTorchHere we discuss the different versions of Pytorch released with the system configuration required and mainly focus on current stable release v1.3 as this is the one used in market and research community currently:
1. Old Version – PyTorch Versions < 1.0.0In the very first release of PyTorch, Facebook combined Python and Torch libraries to create an open-source framework that can also be operated on CUDA and Nvidia GPU. PyTorch mainly uses Tensors (torch.Tensors) to store and operate on the Multi-Dimensional array. PyTorch released the first version as 0.1.12 in public. 0.4 version was one of the most significant released version with core changes.
In PyTorch v0.4 version has added the support for Windows, added features to support the use of RNN in ONNX (Open Neural Network Exchange). It has C++/Cuda extensions for user’s use. Also in 0.4 version provide support for writing device-agnostic code. Tensors and variables have been merged in the 0.4 release as well as operations can return 0-Dimensional tensors. To install all the old version through conda or mini conda use below commands:
In the below command, the user can replace ‘0.2.0’ with his desired version like ‘0.4.0 or 0.4.1’ And replace cuda9 by cuda8, cuda7.5, etc.
conda install pytorch=0.2.0 cuda90 -c pytorchPyTorch libraries are also available in GitHub and users can check out the older version of PyTorch and build it. User can replace ‘0.2.0’ with his desired version: git checkout v0.2.0. Users can also download the required libraries for macOS or for Windows. User can download the respective OS libraries from the below URL from the official website of:
2. PyTorch Version 1.0 to 1.2Before the 1.0 version of the code was written in Pytorch, the Python VM environment was needed to run this app. In 1.0 version python function and classes are provided with chúng tôi and to separate python code, this function/classes can be compiled into high-level representation. The main goal during the release of version from 1.0 to 1.2 was to combine features of Pytorch, ONNX and caffe2 framework into a single framework for seamless integration from research to production deployment. Some of the features added in version 1.0 are as below:
Easy to integrate C++ function with Python.
It separates the AI model from code by providing two modes:
Eager Mode: Mostly used for research as it is simple, debuggable and can use any python library. It needs a Python environment to run.
Script Mode: Model can run without a Python interpreter. This is a production deployment mode it has no python dependency and code is an optimizable subset of Python.
A model can run on servers, GPU or TPUs.
conda install pytorch==1.2.0 torchvision==0.4.0 -c pytorch 3. Latest PyTorch VersionFacebook has released the latest version of PyTorch in 2023. This new version is packed with new changes and bug fixes. Some of the new exciting features are supported for mobile, transparency, named tensors and quantization to meet the needs of researchers. I will be explaining in brief about these new features with some other information.
PyTorch Named TensorsIn prior 1.3 released PyTorch which did not support the suggestion of dimensions, broadcasting based on position or no information related to type was there in documentation with named tensors. PyTorch has overcome this debacle. PyTorch has added Named tensor as a feature so that users can access tensor dimensions using direct names. Previously while performing simple task users had to know the general structure of the now by broadcasting name of the dimensions user can rearrange the dimensions as required.
Named tensors also support error check on the name of the parameter to check dimension name match with the parameter or not.
Example:
import torch data_sample = torch.randn(100, 3, 250, 600 , names=('N', 'C', 'H', 'W'))Here, N is Number of Batches, C is Number of the channel, H is the height of the image, W is the width of the image.
PyTorch QuantizationTo run quantized operations PyTorch uses x86 CPUs with AVX2 support and ARM CPUs.
import torch m = nn.quantized.ReLU() input = torch.randn(2) input = torch.quantize_per_tensor(input, 1.0, 0, dtype=torch.qint32) PyTorch Mobile SupportQuantization is used while developing ML application so that PyTorch models can be deployed to Mobile or Other Devices. In PyTorch 1.3 the developer has added end to end workflow APIs for Android and iOS. This was done to reduce the latency and provide security on the edge node. It is an early-stage developer who is still working on this development with optimized computation, performance, and coverage on mobile CPUs and GPUs.
Apart from the above three features, there are some features added like support for PyTorch on Google colab. Support for tensorboard and performance improvement in the Autograd engine. Some new tools for model privacy, interpretability, and tools to support a multi-modal AI system.
ConclusionIn conclusion, PyTorch is the most used deep learning framework with support to all state of the art technology. As developers are continuously working on improving the PyTorch you can assume that there will be many more releases with exciting new features that will get added. So learning PyTorch to create machine learning or deep learning application will be beneficial for aspiring AI enthusiasts as this is one of the well documented and supported frameworks.
Recommended ArticlesWe hope that this EDUCBA information on “PyTorch Versions” was beneficial to you. You can view EDUCBA’s recommended articles for more information.
Update the detailed information about Transfer Learning Using Vgg16 In Pytorch on the Hatcungthantuong.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!