Trending December 2023 # Sql Interview Questions And Answers Updated For 2023 # Suggested January 2024 # Top 12 Popular

You are reading the article Sql Interview Questions And Answers Updated For 2023 updated in December 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Sql Interview Questions And Answers Updated For 2023

Introduction to Oracle PL/SQL Interview Questions

Oracle PL/SQL Interview Questions have been specifically designed to familiarize you with the nature of the questions you may encounter during your PL/SQL interview. PL/SQL suggests a procedural dialect proposed particularly to grasp SQL proclamations in its grammar. PL/SQL code units do arrange by this Oracle Database server likewise put away inside the database. Besides, both PL/SQL and SQL keep running inside a similar server process at run-time, conveying ideal effectiveness. PL/SQL consequently gets the strength, security, in addition to the convertibility of the Oracle Database.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

An application that rehearses Oracle Database is inadequate except if essentially right and exhaustive information persists. One clear approach to guarantee that is to introduce the database specifically through an interface that wraps the execution determinations, the tables, and the SQL proclamations that keep running on these. This method is much of the time named those thick database standards in light of PL/SQL subprograms inside the database issue the SQL articulations of code that executes the neighboring business rationale and because the information can be adjusted seen exclusively through a PL/SQL interface.

Top 10 Essential Oracle PL/SQL Interview Questions and Answers.

These questions are divided into two parts as follows:

Part 1 – Oracle PL/SQL Interview Questions (Basic)

This first part covers basic Interview Questions and Answers:

Q1. Enroll the Attributes of PL/SQL?

PL/SQL empowers access and segment of similar subprograms employing different applications.

PL/SQL is recognized for convenience seeing the code as code can be executed on each working framework actualized Oracle is stacked on it.

By PL/SQL, clients can compose their individual redid blunders taking care of schedules.

Improved exchange execution by reconciliation to Oracle information reference.

Q2. What is Information Types Conceivable in PL/SQL?


Information types determine the way to perceive the sort of information in addition to their related tasks. There exist four kinds of predefined information types depicted as pursues.

Scalar Data Types: Any scalar information type is a little information type that does not have some inward segments.

For Example:

Scorch (settled length trademark an incentive among characters of 1 and 32,767)

VARCHAR2 (variable length character an incentive inside characters of 1 and 32,767)

NUMBER ( settled decimal, drifting decimal either whole number qualities)

BOOLEAN ( coherent information type for FALSE TRUE either NULL qualities)

DATE (stores additionally date-time data)

LONG (factor length of character information)

Composite Data Types: Any composite information type is developed of various information types in addition to interior parts that can be immediately utilized and controlled. For example, RECORD, VARRAY, and TABLE.

Reference Data Types: Any reference information types contain values, named pointers that show to isolate program things either information things. For example, REF CURSOR.

Extensive Object Data Types: Any Large Object datatype handles esteems, named locators, that portrays the place of substantial articles, for example, illustrations, pictures, video cuts, and so forth.) spared out of line.

For Example:

BFILE (Binary record)

Mass (Binary vast item)

NCLOB( NCHAR type vast item)

CLOB ( Character huge article)

Q3. What do you Comprehend by Bundles of PL/SQL?


Package Specifications

Package body



Submit, SAVEPOINT, additionally ROLLBACK are three exchange terms accessible in PL/SQL.

SUBMIT Articulation: If the DML task performs, it handles only information in database support, and the database endures unaltered by these adjustments. To spare/store those exchange changes to the database, we require the exchange to COMMIT. Submit exchange spares each noticeable difference after the last COMMIT, and the accompanying procedure happens.

Influenced columns locks are issued.

The exchange set apart as wrapped up.

Exchange particular is spared in the information word reference.

Linguistic structure: COMMIT.

ROLLBACK Articulation: When we require fixing either eradicate entirely the progressions that have occurred in the present exchange until now, we need to move back to the transaction. As it were, ROLLBACK erases each eminent contrast since the last COMMIT or ROLLBACK.

Sentence structure to roll back an exchange.

SAVEPOINT Proclamation: The SAVEPOINT explanation gives a title and denotes a point in the preparation of the present exchange. The progressions and locks that have occurred before the SAVEPOINT in the transaction are kept up, while those that happen after the SAVEPOINT are distributed.

Language structure:

Q5. What is the Transforming Table and Obliging Table?


A table that is right now being changed by a DML proclamation like setting up triggers in a table is recognized as a Mutating table. A table that may require to be perused from for a referential honesty limitation is recognized as a compelled table.

Part 2 – Oracle PL/SQL Interview Questions (Basic) Q6. What is the Distinction between ROLLBACK TO and ROLLBACK Proclamations?


The exchange is completely halted after the ROLLBACK proclamation. That is, the ROLLBACK order altogether fixes an exchange and discharge each bolt.

Then again, any exchange is yet dynamic and pursuing ROLLBACK TO order as it fixes a segment of the transaction up till the gave SAVEPOINT.

Q7. Clarify the Distinction among the Cursor Announced in Strategies and Cursors Expressed in the Bundle Detail?


The cursor demonstrated in the system is taken care of as nearby and can’t be gotten to by various strategies like this. On the other hand, the cursor showed in the bundle is dealt with worldwide and hence can be gotten by various strategies.

Q8. Am I Not Catching your Meaning by PL/SQL Records?


A PL/SQL record can be viewed as a gathering of qualities or states, an accumulation of different parts of data, every one of which is of unobtrusive sorts and can be connected to one different as a field.

There are three sorts of records bolstered in PL/SQL.

Table based records

Programmer based archives

Cursor based records

Q9. Whichever are INSTEAD of Triggers?


The INSTEAD OF triggers are the triggers composed for the most part to change sees, which can’t be promptly changed through SQL DML proclamations.

Q10. What do you know by Exception taking Care of in PL/SQL?


In the event that a mistake happens in PL/SQL, a special case is raised. As it were, to oversee undesired conditions where PL/SQL contents finished surprisingly, a blunder taking care of code is engaged with the program. In PL/SQL, each particular case taking care of code is situated in the EXCEPTION division.

There are three sorts of Exception:

Predefined Exceptions: Common blunders with predefined titles.

Unclear Exceptions: Minimum basic mistakes with no predefined titles.

Client Characterized Exceptions: Do not make runtime blunders, in any case, upset business rules.

Recommended Articles

This has been a guide to list of PL/SQL Interview Questions and Answers so that the candidate can crackdown on these Interview Questions easily. Here in this post, we have studied top PL/SQL Interview Questions, which are often asked in interviews. You may also look at the following articles to learn more –

You're reading Sql Interview Questions And Answers Updated For 2023

Top 13 Aws Interview Questions And Answers Updated For 2023

Introduction to AWS Interview Questions and Answers

The following article provides an outline for AWS Interview Questions. Amazon Web Services (AWS) is a comprehensive, on-demand cloud computing platform provided by Amazon to individuals, companies, and government organizations on a paid subscription basis. The technology provides a virtual cluster of a computer that is available all the time via the web. It provides a mix of infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service (SaaS) offerings, and Recovery as a Service (RaaS).

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

What is Cloud Computing? Types of Clouds

Public Cloud: A cloud where the third-party service providers make resources and services available to their customers via the internet. Related data and security are with the service providers’ owned infrastructure.

Private Cloud: This is almost similar features as the public cloud, but the data and services are managed by the organization or by the third party only for the customer’s organization. In this type of cloud, major control is over the infrastructure so security-related issues are minimized which makes it different from a public cloud.

Hybrid Cloud: As the name suggests Hybrid, is the combination of both private and public cloud. The decision to choose a type of cloud i.e. private or public usually depends on various parameters like the sensitivity of data and applications, industry certifications and required standards, regulations, and many more.

Understanding Different Types of Cloud Computing

IaaS stands for Infrastructure as a Service. It is the lowest level of a cloud solution and refers to cloud-based computing infrastructure as a fully outsourced service. It provides provision processing, storage, and network connectivity on demand. With the implementation of this service model, the customers can develop their own applications on these resources.

Software as a Service (SaaS), as the name suggests, here the third-party providers provide end-user applications to their customers with some administrative capability at the application level, such as the ability to create and manage their users. Basic customization is possible such as the user can use their own corporate logos, colors, etc.

Security Issues: The Biggest concerns in today’s world: The infrastructure provided by AWS cloud is designed in such a way that ensure flexibility and secure cloud network. It is a scalable and highly reliable platform that enables users to deploy applications and data quickly and securely and thus it is gaining its popularity in today’s market. Security is the major issue in cloud computing. The cloud service providers implement the best security standards and industry certifications, however, storing data and important files on external service providers always a risk.

Technical issues: One of the technical issues that are very common is that if the internet connection is offline then we will not be able to access any of the applications, server, or data from the cloud.

Not easy to switch service providers: It is usually promised that cloud will be flexible to use and integrate, however, switching cloud services is not easy.

Now, if you are looking for a job which is related to AWS then you need to prepare for the 2023 AWS Interview Questions. It is true that every interview is different as per the different job profiles. Here, we have prepared the important AWS Interview Questions and Answers which will help you get success in your interview.

In this 2023 AWS Interview Questions article, we shall present 10 most important and frequently used AWS interview questions.

These questions are divided into two parts:

Part 1 – AWS Interview Questions (Basic)

This first part covers basic AWS Interview Questions and Answers:

Q1. What are the main components of AWS?


Following are the basic elements of AWS:

Route 53: It is a web service (DNS).

E-mail Service: Provides email service which can be utilized by RESTFUL API or through normal SMTP.

It provides strong protection and identity control services for AWS account.

S3 Services: It’s like warehouse equipment and is a very widely well-known utilized service.

Elastic Block Store (EBS): It provides persistent storage that connects to EC2 to endure data beyond the lifespan of a particular EC2 instance.

Cloud Watch: It is used to observe AWS sources, it also provides facilities by which one can produce a notification alert in the state of crisis.

Q2. What is S3?


S3 stands for Simple Storage Service. It is possible to use S3 services to save and recover the unspecified volume of data, at any time and from everywhere over the web.

Q3. By default how many buckets can be created in AWS?


100 buckets are possible to be created by default for each AWS account.

Q4. In VPC with private and public subnets, where the database servers should ideally be launched (subnet)?


This is the basic AWS Interview Questions asked in an interview. Database servers should ideally be launched with separate subnets, among private and public subnets in VPC.

Q5. List out the components required for Amazon VPC?

Following are the components required for Amazon VPC:

Peering Connection, IG (Internet Gateway), HW VPN Connection, Subnet, Customer Gateway, Router, VPC Endpoint for S3, Virtual Private Gateway, Egress-only Internet Gateway, NAT Gateway.

Q6. How do you safeguard EC2 instances running in VPC?


EC2 instances can be protected by using Security Groups in a VPC. Both INBOUND and OUTBOUND groups can be configured in a Security Group which enables secured access to the EC2 instances. It automatically denies any unauthorized access to the same.

Q7. In a VPC how many EC2 instances can be used?


By default, it is limited to launch 20 EC2 instances at once. However, a maximum of 65,536 instances is possible with VPC.

Part 2 – AWS Interview Questions (Advanced)

This part covers Advanced AWS Interview Questions and Answers:

Q8. What are the different connectivity options present for VPC?


NAT, Internet Gateway (IG), Peering Connections, VPG (Virtual Private Gateway), End Points.

Q9. What are the different types of available Cloud Computing services?


IAAS (Infrastructure as a Service), PAAS (Platform as a Service), SAAS (Software as a Service) etc.

Q10. When a standby Relational Database Service instance is launched will it be available in the same Available Zone?


Q11. What is Lifecycle Hooks?


This is frequently asked AWS Interview Questions asked in an interview. It is used in Auto Scaling. It allows performing custom actions by pausing instances as an Auto Scaling group launches or by terminating them. It is possible to have multiple lifecycle hooks for each auto-scaling group.

Q12. Name two types of Load Balancer?


Application Load Balancer & Classic Load Balancer.

Q13. What is Hypervisor?


Recommended Articles

This has been a guide to the list of AWS Interview Questions and Answers. Here we have listed the top 13 Interview Questions and Answer that are commonly asked in interviews with detailed responses. You may also look at the following articles to learn more –

Core Php Interview Questions And Answers In 2023

Introduction to Core PHP Interview Questions and Answers

Core PHP is actually the meaning of very basic PHP. It is normally used for creating some dynamic web pages for displaying to the end client through their own browser. It has basic core logic of programming on the server side and displays on the client side based on the expected logic.

If you are looking for a job related to Core PHP, you need to prepare for the 2023 Core PHP Interview Questions. It is true that every interview is different as per the different job profiles. Here, we have prepared the important Core PHP Interview Questions and Answers, which will help you get success in your interview.

Start Your Free Software Development Course

In this 2023 Core PHP Interview Questions article, we shall present the 10 most important and frequently asked Core PHP interview questions. These interview questions are divided into two parts as follows:

Part 1 – Core PHP Interview Questions (Basic)

This first part covers basic Core PHP Interview Questions and Answers.

Q1. Two very common functions used in core PHP for a long time. Those functions include () and require(). Please give some clear differences between including and requiring a function for core PHP.

Include() and require() are both used to include some specific file with the requesting page.

The main difference between them is:

If the developer use requires to include the file, then somehow the process throws some fatal error during execution for unavailability of the file, then execution of the process will entirely stop. But if the developer use includes for including the file, then the entire process will not stop; it will ignore the fatal error and go for executing the next step without stopping the process.

Q2. Suppose we are willing to get the IP address for some client who is using PHP developed web application. Please explain how we can get that IP information in PHP?

There have several options for fetching IP addresses of the client execution machine in PHP. The developer can able to write some critical scripts for fetching those data externally.

But one of the popular and very basic approaches to fetching IP addresses is:


Q3. Explain in detail the difference between the two popular functions of PHP, unset() and unlink().

The main difference between them is:

If the developer used unset() on any file, then that file reference is going to be undefined for the entire application, whereas if the developer mentions one file as unlink, then that file will be removed from the directory and as well as not available for the entire application.

Q4. There are several error types available in PHP. Explain some of the major error types which are very frequently used for PHP applications and give the proper difference of them.

There is the common Core PHP Interview Questions asked in an interview. Several responsibilities normally need to follow by a Core PHP tester in the current IT industry.

Normally in PHP, we are handling three kinds of errors:

Notices: This is just given one notice of wrong coding or execution. It is a very simple and mostly non-critical error that normally occurred script execution time. Suppose an application trying to access some undefined variable; then this kind of notice will come.

Warnings: It is again not that much critical error, but still, any wrong execution warning will be given without stopping the normal execution of the process. An example is, including a function used, but some file is missing in the directory, then the warning will be given, but the process will execute successfully.

Fatal: This is one of the most important errors that came in PHP script execution. It mainly causes the termination of the process by giving a proper explanation. An example is, trying to access some nonexistent object or requiring file uploading, but the file is missing.

Q5. Explain in detail about the difference between GET and POST in the PHP application.

Some of the key differences between GET and POST in PHP are given below:

GET information always passes through a URL, so it is always visible to everyone, whereas POST information is embedded with the request and sometimes it is in the encoded format, so it will not able understandable or visible to the normal user.

GET have some restriction on handling the request, define characters are 2048. Whereas POST doesn’t have this kind of restriction at all.

GET only allows require ASCII data, whereas POST does not have this kind of restriction.

Developer common approach to use Get for fetching data, whereas POST is used for inserting or updating.

Part 2 – Core PHP Interview Questions (Advanced) Q6. Suppose the developer needs to enable some of the error-reporting utilities in PHP. How can it be done, please explain in detail.

Displaying an error message is one of the key requirements, especially in debugging the developer’s error; it normally displays the number of lines of the script where a fatal error got generated. The developer can display this error on the possible page by giving the below command:


But for initializing or activating of displaying error in PHP application, the developer needs to follow any of the below approaches:

Display_error = ON in php.ini

Ini_set(‘display_error’, 1) in the specific script file

Q7. Explain in detail about Traits in the PHP application.

Traits are one of the popular mechanisms specifically for the PHP developer. This mechanism help the developer for allowing to create some reusable code again for the PHP language application in case of those objects where the inheritance objective is not fully supported. In the case of Traits, not possible to inheritance by its own mechanism. It is one of the key requirements that PHP developers should know about the key and powerful features of the language before starting development in PHP.

Q8. Suppose one constant has been defined in one of the PHP scripts. Now developer needs to change that constant value during the execution. Is it possible to do? Explain?

If one value is declared as constant in PHP, then it will never be changed by any process during execution. Therefore, a constant value needs to be assigned at the time of initialization.

Q9. Is it possible to extend one class that is defined as final? Explain?

There are the most popular Core PHP Interview Questions asked in an interview. Some popular test cases in the current IT industry. If the developer defines one class as final, then extending that class is absolutely not possible. If one class or method is declared final then creating a child class and method overloading or overriding both are not possible.

Q10. Explain in detail about _destruct(), and _construct() methods available in PHP classes.

Every PHP object should have two methods called constructor and destructor. Both methods are mainly defined in the build-in. The constructor method is normally called immediately after creating one new instance of the specific class, normally used for initializing all the properties of a class. Whereas the destructor methods are mainly used to release the object of the class from the application memory. The destruction method does not require passing any parameter.

Recommended Articles

This has been a guide to Core PHP Interview Questions. Here we have listed the most useful 10 interview sets of questions so that the jobseeker can crack the interview with ease. You may also look at the following articles to learn more –

Cloud Computing Interview Questions And Answers

The scope of Cloud computing is huge. If you are looking for a cloud-related job, consider learning these cloud computing skills. Cloud computing interview questions will also be based on one or more of those skills.

In this article, I have compiled the most asked Cloud Computing interview questions and answers involving Microsoft Azure. Though AWS is the most used cloud service as of now, Microsoft Azure is catching up and is already the backbone of many organizations. Check out the interview questions on Microsoft Azure among the most asked cloud computing interview questions below. Note that the wording of these questions may vary so you can tweak answers to suit the tone of questions.

Cloud Computing interview questions and answers

This section includes cloud computing interview questions that are generic and apply to all platforms like AWS, Microsoft Azure, or Google Apps, etc.

Q1: How do you explain cloud to a layperson? Or What is cloud computing?

A1: Cloud is the extension of local or on-premise computing. When we say we use cloud computing, we are using someone else’s (generally a cloud service provider’s) resources. These resources can be anything from just external storage space to remote infrastructure. The service provider charges users based on the usage of resources.

Q2: What are the basic traits of cloud computing? -OR- When do you call a service, cloud computing?

A2:  The cloud computing vendor should provide the following basic features that are essential for the service to be called cloud computing service. The service should be scalable. That is, when required, the cloud service provider should able to increase the resources and when the demand reduces, the cloud service provider should be able to release the resources for other customers so that the user is not overcharged. Other features are real-time backup, high uptime, and security. Logs are also essential, but they are presented on-demand only. These logs contain who accessed what service at what time etc. information.

Q3: What is grid computing? Is it the same as cloud computing? What are the differences between grid computing and cloud computing?

A3: For a better understanding of the difference between cloud computing and grid computing, please read this article – Grid vs Cloud.

Q4: How many types of clouds are there in practice? -OR- Explain cloud deployment models in use today.

A4: There are three cloud deployment types. First is the public cloud that hosts several tenants’ data. An example of a public cloud is OneDrive as the same servers host many accounts on each. The second deployment model is a private cloud. In this, the resources are hosted on a dedicated cloud. An example of a private cloud could be website hosting with a particular hosting provider. The third and last deployment model is the hybrid cloud. In this, parts of the resources are hosted on the public cloud, and some of them are used exclusively from a private cloud. An example of a hybrid network can be an online store. Part of the website is hosted on the public cloud, and other important artifacts are hosted locally so that they are not compromised. Read the details on cloud computing deployment.

Q5: What are the three service models of cloud computing?

A5: Software as Service, Platform as a service, IaaS (Infrastructure as a service). Please read this article on cloud service models for more details on each type of service model.

Q6: What do you mean by the term “Eucalyptus” in cloud computing?

A6: Eucalyptus stands for “Elastic Utility Computing Architecture for Linking your Programs to useful Systems”. It is basically for AWS (Amazon Web Services).

Q7: What is OpenStack? OR What is the use of OpenStack?

A7: OpenStack is an open-source cloud computing element serving IaaS (Infrastructure as a Service). For more details, check out

Q8: What are the benefits of cloud computing over in-premise computing?

A8: On-Premise computing requires a lot of preparation – in terms of both money and time. If an organization chooses to go for the cloud, it saves much on the initial setup cost. In cloud computing, maintenance is taken care of by the service provider. In On-Premise computing, we’ll need at least one dedicated IT technician to take care of troubleshooting. Cloud provides upgrade and scalability as and when required. One can increase the number of resources or reduce them according to the usage. On-premise computing, on the other hand, will require procurement of more hardware and software and these purchases are permanent so in a way, the cloud saves money while providing back-ups, etc. features.

Q9: What is IaaS? What does it do? Give some examples of IaaS

A9: IaaS stands for Infrastructure as a Service. When a cloud offers an infrastructure for hire/rental, it is called IaaS. Examples of IaaS are AWS (Amazon Web Services), Microsoft Azure, Google Computer Engine, and CISCO Metapod.

Q10: Explain AWS and its components

A10: AWS stands for Amazon Web Services. It is basically infrastructure as a service. The main components of AWS are as follows:

DNS – It offers a service platform that is based on a domain name server; also called route-53

E-mail Service Simple: Other than SMTP (Simple Main Transfer Protocol), the email can also be sent using API calls local to AWS.

Azure cloud computing interview questions

This section covers basic but most asked cloud computing interview questions related to Microsoft Azure, which is Infrastructure as a Service platform.

Question 11: What is Microsoft Azure -OR- What do you know about Microsoft Azure?

Answer 11: Microsoft Azure is a cloud offering from Microsoft. It offers services such as content delivery networks (CDNs), Virtual Machines (VM), and some really good proprietary software that makes it perfect as an IaaS. RemoteApp, for example, helps in using virtual machines to deploy Windows programs. Then there is Active Directory service and SQL server. It also supports open technologies such as Linux distributions that can be contained in virtual machines.

Q12: What is the name of the service in Azure that helps you manage resources?

A12: Azure Resource Manager

Q13: Name some web applications that can be deployed with Azure

A13: Many web applications including open source can be deployed on Azure. Some examples are PHP, WCF, and ASP.NET.

Q14: What are the three types of roles in Microsoft Azure? -OR- What are Roles in Microsoft Azure?

A14:  There are three types of roles in Microsoft Azure. These roles are Web Role, Worker Role, and VM Role. Web Roles help in deploying websites. It is good for running web applications. Worker Role assists Web Role. It runs background processes to support Web Role. The VM Role lets the users customize the servers on which the Web Role and Worker Roles are running.

Q15: What is Azure Active Directory service?

A15: Azure Active Directory Service is a Multi-Tenant Cloud-based directory and identity management service that combines core directory services, application access management, and identity protection.  In other words, it is an identity and access management system. It helps in granting access privileges to users to different resources on the network. It is also used for maintaining information about the network and related resources.

Q16: Are AD and Azure AD same?

A16: No. Active Directory in Windows is an on-premise directory that stores information about the network. Most people confuse Azure AD to be an online version of Windows AD. But that’s not the case. Azure AD is a cloud configuration helper while AD is for local networks

Q17: What do AD and Azure AD do?

A16: Windows AD is a system created for local networks whereas Azure AD is a separate system created only for the cloud. Both keep information about networks, network resources, and help in providing accessing or restricting privileges to different users for different resources on the network. Azure AD is scalable which has been built to support global-scale resource allotments. Azure AD also helps you when you move your on-premise computing to the cloud.

Q18: Is Azure IaaS or PaaS?

A18: Azure offers all three types of services – SaaS, PaaS, and IaaS. But it is mostly used as a PaaS. While many developers prefer to deploy their apps on Azure (PaaS model), some are keen on both developing the whole app and hosting it on Azure instead of using local computers (IaaS model). Thus, it serves both as IaaS and PaaS.

Q19: What are Azure Storage Queues?

A19: Azure Queue storage is an Azure service that allows messages to be retrieved and accessed from anywhere on the planet. The service uses simple Hyper Text Transfer Protocol (HTTP or HTTPS).

Q20: What is Poison in Azure Storage Queues?

A20: Messages that have exceeded the max number of delivery attempts to the application is called poison in the language of Microsoft Azure. There can be many reasons why this happens.

The above are some most asked cloud computing interview questions and answers. I wrote the answers with my limited knowledge. Since you may have taken a proper course to learn cloud computing, you can always answer better. I’ve simply given pointers. It is up to the readers to improve upon the pointers using whatever resources they have.

TIP: This Microsoft Azure Interview Questions & Answers PDF released by Microsoft MVPs will interest you.

All the best!

60+ Data Engineer Interview Questions And Answers In 2023

1) Explain Data Engineering.

Here are Data Engineering interview questions and answers for fresher as well experienced data engineer candidates to get their dream job.

Data engineering is a term used in big data. It focuses on the application of data collection and research. The data generated from various sources are just raw data. Data engineering helps to convert this raw data into useful information.

2) What is Data Modelling?

Data modeling is the method of documenting complex software design as a diagram so that anyone can easily understand. It is a conceptual representation of data objects that are associated between various data objects and the rules.

3) List various types of design schemas in Data Modelling

There are mainly two types of schemas in data modeling: 1) Star schema and 2) Snowflake schema.

4) Distinguish between structured and unstructured data

Following is a difference between structured and unstructured data:

Parameter Structured Data Unstructured Data

Storage DBMS Unmanaged file structures

Standard, ODBC, and SQL STMP, XML, CSV, and SMS

Integration Tool ELT (Extract, Transform, Load)

scaling Schema scaling is difficult Scaling is very easy.

5) Explain all components of a Hadoop application

Following are the components of Hadoop application:

Hadoop Common: It is a common set of utilities and libraries that are utilized by Hadoop.

HDFS: This Hadoop application relates to the file system in which the Hadoop data is stored. It is a distributed file system having high bandwidth.

Hadoop MapReduce: It is based according to the algorithm for the provision of large-scale data processing.

Hadoop YARN: It is used for resource management within the Hadoop cluster. It can also be used for task scheduling for users.

6) What is NameNode?

It is the centerpiece of HDFS. It stores data of HDFS and tracks various files across the clusters. Here, the actual data is not stored. The data is stored in DataNodes.

7) Define Hadoop streaming

It is a utility which allows for the creation of the map and Reduces jobs and submits them to a specific cluster.

8) What is the full form of HDFS?

HDFS stands for Hadoop Distributed File System.

9) Define Block and Block Scanner in HDFS

Blocks are the smallest unit of a data file. Hadoop automatically splits huge files into small pieces.

Block Scanner verifies the list of blocks that are presented on a DataNode.

10) What are the steps that occur when Block Scanner detects a corrupted data block?

Following are the steps that occur when Block Scanner find a corrupted data block:

1) First of all, when Block Scanner find a corrupted data block, DataNode report to NameNode

2) NameNode start the process of creating a new replica using a replica of the corrupted block.

11) Name two messages that NameNode gets from DataNode?

There are two messages which NameNode gets from DataNode. They are 1) Block report and 2) Heartbeat.

12) List out various XML configuration files in Hadoop?

There are five XML configuration files in Hadoop:





13) What are four V’s of big data?

Four V’s of big data are:





14) Explain the features of Hadoop

Important features of Hadoop are:

It is an open-source framework that is available freeware.

Hadoop is compatible with the many types of hardware and easy to access new hardware within a specific node.

Hadoop supports faster-distributed processing of data.

It stores the data in the cluster, which is independent of the rest of the operations.

Hadoop allows creating 3 replicas for each block with different nodes.

15) Explain the main methods of Reducer

setup (): It is used for configuring parameters like the size of input data and distributed cache.

cleanup(): This method is used to clean temporary files.

reduce(): It is a heart of the reducer which is called once per key with the associated reduced task

16) What is the abbreviation of COSHH?

The abbreviation of COSHH is Classification and Optimization based Schedule for Heterogeneous Hadoop systems.

17) Explain Star Schema

Star Schema or Star Join Schema is the simplest type of Data Warehouse schema. It is known as star schema because its structure is like a star. In the Star schema, the center of the star may have one fact table and multiple associated dimension table. This schema is used for querying large data sets.

18) How to deploy a big data solution?

Follow the following steps in order to deploy a big data solution.

3) Deploy big data solution using processing frameworks like Pig, Spark, and MapReduce.

19) Explain FSCK

File System Check or FSCK is command used by HDFS. FSCK command is used to check inconsistencies and problem in file.

20) Explain Snowflake Schema

A Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. It is so-called as snowflake because its diagram looks like a Snowflake. The dimension tables are normalized, that splits data into additional tables.

21) Distinguish between Star and Snowflake Schema

Star SnowFlake Schema

Dimensions hierarchies are stored in dimensional table. Each hierarchy is stored into separate tables.

Chances of data redundancy are high Chances of data redundancy are low.

It has a very simple DB design It has a complex DB design

Provide a faster way for cube processing Cube processing is slow due to the complex join.

22) Explain Hadoop distributed file system

Hadoop works with scalable distributed file systems like S3, HFTP FS, FS, and HDFS. Hadoop Distributed File System is made on the Google File System. This file system is designed in a way that it can easily run on a large cluster of the computer system.

23) Explain the main responsibilities of a data engineer

Data engineers have many responsibilities. They manage the source system of data. Data engineers simplify complex data structure and prevent the reduplication of data. Many times they also provide ELT and data transformation.

24) What is the full form of YARN?

The full form of YARN is Yet Another Resource Negotiator.

25) List various modes in Hadoop

Modes in Hadoop are 1) Standalone mode 2) Pseudo distributed mode 3) Fully distributed mode.

26) How to achieve security in Hadoop?

Perform the following steps to achieve security in Hadoop:

3) In the last step, the client use service ticket for self-authentication to a specific server.

27) What is Heartbeat in Hadoop?

In Hadoop, NameNode and DataNode communicate with each other. Heartbeat is the signal sent by DataNode to NameNode on a regular basis to show its presence.

28) Distinguish between NAS and DAS in Hadoop


Storage capacity is 109 to 1012 in byte. Storage capacity is 109 in byte.

Management cost per GB is moderate. Management cost per GB is high.

Transmit data using Ethernet or TCP/IP. Transmit data using IDE/ SCSI

29) List important fields or languages used by data engineer

Here are a few fields or languages used by data engineer:

Probability as well as linear algebra

Machine learning

Trend analysis and regression

Hive QL and SQL databases

30) What is Big Data?

It is a large amount of structured and unstructured data, that cannot be easily processed by traditional data storage methods. Data engineers are using Hadoop to manage big data.

31) What is FIFO scheduling?

It is a Hadoop Job scheduling algorithm. In this FIFO scheduling, a reporter selects jobs from a work queue, the oldest job first.

32) Mention default port numbers on which task tracker, NameNode, and job tracker run in Hadoop

Default port numbers on which task tracker, NameNode, and job tracker run in Hadoop are as follows:

Task tracker runs on 50060 port

NameNode runs on 50070 port

Job Tracker runs on 50030 port

33) How to disable Block Scanner on HDFS Data Node

In order to disable Block Scanner on HDFS Data Node, set dfs.datanode.scan.period.hours to 0.

34) How to define the distance between two nodes in Hadoop?

The distance is equal to the sum of the distance to the closest nodes. The method getDistance() is used to calculate the distance between two nodes.

35) Why use commodity hardware in Hadoop?

Commodity hardware is easy to obtain and affordable. It is a system that is compatible with Windows, MS-DOS, or Linux.

36) Define replication factor in HDFS

Replication factor is a total number of replicas of a file in the system.

37) What data is stored in NameNode?

Namenode stores the metadata for the HDFS like block information, and namespace information.

38) What do you mean by Rack Awareness?

In Haddop cluster, Namenode uses the Datanode to improve the network traffic while reading or writing any file that is closer to the nearby rack to Read or Write request. Namenode maintains the rack id of each DataNode to achieve rack information. This concept is called as Rack Awareness in Hadoop.

39) What are the functions of Secondary NameNode?

Following are the functions of Secondary NameNode:

FsImage which stores a copy of EditLog and FsImage file.

NameNode crash: If the NameNode crashes, then Secondary NameNode’s FsImage can be used to recreate the NameNode.

Checkpoint: It is used by Secondary NameNode to confirm that data is not corrupted in HDFS.

Update: It automatically updates the EditLog and FsImage file. It helps to keep FsImage file on Secondary NameNode updated.

40) What happens when NameNode is down, and the user submits a new job?

41) What are the basic phases of reducer in Hadoop?

There are three basic phases of a reducer in Hadoop:

1. Shuffle: Here, Reducer copies the output from Mapper.

2. Sort: In sort, Hadoop sorts the input to Reducer using the same key.

3. Reduce: In this phase, output values associated with a key are reduced to consolidate the data into the final output.

42) Why Hadoop uses Context object?

Hadoop framework uses Context object with the Mapper class in order to interact with the remaining system. Context object gets the system configuration details and job in its constructor.

We use Context object in order to pass the information in setup(), cleanup() and map() methods. This object makes vital information available during the map operations.

43) Define Combiner in Hadoop

It is an optional step between Map and Reduce. Combiner takes the output from Map function, creates key value pairs, and submit to Hadoop Reducer. Combiner’s task is to summarize the final result from Map into summary records with an identical key.

44) What is the default replication factor available in HDFS What it indicates?

Default replication factor in available in HDFS is three. Default replication factor indicates that there will be three replicas of each data.

45) What do you mean Data Locality in Hadoop?

In a Big Data system, the size of data is huge, and that is why it does not make sense to move data across the network. Now, Hadoop tries to move computation closer to data. This way, the data remains local to the stored location.

46) Define Balancer in HDFS

In HDFS, the balancer is an administrative used by admin staff to rebalance data across DataNodes and moves blocks from overutilized to underutilized nodes.

47) Explain Safe mode in HDFS

It is a read-only mode of NameNode in a cluster. Initially, NameNode is in Safemode. It prevents writing to file-system in Safemode. At this time, it collects data and statistics from all the DataNodes.

48) What is the importance of Distributed Cache in Apache Hadoop?

Hadoop has a useful utility feature so-called Distributed Cache which improves the performance of jobs by caching the files utilized by applications. An application can specify a file for the cache using JobConf configuration.

Hadoop framework makes replica of these files to the nodes one which a task has to be executed. This is done before the execution of task starts. Distributed Cache supports the distribution of read only files as well as zips, and jars files.

49) What is Metastore in Hive?

It stores schema as well as the Hive table location.

Hive table defines, mappings, and metadata that are stored in Metastore. This can be stored in RDBMS supported by JPOX.

50) What do mean by SerDe in Hive?

SerDe is a short name for Serializer or Deserializer. In Hive, SerDe allows to read data from table to and write to a specific field in any format you want.

51) List components available in Hive data model

There are the following components in the Hive data model:




52) Explain the use of Hive in Hadoop eco-system.

Hive provides an interface to manage data stored in Hadoop eco-system. Hive is used for mapping and working with HBase tables. Hive queries are converted into MapReduce jobs in order to hide the complexity associated with creating and running MapReduce jobs.

53) List various complex data types/collection are supported by Hive

Hive supports the following complex data types:





54) Explain how .hiverc file in Hive is used?

In Hive, .hiverc is the initialization file. This file is initially loaded when we start Command Line Interface (CLI) for Hive. We can set the initial values of parameters in .hiverc file.

55) Is it possible to create more than one table in Hive for a single data file?

Yes, we can create more than one table schemas for a data file. Hive saves schema in Hive Metastore. Based on this schema, we can retrieve dissimilar results from same Data.

56) Explain different SerDe implementations available in Hive

There are many SerDe implementations available in Hive. You can also write your own custom SerDe implementation. Following are some famous SerDe implementations:





57) List table generating functions available in Hive

Following is a list of table generating functions:





58) What is a Skewed table in Hive?

A Skewed table is a table that contains column values more often. In Hive, when we specify a table as SKEWED during creation, skewed values are written into separate files, and remaining values go to another file.

59) List out objects created by create statement in MySQL.

Objects created by create statement in MySQL are as follows:










60) How to see the database structure in MySQL?

In order to see database structure in MySQL, you can use

DESCRIBE command. Syntax of this command is DESCRIBE Table name;.

61) How to search for a specific String in MySQL table column?

Use regex operator to search for a String in MySQL column. Here, we can also define various types of regular expression and search for using regex.

62) Explain how data analytics and big data can increase company revenue?

Following are the ways how data analytics and big data can increase company revenue:

Use data efficiently to make sure that business growth.

Increase customer value.

Turning analytical to improve staffing levels forecasts.

Cutting down the production cost of the organizations.

These interview questions will also help in your viva(orals)

30+ Most Important Data Science Interview Questions (Updated 2023)

            X               1               20                 30                    40

            Y               1               400                800                   1300

(A)  27.876

(B) 32.650

(C) 40.541

(D) 28.956

Answer: (D)

Explanation: Hint: Use the ordinary least square method.

Q5. The robotic arm will be able to paint every corner of the automotive parts while minimizing the quantity of paint wasted in the process. Which learning technique is used in this problem?

(A) Supervised Learning.

(B) Unsupervised Learning.

(C) Reinforcement Learning.

(D) Both (A) and (B).

Answer: (C)

Explanation: Here robot is learning from the environment by taking the rewards for positive actions and penalties for negative actions.

Q6. Which one of the following statements is TRUE for a Decision Tree?

(A) Decision tree is only suitable for the classification problem statement.

(B) In a decision tree, the entropy of a node decreases as we go down the decision tree.

(C) In a decision tree, entropy determines purity.

(D) Decision tree can only be used for only numeric valued and continuous attributes.

Answer: (B)

Explanation: Entropy helps to determine the impurity of a node, and as we go down the decision tree, entropy decreases.

Q7. How do you choose the right node while constructing a decision tree?

(A) An attribute having high entropy

(B) An attribute having high entropy and information gain

(C) An attribute having the lowest information gain.

(D) An attribute having the highest information gain.

Answer: (D)

Explanation: We select first those attributes which are having maximum information gain.

Q8. What kind of distance metric(s) are suitable for categorical variables to find the closest neighbors?

(A) Euclidean distance.

(B) Manhattan distance.

(C) Minkowski distance.

(D) Hamming distance.

Answer: (D)

Explanation: Hamming distance is a metric for comparing two binary data strings, i.e., suitable for categorical variables.

Q9. In the Naive Bayes algorithm, suppose that the prior for class w1 is greater than class w2, would the decision boundary shift towards the region R1(region for deciding w1) or towards region R2 (region for deciding w2)?

(A) towards region R1.

(B) towards region R2.

(C) No shift in decision boundary.

(D) It depends on the exact value of priors.

Answer: (B)

Explanation: Upon shifting the decision boundary towards region R2, we preserve the prior probabilities proportion since the prior for w1 is greater than w2.

Q10. Which of the following statements is FALSE about Ridge and Lasso Regression?

(A) These are types of regularization methods to solve the overfitting problem.

(B) Lasso Regression is a type of regularization method.

(C) Ridge regression shrinks the coefficient to a lower value.

(D) Ridge regression lowers some coefficients to a zero value.

Answer: (D)

Explanation: Ridge regression never drops any feature; instead, it shrinks the coefficients. However, Lasso regression drops some features by making the coefficient of that feature zero. Therefore, the latter is used as a Feature Selection Technique.

Q11. Which of the following is FALSE about Correlation and Covariance?

(A) A zero correlation does not necessarily imply independence between variables.

(B) Correlation and covariance values are the same.

(C) The covariance and correlation are always the same sign.

(D) Correlation is the standardized version of Covariance.

Answer: (B)

Explanation: Correlation is defined as covariance divided by standard deviations and, therefore, is the standardized version of covariance.

Q12. In Regression modeling, we develop a mathematical equation that describes how, (Predictor-Independent variable, Response-Dependent variable)

(A) one predictor and one or more response variables are related.

(B) several predictors and several response variables response are related.

(C) one response and one or more predictors are related.

(D) All of these are correct.

Answer: (C)

Explanation: In the regression problem statement, we have several independent variables but only one dependent variable.

Q13. True or False: In a naive Bayes algorithm, the entire posterior probability will be zero when an attribute value in the testing record has no example in the training set.

(A) True

(B) False

(C) Can’t be determined

(D) None of these

Answer: (A)

Q14. Which of the following is NOT true about Ensemble Learning Techniques?

(A) Bagging decreases the variance of the classifier.

(B) Boosting helps to decrease the bias of the classifier.

(C) Bagging combines the predictions from different models and then finally gives the results.

(D) Bagging and Boosting are the only available ensemble techniques.

Answer: (D)

Explanation: Apart from bagging and boosting, there are other various types of ensemble techniques such as Stacking, Extra trees classifier, Voting classifier, etc.

Q15. Which of the following statement is TRUE about the Bayes classifier?

(A) Bayes classifier works on the Bayes theorem of probability.

(B) Bayes classifier is an unsupervised learning algorithm.

(C) Bayes classifier is also known as maximum apriori classifier.

(D) It assumes the independence between the independent variables or features.

Answer: (A)

Explanation: Bayes classifier internally uses the concept of the Bayes theorem for doing the predictions for unseen data points.

Q16. How will you define precision in a confusion matrix?

(A) It is the ratio of true positive to false negative predictions.

(B) It is the measure of how accurately a model can identify positive classes out of all the positive classes present in the dataset.

(C) It is the measure of how accurately a model can identify true positives from all the positive predictions that it has made

(D) It is the measure of how accurately a model can identify true negatives from all the positive predictions that it has made

Answer: (C)

Explanation: Precision is the ratio of true positive and (true positive + false positive), which means that it measures, out of all the positive predicted values by a model, how precisely a model predicted the truly positive values.

Q17. What is True about bias and variance?

(A) High bias means that the model is underfitting.

(B) High variance means that the model is overfitting

(C) Bias and variance are inversely proportional to each other.

(D) All of the above

Answer: (D)

Explanation: A model with high bias is unable to capture the underlying patterns in the data and consistently underestimates or overestimates the true values, which means that the model is underfitting. A model with high variance is overly sensitive to the noise in the data and may produce vastly different results for different samples of the same data. Therefore it is important to maintain the balance of both variance and bias. As they are inversely proportional to each other, this relationship between bias and variance is often referred to as the bias-variance trade-off.

Q18. Which of these machine learning models is used for classification as well as regression tasks?

(A) Random forest

(B) SVM(support vector machine)

(C) Logistic regression

(D) Both A and B

Answer: (D)

Explanation: Support Vector Machines (SVMs) and Decision Trees are two popular machine-learning algorithms that can be used for classification and regression tasks.

A. It is computationally expensive

B. It can get stuck in local minima

C. It requires a large amount of labeled data

D. It can only handle numerical data

Answer: (B)

Explanation: It can get stuck in local minima

Data Science Interview Questions on Deep Learning Q19. Which of the following SGD variants is based on both momentum and adaptive learning?

(A) RMSprop.

(B) Adagrad.

(C) Adam.

(D) Nesterov.

Answer: (C)

Explanation: Adam, being a popular deep learning optimizer, is based on both momentum and adaptive learning.

Q20. Which of the following activation function output is zero-centered?

(A) Hyperbolic Tangent.

(B) Sigmoid.

(C) Softmax.

(D) Rectified Linear unit(ReLU).

Answer: (A)

Explanation: Hyperbolic Tangent activation function gives output in the range [-1,1], which is symmetric about zero.

Q21. Which of the following is FALSE about Radial Basis Function Neural Network?

(A) It resembles Recurrent Neural Networks(RNNs) which have feedback loops.

(B) It uses the radial basis function as an activation function.

(C) While outputting, it considers the distance of a point with respect to the center.

(D) The output given by the Radial basis function is always an absolute value.

Answer: (A)

Explanation: Radial basis functions do not resemble RNN but are used as an artificial neural network, which takes a distance of all the points from the center rather than the weighted sum.

Q22. In which of the following situations should you NOT prefer Keras over TensorFlow?

(A) When you want to quickly build a prototype using neural networks.

(B) When you want to implement simple neural networks in your initial learning phase.

(C) When doing critical and intensive research in any field.

(D) When you want to create simple tutorials for your students and friends.

Answer: (C)

Explanation: Keras is not preferred since it is built on top of Tensorflow, which provides both high-level and low-level APIs.

Q23. Which of the following is FALSE about Deep Learning and Machine Learning?

(A) Deep Learning algorithms work efficiently on a high amount of data and require high computational power.

(B) Feature Extraction needs to be done manually in both ML and DL algorithms.

(C) Deep Learning algorithms are best suited for an unstructured set of data.

(D) Deep Learning is a subset of machine learning

Answer: (B)

Explanation: Usually, in deep learning algorithms, feature extraction happens automatically in hidden layers.

Q24. What can you do to reduce underfitting in a deep-learning model?

(A) Increase the number of iterations

(B) Use dimensionality reduction techniques

(C) Use cross-validation technique to reduce underfitting

(D) Use data augmentation techniques to increase the amount of data used.

Answer: (D)

Explanation: Options A and B can be used to reduce overfitting in a model. Option C is just used to check if there is underfitting or overfitting in a model but cannot be used to treat the issue. Data augmentation techniques can help reduce underfitting as it produces more data, and the noise in the data can help in generalizing the model.

Q25. Which of the following is FALSE for neural networks?

(A) Artificial neurons are similar in operation to biological neurons.

(B) Training time for a neural network depends on network size.

(C) Neural networks can be simulated on conventional computers.

(D) The basic units of neural networks are neurons.

Answer: (A)

Explanation: Artificial neuron is not similar in working as compared to biological neuron since artificial neuron first takes a weighted sum of all inputs along with bias followed by applying an activation function to give the final result, whereas the working of biological neuron involves axon, synapses, etc.

Q26. Which of the following logic function cannot be implemented by a perceptron having 2 inputs?


(B) OR



Answer: (D)

Explanation: Perceptron always gives a linear decision boundary. However, for the Implementation of the XOR function, we need a non-linear decision boundary.

Q27. Inappropriate selection of learning rate value in gradient descent gives rise to:

(A) Local Minima.

(B) Oscillations.

(C) Slow convergence.

(D) All of the above.

Answer: (D)

Explanation: The learning rate decides how fast or slow our optimizer is able to achieve the global minimum. So by choosing an inappropriate value of learning rate, we may not reach the global minimum; instead, we get stuck at a local minimum and oscillate around the minimum, because of which the convergence time increases.

Data Science Interview Questions on Coding Q28.What will be the output of the following python code?

Update the detailed information about Sql Interview Questions And Answers Updated For 2023 on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!