1. Which of the following is a part of Data Science?
a. Data Collection
b. Data Analysis
c. Data Visualization
d. Data Cleaning
Answer : b. Data Analysis
2. Which action is followed by a data scientist after collecting the data?
a. Data Storage
b. Data Cleaning
c. Data Visualization
d. Data Preprocessing
Answer : d. Data Preprocessing
3. Which of the following is NOT a data science application?
a. Predicting Stock Prices
b. Image Recognition
c. Generating Random Numbers
d. Fraud Detection
Answer : c. Generating Random Numbers
4. Which model is frequently used as the benchmark for data analysis?
a. Support Vector Machine
b. Decision Tree
c. Linear Regression
d. Random Forest
Answer : c. Linear Regression
5. Which language is commonly used in data science?
a. Java
b. C++
c. R
d. Python
Answer : d. Python
6. Which action follows the collection of the data is carried out by a data scientist?
a. Data Cleaning
b. Data Integration
c. Data Replication
d. All of the above
Answer : d. All of the above
7. Which one of the following focuses the identification of properties in the data?
a. Data mining
b. Big Data
c. Data wrangling
d. Machine Learning
Answer : a. Data mining
8. Data can be categorized into _______ groups.
a. 1
b. 2
c. 3
d. 4
Answer : b. 2
9. A structured data representation is known as __________.
a. Database table
b. Functions
c. Data preparation
d. Data frame
Answer : d. Data frame
10. To tell Python that we want to activate the mean function from the Numpy package, we write __ in front of the mean.
a. npm.
b. np.
c. ng.
d. ngm.
Answer : b. np.
11. Which of the following machine learning algorithms depends on the concept of bagging?
a. K-means
b. Naive Bayes
c. Random Forest
d. Support Vector Machine
Answer : c. Random Forest
12. Which of the following is essential components of data science?
a. Data Collection, Data Cleaning, Data Analysis
b. Data Visualization, Data Modeling, Data Deployment
c. Data Storage, Data Retrieval, Data Deletion
d. Data Mining, Data Entry, Data Replication
Answer : a. Data Collection, Data Cleaning, Data Analysis
13. What step in the data science process are NOT included?
a. Data Collection
b. Data Analysis
c. Quantum Computing
d. Data Visualization
Answer : c. Quantum Computing
14. How many groups can data be categorized into?
a. One
b. Two
c. Three
d. Four
Answer : b. Two
15. Unstructured data is not organized.
a. True
b. False
Answer : a. True
16. Column representation of data is know as __________.
a. Horizontal
b. Diagonal
c. Vertical
d. Top
Answer : c. Vertical
17. Only one time raw data can be processed.
a. True
b. False
Answer : b. False
18. What is the common goal of statistical modeling?
a. Inference
b. Summarizing
c. Subsetting
d. None of the above
Answer : a. Inference
19. Census data is analysis when the causal data is accured.
a. True
b. False
Answer : b. False
20. Which of the following models serves as the industry standard when it comes to data analysis?
a. Inferential
b. Descriptive
c. Causal
d. All of the above
Answer : a. Inferential
21. Which of the following is a revision control system?
a. Git
b. Numpy
c. Scipy
d. Slidify
Answer : a. Git
22. Which of the following is disadvantage of decision trees.
a. They can easily overfit the data.
b. They are not suitable for classification.
c. They are computationally expensive.
d. They have high bias and low variance.
Answer : a. They can easily overfit the data.
23. Which of the following is not a part of supervised learning?
a. Linear Regression
b. K-means Clustering
c. Decision Tree Classification
d. Support Vector Machine
Answer : b. K-means Clustering
24. Determine the clustering technique that handles data variance.
a. Hierarchical Clustering
b. K-means Clustering
c. DBSCAN
d. Agglomerative Clustering
Answer : b. K-means Clustering
25. Which of the following options focuses on the discovery of unknown properties in the data.
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Deep Learning
Answer : b. Unsupervised Learning
26. Inference engines work on the ____________ principle.
a. Inductive Reasoning
b. Deductive Reasoning
c. Abductive Reasoning
d. Bayesian Reasoning
Answer : b. Deductive Reasoning
27. Components of an expert system are?
a. Knowledge Base, Inference Engine, User Interface
b. Data Storage, Data Processing, Data Visualization
c. Sensors, Actuators, Logic Gates
d. Data Mining, Machine Learning, Data Cleaning
Answer : a. Knowledge Base, Inference Engine, User Interface
28. How many different kinds of observing environments exist?
a. One
b. Two
c. Three
d. Four
Answer : d. Four
29. What is another term for data dredging?
a. Data Snooping
b. Data Mining
c. Data Analysis
d. Data Cleansing
Answer : b. Data Mining
30. Which of the following algorithms uses the least memory out of the options provided?
a. Random Forest
b. Decision Tree
c. k-Nearest Neighbors (k-NN)
d. Naive Bayes
Answer : c. k-Nearest Neighbors (k-NN)
31. What are different machine learning methods?
a. Supervised Learning, Unsupervised Learning, Reinforcement Learning
b. Data Cleaning, Data Analysis, Data Visualization
c. Neural Networks, Decision Trees, Regression
d. Linear Algebra, Calculus, Statistics
Answer : a. Supervised Learning, Unsupervised Learning, Reinforcement Learning
32. The different types of machine learning are?
a. Regression, Classification, Clustering
b. Data Cleaning, Data Analysis, Data Visualization
c. Neural Networks, Decision Trees, Random Forest
d. Supervised Learning, Unsupervised Learning, Reinforcement Learning
Answer : d. Supervised Learning, Unsupervised Learning, Reinforcement Learning
33. Which generation of computers are related with artificial intelligence?
a. First Generation
b. Second Generation
c. Third Generation
d. Fifth Generation
Answer : d. Fifth Generation
34. Which of the following is essential data science skill?
a. Data Collection
b. Data Analysis
c. Data Visualization
d. Data Cleaning
Answer : b. Data Analysis
35. Which action follows the collection of the data is carried out by a data scientist?
a. Data Storage
b. Data Cleaning
c. Data Visualization
d. Data Preprocessing
Answer : d. Data Preprocessing
36. Which of the following is NOT a data science application?
a. Predicting Stock Prices
b. Image Recognition
c. Generating Random Numbers
d. Fraud Detection
Answer : c. Generating Random Numbers
37. What is the main objective of data preprocessing in data science?
a. To make the data fit on a single computer
b. To remove outliers from the data
c. To transform raw data into a usable format
d. To create visualizations of the data
Answer : c. To transform raw data into a usable format
38. Which of the following Python libraries is most frequently used for data analysis and manipulation?
a. TensorFlow
b. Keras
c. Pandas
d. Matplotlib
Answer : c. Pandas
39. What is the acronym for PEAS?
a. Programming, Engineering, Algorithms, Systems
b. Performance measure, Environment, Actuators, Sensors
c. Processing, Evaluation, Analysis, Synthesis
d. Programming, Evaluation, Algorithms, Synthesis
Answer : b. Performance measure, Environment, Actuators, Sensors
40. Which of the foloowing model usally a gold standard for data analysis.
a. Logistic Regression
b. Decision Tree
c. Linear Regression
d. Naive Bayes
Answer : c. Linear Regression
41. Data fishing is also known as ____________.
a. Data Snooping
b. Data Mining
c. Data Analysis
d. Data Cleansing
Answer : a. Data Snooping
42. CLI stands for____________.
a. Command Line Instruction
b. Command Line Integration
c. Command Line Interface
d. Command Line Interpretation
Answer : c. Command Line Interface
43. Time differences represented in various units are referred to as time deltas.
a. True
b. False
Answer : a. True
44. Which of the following DOES NOT constitute an appropriate data science application in the healthcare industry?
a. Predicting Disease Outcomes
b. Drug Discovery
c. Image-Based Diagnosis
d. Stock Market Prediction
Answer : d. Stock Market Prediction
45. Identify which CLI command is incorrect.
a. cd myfolder
b. ls -l
c. RUN app.py
d. mkdir newfolder
Answer : c. RUN app.py
46. Total principles of analytical graphs that exist are ______________.
a. Five
b. Seven
c. The number may vary
d. Ten
Answer : c. The number may vary
47. Knowledge in AI represented as ____________.
a. Rules
b. Equations
c. Images
d. Colors
Answer : a. Rules
48. Which of the SGD variations below depends on both momentum and adaptive learning?
a. Stochastic Gradient Descent (SGD)
b. AdaGrad
c. Adam (Adaptive Moment Estimation)
d. RMSprop
Answer : c. Adam (Adaptive Moment Estimation)
49. Which output of an activation function is zero-centered?
a. Sigmoid
b. ReLU (Rectified Linear Unit)
c. Tanh (Hyperbolic Tangent)
d. Leaky ReLU
Answer : c. Tanh (Hyperbolic Tangent)
50. Which of the following logic operations cannot be carried out by a two-input perceptron?
a. AND
b. OR
c. NOT
d. XOR
Answer : d. XOR
51. Which of the following method used to train and test the model based on data point in ML.
a. Validation data
b. Test data
c. Training data
d. Unlabeled data
Answer : c. Training data
52. Which of the following represents a machine learning classification problem?
a. Predicting stock prices
b. Image recognition
c. Sentiment analysis
d. Regression analysis
Answer : c. Sentiment analysis
53. What does “overfitting” mean in machine learning?
a. The model performs on the training data but poorly on new or unseen data.
b. The model has few parameters.
c. The model cannot fit on the training data.
d. The model performs equally well on training and test data.
Answer : a. The model performs on the training data but poorly on new or unseen data.
54. For classification and regression tasks which of the following Bayes theorem is used in Machine Learning algorithm?
a. k-Nearest Neighbors (k-NN)
b. Decision Trees
c. Naive Bayes
d. Support Vector Machines (SVM)
Answer : c. Naive Bayes
55. What is the main objective of machine learning dimensionality reduction techniques?
a. To increase the number of features in the data
b. To reduce the number of features in the data while preserving important information
c. To make the data more complex
d. To create new features from existing ones
Answer : b. To reduce the number of features in the data while preserving important information
56. What does “SQL” stand for when referring to databases and data science?
a. Structured Query Language
b. Sequential Query Logic
c. Simple Query Layer
d. Standardized Query Line
Answer : a. Structured Query Language
57. Which type of data having a fixed data structure with rows and columns?
a. Unstructured data
b. Semi-structured data
c. Structured data
d. NoSQL data
Answer : c. Structured data
58. Which Machine Learning Library is not a part of python.
a. NumPy
b. Scikit-learn
c. TensorFlow
d. Matplotlib
Answer : d. Matplotlib
59. What is the main objective for collecting data for data analysis?
a. To increase the size of the dataset
b. To reduce the dimensionality of the dataset
c. To select a representative subset of the data
d. To remove missing values from the dataset
Answer : c. To select a representative subset of the data
60. In data science, what is the main objective of data preprocessing?
a. To collect more data
b. To visualize data
c. To prepare and clean data for analysis
d. To build machine learning models
Answer : c. To prepare and clean data for analysis
61. Which programming language is used in data science for data analysis and data manipulation?
a. Java
b. Python
c. C++
d. Ruby
Answer : b. Python
62. What does exploratory data analysis (EDA) do in data science?
a. To build predictive models
b. To visualize data
c. To clean data
d. To deploy machine learning algorithms
Answer : b. To visualize data
63. Which of the following is not a of data type in Data Science?
a. Integer
b. Float
c. String
d. Loop
Answer : d. Loop
64. What does data science use to translate category data into numerical values?
a. Data visualization
b. Data preprocessing
c. Data transformation
d. Data exploration
Answer : c. Data transformation
65. Which statistical metric in data science best captures the central tendency of a dataset?
a. Standard deviation
b. Range
c. Mean
d. Variance
Answer : c. Mean
66. What is the function of feature engineering in data science?
a. To design new machine learning algorithms
b. To create visualizations
c. To transform raw data into informative features for modeling
d. To build data pipelines
Answer : c. To transform raw data into informative features for modeling
67. What is the most popular data visualization tool in data science for producing interactive and dynamic visualizations?
a. Matplotlib
b. Seaborn
c. Tableau
d. Pandas
Answer : c. Tableau
68. What is machine learning’s main objective in data science?
a. To explore data
b. To build predictive models and make predictions
c. To clean and preprocess data
d. To visualize data
Answer : b. To build predictive models and make predictions
69. Which of the following supervised learning algorithms is utilized in data science for classification tasks?
a. k-Means
b. Principal Component Analysis (PCA)
c. Random Forest
d. Hierarchical Clustering
Answer : c. Random Forest
70. What is the main goal of data science clustering algorithms?
a. To classify data into predefined categories
b. To reduce the dimensionality of data
c. To group similar data points based on their characteristics
d. To perform regression analysis
Answer : c. To group similar data points based on their characteristics
71. Which Python data structure is frequently used in data science to store and manipulate tabular data?
a. List
b. Dictionary
c. DataFrame (from pandas)
d. Array
Answer : c. DataFrame (from pandas)
72. What is the main objective of hypothesis testing in data science?
a. To make predictions
b. To explore data
c. To test if a hypothesis about a population is supported by sample data
d. To perform clustering
Answer : c. To test if a hypothesis about a population is supported by sample data
73. Which data science method includes developing a model on one set of data and analyzing its performance on an other, separate set of data?
a. Cross check validation
b. Feature validation
c. Hypothesis validation
d. Holdout validation
Answer : d. Holdout validation
74. Which of the following is a typical algorithm used for data science regression tasks?
a. k-Means
b. Decision Tree
c. Naive Bayes
d. Logistic Regression
Answer : d. Logistic Regression
75. Which data science method uses existing data patterns to fill missing values in a dataset?
a. Feature selection
b. Data visualization
c. Data cleaning
d. Missing data imputation
Answer : d. Missing data imputation
76. In natural language processing applications use which of the following text categorization and sentiment analysis algorithms?
a. k-Means
b. Linear Regression
c. Support Vector Machine (SVM)
d. Naive Bayes
Answer : d. Naive Bayes
77. What is the main objective of dimensionality reduction methods in data science similar to Principal Component Analysis (PCA)?
a. To increase the number of features
b. To add noise to the data
c. To reduce the dimensionality of data while preserving important information
d. To overfit the data
Answer : c. To reduce the dimensionality of data while preserving important information
78. Which data science procedure involves to converting data into a format appropriate for modeling or analysis?
a. Feature engineering
b. Data preprocessing
c. Data visualization
d. Hypothesis testing
Answer : b. Data preprocessing
79. What is the main objective of time series analysis in data science?
a. To classify data
b. To predict future values based on past observations
c. To perform clustering
d. To visualize data
Answer : b. To predict future values based on past observations
80. Which data science method divides a dataset into training and testing sets to assess the performance of a model?
a. Feature engineering
b. Cross-validation
c. Hypothesis testing
d. Train-test split
Answer : d. Train-test split
81. What is the objective of cross-validation in data science?
a. To preprocess data
b. To perform clustering
c. To evaluate the performance of a machine learning model on multiple subsets of the data
d. To visualize data
Answer : c. To evaluate the performance of a machine learning model on multiple subsets of the data
82. What is a data scientist’s main objective when performing A/B testing?
a. To visualize data
b. To explore data
c. To test the impact of a change or treatment on a specific metric
d. To perform clustering
Answer : c. To test the impact of a change or treatment on a specific metric
83. What data science method evaluates the significance of characteristics in a machine learning model?
a. Hypothesis testing
b. Feature selection
c. Data cleaning
d. Cross-validation
Answer : b. Feature selection
84. What is the main objective of anomaly detection in data science?
a. To identify unusual or suspicious patterns in data
b. To clean and preprocess data
c. To perform regression analysis
d. To visualize data
Answer : a. To identify unusual or suspicious patterns in data
85. What data science method reduces the influence of outliers in a dataset?
a. Data visualization
b. Data cleaning
c. Data transformation
d. Robust scaling
Answer : d. Robust scaling
86. In data science, what is the main objective of data transformation?
a. To increase the dimensionality of data
b. To add noise to the data
c. To convert data into a more suitable format for analysis or modeling
d. To perform feature engineering
Answer : c. To convert data into a more suitable format for analysis or modeling
87. What role does a histogram play in data science?
a. To visualize data
b. To evaluate model performance
c. To preprocess data
d. To perform clustering
Answer : a. To visualize data
88. Which data science method includes identifying relationships or trends in massive datasets?
a. Clustering
b. Association rule mining
c. Time series analysis
d. Data cleaning
Answer : b. Association rule mining
89. In data science, what is the main objective of data imputation?
a. To introduce noise to the data
b. To visualize data
c. To replace missing values in a dataset
d. To perform clustering
Answer : c. To replace missing values in a dataset
90. What is the main objective of data integration in data science?
a. To divide a dataset into training and testing sets
b. To preprocess data
c. To combine data from multiple sources into a unified dataset
d. To perform feature engineering
Answer : c. To combine data from multiple sources into a unified dataset
91. Which of the following is a standard R library for data analysis and manipulation?
a. Pandas
b. Scikit-Learn
c. ggplot2
d. Keras
Answer : c. ggplot2
92. What is the main reason that data augmentation is used in data science, particularly in computer vision tasks?
a. To increase the size of the dataset
b. To reduce model complexity
c. To perform feature engineering
d. To remove outliers from the data
Answer : a. To increase the size of the dataset
93. What is the main objective of time complexity analysis in data science?
a. To explore data
b. To evaluate model performance
c. To analyze the efficiency of algorithms in terms of their running time
d. To visualize data
Answer : c. To analyze the efficiency of algorithms in terms of their running time
94. Which of the following approaches is typically used to handle class imbalance in data science classification tasks?
a. Oversampling the majority class
b. Undersampling the minority class
c. Both A and B
d. Neither A nor B
Answer : c. Both A and B
95. In data science, what is the main objective of data munging (data wrangling)?
a. To create data visualizations
b. To clean and prepare raw data for analysis
c. To perform feature selection
d. To evaluate model performance
Answer : b. To clean and prepare raw data for analysis
96. What is the main objective of k-Means clustering in data science?
a. To perform regression analysis
b. To classify data into predefined categories
c. To group similar data points based on their characteristics
d. To visualize data
Answer : c. To group similar data points based on their characteristics
97. Which of the following is a standard Python library for data science and machine learning?
a. NumPy
b. TensorFlow
c. Matplotlib
d. All of the above
Answer : d. All of the above
98. What is the main objective of data science time series forecasting?
a. To explore data
b. To visualize data
c. To predict future values based on past observations
d. To perform clustering
Answer : c. To predict future values based on past observations
a. Data Collection
b. Data Analysis
c. Data Visualization
d. Data Cleaning
Answer : b. Data Analysis
2. Which action is followed by a data scientist after collecting the data?
a. Data Storage
b. Data Cleaning
c. Data Visualization
d. Data Preprocessing
Answer : d. Data Preprocessing
3. Which of the following is NOT a data science application?
a. Predicting Stock Prices
b. Image Recognition
c. Generating Random Numbers
d. Fraud Detection
Answer : c. Generating Random Numbers
4. Which model is frequently used as the benchmark for data analysis?
a. Support Vector Machine
b. Decision Tree
c. Linear Regression
d. Random Forest
Answer : c. Linear Regression
5. Which language is commonly used in data science?
a. Java
b. C++
c. R
d. Python
Answer : d. Python
6. Which action follows the collection of the data is carried out by a data scientist?
a. Data Cleaning
b. Data Integration
c. Data Replication
d. All of the above
Answer : d. All of the above
7. Which one of the following focuses the identification of properties in the data?
a. Data mining
b. Big Data
c. Data wrangling
d. Machine Learning
Answer : a. Data mining
8. Data can be categorized into _______ groups.
a. 1
b. 2
c. 3
d. 4
Answer : b. 2
9. A structured data representation is known as __________.
a. Database table
b. Functions
c. Data preparation
d. Data frame
Answer : d. Data frame
10. To tell Python that we want to activate the mean function from the Numpy package, we write __ in front of the mean.
a. npm.
b. np.
c. ng.
d. ngm.
Answer : b. np.
11. Which of the following machine learning algorithms depends on the concept of bagging?
a. K-means
b. Naive Bayes
c. Random Forest
d. Support Vector Machine
Answer : c. Random Forest
12. Which of the following is essential components of data science?
a. Data Collection, Data Cleaning, Data Analysis
b. Data Visualization, Data Modeling, Data Deployment
c. Data Storage, Data Retrieval, Data Deletion
d. Data Mining, Data Entry, Data Replication
Answer : a. Data Collection, Data Cleaning, Data Analysis
13. What step in the data science process are NOT included?
a. Data Collection
b. Data Analysis
c. Quantum Computing
d. Data Visualization
Answer : c. Quantum Computing
14. How many groups can data be categorized into?
a. One
b. Two
c. Three
d. Four
Answer : b. Two
15. Unstructured data is not organized.
a. True
b. False
Answer : a. True
16. Column representation of data is know as __________.
a. Horizontal
b. Diagonal
c. Vertical
d. Top
Answer : c. Vertical
17. Only one time raw data can be processed.
a. True
b. False
Answer : b. False
18. What is the common goal of statistical modeling?
a. Inference
b. Summarizing
c. Subsetting
d. None of the above
Answer : a. Inference
19. Census data is analysis when the causal data is accured.
a. True
b. False
Answer : b. False
20. Which of the following models serves as the industry standard when it comes to data analysis?
a. Inferential
b. Descriptive
c. Causal
d. All of the above
Answer : a. Inferential
21. Which of the following is a revision control system?
a. Git
b. Numpy
c. Scipy
d. Slidify
Answer : a. Git
22. Which of the following is disadvantage of decision trees.
a. They can easily overfit the data.
b. They are not suitable for classification.
c. They are computationally expensive.
d. They have high bias and low variance.
Answer : a. They can easily overfit the data.
23. Which of the following is not a part of supervised learning?
a. Linear Regression
b. K-means Clustering
c. Decision Tree Classification
d. Support Vector Machine
Answer : b. K-means Clustering
24. Determine the clustering technique that handles data variance.
a. Hierarchical Clustering
b. K-means Clustering
c. DBSCAN
d. Agglomerative Clustering
Answer : b. K-means Clustering
25. Which of the following options focuses on the discovery of unknown properties in the data.
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Deep Learning
Answer : b. Unsupervised Learning
26. Inference engines work on the ____________ principle.
a. Inductive Reasoning
b. Deductive Reasoning
c. Abductive Reasoning
d. Bayesian Reasoning
Answer : b. Deductive Reasoning
27. Components of an expert system are?
a. Knowledge Base, Inference Engine, User Interface
b. Data Storage, Data Processing, Data Visualization
c. Sensors, Actuators, Logic Gates
d. Data Mining, Machine Learning, Data Cleaning
Answer : a. Knowledge Base, Inference Engine, User Interface
28. How many different kinds of observing environments exist?
a. One
b. Two
c. Three
d. Four
Answer : d. Four
29. What is another term for data dredging?
a. Data Snooping
b. Data Mining
c. Data Analysis
d. Data Cleansing
Answer : b. Data Mining
30. Which of the following algorithms uses the least memory out of the options provided?
a. Random Forest
b. Decision Tree
c. k-Nearest Neighbors (k-NN)
d. Naive Bayes
Answer : c. k-Nearest Neighbors (k-NN)
31. What are different machine learning methods?
a. Supervised Learning, Unsupervised Learning, Reinforcement Learning
b. Data Cleaning, Data Analysis, Data Visualization
c. Neural Networks, Decision Trees, Regression
d. Linear Algebra, Calculus, Statistics
Answer : a. Supervised Learning, Unsupervised Learning, Reinforcement Learning
32. The different types of machine learning are?
a. Regression, Classification, Clustering
b. Data Cleaning, Data Analysis, Data Visualization
c. Neural Networks, Decision Trees, Random Forest
d. Supervised Learning, Unsupervised Learning, Reinforcement Learning
Answer : d. Supervised Learning, Unsupervised Learning, Reinforcement Learning
33. Which generation of computers are related with artificial intelligence?
a. First Generation
b. Second Generation
c. Third Generation
d. Fifth Generation
Answer : d. Fifth Generation
34. Which of the following is essential data science skill?
a. Data Collection
b. Data Analysis
c. Data Visualization
d. Data Cleaning
Answer : b. Data Analysis
35. Which action follows the collection of the data is carried out by a data scientist?
a. Data Storage
b. Data Cleaning
c. Data Visualization
d. Data Preprocessing
Answer : d. Data Preprocessing
36. Which of the following is NOT a data science application?
a. Predicting Stock Prices
b. Image Recognition
c. Generating Random Numbers
d. Fraud Detection
Answer : c. Generating Random Numbers
37. What is the main objective of data preprocessing in data science?
a. To make the data fit on a single computer
b. To remove outliers from the data
c. To transform raw data into a usable format
d. To create visualizations of the data
Answer : c. To transform raw data into a usable format
38. Which of the following Python libraries is most frequently used for data analysis and manipulation?
a. TensorFlow
b. Keras
c. Pandas
d. Matplotlib
Answer : c. Pandas
39. What is the acronym for PEAS?
a. Programming, Engineering, Algorithms, Systems
b. Performance measure, Environment, Actuators, Sensors
c. Processing, Evaluation, Analysis, Synthesis
d. Programming, Evaluation, Algorithms, Synthesis
Answer : b. Performance measure, Environment, Actuators, Sensors
40. Which of the foloowing model usally a gold standard for data analysis.
a. Logistic Regression
b. Decision Tree
c. Linear Regression
d. Naive Bayes
Answer : c. Linear Regression
41. Data fishing is also known as ____________.
a. Data Snooping
b. Data Mining
c. Data Analysis
d. Data Cleansing
Answer : a. Data Snooping
42. CLI stands for____________.
a. Command Line Instruction
b. Command Line Integration
c. Command Line Interface
d. Command Line Interpretation
Answer : c. Command Line Interface
43. Time differences represented in various units are referred to as time deltas.
a. True
b. False
Answer : a. True
44. Which of the following DOES NOT constitute an appropriate data science application in the healthcare industry?
a. Predicting Disease Outcomes
b. Drug Discovery
c. Image-Based Diagnosis
d. Stock Market Prediction
Answer : d. Stock Market Prediction
45. Identify which CLI command is incorrect.
a. cd myfolder
b. ls -l
c. RUN app.py
d. mkdir newfolder
Answer : c. RUN app.py
46. Total principles of analytical graphs that exist are ______________.
a. Five
b. Seven
c. The number may vary
d. Ten
Answer : c. The number may vary
47. Knowledge in AI represented as ____________.
a. Rules
b. Equations
c. Images
d. Colors
Answer : a. Rules
48. Which of the SGD variations below depends on both momentum and adaptive learning?
a. Stochastic Gradient Descent (SGD)
b. AdaGrad
c. Adam (Adaptive Moment Estimation)
d. RMSprop
Answer : c. Adam (Adaptive Moment Estimation)
49. Which output of an activation function is zero-centered?
a. Sigmoid
b. ReLU (Rectified Linear Unit)
c. Tanh (Hyperbolic Tangent)
d. Leaky ReLU
Answer : c. Tanh (Hyperbolic Tangent)
50. Which of the following logic operations cannot be carried out by a two-input perceptron?
a. AND
b. OR
c. NOT
d. XOR
Answer : d. XOR
51. Which of the following method used to train and test the model based on data point in ML.
a. Validation data
b. Test data
c. Training data
d. Unlabeled data
Answer : c. Training data
52. Which of the following represents a machine learning classification problem?
a. Predicting stock prices
b. Image recognition
c. Sentiment analysis
d. Regression analysis
Answer : c. Sentiment analysis
53. What does “overfitting” mean in machine learning?
a. The model performs on the training data but poorly on new or unseen data.
b. The model has few parameters.
c. The model cannot fit on the training data.
d. The model performs equally well on training and test data.
Answer : a. The model performs on the training data but poorly on new or unseen data.
54. For classification and regression tasks which of the following Bayes theorem is used in Machine Learning algorithm?
a. k-Nearest Neighbors (k-NN)
b. Decision Trees
c. Naive Bayes
d. Support Vector Machines (SVM)
Answer : c. Naive Bayes
55. What is the main objective of machine learning dimensionality reduction techniques?
a. To increase the number of features in the data
b. To reduce the number of features in the data while preserving important information
c. To make the data more complex
d. To create new features from existing ones
Answer : b. To reduce the number of features in the data while preserving important information
56. What does “SQL” stand for when referring to databases and data science?
a. Structured Query Language
b. Sequential Query Logic
c. Simple Query Layer
d. Standardized Query Line
Answer : a. Structured Query Language
57. Which type of data having a fixed data structure with rows and columns?
a. Unstructured data
b. Semi-structured data
c. Structured data
d. NoSQL data
Answer : c. Structured data
58. Which Machine Learning Library is not a part of python.
a. NumPy
b. Scikit-learn
c. TensorFlow
d. Matplotlib
Answer : d. Matplotlib
59. What is the main objective for collecting data for data analysis?
a. To increase the size of the dataset
b. To reduce the dimensionality of the dataset
c. To select a representative subset of the data
d. To remove missing values from the dataset
Answer : c. To select a representative subset of the data
60. In data science, what is the main objective of data preprocessing?
a. To collect more data
b. To visualize data
c. To prepare and clean data for analysis
d. To build machine learning models
Answer : c. To prepare and clean data for analysis
61. Which programming language is used in data science for data analysis and data manipulation?
a. Java
b. Python
c. C++
d. Ruby
Answer : b. Python
62. What does exploratory data analysis (EDA) do in data science?
a. To build predictive models
b. To visualize data
c. To clean data
d. To deploy machine learning algorithms
Answer : b. To visualize data
63. Which of the following is not a of data type in Data Science?
a. Integer
b. Float
c. String
d. Loop
Answer : d. Loop
64. What does data science use to translate category data into numerical values?
a. Data visualization
b. Data preprocessing
c. Data transformation
d. Data exploration
Answer : c. Data transformation
65. Which statistical metric in data science best captures the central tendency of a dataset?
a. Standard deviation
b. Range
c. Mean
d. Variance
Answer : c. Mean
66. What is the function of feature engineering in data science?
a. To design new machine learning algorithms
b. To create visualizations
c. To transform raw data into informative features for modeling
d. To build data pipelines
Answer : c. To transform raw data into informative features for modeling
67. What is the most popular data visualization tool in data science for producing interactive and dynamic visualizations?
a. Matplotlib
b. Seaborn
c. Tableau
d. Pandas
Answer : c. Tableau
68. What is machine learning’s main objective in data science?
a. To explore data
b. To build predictive models and make predictions
c. To clean and preprocess data
d. To visualize data
Answer : b. To build predictive models and make predictions
69. Which of the following supervised learning algorithms is utilized in data science for classification tasks?
a. k-Means
b. Principal Component Analysis (PCA)
c. Random Forest
d. Hierarchical Clustering
Answer : c. Random Forest
70. What is the main goal of data science clustering algorithms?
a. To classify data into predefined categories
b. To reduce the dimensionality of data
c. To group similar data points based on their characteristics
d. To perform regression analysis
Answer : c. To group similar data points based on their characteristics
71. Which Python data structure is frequently used in data science to store and manipulate tabular data?
a. List
b. Dictionary
c. DataFrame (from pandas)
d. Array
Answer : c. DataFrame (from pandas)
72. What is the main objective of hypothesis testing in data science?
a. To make predictions
b. To explore data
c. To test if a hypothesis about a population is supported by sample data
d. To perform clustering
Answer : c. To test if a hypothesis about a population is supported by sample data
73. Which data science method includes developing a model on one set of data and analyzing its performance on an other, separate set of data?
a. Cross check validation
b. Feature validation
c. Hypothesis validation
d. Holdout validation
Answer : d. Holdout validation
74. Which of the following is a typical algorithm used for data science regression tasks?
a. k-Means
b. Decision Tree
c. Naive Bayes
d. Logistic Regression
Answer : d. Logistic Regression
75. Which data science method uses existing data patterns to fill missing values in a dataset?
a. Feature selection
b. Data visualization
c. Data cleaning
d. Missing data imputation
Answer : d. Missing data imputation
76. In natural language processing applications use which of the following text categorization and sentiment analysis algorithms?
a. k-Means
b. Linear Regression
c. Support Vector Machine (SVM)
d. Naive Bayes
Answer : d. Naive Bayes
77. What is the main objective of dimensionality reduction methods in data science similar to Principal Component Analysis (PCA)?
a. To increase the number of features
b. To add noise to the data
c. To reduce the dimensionality of data while preserving important information
d. To overfit the data
Answer : c. To reduce the dimensionality of data while preserving important information
78. Which data science procedure involves to converting data into a format appropriate for modeling or analysis?
a. Feature engineering
b. Data preprocessing
c. Data visualization
d. Hypothesis testing
Answer : b. Data preprocessing
79. What is the main objective of time series analysis in data science?
a. To classify data
b. To predict future values based on past observations
c. To perform clustering
d. To visualize data
Answer : b. To predict future values based on past observations
80. Which data science method divides a dataset into training and testing sets to assess the performance of a model?
a. Feature engineering
b. Cross-validation
c. Hypothesis testing
d. Train-test split
Answer : d. Train-test split
81. What is the objective of cross-validation in data science?
a. To preprocess data
b. To perform clustering
c. To evaluate the performance of a machine learning model on multiple subsets of the data
d. To visualize data
Answer : c. To evaluate the performance of a machine learning model on multiple subsets of the data
82. What is a data scientist’s main objective when performing A/B testing?
a. To visualize data
b. To explore data
c. To test the impact of a change or treatment on a specific metric
d. To perform clustering
Answer : c. To test the impact of a change or treatment on a specific metric
83. What data science method evaluates the significance of characteristics in a machine learning model?
a. Hypothesis testing
b. Feature selection
c. Data cleaning
d. Cross-validation
Answer : b. Feature selection
84. What is the main objective of anomaly detection in data science?
a. To identify unusual or suspicious patterns in data
b. To clean and preprocess data
c. To perform regression analysis
d. To visualize data
Answer : a. To identify unusual or suspicious patterns in data
85. What data science method reduces the influence of outliers in a dataset?
a. Data visualization
b. Data cleaning
c. Data transformation
d. Robust scaling
Answer : d. Robust scaling
86. In data science, what is the main objective of data transformation?
a. To increase the dimensionality of data
b. To add noise to the data
c. To convert data into a more suitable format for analysis or modeling
d. To perform feature engineering
Answer : c. To convert data into a more suitable format for analysis or modeling
87. What role does a histogram play in data science?
a. To visualize data
b. To evaluate model performance
c. To preprocess data
d. To perform clustering
Answer : a. To visualize data
88. Which data science method includes identifying relationships or trends in massive datasets?
a. Clustering
b. Association rule mining
c. Time series analysis
d. Data cleaning
Answer : b. Association rule mining
89. In data science, what is the main objective of data imputation?
a. To introduce noise to the data
b. To visualize data
c. To replace missing values in a dataset
d. To perform clustering
Answer : c. To replace missing values in a dataset
90. What is the main objective of data integration in data science?
a. To divide a dataset into training and testing sets
b. To preprocess data
c. To combine data from multiple sources into a unified dataset
d. To perform feature engineering
Answer : c. To combine data from multiple sources into a unified dataset
91. Which of the following is a standard R library for data analysis and manipulation?
a. Pandas
b. Scikit-Learn
c. ggplot2
d. Keras
Answer : c. ggplot2
92. What is the main reason that data augmentation is used in data science, particularly in computer vision tasks?
a. To increase the size of the dataset
b. To reduce model complexity
c. To perform feature engineering
d. To remove outliers from the data
Answer : a. To increase the size of the dataset
93. What is the main objective of time complexity analysis in data science?
a. To explore data
b. To evaluate model performance
c. To analyze the efficiency of algorithms in terms of their running time
d. To visualize data
Answer : c. To analyze the efficiency of algorithms in terms of their running time
94. Which of the following approaches is typically used to handle class imbalance in data science classification tasks?
a. Oversampling the majority class
b. Undersampling the minority class
c. Both A and B
d. Neither A nor B
Answer : c. Both A and B
95. In data science, what is the main objective of data munging (data wrangling)?
a. To create data visualizations
b. To clean and prepare raw data for analysis
c. To perform feature selection
d. To evaluate model performance
Answer : b. To clean and prepare raw data for analysis
96. What is the main objective of k-Means clustering in data science?
a. To perform regression analysis
b. To classify data into predefined categories
c. To group similar data points based on their characteristics
d. To visualize data
Answer : c. To group similar data points based on their characteristics
97. Which of the following is a standard Python library for data science and machine learning?
a. NumPy
b. TensorFlow
c. Matplotlib
d. All of the above
Answer : d. All of the above
98. What is the main objective of data science time series forecasting?
a. To explore data
b. To visualize data
c. To predict future values based on past observations
d. To perform clustering
Answer : c. To predict future values based on past observations
Data Science MCQs
| India | |
| Computer 👇👇 | |
![]() 22-Nov-2025, 04:38 pm | Computer Programming Language Python 100+ MCQ |
![]() 20-Nov-2025, 01:38 pm | Computer Vision MCQs |
![]() 17-Oct-2025, 12:05 pm | Top 100 GK Computer Questions Answers in Hindi and English |
Knowledge Center
More
..



Vidya Bhaskar November 2024 to October 2025 PDF : विद्या भास्कर नवंबर 2024 से अक्टूबर 2025 One Liner डाउनलोड करें
Vidya Bhaskar September 2025: विद्या भास्कर सितम्बर 2025, डाउनलोड करें विद्या भास्कर मंथली करेंट अफेयर्स PDF
Satellite Internet Technology An Overview Handout, Hindi and English Edition / सैटेलाइट इंटरनेट टेक्नोलॉजी का एक संक्षिप्त विवरण - हैंडआउट, हिंदी और अंग्रेजी संस्करण
Vidya Bhaskar August 2025: विद्या भास्कर अगस्त 2025, डाउनलोड करें विद्या भास्कर मंथली करेंट अफेयर्स PDF
Vidya Bhaskar Aug 2024 to July 2025 PDF : विद्या भास्कर अगस्त 2024 से जुलाई 2025 One Liner डाउनलोड करें
Vidya Bhaskar May 2025: विद्या भास्कर मई 2025, डाउनलोड करें विद्या भास्कर मंथली करेंट अफेयर्स PDF
Vidya Bhaskar June 2025: विद्या भास्कर जून 2025, डाउनलोड करें विद्या भास्कर मंथली करेंट अफेयर्स PDF
Vidya Bhaskar July 2025: विद्या भास्कर जुलाई 2025, डाउनलोड करें विद्या भास्कर मंथली करेंट अफेयर्स PDF
All Magazine Editions - Vidya Bhaskar and Current Affairs Booster, Download Free PDF