AIP-210: Certified Artificial Intelligence Practitioner

An HR solutions firm is developing software for staffing agencies that uses machine learning.

The team uses training data to teach the algorithm and discovers that it generates lower employability scores for women. Also, it predicts that women, especially with children, are less likely to get a high-paying job.

Which type of bias has been discovered?

Automation
Emergent
Preexisting
Technical

Correct answer: C

Explanation:

Preexisting bias is a type of bias that originates from historical or social contexts, such as stereotypes, prejudices, or discriminations. Preexisting bias can affect the data or the algorithm used for machine learning, as well as the outcomes or decisions made by machine learning.Preexisting bias can cause unfair or harmful impacts on certain groups or individuals based on their attributes, such as gender, race, age, or disability3. In this case, the software that uses machine learning generates lower employability scores for women and predicts that women, especially with children, are less likely to get a high-paying job. This indicates that the software has preexisting bias against women, which may reflect the historical or social inequalities or expectations in the labor market.

Which two encodes can be used to transform categories data into numerical features? (Select two.)

Count Encoder
Log Encoder
Mean Encoder
Median Encoder
One-Hot Encoder

Correct answer: CE

Explanation:

Encoding is a technique that transforms categorical data into numerical features that can be used by machine learning models. Categorical data are data that have a finite number of possible values or categories, such as gender, color, or country. Encoding can help convert categorical data into a format that is suitable and understandable for machine learning models. Some of the encoding methods that can be used to transform categorical data into numerical features are:Mean Encoder: Mean encoder is a method that replaces each category with the mean value of the target variable for that category. Mean encoder can capture the relationship between the category and the target variable, but it may cause overfitting or multicollinearity problems.One-Hot Encoder: One-hot encoder is a method that creates a binary vector for each category, where only one element has a value of 1 (the hot bit) and the rest have a value of 0. One-hot encoder can create distinct and orthogonal vectors for each category, but it may increase the dimensionality and sparsity of the data.

Encoding is a technique that transforms categorical data into numerical features that can be used by machine learning models. Categorical data are data that have a finite number of possible values or categories, such as gender, color, or country. Encoding can help convert categorical data into a format that is suitable and understandable for machine learning models. Some of the encoding methods that can be used to transform categorical data into numerical features are:

Mean Encoder: Mean encoder is a method that replaces each category with the mean value of the target variable for that category. Mean encoder can capture the relationship between the category and the target variable, but it may cause overfitting or multicollinearity problems.

One-Hot Encoder: One-hot encoder is a method that creates a binary vector for each category, where only one element has a value of 1 (the hot bit) and the rest have a value of 0. One-hot encoder can create distinct and orthogonal vectors for each category, but it may increase the dimensionality and sparsity of the data.

Which of the following is the primary purpose of hyperparameter optimization?

Controls the learning process of a given algorithm
Makes models easier to explain to business stakeholders
Improves model interpretability
Increases recall over precision

Correct answer: A

Explanation:

Hyperparameter optimization is the process of finding the optimal values for hyperparameters that control the learning process of a given algorithm. Hyperparameters are parameters that are not learned by the algorithm but are set by the user before training. Hyperparameters can affect the performance and behavior of the algorithm, such as its speed, accuracy, complexity, or generalization. Hyperparameter optimization can help improve the efficiency and effectiveness of the algorithm by tuning its hyperparameters to achieve the best results.

In which of the following scenarios is lasso regression preferable over ridge regression?

The number of features is much larger than the sample size.
There are many features with no association with the dependent variable.
There is high collinearity among some of the features associated with the dependent variable.
The sample size is much larger than the number of features.

Correct answer: B

Explanation:

Lasso regression is a type of linear regression that adds a regularization term to the loss function to reduce overfitting and improve generalization. Lasso regression uses an L1 norm as the regularization term, which is the sum of the absolute values of the coefficients. Lasso regression can shrink some of the coefficients to zero, which effectively eliminates some of the features from the model. Lasso regression is preferable over ridge regression when there are many features with no association with the dependent variable, as it can perform feature selection and reduce the complexity and noise of the model.

Which of the following is the correct definition of the quality criteria that describes completeness?

The degree to which all required measures are known.
The degree to which a set of measures are equivalent across systems.
The degree to which a set of measures are specified using the same units of measure in all systems.
The degree to which the measures conform to defined business rules or constraints.

Correct answer: A

Explanation:

Completeness is a quality criterion that describes the degree to which all required measures are known. Completeness can help assess the coverage and availability of data for a given purpose or analysis.Completeness can be measured by comparing the actual number of measures with the expected number of measures, or by identifying and counting any missing, null, or unknown values in the data.

Completeness is a quality criterion that describes the degree to which all required measures are known. Completeness can help assess the coverage and availability of data for a given purpose or analysis.

Completeness can be measured by comparing the actual number of measures with the expected number of measures, or by identifying and counting any missing, null, or unknown values in the data.

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

Decision tree
Logistic regression
Random forest
XGBoost

Correct answer: C

Explanation:

Random forest is an algorithm that is ideal to prevent overfitting when using a dataset with many features and a small sample size. Random forest is an ensemble learning method that combines multiple decision trees to create a more robust and accurate model. Random forest can prevent overfitting by introducing randomness and diversity into the model, such as by using bootstrap sampling (sampling with replacement) to create different subsets of data for each tree, or by using feature selection (choosing a random subset of features) to split each node in a tree.

You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community.

Which of the following would BEST meet your needs?

Caffe
Keras
Microsoft Cognitive Services
TensorBoard

Correct answer: A

Explanation:

Caffe is a deep learning framework that is designed for speed and modularity. It can process large datasets of images efficiently and supports various types of neural networks. It also has a large and active developer community that contributes to its code base and documentation.Caffe is suitable for image processing tasks such as classification, segmentation, detection, and recognition

Which of the following principles supports building an ML system with a Privacy by Design methodology?

Avoiding mechanisms to explain and justify automated decisions.
Collecting and processing the largest amount of data possible.
Understanding, documenting, and displaying data lineage.
Utilizing quasi-identifiers and non-unique identifiers, alone or in combination.

Correct answer: C

Explanation:

Data lineage is the process of tracking the origin, transformation, and usage of data throughout its lifecycle. It helps to ensure data quality, integrity, and provenance. Data lineage also supports the Privacy by Design methodology, which is a framework that aims to embed privacy principles into the design and operation of systems, processes, and products that involve personal data.By understanding, documenting, and displaying data lineage, an ML system can demonstrate how it collects, processes, stores, and deletes personal data in a transparent and accountable manner3.

A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?

Cyberprotection
Cybersecurity
Data privacy
Data security

Correct answer: C

Explanation:

Data privacy is the right of individuals to control how their personal data is collected, used, shared, and protected. It also involves complying with relevant laws and regulations that govern the handling of personal data. Data privacy is especially important when extracting business intelligence from primary data captured from the public, as it may contain sensitive or confidential information that could harm the individuals if misused or breached .

For a particular classification problem, you are tasked with determining the best algorithm among SVM, random forest, K-nearest neighbors, and a deep neural network. Each of the algorithms has similar accuracy on your data. The stakeholders indicate that they need a model that can convey each feature's relative contribution to the model's accuracy. Which is the best algorithm for this use case?

Deep neural network
K-nearest neighbors
Random forest
SVM

Correct answer: C

Explanation:

Random forest is an ensemble learning method that combines multiple decision trees to create a more accurate and robust classifier or regressor. Random forest can convey each feature's relative contribution to the model's accuracy by measuring how much the prediction error increases when a feature is randomly permuted. This metric is called feature importance or Gini importance. Random forest can also provide insights into the interactions and dependencies among features by visualizing the decision trees .