CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers
When should the model be retrained in the ML pipeline?
You create a prediction model with 96% accuracy. While the model's true positive rate (TPR) is performing well at 99%, the true negative rate (TNR) is only 50%. Your supervisor tells you that the TNR needs to be higher, even if it decreases the TPR. Upon further inspection, you notice that the vast majority of your data is truly positive.
What method could help address your issue?
Which of the following is the definition of accuracy?
A big data architect needs to be cautious about personally identifiable information (PII) that may be captured with their new IoT system. What is the final stage of the Data Management Life Cycle, which the architect must complete in order to implement data privacy and security appropriately?
Which of the following is a common negative side effect of not using regularization?
An AI system recommends New Year's resolutions. It has an ML pipeline without monitoring components. What retraining strategy would be BEST for this pipeline?
When should you use semi-supervised learning? (Select two.)
Given a feature set with rows that contain missing continuous values, and assuming the data is normally distributed, what is the best way to fill in these missing features?
For a particular classification problem, you are tasked with determining the best algorithm among SVM, random forest, K-nearest neighbors, and a deep neural network. Each of the algorithms has similar accuracy on your data. The stakeholders indicate that they need a model that can convey each feature's relative contribution to the model's accuracy. Which is the best algorithm for this use case?
Which two techniques are used to build personas in the ML development lifecycle? (Select two.)
Which of the following best describes distributed artificial intelligence?
Which of the following pieces of AI technology provides the ability to create fake videos?
Which database is designed to better anticipate and avoid risks of AI systems causing safety, fairness, or other ethical problems?
You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community. Which of the following would BEST meet your needs?
In which of the following scenarios is lasso regression preferable over ridge regression?
Which of the following is the primary purpose of hyperparameter optimization?
A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?
Which of the following are true about the transform-design pattern for a machine learning pipeline? (Select three.)
It aims to separate inputs from features.
You are developing a prediction model. Your team indicates they need an algorithm that is fast and requires low memory and low processing power. Assuming the following algorithms have similar accuracy on your data, which is most likely to be an ideal choice for the job?
In general, models that perform their tasks:
Which of the following approaches is best if a limited portion of your training data is labeled?
Which of the following sentences is true about model evaluation and model validation in ML pipelines?
When working with textual data and trying to classify text into different languages, which approach to representing features makes the most sense?
Which of the following scenarios is an example of entanglement in ML pipelines?
Which of the following text vectorization methods is appropriate and correctly defined for an English-to-Spanish translation machine?
Which two of the following statements about the beta value in an A/B test are accurate? (Select two.)
In addition to understanding model performance, what does continuous monitoring of bias and variance help ML engineers to do?