Oracle Cloud Infrastructure 2025 Data Science Professional Questions and Answers
You are preparing a configuration object necessary to create a Data Flow application. Which THREE parameter values should you provide?
You have received machine learning model training code, without clear information about the optimal shape to run the training. How would you proceed to identify the optimal compute shape for your model training that provides a balanced cost and processing time?
You are a data scientist working for a utilities company. You have developed an algorithm that detects anomalies from a utility reader in the grid. The size of the model artifact is about 2 GB, and you are trying to store it in the model catalog. Which THREE interfaces could you use to save the model artifact into the model catalog?
Which components are a part of the OCI Identity and Access Management service?
True or false? Data scientists typically need a combination of technical skills, nontechnical ones, and suitable personality traits to be successful.
You have been given a collection of digital files required for a business audit. They consist of several different formats that you would like to annotate using Oracle Cloud Infrastructure (OCI) Data Labeling. Which THREE types of files could this tool annotate?
Triggering a PagerDuty notification as part of Monitoring is an example of what in the OCI Console?
What is the correct definition of Git?
While working with Git on Oracle Cloud Infrastructure (OCI) Data Science, you notice that two of the operations are taking more time than the others due to your slow internet speed. Which TWO operations would experience the delay?
You have just completed analyzing a set of images by using Oracle Cloud Infrastructure (OCI) Data Labeling, and you want to export the annotated data. Which TWO formats are supported?
Where are OCI secrets stored?
You have a complex Python code project that could benefit from using Data Science Jobs as it is a repeatable machine learning model training task. The project contains many sub-folders and classes. What is the best way to run this project as a Job?
You are asked to prepare data for a custom-built model that requires transcribing Spanish video recordings into a readable text format with profane words identified. Which Oracle Cloud Service would you use?
Which of the following best describes the principal goal of data science?
A bike sharing platform has collected user commute data for the past 3 years. For increasing profitability and making useful inferences, a machine learning model needs to be built from the accumulated data. Which of the following options has the correct order of the required machine learning tasks for building a model?
After you have created and opened a notebook session, you want to use the Accelerated Data Science (ADS) SDK to access your data and get started with exploratory data analysis. From which TWO places can you access the ADS SDK?
Which statement best describes Oracle Cloud Infrastructure Data Science Jobs?
Which of these options allow the sharing and loading back of ML models into a notebook session?
Which architecture is based on the principle of “never trust, always verify”?
You have created a model and want to use Accelerated Data Science (ADS) SDK to deploy the model. Where are the artifacts to deploy this model with ADS?
You want to write a program that performs document analysis tasks such as extracting text and tables from a document. Which Oracle AI service would you use?
You want to make your model more frugal to reduce the cost of collecting and processing data. You plan to do this by removing features that are highly correlated. You would like to create a heatmap that displays the correlation so that you can identify candidate features to remove. Which Accelerated Data Science (ADS) SDK method is appropriate to display the comparability between Continuous and Categorical features?
As a data scientist, you use the Oracle Cloud Infrastructure (OCI) Language service to train custommodels. Which types of custom models can be trained?
Which statement accurately describes an aspect of machine learning models?
You have configured the Management Agent on an Oracle Cloud Infrastructure (OCI) Linux instance for log ingestion purposes. Which is a required configuration for OCI Logging Analytics service to collect data from multiple logs of this instance?
You want to create a user group for a team of external data science consultants. The consultants should only have the ability to see Data Science resource details but not have the ability to create, delete, or update Data Science resources. What verb should you write in the policy?
Which statement about Oracle Cloud Infrastructure Anomaly Detection is true?
Arrange the following in the correct Git Repository workflow order:
Install, configure, and authenticate Git.
Configure SSH keys for the Git repository.
Create a local and remote Git repository.
Commit files to the local Git repository.
Push the commit to the remote Git repository.
You are using Oracle Cloud Infrastructure (OCI) Anomaly Detection to train a model to detect anomalies in pump sensor data. How does the required False Alarm Probability setting affect an anomaly detection model?
Which two statements are true about published conda environments?
You have trained three different models on your dataset using Oracle AutoML. You want to visualize the behavior of each of the models, including the baseline model, on the test set. Which class should be used from the Accelerated Data Science (ADS) SDK to visually compare the models?
How can you convert a fixed load balancer to a flexible load balancer?
Which TWO statements about Oracle Cloud Infrastructure (OCI) Open Data service are true?
Which statement about Oracle Cloud Infrastructure Data Science Jobs is true?
You are a data scientist working inside a notebook session and you attempt to pip install a package from a public repository that is not included in your conda environment. After running this command, you get a network timeout error. What might be missing from your network configuration?
Six months ago you created and deployed a model that predicts customer churn for a call center. Initially, it was yielding quality predictions. However, over the last two months, users have been questioning the credibility of the predictions. Which TWO methods would you employ to verify accuracy and lower customer churn?
You have trained a binary classifier for a loan application and saved this model into the model catalog. A colleague wants to examine the model, and you need to share the model with your colleague. From the model catalog, which model artifacts can be shared?
Which statement about resource principals is true?
Which of the following analytical and statistical techniques do data scientists commonly use?
As a data scientist, you create models for cancer prediction based on mammographic images. The correct identification is very crucial in this case. After evaluating two models, you arrive at the following confusion matrix. Which model would you prefer and why?
Model 1 has Test accuracy is 80% and recall is 70%
Model 2 has Test accuracy is 75% and recall is 85%
You are working as a data scientist for a healthcare company. They decided to analyze the data to find patterns in a large volume of electronic medical records. You are asked to build a PySpark solution to analyze these records in a JupyterLab notebook. What is the order of recommended steps to develop a PySpark application in OCI Data Science?
You have a dataset with fewer than 1000 observations, and you are using Oracle AutoML to build a classifier. While visualizing the results of each stage of the Oracle AutoML pipeline, you notice that no visualization has been generated for one of the stages. Which stage is not visualized?
You are using Oracle Cloud Infrastructure (OCI) Anomaly Detection to train a model to detect anomalies in pump sensor data. What are you trying to determine? How does the required False Alarm Probability setting affect an anomaly detection model?
The Oracle AutoML pipeline automates hyperparameter tuning by training the model with different parameters in parallel. You have created an instance of Oracle AutoML as oracle_automl and now you want an output with all the different trials performed by Oracle AutoML. Which of the following commands gives you the results of all trials?
You are running a pipeline in the OCI Data Science service and want to override some of the pipeline's default settings. Which of the following statements about overriding pipeline defaults is true?
Which statement is true about standards?
You are a data scientist designing an air traffic control model, and you choose to leverage Oracle AutoML. You understand that the Oracle AutoML pipeline consists of multiple stages and automatically operates in a certain sequence. What is the correct sequence for the Oracle AutoML pipeline?
Which Oracle Cloud Infrastructure (OCI) Data Science policy is invalid?
For your next data science project, you need access to public geospatial images. Which Oracle Cloud service provides free access to those images?
You want to create a user group for a team of external data science consultants. The consultants should only have the ability to see Data Science resource details but not have the ability to create, delete, or update Data Science resources. What verb should you write in the policy?
Which statement about resource principals is true?
You want to make your model more frugal to reduce the cost of collecting and processing data. You plan to do this by removing features that are highly correlated. You would like to create a heatmap that displays the correlation so that you can identify candidate features to remove. Which Accelerated Data Science (ADS) SDK method is appropriate to display the comparability between Continuous and Categorical features?
In which two ways can you improve data durability in Oracle Cloud Infrastructure Object Storage?
A bike sharing platform has collected user commute data for the past 3 years. For increasing profitability and making useful inferences, a machine learning model needs to be built from the accumulated data. Which of the following options has the correct order of the required machine learning tasks for building a model?
You are a data scientist working inside a notebook session and you attempt to pip install a package from a public repository that is not included in your conda environment. After running this command, you get a network timeout error. What might be missing from your network configuration?
Which type of firewalls are designed to protect against web application attacks, such as SQL injection and cross-site scripting?
Where are OCI secrets stored?
You are a data scientist leveraging Oracle Cloud Infrastructure (OCI) to create a model and need some additional Python libraries for processing genome sequencing data. Which of the following THREE statements are correct with respect to installing additional Python libraries to process the data?
What is the minimum active storage duration for logs used by Logging Analytics to be archived?
Which Web Application Firewall (WAF) service component must be configured to allow, block, or log network requests when they meet specified criteria?
What is feature engineering in machine learning used for?
Which OCI service provides a managed Kubernetes service for deploying, scaling, and managing containerized applications?
How are datasets exported in the OCI Data Labeling service?
As a data scientist, you require a pipeline to train ML models. When can a pipeline run be initiated?
You are asked to prepare data for a custom-built model that requires transcribing Spanish video recordings into a readable text format with profane words identified. Which Oracle Cloud Service would you use?
In machine learning, what is the primary difference between supervised and unsupervised learning?
You are attempting to save a model from a notebook session to the model catalog by using the Accelerated Data Science (ADS) SDK, with resource principal as the authentication signer, and you get a 404 authentication error. Which two should you look for to ensure permissions are set up correctly?
You want to make API calls against other OCI services from your instance without configuring user credentials. How would you achieve this?
You are creating an Oracle Cloud Infrastructure (OCI) Data Science job that will run on a recurring basis in a production environment. This job will pick up sensitive data from an Object Storage Bucket, train a model, and save it to the model catalog. How would you design the authentication mechanism for the job?
True or false? Bias is a common problem in data science applications.
Why is data sampling useful for data scientists?
Which OCI service enables you to build, train, and deploy machine learning models in the cloud?
As a data scientist, you create models for cancer prediction based on mammographic images. The correct identification is very crucial in this case. After evaluating two models, you arrive at the following confusion matrix. Which model would you prefer and why?
Model 1 has Test accuracy is 80% and recall is 70%
Model 2 has Test accuracy is 75% and recall is 85%
As you are working in your notebook session, you find that your notebook session does not have enough compute CPU and memory for your workload. How would you scale up your notebook session without losing your work?
You are using a custom application with third-party APIs to manage application and data hosted in an Oracle Cloud Infrastructure (OCI) tenancy. Although your third-party APIs don’t support OCI’s signature-based authentication, you want them to communicate with OCI resources. Which authentication option must you use to ensure this?
You have just started as a data scientist at a healthcare company. You have been asked to analyze and improve a deep neural network model, which was built based on the electrocardiogram records of patients. There are no details about the model framework that was built. What would be the best way to find more details about the machine learning models inside the model catalog?
You are a data scientist working for a utilities company. You have developed an algorithm that detects anomalies from a utility reader in the grid. The size of the model artifact is about 2 GB, and you are trying to store it in the model catalog. Which THREE interfaces could you use to save the model artifact into the model catalog?
Which cache rules criterion matches if the concatenation of the requested URL path and query are identical to the contents of the value field?
You are given a task of writing a program that sorts document images by language. Which Oracle AI Service would you use?
What is a common maxim about data scientists?
Which components are a part of the OCI Identity and Access Management service?
Using Oracle AutoML, you are tuning hyperparameters on a supported model class and have specified a time budget. AutoML terminates computation once the time budget is exhausted. What would you expect AutoML to return in case the time budget is exhausted before hyperparameter tuning is completed?
You have an embarrassingly parallel or distributed batch job on a large amount of data that you consider running using Data Science Jobs. What would be the best approach to run the workload?
You want to build a multistep machine learning workflow by using the Oracle Cloud Infrastructure (OCI) Data Science Pipeline feature. How would you configure the conda environment to run a pipeline step?