
DSA-C02 exam questions for practice in 2024 Updated 67 Questions
Updated May-2024 Premium DSA-C02 Exam Engine pdf - Download Free Updated 67 Questions
NEW QUESTION # 24
What Can Snowflake Data Scientist do in the Snowflake Marketplace as Provider?
- A. Share live datasets securely and in real-time without creating copies of the data or im-posing data integration tasks on the consumer.
- B. Publish listings for free-to-use datasets to generate interest and new opportunities among the Snowflake customer base.
- C. Eliminate the costs of building and maintaining APIs and data pipelines to deliver data to customers.
- D. Publish listings for datasets that can be customized for the consumer.
Answer: A,B,C,D
Explanation:
Explanation
All are correct!
About the Snowflake Marketplace
You can use the Snowflake Marketplace to discover and access third-party data and services, as well as market your own data products across the Snowflake Data Cloud.
As a data provider, you can use listings on the Snowflake Marketplace to share curated data offer-ings with many consumers simultaneously, rather than maintain sharing relationships with each indi-vidual consumer.
With Paid Listings, you can also charge for your data products.
As a consumer, you might use the data provided on the Snowflake Marketplace to explore and ac-cess the following:
Historical data for research, forecasting, and machine learning.
Up-to-date streaming data, such as current weather and traffic conditions.
Specialized identity data for understanding subscribers and audience targets.
New insights from unexpected sources of data.
The Snowflake Marketplace is available globally to all non-VPS Snowflake accounts hosted on Amazon Web Services, Google Cloud Platform, and Microsoft Azure, with the exception of Mi-crosoft Azure Government.
Support for Microsoft Azure Government is planned.
NEW QUESTION # 25
Which type of Machine learning Data Scientist generally used for solving classification and regression problems?
- A. Supervised
- B. Regression Learning
- C. Instructor Learning
- D. Unsupervised
- E. Reinforcement Learning
Answer: A
Explanation:
Explanation
Supervised Learning
Overview:
Supervised learning is a type of machine learning that uses labeled data to train machine learning models. In labeled data, the output is already known. The model just needs to map the inputs to the respective outputs.
Algorithms:
Some of the most popularly used supervised learning algorithms are:
Linear Regression
Logistic Regression
Support Vector Machine
K Nearest Neighbor
Decision Tree
Random Forest
Naive Bayes
Working:
Supervised learning algorithms take labelled inputs and map them to the known outputs, which means you already know the target variable.
Supervised Learning methods need external supervision to train machine learning models. Hence, the name supervised. They need guidance and additional information to return the desired result.
Applications:
Supervised learning algorithms are generally used for solving classification and regression problems.
Few of the top supervised learning applications are weather prediction, sales forecasting, stock price analysis.
NEW QUESTION # 26
There are a couple of different types of classification tasks in machine learning, Choose the Correct Classification which best categorized the below Application Tasks in Machine learning?
To detect whether email is spam or not
To determine whether or not a patient has a certain disease in medicine.
To determine whether or not quality specifications were met when it comes to QA (Quality Assurance).
- A. Multi-Class Classification
- B. Binary Classification
- C. Multi-Label Classification
- D. Logistic Regression
Answer: B
Explanation:
Explanation
The Supervised Machine Learning algorithm can be broadly classified into Regression and Classification Algorithms. In Regression algorithms, we have predicted the output for continuous values, but to predict the categorical values, we need Classification algorithms.
What is the Classification Algorithm?
The Classification algorithm is a Supervised Learning technique that is used to identify the category of new observations on the basis of training data. In Classification, a program learns from the given dataset or observations and then classifies new observation into a number of classes or groups. Such as, Yes or No, 0 or
1, Spam or Not Spam, cat or dog, etc. Classes can be called as targets/labels or categories.
Unlike regression, the output variable of Classification is a category, not a value, such as "Green or Blue",
"fruit or animal", etc. Since the Classification algorithm is a Supervised learning technique, hence it takes labeled input data, which means it contains input with the corresponding output.
In classification algorithm, a discrete output function(y) is mapped to input variable(x).
y=f(x), where y = categorical output
The best example of an ML classification algorithm is Email Spam Detector.
The main goal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly used to predict the output for the categorical data.
The algorithm which implements the classification on a dataset is known as a classifier. There are two types of Classifications:
Binary Classifier: If the classification problem has only two possible outcomes, then it is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
Multi-class Classifier: If a classification problem has more than two outcomes, then it is called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.
Binary classification in deep learning refers to the type of classification where we have two class labels - one normal and one abnormal. Some examples of binary classification use:
To detect whether email is spam or not
To determine whether or not a patient has a certain disease in medicine.
To determine whether or not quality specifications were met when it comes to QA (Quality Assurance).
For example, the normal class label would be that a patient has the disease, and the abnormal class label would be that they do not, or vice-versa.
As is with every other type of classification, it is only as good as the binary classification dataset that it has - or, in other words, the more training and data it has, the better it is.
NEW QUESTION # 27
Which command is used to install Jupyter Notebook?
- A. pip install jupyter
- B. pip install nbconvert
- C. pip install jupyter-notebook
- D. pip install notebook
Answer: A
Explanation:
Explanation
Jupyter Notebook is a web-based interactive computational environment.
The command used to install Jupyter Notebook is pip install jupyter.
The command used to start Jupyter Notebook is jupyter notebook.
NEW QUESTION # 28
All Snowpark ML modeling and preprocessing classes are in the ________ namespace?
- A. snowpark.ml.modeling
- B. snowflake.scikit.modeling
- C. snowflake.sklearn.modeling
- D. snowflake.ml.modeling
Answer: D
Explanation:
Explanation
All Snowpark ML modeling and preprocessing classes are in the snowflake.ml.modeling namespace. The Snowpark ML modules have the same name as the corresponding module from the sklearn namespace. For example, the Snowpark ML module corresponding to sklearn.calibration is snow-flake.ml.modeling.calibration.
The xgboost and lightgbm modules correspond to snowflake.ml.modeling.xgboost and snow-flake.ml.modeling.lightgbm, respectively.
Not all of the classes from scikit-learn are supported in Snowpark ML.
NEW QUESTION # 29
Which one of the following is not the key component while designing External functions within Snowflake?
- A. UDF Service
- B. Proxy Service
- C. Remote Service
- D. API Integration
Answer: A
Explanation:
Explanation
What is an External Function?
An external function calls code that is executed outside Snowflake.
The remotely executed code is known as a remote service.
Information sent to a remote service is usually relayed through a proxy service.
Snowflake stores security-related external function information in an API integration.
External Function:
An external function is a type of UDF. Unlike other UDFs, an external function does not contain its own code; instead, the external function calls code that is stored and executed outside Snowflake.
Inside Snowflake, the external function is stored as a database object that contains information that Snowflake uses to call the remote service. This stored information includes the URL of the proxy service that relays information to and from the remote service.
Remote Service:
The remotely executed code is known as a remote service.
The remote service must act like a function. For example, it must return a value.
Snowflake supports scalar external functions; the remote service must return exactly one row for each row received.
Proxy Service:
Snowflake does not call a remote service directly. Instead, Snowflake calls a proxy service, which relays the data to the remote service.
The proxy service can increase security by authenticating requests to the remote service.
The proxy service can support subscription-based billing for a remote service. For example, the proxy service can verify that a caller to the remote service is a paid subscriber.
The proxy service also relays the response from the remote service back to Snowflake.
Examples of proxy services include:
Amazon API Gateway.
Microsoft Azure API Management service.
API Integration:
An integration is a Snowflake object that provides an interface between Snowflake and third-party services.
An API integration stores information, such as security information, that is needed to work with a proxy service or remote service.
An API integration is created with the CREATE API INTEGRATION command.
Users can write and call their own remote services, or call remote services written by third parties. These remote services can be written using any HTTP server stack,including cloud serverless compute services such as AWS Lambda.
NEW QUESTION # 30
Mark the incorrect statement regarding usage of Snowflake Stream & Tasks?
- A. Snowflake automatically resizes and scales the compute resources for serverless tasks.
- B. Streams support repeatable read isolation.
- C. Snowflake ensures only one instance of a task with a schedule (i.e. a standalone task or the root task in a DAG) is executed at a given time. If a task is still running when the next scheduled execution time occurs, then that scheduled time is skipped.
- D. An standard-only stream tracks row inserts only.
Answer: D
Explanation:
Explanation
All are correct except a standard-only stream tracks row inserts only.
A standard (i.e. delta) stream tracks all DML changes to the source object, including inserts, up-dates, and deletes (including table truncates).
NEW QUESTION # 31
Consider a data frame df with columns ['A', 'B', 'C', 'D'] and rows ['r1', 'r2', 'r3']. What does the ex-pression df[lambda x : x.index.str.endswith('3')] do?
- A. Filters the row labelled r3
- B. Returns the row name r3
- C. Returns the third column
- D. Results in Error
Answer: A
Explanation:
Explanation
It will Filters the row labelled r3.
NEW QUESTION # 32
Performance metrics are a part of every machine learning pipeline, Which ones are not the performance metrics used in the Machine learning?
- A. AUM
- B. R2 (R-Squared)
- C. AU-ROC
- D. Root Mean Squared Error (RMSE)
Answer: A
Explanation:
Explanation
Every machine learning task can be broken down to either Regression or Classification, just like the performance metrics.
Metrics are used to monitor and measure the performance of a model (during training and testing), and do not need to be differentiable.
Regression metrics
Regression models have continuous output. So, we need a metric based on calculating some sort of distance between predicted and ground truth.
In order to evaluate Regression models, we'll discuss these metrics in detail:
Mean Absolute Error (MAE),
Mean Squared Error (MSE),
Root Mean Squared Error (RMSE),
R2 (R-Squared).
Mean Squared Error (MSE)
Mean squared error is perhaps the most popular metric used for regression problems. It essentially finds the average of the squared difference between the target value and the value predicted by the regression model.
Few key points related to MSE:
It's differentiable, so it can be optimized better.
It penalizes even small errors by squaring them, which essentially leads to an overestimation of how bad the model is.
Error interpretation has to be done with squaring factor(scale) in mind. For example in our Boston Housing regression problem, we got MSE=21.89 which primarily corresponds to (Prices)2.
Due to the squaring factor, it's fundamentally more prone to outliers than other metrics.
Mean Absolute Error (MAE)
Mean Absolute Error is the average of the difference between the ground truth and the predicted values.
Few key points for MAE
It's more robust towards outliers than MAE, since it doesn't exaggerate errors.
It gives us a measure of how far the predictions were from the actual output. However, since MAE uses absolute value of the residual, it doesn't give us an idea of the direction of the error, i.e. whether we're under-predicting or over-predicting the data.
Error interpretation needs no second thoughts, as it perfectly aligns with the original degree of the variable.
MAE is non-differentiable as opposed to MSE, which is differentiable.
Root Mean Squared Error (RMSE)
Root Mean Squared Error corresponds to the square root of the average of the squared difference between the target value and the value predicted by the regression model.
Few key points related to RMSE:
It retains the differentiable property of MSE.
It handles the penalization of smaller errors done by MSE by square rooting it.
Error interpretation can be done smoothly, since the scale is now the same as the random variable.
Since scale factors are essentially normalized, it's less prone to struggle in the case of outliers.
R2 Coefficient of determination
R2 Coefficient of determination actually works as a post metric, meaning it's a metric that's calcu-lated using other metrics.
The point of even calculating this coefficient is to answer the question "How much (what %) of the total variation in Y(target) is explained by the variation in X(regression line)" Few intuitions related to R2 results:
If the sum of Squared Error of the regression line is small => R2 will be close to 1 (Ideal), meaning the regression was able to capture 100% of the variance in the target variable.
Conversely, if the sum of squared error of the regression line is high=> R2 will be close to 0, meaning the regression wasn't able to capture any variance in the target variable.
You might think that the range of R2 is (0,1) but it's actually (-,1)because the ratio of squared errors of the regression line and mean can surpass the value 1 if the squared error of regression line is too high (>squared error of the mean).
Classification metrics
Classification problems are one of the world's most widely researched areas. Use cases are present in almost all production and industrial environments. Speech recognition, face recognition, textclassification - the list is endless.
Classification models have discrete output, so we need a metric that compares discrete classes in some form.
Classification Metrics evaluate a model's performance and tell you how good or bad the classification is, but each of them evaluates it in a different way.
So in order to evaluate Classification models, we'll discuss these metrics in detail:
Accuracy
Confusion Matrix (not a metric but fundamental to others)
Precision and Recall
F1-score
AU-ROC
Accuracy
Classification accuracy is perhaps the simplest metric to use and implement and is defined as the number of correct predictions divided by the total number of predictions, multiplied by 100.
We can implement this by comparing ground truth and predicted values in a loop or simply utilizing the scikit-learn module to do the heavy lifting for us (not so heavy in this case).
Confusion Matrix
Confusion Matrix is a tabular visualization of the ground-truth labels versus model predictions. Each row of the confusion matrix represents the instances in a predicted class and each column represents the instances in an actual class. Confusion Matrix is not exactly a performance metric but sort of a basis on which other metrics evaluate the results.
Each cell in the confusion matrix represents an evaluation factor. Let's understand these factors one by one:
True Positive(TP) signifies how many positive class samples your model predicted correctly.
True Negative(TN) signifies how many negative class samples your model predicted correctly.
False Positive(FP) signifies how many negative class samples your model predicted incorrectly. This factor represents Type-I error in statistical nomenclature. This error positioning in the confusion matrix depends on the choice of the null hypothesis.
False Negative(FN) signifies how many positive class samples your model predicted incorrectly. This factor represents Type-II error in statistical nomenclature. This error positioning in the confu-sion matrix also depends on the choice of the null hypothesis.
Precision
Precision is the ratio of true positives and total positives predicted
Recall/Sensitivity/Hit-Rate
A Recall is essentially the ratio of true positives to all the positives in ground truth.
Precision-Recall tradeoff
To improve your model, you can either improve precision or recall - but not both! If you try to re-duce cases of non-cancerous patients being labeled as cancerous (FN/type-II), no direct effect will take place on cancerous patients being labeled as non-cancerous.
F1-score
The F1-score metric uses a combination of precision and recall. In fact, the F1 score is the harmonic mean of the two.
AUROC (Area under Receiver operating characteristics curve)
Better known as AUC-ROC score/curves. It makes use of true positive rates(TPR) and false posi-tive rates(FPR).
NEW QUESTION # 33
Which ones are the key actions in the data collection phase of Machine learning included?
- A. Label
- B. Ingest and Aggregate
- C. Measure
- D. Probability
Answer: A,B
Explanation:
Explanation
The key actions in the data collection phase include:
Label: Labeled data is the raw data that was processed by adding one or more meaningful tags so that a model can learn from it. It will take some work to label it if such information is missing (manually or automatically).
Ingest and Aggregate: Incorporating and combining data from many data sources is part of data collection in AI.
Data collection
Collecting data for training the ML model is the basic step in the machine learning pipeline. The predictions made by ML systems can only be as good as the data on which they have been trained. Following are some of the problems that can arise in data collection:
Inaccurate data. The collected data could be unrelated to the problem statement.
Missing data. Sub-data could be missing. That could take the form of empty values in columns or missing images for some class of prediction.
Data imbalance. Some classes or categories in the data may have a disproportionately high or low number of corresponding samples. As a result, they risk being under-represented in the model.
Data bias. Depending on how the data, subjects and labels themselves are chosen, the model could propagate inherent biases on gender, politics, age or region, for example. Data bias is difficult to detect and remove.
Several techniques can be applied to address those problems:
Pre-cleaned, freely available datasets. If the problem statement (for example, image classification, object recognition) aligns with a clean, pre-existing, properly formulated dataset, then take ad-vantage of existing, open-source expertise.
Web crawling and scraping. Automated tools, bots and headless browsers can crawl and scrape websites for data.
Private data. ML engineers can create their own data. This is helpful when the amount of data required to train the model is small and the problem statement is too specific to generalize over an open-source dataset.
Custom data. Agencies can create or crowdsource the data for a fee.
NEW QUESTION # 34
Which is the visual depiction of data through the use of graphs, plots, and informational graphics?
- A. Data visualization
- B. Data Virtualization
- C. Data Interpretation
- D. Data Mining
Answer: D
Explanation:
Explanation
Data visualization is the visual depiction of data through the use of graphs, plots, and informational graphics.
Its practitioners use statistics and data science to conveythe meaning behind data in ethical and accurate ways.
NEW QUESTION # 35
As Data Scientist looking out to use Reader account, Which ones are the correct considerations about Reader Accounts for Third-Party Access?
- A. Reader accounts (formerly known as "read-only accounts") provide a quick, easy, and cost-effective way to share data without requiring the consumer to become a Snowflake customer.
- B. Users in a reader account can query data that has been shared with the reader account, but cannot perform any of the DML tasks that are allowed in a full account, such as data loading, insert, update, and similar data manipulation operations.
- C. Each reader account belongs to the provider account that created it.
- D. Data sharing is only possible between Snowflake accounts.
Answer: D
Explanation:
Explanation
Data sharing is only supported between Snowflake accounts. As a data provider, you might want to share data with a consumer who does not already have a Snowflake account or is not ready to be-come a licensed Snowflake customer.
To facilitate sharing data with these consumers, you can create reader accounts. Reader accounts (formerly known as "read-only accounts") provide a quick, easy, and cost-effective way to share data without requiring the consumer to become a Snowflake customer.
Each reader account belongs to the provider account that created it. As a provider, you use shares to share databases with reader accounts; however, a reader account can only consume data from the provider account that created it.
So, Data Sharing is possible between Snowflake & Non-snowflake accounts via Reader Account.
NEW QUESTION # 36
Which one is not the feature engineering techniques used in ML data science world?
- A. Statistical
- B. Imputation
- C. Binning
- D. One hot encoding
Answer: A
Explanation:
Explanation
Feature engineering is the pre-processing step of machine learning, which is used to transform raw data into features that can be used for creating a predictive model using Machine learning or statistical Modelling.
What is a feature?
Generally, all machine learning algorithms take input data to generate the output. The input data re-mains in a tabular form consisting of rows (instances or observations) and columns (variable or at-tributes), and these attributes are often known as features. For example, an image is an instance in computer vision, but a line in the image could be the feature. Similarly, in NLP, a document can be an observation, and the word count could be the feature. So, we can say a feature is an attribute that impacts a problem or is useful for the problem.
What is Feature Engineering?
Feature engineering is the pre-processing step of machine learning, which extracts features from raw data. It helps to represent an underlying problem to predictive models in a better way, which as a result, improve the accuracy of the model for unseen data. The predictive model contains predictor variables and an outcome variable, and while the feature engineering process selects the most useful predictor variables for the model.
Some of the popular feature engineering techniques include:
1. Imputation
Feature engineering deals with inappropriate data, missing values,human interruption, general errors, insufficient data sources, etc. Missing values within the dataset highly affect the performance of the algorithm, and to deal with them "Imputation" technique is used. Imputation is responsible for handling irregularities within the dataset.
For example, removing the missing values from the complete row or complete column by a huge percentage of missing values. But at the same time, to maintain the data size, it is required to impute the missing data, which can be done as:
For numerical data imputation, a default value can be imputed in a column, and missing values can be filled with means or medians of the columns.
For categorical data imputation, missing values can be interchanged with the maximum occurred value in a column.
2. Handling Outliers
Outliers are the deviated values or data points that are observed too away from other data points in such a way that they badly affect the performance of the model. Outliers can be handled with this feature engineering technique. This technique first identifies the outliers and then remove them out.
Standard deviation can be used to identify the outliers. For example, each value within a space has a definite to an average distance, but if a value is greater distant than acertain value, it can be considered as an outlier.
Z-score can also be used to detect outliers.
3. Log transform
Logarithm transformation or log transform is one of the commonly used mathematical techniques in machine learning. Log transform helps in handling the skewed data, and it makes the distribution more approximate to normal after transformation. It also reduces the effects of outliers on the data, as because of the normalization of magnitude differences, a model becomes much robust.
4. Binning
In machine learning, overfitting is one of the main issues that degrade the performance of the model and which occurs due to a greater number of parameters and noisydata. However, one of the popular techniques of feature engineering, "binning", can be used to normalize the noisy data. This process involves segmenting different features into bins.
5. Feature Split
As the name suggests, feature split is the process of splitting features intimately into two or more parts and performing to make new features. This technique helps the algorithms to better understand and learn the patterns in the dataset.
The feature splitting process enables the new features to be clustered and binned, which results in extracting useful information and improving the performance of the data models.
6. One hot encoding
One hot encoding is the popular encoding technique in machine learning. It is a technique that converts the categorical data in a form so that they can be easily understood by machine learning algorithms and hence can make a good prediction. It enables group theof categorical data without losing any information.
NEW QUESTION # 37
Which one is not Types of Feature Scaling?
- A. Standard Scaling
- B. Robust Scaling
- C. Economy Scaling
- D. Min-Max Scaling
Answer: D
Explanation:
ExplanationFeature Scaling
Feature Scaling is the process of transforming the features so that they have a similar scale. This is important in machine learning because the scale of the features can affect the performance of the model.
Types of Feature Scaling:
Min-Max Scaling: Rescaling the features to a specific range, such as between 0 and 1, by subtracting the minimum value and dividing by the range.
Standard Scaling: Rescaling the features to have a mean of 0 and a standard deviation of 1 by subtracting the mean and dividing by the standard deviation.
Robust Scaling: Rescaling the features to be robust to outliers by dividing them by the interquartile range.
Benefits of Feature Scaling:
Improves Model Performance: By transforming the features to have a similar scale, the model can learn from all features equally and avoid being dominated by a few large features.
Increases Model Robustness: By transforming the features to be robust to outliers, the model can become more robust to anomalies.
Improves Computational Efficiency: Many machine learning algorithms, such as k-nearest neighbors, are sensitive to the scale of the features and perform better with scaled features.
Improves Model Interpretability: By transforming the features to have a similar scale, it can be easier to understand the model's predictions.
NEW QUESTION # 38
Which one is incorrect understanding about Providers of Direct share?
- A. You can create as many shares as you want, and add as many accounts to a share as you want.
- B. A data provider is any Snowflake account that creates shares and makes them available to other Snowflake accounts to consume.
- C. If you want to provide a share to many accounts, you can do the same via Direct Share.
- D. As a data provider, you share a database with one or more Snowflake accounts.
Answer: C
Explanation:
Explanation
If you want to provide a share to many accounts, you might want to use a listing or a data ex-change.
NEW QUESTION # 39
Select the Data Science Tools which are known to provide native connectivity to Snowflake?
- A. HEX
- B. Denodo
- C. DvSUM
- D. DiYotta
Answer: A
Explanation:
Explanation
Hex - collaborative data science and analytics platform
Denodo - data virtualization and federation platform
DvSum - data catalog and data intelligence platform
Diyotta - data integration and migration
NEW QUESTION # 40
You are training a binary classification model to support admission approval decisions for a college degree program.
How can you evaluate if the model is fair, and doesn't discriminate based on ethnicity?
- A. None of the above.
- B. Compare disparity between selection rates and performance metrics across ethnicities.
- C. Evaluate each trained model with a validation datasetand use the model with the highest accuracy score.
- D. Remove the ethnicity feature from the training dataset.
Answer: B
Explanation:
Explanation
By using ethnicity as a sensitive field, and comparing disparity between selection rates and performance metrics for each ethnicity value, you can evaluate the fairness of the model.
NEW QUESTION # 41
Mark the Incorrect understanding of Data Scientist about Streams?
- A. Streams itself does not contain any table data.
- B. Streams on views support both local views and views shared using Snowflake Secure Data Sharing, including secure views.
- C. Streams can track changes in materialized views.
- D. Streams do not support repeatable read isolation.
Answer: C,D
Explanation:
Explanation
Streams on views support both local views and views shared using Snowflake Secure Data Sharing, including secure views. Currently, streams cannot track changes in materialized views.
stream itself does not contain any table data. A stream only stores an offset for the source object and returns CDC records by leveraging the versioning history for the source object. When the first stream for a table is created, several hidden columns are added to the source table and begin storing change tracking metadata.
These columns consume a small amount of storage. The CDC records returned when querying a stream rely on a combination of the offset stored in the stream and the change tracking metadata stored in the table. Note that for streams on views, change tracking must be enabled explicitly for the view and underlying tables to add the hidden columns to these tables.
Streams support repeatable read isolation. In repeatable read mode, multiple SQL statements within a transaction see the same set of records in a stream. This differs from the read committed mode supported for tables, in which statements see any changes made by previous statements executed within the same transaction, even though those changes are not yet committed.
The delta records returned by streams in a transaction is the range from the current position of the stream until the transaction start time. The stream position advances to the transaction start time if the transaction commits; otherwise it stays at the same position.
NEW QUESTION # 42
What is the formula for measuring skewness in a dataset?
- A. (3(MEAN - MEDIAN))/ STANDARD DEVIATION
- B. (MEAN - MODE)/ STANDARD DEVIATION
- C. MODE - MEDIAN
- D. MEAN - MEDIAN
Answer: A
Explanation:
Explanation
Since the normal curve is symmetric about its mean, its skewness is zero. This is a theoretical expla-nation for mathematical proofs, you can refer to books or websites that speak on the same in detail.
NEW QUESTION # 43
......
Authentic DSA-C02 Dumps With 100% Passing Rate Practice Tests Dumps: https://www.trainingquiz.com/DSA-C02-practice-quiz.html
Snowflake DSA-C02 Real Exam Questions Guaranteed Updated Dump from TrainingQuiz: https://drive.google.com/open?id=1BKM9Y09KcSqC-0mJ2ivxZ6gCfM-wvAEh

