# SOLVED CS614 Quiz No. 2 Solution and Discussion

• Quiz Start Time: 04:09 PM Time Left 49
sec(s)

Question # 1 of 10 ( Start time: 04:09:48 PM ) Total Marks: 1
In nested-loop join case, if there are ‘M’ rows in outer table and ‘N’ rows in inner table, time complexity is
Select correct option:

O(MN) (Correct)

Quiz Start Time: 04:09 PM Time Left 35
sec(s)

Question # 2 of 10 ( Start time: 04:11:20 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:

importance of hidden perameters discovered (Correct)

Quiz Start Time: 04:09 PM Time Left 56
sec(s)

Question # 3 of 10 ( Start time: 04:12:51 PM ) Total Marks: 1
Classification consists of examining the properties of a newly presented observation and assigning it to a predefined ____________.
Select correct option:

Class (Correct)

Quiz Start Time: 04:09 PM Time Left 25
sec(s)

Question # 4 of 10 ( Start time: 04:13:44 PM ) Total Marks: 1
The optimizer uses a hash join to join two tables if they are joined using an equijoin and
Select correct option:

Large amount of data need to be joined (Correct)
A large amount of data needs to be joined.
A large portion of the table needs to be joined.

Quiz Start Time: 04:09 PM Time Left 20
sec(s)

Question # 5 of 10 ( Start time: 04:15:07 PM ) Total Marks: 1
In context of nested-loop join, actual number of matching rows returned as a result of the join would be ______ of the order of tables
Select correct option:

Independent (Correct)

Quiz Start Time: 04:09 PM Time Left 13
sec(s)

Question # 6 of 10 ( Start time: 04:16:36 PM ) Total Marks: 1
Normally the input data structure (a database table) for a data mining algorithm:
Select correct option:

Quiz Start Time: 04:09 PM Time Left 12
sec(s)

Question # 7 of 10 ( Start time: 04:18:07 PM ) Total Marks: 1
Mining multi dimensional databases allow users to:
Select correct option:

Analyze Data (Correct)

Quiz Start Time: 04:09 PM Time Left 24
sec(s)

Question # 8 of 10 ( Start time: 04:19:38 PM ) Total Marks: 1
________ refers to the overall process of discovering useful knowledge from data and data mining refers to a particular step in this process.
Select correct option:

Knowledge discovery in database (Correct)

Quiz Start Time: 04:09 PM Time Left 17
sec(s)

Question # 9 of 10 ( Start time: 04:21:01 PM ) Total Marks: 1
In case of nested-loop join, Inner table is accessed _____ for each qualifying row (or touple) in outer table
Select correct option:

One Time (Correct)

Quiz Start Time: 04:09 PM Time Left 10
sec(s)

Question # 10 of 10 ( Start time: 04:22:32 PM ) Total Marks: 1
In contrast to data mining, statistics is ______ driven.
knowledge (Correct)

Quiz Start Time: 08:50 PM Time Left 55
sec(s)

Question # 1 of 10 ( Start time: 08:50:34 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:

The primary key of table
The index location of the record
Importance of hidden patterns discovered (Answer)
Numerical or string measure assigned to an attribute

Quiz Start Time: 08:50 PM Time Left 22
sec(s)

Question # 2 of 10 ( Start time: 08:51:23 PM ) Total Marks: 1
Data mining is all about:
Select correct option:

Knowledge discovery in database
Finding hidden patterns in data
Finding relationships in data
All of the given options ( may be Answer)

Quiz Start Time: 08:50 PM Time Left 19
sec(s)

Question # 3 of 10 ( Start time: 08:52:41 PM ) Total Marks: 1
In contrast to statistics, data mining is ______ driven.
Select correct option:

Knowledge
Human
Database

Quiz Start Time: 08:50 PM Time Left 55
sec(s)

Question # 4 of 10 ( Start time: 08:53:56 PM ) Total Marks: 1
Mining multi dimensional databases allow users to:
Select correct option:

Categorize the data
Analyze the data (Answer)
Summarize the data
All of the given options

Quiz Start Time: 08:50 PM Time Left 18
sec(s)

Question # 5 of 10 ( Start time: 08:54:40 PM ) Total Marks: 1
In context of data mining definition, the term “nontrivial” means:
Select correct option:

Discovering information is a simple task
Discovering information is a complex task
We can not discover information
We simply find things rather than discovery (Answer)

Quiz Start Time: 08:50 PM Time Left 1
sec(s)

Question # 6 of 10 ( Start time: 08:55:56 PM ) Total Marks: 1
Identify the TRUE statement:
Select correct option:

Clustering is unsupervised learning and classification is supervised learning
Clustering is supervised learning and classification is unsupervised learning
Both clustering and classification are unsupervised learning
Both clustering and classification are supervised learning

Quiz Start Time: 08:50 PM Time Left 27
sec(s)

Question # 7 of 10 ( Start time: 08:57:26 PM ) Total Marks: 1
In ________learning you don’t know the number of clusters and no idea about their attributes.
Select correct option:

Supervised learning
Multi Dimension modeling
None of the given options

Quiz Start Time: 08:50 PM Time Left 50
sec(s)

Question # 8 of 10 ( Start time: 08:58:39 PM ) Total Marks: 1
In context of clustering, the term “distance” means:
Select correct option:

Similarity/dissimilarly of records (Answer)
The difference between the primary keys of two records
The relation of a record with corresponding record in child table
None of the given options

Quiz Start Time: 08:50 PM Time Left 15
sec(s)

Question # 9 of 10 ( Start time: 08:59:22 PM ) Total Marks: 1
In data mining, initially you _____ what you are looking for.
Select correct option:

Know
May or may not know
None of the given options

Quiz Start Time: 08:50 PM Time Left 52
sec(s)

Question # 10 of 10 ( Start time: 09:00:41 PM ) Total Marks: 1
The optimizer uses a hash join to join two tables if they are joined using an equijoin and
Select correct option:

Outer table has less number of rows
Inner table has less number of rows
Cardinality of tables is equal
Large amount of data needs to be joined (Answer)

• In contrast to data mining, statistics is ______ driven. CS614

• Which of the following is NOT one of the methodologies for Data Warehouse project development?

System Driven

• Which of the following is NOT one of the three parallel tracks in Kimballs approach? CS614

• Implementation of a data warehouse requires ________ activities

Highly integrated

Loosely integrated

Tightly decoupled

None of the given

• An effective user education program includes, among others, the following guideline(s):

• “If resources increase in proportion to increase in data size, time is constant”. The statement refers to:

• Data mining is all about:

• In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that:

• Vertically wide data means:

• In context of the most fundamental data warehouse life cycle model, which of the following is NOT one of the data warehouse design activities?
Select correct option:
End-user interviews and re-interviews
Source system cataloguing
Definition of key performance indicators
System vision development

• Quiz Start Time: 04:09 PM Time Left 49
sec(s)

Question # 1 of 10 ( Start time: 04:09:48 PM ) Total Marks: 1
In nested-loop join case, if there are ‘M’ rows in outer table and ‘N’ rows in inner table, time complexity is
Select correct option:

O(MN) (Correct)

Quiz Start Time: 04:09 PM Time Left 35
sec(s)

Question # 2 of 10 ( Start time: 04:11:20 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:

importance of hidden perameters discovered (Correct)

Quiz Start Time: 04:09 PM Time Left 56
sec(s)

Question # 3 of 10 ( Start time: 04:12:51 PM ) Total Marks: 1
Classification consists of examining the properties of a newly presented observation and assigning it to a predefined ____________.
Select correct option:

Class (Correct)

Quiz Start Time: 04:09 PM Time Left 25
sec(s)

Question # 4 of 10 ( Start time: 04:13:44 PM ) Total Marks: 1
The optimizer uses a hash join to join two tables if they are joined using an equijoin and
Select correct option:

Large amount of data need to be joined (Correct)
A large amount of data needs to be joined.
A large portion of the table needs to be joined.

Quiz Start Time: 04:09 PM Time Left 20
sec(s)

Question # 5 of 10 ( Start time: 04:15:07 PM ) Total Marks: 1
In context of nested-loop join, actual number of matching rows returned as a result of the join would be ______ of the order of tables
Select correct option:

Independent (Correct)

Quiz Start Time: 04:09 PM Time Left 13
sec(s)

Question # 6 of 10 ( Start time: 04:16:36 PM ) Total Marks: 1
Normally the input data structure (a database table) for a data mining algorithm:
Select correct option:

Quiz Start Time: 04:09 PM Time Left 12
sec(s)

Question # 7 of 10 ( Start time: 04:18:07 PM ) Total Marks: 1
Mining multi dimensional databases allow users to:
Select correct option:

Analyze Data (Correct)

Quiz Start Time: 04:09 PM Time Left 24
sec(s)

Question # 8 of 10 ( Start time: 04:19:38 PM ) Total Marks: 1
________ refers to the overall process of discovering useful knowledge from data and data mining refers to a particular step in this process.
Select correct option:

Knowledge discovery in database (Correct)

Quiz Start Time: 04:09 PM Time Left 17
sec(s)

Question # 9 of 10 ( Start time: 04:21:01 PM ) Total Marks: 1
In case of nested-loop join, Inner table is accessed _____ for each qualifying row (or touple) in outer table
Select correct option:

One Time (Correct)

Quiz Start Time: 04:09 PM Time Left 10
sec(s)

Question # 10 of 10 ( Start time: 04:22:32 PM ) Total Marks: 1
In contrast to data mining, statistics is ______ driven.
knowledge (Correct)

Quiz Start Time: 08:50 PM Time Left 55
sec(s)

Question # 1 of 10 ( Start time: 08:50:34 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:

The primary key of table
The index location of the record
Importance of hidden patterns discovered (Answer)
Numerical or string measure assigned to an attribute

Quiz Start Time: 08:50 PM Time Left 22
sec(s)

Question # 2 of 10 ( Start time: 08:51:23 PM ) Total Marks: 1
Data mining is all about:
Select correct option:

Knowledge discovery in database
Finding hidden patterns in data
Finding relationships in data
All of the given options ( may be Answer)

Quiz Start Time: 08:50 PM Time Left 19
sec(s)

Question # 3 of 10 ( Start time: 08:52:41 PM ) Total Marks: 1
In contrast to statistics, data mining is ______ driven.
Select correct option:

Knowledge
Human
Database

Quiz Start Time: 08:50 PM Time Left 55
sec(s)

Question # 4 of 10 ( Start time: 08:53:56 PM ) Total Marks: 1
Mining multi dimensional databases allow users to:
Select correct option:

Categorize the data
Analyze the data (Answer)
Summarize the data
All of the given options

Quiz Start Time: 08:50 PM Time Left 18
sec(s)

Question # 5 of 10 ( Start time: 08:54:40 PM ) Total Marks: 1
In context of data mining definition, the term “nontrivial” means:
Select correct option:

Discovering information is a simple task
Discovering information is a complex task
We can not discover information
We simply find things rather than discovery (Answer)

Quiz Start Time: 08:50 PM Time Left 1
sec(s)

Question # 6 of 10 ( Start time: 08:55:56 PM ) Total Marks: 1
Identify the TRUE statement:
Select correct option:

Clustering is unsupervised learning and classification is supervised learning
Clustering is supervised learning and classification is unsupervised learning
Both clustering and classification are unsupervised learning
Both clustering and classification are supervised learning

Quiz Start Time: 08:50 PM Time Left 27
sec(s)

Question # 7 of 10 ( Start time: 08:57:26 PM ) Total Marks: 1
In ________learning you don’t know the number of clusters and no idea about their attributes.
Select correct option:

Supervised learning
Multi Dimension modeling
None of the given options

Quiz Start Time: 08:50 PM Time Left 50
sec(s)

Question # 8 of 10 ( Start time: 08:58:39 PM ) Total Marks: 1
In context of clustering, the term “distance” means:
Select correct option:

Similarity/dissimilarly of records (Answer)
The difference between the primary keys of two records
The relation of a record with corresponding record in child table
None of the given options

Quiz Start Time: 08:50 PM Time Left 15
sec(s)

Question # 9 of 10 ( Start time: 08:59:22 PM ) Total Marks: 1
In data mining, initially you _____ what you are looking for.
Select correct option:

Know
May or may not know
None of the given options

Quiz Start Time: 08:50 PM Time Left 52
sec(s)

Question # 10 of 10 ( Start time: 09:00:41 PM ) Total Marks: 1
The optimizer uses a hash join to join two tables if they are joined using an equijoin and
Select correct option:

Outer table has less number of rows
Inner table has less number of rows
Cardinality of tables is equal
Large amount of data needs to be joined (Answer)

• Quiz Start Time: 04:09 PM Time Left 49
sec(s)

Question # 1 of 10 ( Start time: 04:09:48 PM ) Total Marks: 1
In nested-loop join case, if there are ‘M’ rows in outer table and ‘N’ rows in inner table, time complexity is
Select correct option:

O(MN) (Correct)

Quiz Start Time: 04:09 PM Time Left 35
sec(s)

Question # 2 of 10 ( Start time: 04:11:20 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:

importance of hidden perameters discovered (Correct)

Quiz Start Time: 04:09 PM Time Left 56
sec(s)

Question # 3 of 10 ( Start time: 04:12:51 PM ) Total Marks: 1
Classification consists of examining the properties of a newly presented observation and assigning it to a predefined ____________.
Select correct option:

Class (Correct)

Quiz Start Time: 04:09 PM Time Left 25
sec(s)

Question # 4 of 10 ( Start time: 04:13:44 PM ) Total Marks: 1
The optimizer uses a hash join to join two tables if they are joined using an equijoin and
Select correct option:

Large amount of data need to be joined (Correct)
A large amount of data needs to be joined.
A large portion of the table needs to be joined.

Quiz Start Time: 04:09 PM Time Left 20
sec(s)

Question # 5 of 10 ( Start time: 04:15:07 PM ) Total Marks: 1
In context of nested-loop join, actual number of matching rows returned as a result of the join would be ______ of the order of tables
Select correct option:

Independent (Correct)

Quiz Start Time: 04:09 PM Time Left 13
sec(s)

Question # 6 of 10 ( Start time: 04:16:36 PM ) Total Marks: 1
Normally the input data structure (a database table) for a data mining algorithm:
Select correct option:

Quiz Start Time: 04:09 PM Time Left 12
sec(s)

Question # 7 of 10 ( Start time: 04:18:07 PM ) Total Marks: 1
Mining multi dimensional databases allow users to:
Select correct option:

Analyze Data (Correct)

Quiz Start Time: 04:09 PM Time Left 24
sec(s)

Question # 8 of 10 ( Start time: 04:19:38 PM ) Total Marks: 1
________ refers to the overall process of discovering useful knowledge from data and data mining refers to a particular step in this process.
Select correct option:

Knowledge discovery in database (Correct)

Quiz Start Time: 04:09 PM Time Left 17
sec(s)

Question # 9 of 10 ( Start time: 04:21:01 PM ) Total Marks: 1
In case of nested-loop join, Inner table is accessed _____ for each qualifying row (or touple) in outer table
Select correct option:

One Time (Correct)

Quiz Start Time: 04:09 PM Time Left 10
sec(s)

Question # 10 of 10 ( Start time: 04:22:32 PM ) Total Marks: 1
In contrast to data mining, statistics is ______ driven.
knowledge (Correct)

Quiz Start Time: 08:50 PM Time Left 55
sec(s)

Question # 1 of 10 ( Start time: 08:50:34 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:

The primary key of table
The index location of the record
Importance of hidden patterns discovered (Answer)
Numerical or string measure assigned to an attribute

Quiz Start Time: 08:50 PM Time Left 22
sec(s)

Question # 2 of 10 ( Start time: 08:51:23 PM ) Total Marks: 1
Data mining is all about:
Select correct option:

Knowledge discovery in database
Finding hidden patterns in data
Finding relationships in data
All of the given options ( may be Answer)

Quiz Start Time: 08:50 PM Time Left 19
sec(s)

Question # 3 of 10 ( Start time: 08:52:41 PM ) Total Marks: 1
In contrast to statistics, data mining is ______ driven.
Select correct option:

Knowledge
Human
Database

Quiz Start Time: 08:50 PM Time Left 55
sec(s)

Question # 4 of 10 ( Start time: 08:53:56 PM ) Total Marks: 1
Mining multi dimensional databases allow users to:
Select correct option:

Categorize the data
Analyze the data (Answer)
Summarize the data
All of the given options

Quiz Start Time: 08:50 PM Time Left 18
sec(s)

Question # 5 of 10 ( Start time: 08:54:40 PM ) Total Marks: 1
In context of data mining definition, the term “nontrivial” means:
Select correct option:

Discovering information is a simple task
Discovering information is a complex task
We can not discover information
We simply find things rather than discovery (Answer)

Quiz Start Time: 08:50 PM Time Left 1
sec(s)

Question # 6 of 10 ( Start time: 08:55:56 PM ) Total Marks: 1
Identify the TRUE statement:
Select correct option:

Clustering is unsupervised learning and classification is supervised learning
Clustering is supervised learning and classification is unsupervised learning
Both clustering and classification are unsupervised learning
Both clustering and classification are supervised learning

Quiz Start Time: 08:50 PM Time Left 27
sec(s)

Question # 7 of 10 ( Start time: 08:57:26 PM ) Total Marks: 1
In ________learning you don’t know the number of clusters and no idea about their attributes.
Select correct option:

Supervised learning
Multi Dimension modeling
None of the given options

Quiz Start Time: 08:50 PM Time Left 50
sec(s)

Question # 8 of 10 ( Start time: 08:58:39 PM ) Total Marks: 1
In context of clustering, the term “distance” means:
Select correct option:

Similarity/dissimilarly of records (Answer)
The difference between the primary keys of two records
The relation of a record with corresponding record in child table
None of the given options

Quiz Start Time: 08:50 PM Time Left 15
sec(s)

Question # 9 of 10 ( Start time: 08:59:22 PM ) Total Marks: 1
In data mining, initially you _____ what you are looking for.
Select correct option:

Know
May or may not know
None of the given options

Quiz Start Time: 08:50 PM Time Left 52
sec(s)

Question # 10 of 10 ( Start time: 09:00:41 PM ) Total Marks: 1
The optimizer uses a hash join to join two tables if they are joined using an equijoin and
Select correct option:

Outer table has less number of rows
Inner table has less number of rows
Cardinality of tables is equal
Large amount of data needs to be joined (Answer)

• 1_ In context of data parallelism, the work done by query processor should be:

Maximum

2_ _______ do not (typically) keep the index values in stored order

Hash based index

3_ if every key in the data is represented in the index file then it is called

Dense index

4_ In context of bitmap index, the length of the bit vector is:

the number of records in the base table

5_ In context of joining tables, the join condition is specified in _____ clause.

WHERE

6_ A join is identified by multiple tables in the _____ clause.

From

7_ Parallelism can be exploited, if there is:

All of the given options

8_ In ____ index, the ith bit is set to “1” if the ith row of the base table has the value for the indexed column.

Bitmap index

9_ As the number of processors increase, the speedup should also increase. thus we should have linear speedup. Which of the following is NOT the one of the barriers

to achieve this linear speed-up?

Amdah’l Law not sure

10_ Bitmap index is appropriate for:

Low cardinality data

Q1: in context of nested-loop join, actual number os matching rows returned as a result of the join would be ________ of the order of tables.
Independent.

Q2: Which of the following is NOT one of the parallel hardware architecture?
Shared Memory

Q3: If resources increase in proportion to increase in data size. time is constant’. The statement refers to:
Scale-Up

Q4: If every key in the data ﬁle is represented in the index ﬁle then it is called?
Dense Index

Q5: In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that.
All

Q6: In nested-loop join case, if there are ‘M’ rows in outer table and N rows in inner table, time complexity is.
o(MN)

Q7: The goal of__________ is to look at as few blocks as possib le to find the matching records(s).
Indexing

Q8: Parallelism can be exploited, if there is.
All of the given options

Q9: If we apply Run Length Encoding on the input “11001100”, the output will be.
21#20#21#20

Q10: Which of the following is NOT one of the variants of Nested-loop join?
Binary index nested-loop join.

Q11: In context of data parallelism, the work done by query processor should be:
Maximum.

Q12: ___________ do not (typically) keep the index values in sorted oreder
Hash based Index

Q13: if every key the data file is represented in the index file then it is called.
Dense Index

Q14: In context of bitmap index, the length of the bit vector is:
The number of records in the base table

Q15; In context of joining tables, the join condition is specified in ______ clause:
Where

Q16: A join is identified by multiple tables in the________ clause.
From

Q17: Parallelism can exploited, if there is
All of the given options

Q18: In ________ index, the ith bit is set to “1” if the ith row of the base table has the value for the index column
Bitmap index

Q19: As the number of processors increase, the speedup should also increase. Thus we should have linear speedup. Which of the following is NOT one of the barriers to achieve this linear speed-up?
Amdahl’ Law

Q20: Bitmap index is appropriate for:
Low cardinality data

Q21: If a task takes “T” time units to execute on a single data item, then execution of the task on “N” data items will take______ time units?
N*T

Q22: _________ lists each term in the collection only once and then shows a list of all the documents the contain the given term.
Inverted index

Q23: “More resources means proportionally less time for given amount of data”. The statement refers to:
Speed-UP

Q24: In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that:
All of the given option

Q25: In context of bitmap index, the length of the bit vector is
the number of records in the base table.

Q26: One of the preconditions to decide about operations to be parallelized is that:
Operation can be implemented independent of each other

Q27: A_________ index, if fits in the memory, costs only one disk I/O access to locate a record given a key.
Dense Index

Q28: In context of nested-loop join, actual number of matching rows returned as a result of the join would be ___ of the order of tables
Independent

Q29: __________ refers to “ Parallelexectution of single data operation across multiple partitions of data”
Data Parallelism.

A join is identified by multiple tables in the _ FROM ___ clause

In context of joining tables, the join condition is specified in _ WHERE ___ clause

The goal of ______ ing Goal _____ is to look at as few blocks as possible to find the matching records(s).

__ Sparse Index _____ index uses even less space than __ dense ____ index, but the block has to be searched, even for unsuccessful searches.

In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that:

If we apply Run Length Encoding on the input “11001100”, the output will be:

In B-tree index, the lowest level index blocks are called leaf blocks, and these blocks contain:

every indexed data value and a corresponding ROWID

___ Sparse Index ___ index stores first value in each block in the sequential file and a pointer to the block

1_ In context of data parallelism, the work done by query processor should be:

Maximum

2_ _______ do not (typically) keep the index values in stored order

Hash based index

3_ if every key in the data is represented in the index file then it is called

Dense index

4_ In context of bitmap index, the length of the bit vector is:

the number of records in the base table

5_ In context of joining tables, the join condition is specified in _____ clause.

WHERE

6_ A join is identified by multiple tables in the _____ clause.

From

7_ Parallelism can be exploited, if there is:

All of the given options

8_ In ____ index, the ith bit is set to “1” if the ith row of the base table has the value for the indexed column.

Bitmap index

9_ As the number of processors increase, the speedup should also increase. thus we should have linear speedup. Which of the following is NOT the one of the barriers

to achieve this linear speed-up?

Amdah’l Law not sure

10_ Bitmap index is appropriate for:

Low cardinality data

Q1: in context of nested-loop join, actual number os matching rows returned as a result of the join would be ________ of the order of tables.
Independent.

Q2: Which of the following is NOT one of the parallel hardware architecture?
Shared Memory

Q3: If resources increase in proportion to increase in data size. time is constant’. The statement refers to:
Scale-Up

Q4: If every key in the data ﬁle is represented in the index ﬁle then it is called?
Dense Index

Q5: In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that.
All

Q6: In nested-loop join case, if there are ‘M’ rows in outer table and N rows in inner table, time complexity is.
o(MN)

Q7: The goal of__________ is to look at as few blocks as possib le to find the matching records(s).
Indexing

Q8: Parallelism can be exploited, if there is.
All of the given options

Q9: If we apply Run Length Encoding on the input “11001100”, the output will be.
21#20#21#20

Q10: Which of the following is NOT one of the variants of Nested-loop join?
Binary index nested-loop join.

Q11: In context of data parallelism, the work done by query processor should be:
Maximum.

Q12: ___________ do not (typically) keep the index values in sorted oreder
Hash based Index

Q13: if every key the data file is represented in the index file then it is called.
Dense Index

Q14: In context of bitmap index, the length of the bit vector is:
The number of records in the base table

Q15; In context of joining tables, the join condition is specified in ______ clause:
Where

Q16: A join is identified by multiple tables in the________ clause.
From

Q17: Parallelism can exploited, if there is
All of the given options

Q18: In ________ index, the ith bit is set to “1” if the ith row of the base table has the value for the index column
Bitmap index

Q19: As the number of processors increase, the speedup should also increase. Thus we should have linear speedup. Which of the following is NOT one of the barriers to achieve this linear speed-up?
Amdahl’ Law

Q20: Bitmap index is appropriate for:
Low cardinality data

Q21: If a task takes “T” time units to execute on a single data item, then execution of the task on “N” data items will take______ time units?
N*T

Q22: _________ lists each term in the collection only once and then shows a list of all the documents the contain the given term.
Inverted index

Q23: “More resources means proportionally less time for given amount of data”. The statement refers to:
Speed-UP

Q24: In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that:
All of the given option

Q25: In context of bitmap index, the length of the bit vector is
the number of records in the base table.

Q26: One of the preconditions to decide about operations to be parallelized is that:
Operation can be implemented independent of each other

Q27: A_________ index, if fits in the memory, costs only one disk I/O access to locate a record given a key.
Dense Index

Q28: In context of nested-loop join, actual number of matching rows returned as a result of the join would be ___ of the order of tables
Independent

Q29: __________ refers to “ Parallelexectution of single data operation across multiple partitions of data”
Data Parallelism.

A join is identified by multiple tables in the _ FROM ___ clause

In context of joining tables, the join condition is specified in _ WHERE ___ clause

The goal of ______ ing Goal _____ is to look at as few blocks as possible to find the matching records(s).

__ Sparse Index _____ index uses even less space than __ dense ____ index, but the block has to be searched, even for unsuccessful searches.

In context of data parallelism, to get a speed-up of N with N partitions, it must be ensured that:

If we apply Run Length Encoding on the input “11001100”, the output will be:

In B-tree index, the lowest level index blocks are called leaf blocks, and these blocks contain:

every indexed data value and a corresponding ROWID

___ Sparse Index ___ index stores first value in each block in the sequential file and a pointer to the block

• Question # 1 of 10 ( Start time: 10:37:17 PM ) Total Marks: 1
In context of data mining definition, the term “nontrivial” means:
Select correct option:
Discovering information is a simple task
Discovering information is a complex task (Correct)
We can not discover information
We simply find things rather than discovery

Question # 2 of 10 ( Start time: 10:37:38 PM ) Total Marks: 1
Identify the TRUE statement:
Select correct option:
Clustering is unsupervised learning and classification is supervised learning (Correct)
Clustering is supervised learning and classification is unsupervised learning
Both clustering and classification are unsupervised learning
Both clustering and classification are supervised learning

Question # 3 of 10 ( Start time: 10:38:00 PM ) Total Marks: 1
Waterfall model is appropriate when
Select correct option:
When the budget is low
When the deadline is strict
When resources are limited
Requirements are clearly defined (Correct)

Question # 4 of 10 ( Start time: 10:38:22 PM ) Total Marks: 1
Which of the following is the most ignored step during data warehouse development
Select correct option:
The requirement verification
The vision definition
Schema validation
Success criteria development (Correct)

Question # 5 of 10 ( Start time: 10:38:58 PM ) Total Marks: 1
In ______ phase of a fundamental data warehouse life cycle model, a working model of data warehouse is deployed for a selective set of users
Select correct option:
Design
Prototype (Correct)
Deployment
Operation

Question # 6 of 10 ( Start time: 10:39:19 PM ) Total Marks: 1
One of the drawbacks of waterfall model is that:
Select correct option:
Customers can not review the product during development
It does not work when the resources are limited
It does not define the project timeline/schedule
All of the given options (Correct)

Question # 7 of 10 ( Start time: 10:39:38 PM ) Total Marks: 1
Identify the TRUE statement:
Select correct option:
The data value increases as volume decreases (Correct)
The data value decreases as the volume decreases
The data value is independent of data volume
All of the given options

Question # 8 of 10 ( Start time: 10:39:56 PM ) Total Marks: 1
In contrast to statistics, data mining is ______ driven.
Select correct option:
Assumption
Knowledge (Correct)
Human
Database

Question # 9 of 10 ( Start time: 10:40:48 PM ) Total Marks: 1
In context of data mining definition, the term “value” means:
Select correct option:
The primary key of table
The index location of the record
Importance of hidden patterns discovered (Correct)
Numerical or string measure assigned to an attribute

Question # 10 of 10 ( Start time: 10:41:24 PM ) Total Marks: 1
In context of clustering, the term “distance” means:
Select correct option:
Similarity/dissimilarly of records (Correct)
The difference between the primary keys of two records
The relation of a record with corresponding record in child table
None of the given options

2

2

4

3

2

5

3

1
| |