CS441 Assignment 3 Solution and Discussion
Assignment No. 03 (Graded)
SEMESTER Fall 2019
CS441- Big Data Concepts Total Marks: 20
Due Date: 20-01-2020
Please read the following instructions carefully before submitting the assignment:
It should be clear that your assignment will not get any credit if:
o Assignment is submitted after the due date.
o Submitted assignment does not open or file is corrupt.
o Assignment is copied (From internet/students).
To enable students to write and execute different HivQL queries like:
• Create database
• Create table
• Load data in a table
• Select query
Lectures Covered: This assignment covers Topics of Week-10.
Assignment Submission Instructions
You have to submit only .doc file on the Assignments interface of CS441 on VULMS. An assignment submitted in any other format will not be accepted and will be graded zero marks.
You can visit the following link in order to write the HivQL queries with the help of online editor:
Kindly provide user name and password as demo in order to access the editor and to write and run different HIVQL queries.
For any query about the assignment, contact at [email protected]
You are required to write HiveQL queries for the following tasks:
- Create a database named as “VU”.
- Create the following table named as “Student” in the “VU” database:
Field Name Data type Std-ID int Std-Name String Std-Fname String CGPA Float Cell No String Study Program String
- Write a Hive query that adds the following rows in the “Student” table. Consider that the following data is stored in a text file named as “Std-Data.txt” in /home/user directory.
101 Kamran Usman 3.0 0300-0000000 BCS 102 Arshad Anwaar 2.75 0321-1111111 MCS 103 Waqar Jehanzeb 3.5 0345-2222222 MBA 104 Saad Ameen 2.25 0312-3333333 MCS 105 Pervez Khalid 3.75 0333-4444444 BCS
- Write a Hive query that display all the information of those students whose CGPA is equal or greater than 3.0.
- Write a Hive query that find the total number of students in each study program.