CS441 Assignment 3 Solution and Discussion

Assignment No. 03 (Graded)
SEMESTER Fall 2019
CS441- Big Data Concepts Total Marks: 20
Due Date: 20-01-2020
Instructions
Please read the following instructions carefully before submitting the assignment:
It should be clear that your assignment will not get any credit if:
o Assignment is submitted after the due date.
o Submitted assignment does not open or file is corrupt.
o Assignment is copied (From internet/students).
Objectives:
To enable students to write and execute different HivQL queries like:
• Create database
• Create table
• Load data in a table
• Select query

Lectures Covered: This assignment covers Topics of Week-10.
Assignment Submission Instructions
You have to submit only .doc file on the Assignments interface of CS441 on VULMS. An assignment submitted in any other format will not be accepted and will be graded zero marks.
You can visit the following link in order to write the HivQL queries with the help of online editor:
https://demo.gethue.com/hue/accounts/login?next=/hue
Kindly provide user name and password as demo in order to access the editor and to write and run different HIVQL queries.
For any query about the assignment, contact at [email protected]
GOOD LUCK
Marks: 20
Problem Statement:
You are required to write HiveQL queries for the following tasks:

  1. Create a database named as “VU”.
  2. Create the following table named as “Student” in the “VU” database:
Field Name Data type
Std-ID int
Std-Name String
Std-Fname String
CGPA Float
Cell No String
Study Program String
  1. Write a Hive query that adds the following rows in the “Student” table. Consider that the following data is stored in a text file named as “Std-Data.txt” in /home/user directory.
101 Kamran Usman 3.0 0300-0000000 BCS
102 Arshad Anwaar 2.75 0321-1111111 MCS
103 Waqar Jehanzeb 3.5 0345-2222222 MBA
104 Saad Ameen 2.25 0312-3333333 MCS
105 Pervez Khalid 3.75 0333-4444444 BCS
  1. Write a Hive query that display all the information of those students whose CGPA is equal or greater than 3.0.
  2. Write a Hive query that find the total number of students in each study program.