Data Mining and Business Intelligence Lab Manual
ISBN 9788119221509

Highlights

Notes

  

Chapter 1: OLAP with OraclePractical No 1 AIM: OLAP with Oracle.

Description:

    A) ANALYTICAL Functions

    B) ROLLUP AND CUBES

    C) WINDOWING FUNCTIONS

Analytical Functions 

It computes on aggregate value based on a group of rows.

Analytical Functions / Clauses are:-

    a) RANK():- Rank function calculate the rank of a value in a group of values.

    b) ROW_NUMBER():- It is used to give unique number to rows.

    c) DENSE_RANK():- It computes the rank of a row in an ordered group of rows.

    d) NTILE():- It divides an ordered data set into number of buckets indicated by expression.

    e) FIRST:- It can be used to return the first value FROM an ordered sequence.

    f) LAST:- It can be used to return the last value FROM an ordered sequence.

    g) LEAD ():- It returns offset(incrementally increased) value of an argument column.

    h) LAG ():- This function is used to access data FROM a previous row.

EXAMPLES:-

Create following table.

1. EMPLOYEES(eid,ename,salary,dname)

SQL> CREATE TABLE EMPLOYEES (

eno NUMERIC(3), ename VARCHAR(10), esal NUMERIC(6),

dname varchar(10)

);

Table created.

  • SQL>INSERT INTO EMPLOYEES VALUES(200,’Vinay’,20000,’IT’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES VALUES(201,’Anand’,30000,’IT’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES VALUES(202,’Anil’,25000,’Marketing’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES
    • VALUES(203,’Yogita’,20000,’Marketing’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES
    • VALUES(204,’Priyanka’,20000,’Management’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES
    • VALUES(205,’Likhita’,25000,’Management’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES
    • VALUES(206,’Ankit’,30000,’Management’);
    • 1 row created.
    • SQL>INSERT INTO EMPLOYEES
    • VALUES(207,’Aniket’,35000,’Accountant’);
    • 1 row created.
    • SQL> SELECT * FROM EMPLOYEES;
table-wrap

ENO

ENAME

ESAL

DNAME

200

Vinay

20000

IT

201

Anand

30000

IT

202

Anil

25000

Marketing

203

Yogita

20000

Marketing

204

Priyanka

20000

Management

205

Likhita

25000

Management

206

Ankit

30000

Management

207

Aniket

35000

Accountant

Perform the following queries on above table.

1) Display employees information FROM Employee table and give unique number to each row.

SQL> SELECT eno,ename,esal,dname,ROW_NUMBER() OVER(ORDER BY ENO)”ID”

FROM Employees;

table-wrap

ENO

ENAME

ESAL

DNAME

ID

200

Vinay

20000

IT

1

201

Anand

30000

IT

2

202

Anil

25000

Marketing

3

203

Yogita

20000

Marketing

4

204

Priyanka

20000

Management

5

205

Likhita

25000

Management

6

206

Ankit

30000

Management

7

207

Aniket

35000

Accountant

8

2) Display employees information and give unique number to each row in ascending order of salary.

SQL> SELECT eno,ename,esal,dname,ROW_NUMBER() OVER(ORDER BY ESAL)”ID”

FROM Employees;

table-wrap

ENO

ENAME

ESAL

DNAME

ID

200

Vinay

20000

IT

1

203

Yogita

20000

Marketing

2

204

Priyanka

20000

Management

3

205

Likhita

25000

Management

4

202

Anil

25000

Marketing

5

206

Ankit

30000

Management

6

201

Anand

30000

IT

7

207

Aniket

35000

Accountant

8

3) Assign the ranks to employees in ascending order of salary.

SQL> SELECT eno,ename,esal,dname,RANK() OVER(ORDER BY ESAL)”Rank”

FROM Employees;

table-wrap

ENO

ENAME

ESAL

DNAME

Rank

200

Vinay

20000

IT

1

203

Yogita

20000

Marketing

1

204

Priyanka

20000

Management

1

205

Likhita

25000

Management

4

202

Anil

25000

Marketing

4

206

Ankit

30000

Management

6

201

Anand

30000

IT

6

207

Aniket

35000

Accountant

8

4) Assign the ranks to employees in descending order of salary.

SQL> SELECT eno,ename,esal,dname,RANK() OVER(ORDER BY ESAL DESC)”Rank”

FROM Employees;

table-wrap

ENO

ENAME

ESAL

DNAME

Rank

207

Aniket

35000

Accountant

1

206

Ankit

30000

Management

2

201

Anand

30000

IT

2

205

Likhita

25000

Management

4

202

Anil

25000

Marketing

4

204

Priyanka

20000

Management

6

200

Vinay

20000

IT

6

203

Yogita

20000

Marketing

6

5) Assign the ranks to employees in ascending order of salary using dense rank and point out the difference and display record in descending order of salary.

SQL> SELECT eno,ename,esal,dname,DENSE_RANK() OVER(ORDER BY ESAL)”Dense Rank”

FROM Employees ORDER BY ESAL DESC;

table-wrap

ENO

ENAME

ESAL

DNAME

Dense Rank

207

Aniket

35000

Accountant

4

201

Anand

30000

IT

3

206

Ankit

30000

Management

3

205

Likhita

25000

Management

2

202

Anil

25000

Marketing

2

203

Yogita

20000

Marketing

1

200

Vinay

20000

IT

1

204

Priyanka

20000

Management

1

6) Write a Query for finding highest and lowest salary of each department.

SQL> SELECT deptid,MIN(salary) KEEP(DENSE_RANK FIRST ORDER BY Salary)”Lowest Sal”, MAX(salary) KEEP(DENSE_RANK LAST ORDER BY Salary)”Highest Sal”

FROM Employees GROUP BY deptid;

table-wrap

DEPTID

Lowest Sal

Highest Sal

105

25000

50000

106

60000

60000

107

20000

23000

114

45000

45000

115

65000

65000

117

40000

40000

7) Write a Query to find information of employee who were joined first in each department.

SQL> SELECT deptid,MIN(DOJ) KEEP(DENSE_RANK FIRST ORDER BY Salary)”Hired First”

FROM Employees GROUP BY deptid;

table-wrap

DEPTID

Hired Fir

105

30-MAR-17

106

26-OCT-16

107

17-DEC-18

114

14-AUG-17

115

14-NOV-17

117

02-SEP-18

8) Write a Query which returns a salary FROM previous row name it as Sal_Prev. Calculate the difference between salary of current row and previous row.

SQL> SELECT empid,fname,salary,LAG(Salary,1,0) OVER(ORDER BY Salary)”SAL_PREV”, Salary-LAG(Salary,1,0) OVER(ORDER BY Salary)”SAL_DIFF”

FROM Employees;

table-wrap

EMPID

FNAME

SALARY

SAL_PREV

SAL_DIFF

11

Ross

20000

0

20000

14

Joey

23000

20000

3000

6

Poonam

25000

23000

2000

17

Phoebe

40000

25000

15000

15

Chandler

45000

40000

5000

7

Nisha

50000

45000

5000

9

Monika

60000

50000

10000

10

Rachel

60000

60000

0

19

Tushar

65000

60000

5000

9) Write a Query which returns a salary FROM next row name it as Sal_Next. Calculate the difference between salary of current row and following row.

SQL> SELECT empid,fname,salary,LEAD(Salary,1,0) OVER(ORDER BY Salary)”SAL_NEXT”, Salary-LEAD(Salary,1,0) OVER(ORDER BY Salary)”SAL_DIFF”

FROM Employees;

table-wrap

EMPID

FNAME

SALARY

SAL_NEXT

SAL_DIFF

11

Ross

20000

23000

-3000

14

Joey

23000

25000

-2000

6

Poonam

25000

40000

-15000

17

Phoebe

40000

45000

-5000

15

Chandler

45000

50000

-5000

7

Nisha

50000

60000

-10000

9

Monika

60000

60000

0

10

Rachel

60000

65000

-5000

19

Tushar

65000

0

65000

ROLLUP AND CUBES 

Description:

ROLLUP is an extension of GROUP BY Clause. This option allows us to include extra rows that represent the subtotal and grand-total which are known as super aggregate row.

CUBE returns row containing a subtotal for all combination of columns.

ROLLUP EXAMPLES:-

Create following table.

CREATE TABLE sales(

time number(4), region varchar2(15), dept varchar2(20), profit number(10,2)

);

INSERT INTO sales values(1996,’central’,’pen-sales’,75000); INSERT INTO sales values(1996,’central’,’book_sales’,74000); INSERT INTO sales values(1996,’east’,’pen_sales’,89000); INSERT INTO sales values(1997,’central’,’pen_sales’,82000); INSERT INTO sales values(1997,’central’,’pen-sales’,75000); INSERT INTO sales values(1997,’east’,’book_sales’,74000); INSERT INTO sales values(1997,’east’,’pen_sales’,89000); INSERT INTO sales values(1997,’west’,’book_sales’,115000);

1) Find total profit department wise.

SQL> SELECT dept,SUM(profit)”Profit” FROM sales

GROUP BY dept;

table-wrap

DEPT

Profit

pen_sales

260000

book_sales

263000

pen-sales

150000

2) Find total profit department wise along with Grand Total.

SQL> SELECT dept,SUM(profit)”profit” FROM sales

GROUP BY ROLLUP(dept);

table-wrap

DEPT

profit

book_sales

263000

pen-sales

150000

pen_sales

260000

673000

3) Find the total profit time and region wise.

SQL> SELECT time,region,SUM(profit)”profit” FROM sales

GROUP BY time,region;

table-wrap

TIME

REGION

profit

1997

central

157000

1997

west

115000

1997

east

163000

1996

central

149000

1996

east

89000

4) Find the total profit time and region wise along with Time wise total and Grand Total.

SQL> SELECT time,region,SUM(profit)”profit” FROM sales

GROUP BY ROLLUP(time,region);

table-wrap

TIME

REGION

profit

1996

east

89000

1996

central

149000

1996

238000

1997

east

163000

1997

west

115000

1997

central

157000

1997

435000

673000

5) Find the total profit, time,region wise along with region wise total and grand total.

SQL> SELECT time,region,SUM(profit)”profit” FROM sales

GROUP BY ROLLUP(region,time);

table-wrap

TIME

REGION

profit

1996

east

89000

1997

east

163000

east

252000

1997

west

115000

west

115000

1996

central

149000

1997

central

157000

central

306000

673000

6) Find the total profit, time,region wise along with time wise only.

SQL> SELECT time,region,SUM(profit)”profit” FROM sales

GROUP BY time, ROLLUP(region);

table-wrap

TIME

REGION

profit

1996

east

89000

1996

central

149000

1996

238000

1997

east

163000

1997

west

115000

1997

central

157000

1997

435000

7) Find the total profit, region, time wise along with region wise only.

SQL> SELECT time,region,SUM(profit)”profit” FROM sales

GROUP BY region, ROLLUP(time);

table-wrap

TIME

REGION

profit

1996

east

89000

1997

east

163000

east

252000

1997

west

115000

west

115000

1996

central

149000

1997

central

157000

central

306000

8) Find the total profit, time, region, dept wise along with time wise total, time & dept wise total and grand total.

SQL> SELECT dept,time,region,SUM(profit)”profit” FROM sales

GROUP BY ROLLUP(time,dept,region);

table-wrap

DEPT

TIME

REGION

profit

pen-sales

1996

central

75000

pen-sales

1996

75000

pen_sales

1996

east

89000

pen_sales

1996

89000

book_sales

1996

central

74000

book_sales

1996

74000

1996

238000

pen-sales

1997

central

75000

pen-sales

1997

75000

pen_sales

1997

east

89000

pen_sales

1997

central

82000

pen_sales

1997

171000

book_sales

1997

east

74000

book_sales

1997

west

115000

book_sales

1997

189000

1997

435000

673000

9) Find the total profit, time, region, department wise along with department wise total, department & region wise total and grand total.

SQL> SELECT time,dept,region,SUM(profit)”profit” FROM sales

GROUP BY ROLLUP(dept,region,time);

table-wrap

TIME

DEPT

REGION

profit

1996

pen-sales

central

75000

1997

pen-sales

central

75000

pen-sales

central

150000

pen-sales

150000

1996

pen_sales

east

89000

1997

pen_sales

east

89000

pen_sales

east

178000

1997

pen_sales

central

82000

pen_sales

central

82000

pen_sales

260000

1997

book_sales

east

74000

book_sales

east

74000

1997

book_sales

west

115000

book_sales

west

115000

1996

book_sales

central

74000

book_sales

central

74000

book_sales

263000

673000

10) Find the total profit, time, region, department wise along with department wise total, department & time wise total and grand total.

SQL> SELECT time,dept,region,SUM(profit)”profit” FROM sales

GROUP BY ROLLUP(dept,time,region);

table-wrap

TIME

DEPT

REGION

profit

1996

pen-sales

central

75000

1996

pen-sales

75000

1997

pen-sales

central

75000

1997

pen-sales

75000

pen-sales

150000

1996

pen_sales

east

89000

1996

pen_sales

89000

1997

pen_sales

east

89000

1997

pen_sales

central

82000

1997

pen_sales

171000

pen_sales

260000

1996

book_sales

central

74000

1996

book_sales

74000

1997

book_sales

east

74000

1997

book_sales

west

115000

1997

book_sales

189000

book_sales

263000

673000

CUBES EXAMPLES

1) Find out total sales {time,region,dept}wise,{time,region}wise, {time,dept}wise, {region,dept}wise,

{time}wise, {dept}wise,{region}wise total sales irrespective of time, region and department.

SQL> SELECT time,dept,region, SUM(profit)as profit FROM sales

GROUP BY CUBE(time,region,dept);

table-wrap

TIME

DEPT

REGION

PROFIT

673000

pen-sales

150000

pen_sales

260000

book_sales

263000

east

252000

pen_sales

east

178000

book_sales

east

74000

west

115000

book_sales

west

115000

central

306000

pen-sales

central

150000

pen_sales

central

82000

book_sales

central

74000

1996

238000

1996

pen-sales

75000

1996

pen_sales

89000

1996

book_sales

74000

1996

east

89000

1996

pen_sales

east

89000

1996

central

149000

1996

pen-sales

central

75000

1996

book_sales

central

74000

1997

435000

1997

pen-sales

75000

1997

pen_sales

171000

1997

book_sales

189000

1997

east

163000

1997

pen_sales

east

89000

1997

book_sales

east

74000

1997

west

115000

1997

book_sales

west

115000

1997

central

157000

1997

pen-sales

central

75000

1997

pen_sales

central

82000

2) Find out total sales {time,region,dept}wise,{time,region}wise, {time,dept}wise total sales irrespective of region and department.

SQL> SELECT time,dept,region, SUM(profit)as profit FROM sales

GROUP BY time, CUBE(region,dept);

table-wrap

TIME

DEPT

REGION

PROFIT

1996

238000

1996

pen-sales

75000

1996

pen_sales

89000

1996

book_sales

74000

1996

east

89000

1996

pen_sales

east

89000

1996

central

149000

1996

pen-sales

central

75000

1996

book_sales

central

74000

1997

435000

1997

pen-sales

75000

1997

pen_sales

171000

1997

book_sales

189000

1997

east

163000

1997

pen_sales

east

89000

1997

book_sales

east

74000

1997

west

115000

1997

book_sales

west

115000

1997

central

157000

1997

pen-sales

central

75000

1997

pen_sales

central

82000

WINDOWING FUNCTIONS 

Description:

Windowing Function operates on a set of rows and return a single value for each row FROM the underline query.

The ‘OVER’ Clause defines the partitioning and ordering of row for the function.

The ‘OVER’ accepts the following 3 arguments to define a window for the function to operate on:

- Order By

- Partition By

- Rows OR Range Clause

Examples:

Create following table CREATE TABLE Employee1(e_id number(4),

first_name varchar2(15), last_name varchar2(15), dept_id number(3), hire_date date,

salary number(10,2)

);

INSERT INTO Employee1 values(101,’Steven’,’Patil’,10,’29-Jan-2001’,20000); INSERT INTO Employee1 values(107,’Michael’,’Singh’,10,’19-Jan-2001’,30000); INSERT INTO Employee1 values(106,’Peter’,’Patil’,10,’16-Feb-2001’,50000); INSERT INTO Employee1 values(102,’James’,’Patil’,10,’10-Mar-2001’,200000); INSERT INTO Employee1 values(103,’Alexis’,’Patil’,10,’03-Mar-2001’,70000); INSERT INTO Employee1 values(104,’David’,’Patil’,10,’25-Jun-2001’,40000); INSERT INTO Employee1 values(115,’Prashanti’,’Patil’,20,’16-Jul-2001’,6000); INSERT INTO Employee1 values(120,’Nagraju’,’Patil’,40,’14-Jun-2001’,10000); INSERT INTO Employee1 values(116,’Nagesh’,’Patil’,20,’29-Jan-2001’,2500); INSERT INTO Employee1 values(118,’Sreenivas’,’Patil’,20,’29-Jan-2001’,20000); INSERT INTO Employee1 values(119,’Ashok’,’Singh’,30,’19-Jan-2001’,300000); INSERT INTO Employee1 values(114,’Raghav’,’Patil’,80,’16-Feb-2001’,5000); INSERT INTO Employee1 values(113,’Chandana’,’Patil’,50,’10-Mar-2001’,20000);

1) Display empno,ename,dept,sal,cumulative SUM of salary.

SQL> SELECT e_id,first_name,dept_id,salary,SUM(salary) OVER (order by e_id)”Cumulative_Sal”

FROM Employee1;

table-wrap

E_ID

FIRST_NAME

DEPT_ID

SALARY

Cumulative_Sal

101

Steven

10

20000

20000

102

James

10

200000

220000

103

Alexis

10

70000

290000

104

David

10

40000

330000

106

Peter

10

50000

380000

107

Michael

10

30000

410000

113

Chandana

50

20000

430000

114

Raghav

80

5000

435000

115

Prashanti

20

6000

441000

116

Nagesh

20

2500

443500

118

Sreenivas

20

20000

463500

119

Ashok

30

300000

763500

120

Nagraju

40

10000

773500

2) Display empno,ename,dept,sal,cumulative SUM of salaries of all preceding rows dept wise.

SQL> SELECT e_id,first_name,dept_id,salary,SUM(salary)

OVER (partition by dept_id order by salary Rows unbounded preceding)”Cumulative_Sal” FROM Employee1;

table-wrap

E_ID

FIRST_NAME

DEPT_ID

SALARY

Cumulative_Sal

101

Steven

10

20000

20000

107

Michael

10

30000

50000

104

David

10

40000

90000

106

Peter

10

50000

140000

103

Alexis

10

70000

210000

102

James

10

200000

410000

116

Nagesh

20

2500

2500

115

Prashanti

20

6000

8500

118

Sreenivas

20

20000

28500

119

Ashok

30

300000

300000

120

Nagraju

40

10000

10000

113

Chandana

50

20000

20000

114

Raghav

80

5000

5000

3) Display empno,ename,dept,sal,cumulative SUM of salaries of current and previous 2 rows.

SQL> SELECT e_id,first_name,dept_id,salary,SUM(salary)

OVER (order by e_id Rows between 2 preceding and current row)”Cumulative_Sal” FROM Employee1;

table-wrap

E_ID

FIRST_NAME

DEPT_ID

SALARY

Cumulative_Sal

101

Steven

10

20000

20000

102

James

10

200000

220000

103

Alexis

10

70000

290000

104

David

10

40000

310000

106

Peter

10

50000

160000

107

Michael

10

30000

120000

113

Chandana

50

20000

100000

114

Raghav

80

5000

55000

115

Prashanti

20

6000

31000

116

Nagesh

20

2500

13500

118

Sreenivas

20

20000

28500

119

Ashok

30

300000

322500

120

Nagraju

40

10000

330000

4) Display empno,ename,dept,sal,cumulative SUM of salaries of previous 3 rows and 1 next row dept wise.

SQL> SELECT e_id,first_name,dept_id,salary,SUM(salary)

OVER (order by dept_id Rows between 3 preceding and 1 following)”Cumulative_Sal” FROM Employee1;

table-wrap

E_ID

FIRST_NAME

DEPT_ID

SALARY

Cumulative_Sal

101

Steven

10

20000

50000

107

Michael

10

30000

100000

106

Peter

10

50000

300000

102

James

10

200000

370000

103

Alexis

10

70000

390000

104

David

10

40000

366000

115

Prashanti

20

6000

336000

118

Sreenivas

20

20000

138500

116

Nagesh

20

2500

368500

119

Ashok

30

300000

338500

120

Nagraju

40

10000

352500

113

Chandana

50

20000

337500

114

Raghav

80

5000

335000

5) Display empno,ename,dept,sal,cumulative SUM of salaries of previous 2 rows and 2 next rows.

SQL> SELECT e_id,first_name,dept_id,salary,SUM(salary)

OVER (order by e_id Rows between 2 preceding and 2 following)”Cumulative_Sal” FROM Employee1;

table-wrap

E_ID

FIRST_NAME

DEPT_ID

SALARY

Cumulative_Sal

101

Steven

10

20000

290000

102

James

10

200000

330000

103

Alexis

10

70000

380000

104

David

10

40000

390000

106

Peter

10

50000

210000

107

Michael

10

30000

145000

113

Chandana

50

20000

111000

114

Raghav

80

5000

63500

115

Prashanti

20

6000

53500

116

Nagesh

20

2500

333500

118

Sreenivas

20

20000

338500

119

Ashok

30

300000

332500

120

Nagraju

40

10000

330000

6) Display last 7 days hire_date employee.

SQL> SELECT e_id,first_name,hire_date,count(e_id)

OVER(order by hire_date range interval ‘15’ day preceding)”Days” FROM Employee1;

table-wrap

E_ID

FIRST_NAME

HIRE_DATE

Days

119

Ashok

19-JAN-01

2

107

Michael

19-JAN-01

2

101

Steven

29-JAN-01

5

118

Sreenivas

29-JAN-01

5

116

Nagesh

29-JAN-01

5

106

Peter

16-FEB-01

2

114

Raghav

16-FEB-01

2

103

Alexis

03-MAR-01

3

102

James

10-MAR-01

3

113

Chandana

10-MAR-01

3

120

Nagraju

14-JUN-01

1

104

David

25-JUN-01

2

115

Prashanti

16-JUL-01

1

7) write a query for finding average of sal, if sal is less than 20000, then consider it as 20000 for finding average.

SQL> SELECT AVG(case when salary<20000 then 20000 else salary end)”Average” FROM Employee1;

table-wrap

Average

63846.1538

8) write a query for finding average of sal, if sal is greater than 20000, then consider it as 20000 for finding average.

SQL> SELECT first_name,dept_id,AVG(case when salary>20000 then 20000 else 0 end) OVER(partition by dept_id)”Average” FROM Employee1;

table-wrap

FIRST_NAME

DEPT_ID

Average

Steven

10

16666.6667

Michael

10

16666.6667

Peter

10

16666.6667

James

10

16666.6667

Alexis

10

16666.6667

David

10

16666.6667

Prashanti

20

0

Sreenivas

20

0

Nagesh

20

0

Ashok

30

20000

Nagraju

40

0

Chandana

50

0

Raghav

80

0