Chapter 2: Study of Python Libraries
2.1: NumPy (Numerical Python): ◀◀◀
Numpy is a python package for the computation and processing of multidimensional and single-dimensionalay elements.
2.1.1:
import numpy as np
l = [‘dog’, ‘cat’, ‘horse’]
l
Output: [‘dog’, ‘cat’, ‘horse’]
type(l)
Output: list
2.1.2: sort(): sort the list alphabetically.
sort():
l.sort()
l
Output: [‘cat’, ‘dog’, ‘horse’]
2.1.3: list(): The list() function creates a list object.
li = list(range(6))
li
Output: [0, 1, 2, 3, 4, 5]
while li:
p=li.pop()
print(‘p:’, p)
print(‘li:’, li)
Output:
p: 5
li: [0, 1, 2, 3, 4]
p: 4
li: [0, 1, 2, 3]
p: 3
li: [0, 1, 2]
p: 2
li: [0, 1]
p: 1
li: [0]
p: 0
li: []
2.1.3: tuple: Tuples are used to store multiple items in a single variable.
a = (‘Ryan’, 33, True)
b = ‘Takaya’, 25, False
type(b)
Output: tuple
type(a)
type(b)
Output: tuple
print(a[1])
Output: 33
print(b[0])
Output: Takaya
2.1.4: 1-dimensional array: One-dimensional array contains elements only in one dimension. In other words, the shape of the NumPy array should contain only one value in the tuple.
a = np.array([2,4,6,8])
a
Output: array([2, 4, 6, 8])
a.dtype
Output: dtype(‘int32’)
a = np.array([2,4,6,8], np.int64)
a
Output: array([2, 4, 6, 8], dtype=int64)
a = np.array([[2,4,6,8]])
a
Output: array([[2, 4, 6, 8]])
a[0][3]
Output: 8
shape(): It gives shape of array in terms of row and column.
a.shape
Output: (1, 4)
2.1.5: 2-dimensional array: Two-dimensional array is an array within an array. It is an array of arrays. In this type of array the position of an data element is referred by two indices instead of one. So it represents a table with rows and columns of data.
listarr = np.array([[1,1,1],[2,2,2],[3,3,3]])
listarr
Output:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
2.1.5.1: size: It gives size of the array.
listarr.shape
Output: (3, 3)
listarr.size
Output: 9
2.1.5.2: zeros (): It gives array of zeros.
z = np.zeros((2,4))
z
Output:
array([[0., 0., 0., 0.],
[0., 0., 0., 0.]])
z.shape
Output: (2, 4)
2.1.5.3: ones(): It gives array of ones.
y = np.ones((3,4))
y
Output:
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
y = np.ones((2,3,4))
y
Output:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
2.1.5.4: arange(): It gives range of array.
x = np.arange(10)
x
Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
2.1.5.5: linespace(): The numpy.linspace() function returns number spaces evenly w.r.t interval.
m = np.linspace(1,5,4)
m
Output: array([1., 2.33333333, 3.66666667, 5.])
m = np.linspace(1,7,3)
Output: Type Markdown and LaTeX: α2
m
Output: array([1., 4., 7.])
y
Output:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
2.1.5.6: ones_like(): The numpy.one_like() function returns an array of given shape and type as a given array, with ones.
c = np.ones_like(y)
c
Output:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
g = np.ones((2,3,4))
g
Output:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
2.1.5.7: reshape(): The numpy.reshape() function shapes an array without changing the data of the array.
g.reshape
Output: <function ndarray.reshape>
g
Output:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
h = np.arange(50)
h
Output:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
h.reshape(2,25)
Output:
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24],
[25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49]])
2.1.5.8: ravel(): ravel function in python is used to return a contiguous array.
h.ravel()
Output:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
Practice 2.1:
b = np.arange(3,10,2, dtype=np.int32)
b.itemsize
b
Output: 4
b = np.arange(3.4,10,2)
b.itemsize
Output: 8
b.shape
Output: (4,)
b.itemsize
Output: 8
t = np.linspace(3,10,3, dtype=np.int32)
t
Output: array([3., 4.75, 6.5, 8.25, 10.])
t = np.linspace(3,10,5, dtype=np.int32)
t
Output: array([3, 4, 6, 8, 10])
m = np.arange(6)
m
Output: array([0, 1, 2, 3, 4, 5])
m.reshape(2,3)
Output:
array([[0, 1, 2],
[3, 4, 5]])
m.reshape(3,2)
Output:
array([[0, 1],
[2, 3],
[4, 5]])
2.2: Panda Library: ◀◀◀
Pandas is an open-source Python library used for working with data sets.
It provides various data structures and operations for manipulating numerical data and time series.
This library is built on top of the NumPy library.
Pandas is fast and it has high performance & productivity for users.
2.2.1: Basic example of importing Panda library.
import numpy as np
import pandas as pd
dict = {“name”:[‘aa’, ‘bb’, ‘cc’],
“class”:[‘fy’,’sy’,’ty’],
“roll”:[11, 22, 33]}
dict
Output: {‘name’: [‘aa’, ‘bb’, ‘cc’], ‘class’: [‘fy’, ‘sy’, ‘ty’], ‘roll’: [11, 22, 33]}
2.2.2: Data Frame: A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.
df = pd.DataFrame(dict)
df
Output:
name
class
roll
0
aa
fy
11
1
bb
sy
22
2
cc
ty
33
df.to_csv(‘student.csv’)
df.to_csv(‘index_false_student.csv’, index=False)
2.2.3: head(): The head() method returns a specified number of rows, starting from the top.
df.head()
Output:
name
class
roll
0
aa
fy
11
1
bb
sy
22
2
cc
ty
33
2.2.4: tail(): The tail() method returns a specified number of last rows.
df.tail()
Output:
name
class
roll
0
aa
fy
11
1
bb
sy
22
2
cc
ty
33
2.2.5: describe(): The describe() method returns description of the data in the DataFrame.
df.describe()
Output:
roll
count
3.0
mean
22.0
std
11.0
min
11.0
25%
16.5
50%
22.0
75%
27.5
max
33.0
df.head(3)
Output:
name
class
roll
0
aa
fy
11
1
bb
sy
22
2
cc
ty
33
2.2.6: to_csv(): the to csv() method exports DataFrame to a CSV file with row index as the first column and comma as the delimiter
df.to_csv(‘index.csv’, index=False)
df
Output:
name
class
roll
0
aa
fy
11
1
bb
sy
22
2
cc
ty
33
Practice 2.2:
df.to_csv(‘index1.csv’, index=False)
demo = pd.read_csv(‘index2.csv’)
demo
Output:
prod_id
name
area
0
2200
apple
andheri
1
3300
mango
parle
2
4400
orange
santacruz
demo[‘name’]
Output:
0 apple
1 mango
2 orange
Name: name, dtype: object
demo[‘name’][1]
Output: ‘mango’
demo[‘prod_id’]
Output:
0 2200
1 3300
2 4400
Name: prod_id, dtype: int64
demo[‘prod_id’][2] = 4004
demo[‘prod_id’]
Output:
0 2200
1 3300
2 4004
Name: prod_id, dtype: int64
demo.to_csv(‘new.csv’)
demo
Output:
prod_id
name
area
0
2200
apple
andheri
1
3300
mango
parle
2
4004
orange
santacruz
demo.index = [‘one’, ‘two’, ‘three’]
demo
Output:
prod_id
name
area
one
2200
apple
andheri
two
3300
mango
parle
three
4004
orange
Santacruz
2.2.7: Series: A Pandas Series is like a column in a table. It is a one- dimensional array holding data of any type.
s = pd.Series([2,3,4,5,6,7,8,9,10])
s
Output:
0 2
1 3
2 4
3 5
4 6
5 7
6 8
7 9
8 10
dtype: int64
2.2.8: random.rand(): The numpy.random.rand() function creates an array of specified shape and fills it with random values.
s1 = pd.Series(np.random.rand(20))
s1
Output:
0 0.476242
1 0.332118
2 0.265113
3 0.722535
4 0.210917
5 0.204344
6 0.557794
7 0.585600
8 0.775989
9 0.555856
10 0.669544
11 0.874442
12 0.534156
13 0.260446
14 0.519634
15 0.776713
16 0.660476
17 0.748030
18 0.814161
19 0.366974
dtype: float64
Practice 2.3:
df1 = pd.DataFrame(np.random.rand(20,10))
df1
Output:
0
1
2
3
4
5
6
7
8
9
0
0.889829
0.217723
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
0.815695
0.961605
0.734357
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
0.300321
0.297326
0.667170
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.648760
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
4
0.538527
0.630472
0.851454
0.061778
0.659211
0.565140
0.876626
0.598274
0.997209
0.087594
5
0.541544
0.934696
0.424254
0.602228
0.491561
0.614428
0.120711
0.491124
0.204725
0.973860
6
0.628961
0.302158
0.846598
0.068880
0.285089
0.233620
0.408571
0.277139
0.119807
0.524263
7
0.120473
0.407693
0.207758
0.042455
0.203260
0.605364
0.230598
0.450066
0.450713
0.003687
8
0.558722
0.927035
0.777533
0.483478
0.847846
0.096667
0.910407
0.327488
0.254891
0.337679
9
0.427066
0.629416
0.845941
0.008152
0.927802
0.945599
0.783255
0.626967
0.922936
0.155402
10
0.748707
0.909395
0.492470
0.046778
0.203244
0.102367
0.242721
0.370299
0.525937
0.410644
11
0.190404
0.602494
0.196155
0.650595
0.986109
0.680599
0.886406
0.262964
0.956797
0.719145
12
0.240944
0.520401
0.174845
0.756972
0.198388
0.355310
0.419668
0.514867
0.761939
0.560055
13
0.627101
0.535762
0.842373
0.963862
0.816623
0.052924
0.211294
0.368572
0.167157
0.388588
14
0.978139
0.237486
0.077492
0.209904
0.650783
0.663827
0.352613
0.130673
0.536371
0.074908
15
0.488940
0.336477
0.495782
0.341456
0.425742
0.461244
0.142852
0.294217
0.499867
0.226806
16
0.024142
0.726993
0.602587
0.815984
0.753234
0.515214
0.982483
0.124366
0.452646
0.757576
17
0.428680
0.481441
0.671396
0.437300
0.565147
0.387528
0.174145
0.295377
0.683534
0.326617
18
0.529209
0.236979
0.605650
0.002481
0.898732
0.043005
0.464004
0.849748
0.056447
0.424221
19
0.884170
0.725553
0.001559
0.273916
0.643806
0.102261
0.280440
0.360105
0.760108
0.674790
type(df1)
df1.describe()
Output:
0
1
2
3
4
5
6
7
8
9
count
20.000000
20.000000
20.000000
20.000000
20.000000
20.000000
20.000000
20.000000
20.000000
20.000000
mean
0.530517
0.569589
0.535992
0.366716
0.578745
0.423202
0.470512
0.432950
0.477266
0.432250
std
0.263312
0.252397
0.294265
0.323563
0.296716
0.265350
0.286197
0.197626
0.300232
0.281850
min
0.024142
0.217723
0.001559
0.002481
0.110265
0.043005
0.120711
0.124366
0.043488
0.003687
25%
0.395380
0.327897
0.242445
0.058028
0.264632
0.154430
0.228957
0.295087
0.240142
0.208955
50%
0.540035
0.569128
0.604118
0.307686
0.647295
0.488229
0.414119
0.391818
0.476257
0.399616
75%
0.673747
0.737913
0.793743
0.625445
0.824429
0.607630
0.719149
0.535718
0.702677
0.685879
max
0.978139
0.961605
0.950464
0.963862
0.986109
0.945599
0.982483
0.849748
0.997209
0.973860
df1 [0][1] = “abc”
df1.head(10)
Output:
0
1
2
3
4
5
6
7
8
9
0
0.889829
0.217723
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
abc
0.961605
0.734357
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
0.300321
0.297326
0.667170
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.64876
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
4
0.538527
0.630472
0.851454
0.061778
0.659211
0.565140
0.876626
0.598274
0.997209
0.087594
5
0.541544
0.934696
0.424254
0.602228
0.491561
0.614428
0.120711
0.491124
0.204725
0.973860
6
0.628961
0.302158
0.846598
0.068880
0.285089
0.233620
0.408571
0.277139
0.119807
0.524263
7
0.120473
0.407693
0.207758
0.042455
0.203260
0.605364
0.230598
0.450066
0.450713
0.003687
8
0.558722
0.927035
0.777533
0.483478
0.847846
0.096667
0.910407
0.327488
0.254891
0.337679
9
0.427066
0.629416
0.845941
0.008152
0.927802
0.945599
0.783255
0.626967
0.922936
0.155402
df1.head(4)
Output:
0
1
2
3
4
5
6
7
8
9
0
0.889829
efg
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
abc
0.961605
0.734357
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
pqr
0.297326
0.667170
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.64876
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
df1[2][1]=“aaa”
df1
Output:
0
1
2
3
4
5
6
7
8
9
0
0.889829
efg
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
abc
0.961605
aaa
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
pqr
0.297326
0.66717
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.64876
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
4
0.538527
0.630472
0.851454
0.061778
0.659211
0.565140
0.876626
0.598274
0.997209
0.087594
5
0.541544
0.934696
0.424254
0.602228
0.491561
0.614428
0.120711
0.491124
0.204725
0.973860
6
0.628961
0.302158
0.846598
0.068880
0.285089
0.233620
0.408571
0.277139
0.119807
0.524263
7
0.120473
0.407693
0.207758
0.042455
0.203260
0.605364
0.230598
0.450066
0.450713
0.003687
8
0.558722
0.927035
0.777533
0.483478
0.847846
0.096667
0.910407
0.327488
0.254891
0.337679
9
0.427066
0.629416
0.845941
0.008152
0.927802
0.945599
0.783255
0.626967
0.922936
0.155402
10
0.748707
0.909395
0.49247
0.046778
0.203244
0.102367
0.242721
0.370299
0.525937
0.410644
11
0.190404
0.602494
0.196155
0.650595
0.986109
0.680599
0.886406
0.262964
0.956797
0.719145
12
0.240944
0.520401
0.174845
0.756972
0.198388
0.355310
0.419668
0.514867
0.761939
0.560055
13
0.627101
0.535762
0.842373
0.963862
0.816623
0.052924
0.211294
0.368572
0.167157
0.388588
14
0.978139
0.237486
0.077492
0.209904
0.650783
0.663827
0.352613
0.130673
0.536371
0.074908
15
0.48894
0.336477
0.495782
0.341456
0.425742
0.461244
0.142852
0.294217
0.499867
0.226806
16
0.024142
0.726993
0.602587
0.815984
0.753234
0.515214
0.982483
0.124366
0.452646
0.757576
17
0.42868
0.481441
0.671396
0.437300
0.565147
0.387528
0.174145
0.295377
0.683534
0.326617
18
0.529209
0.236979
0.60565
0.002481
0.898732
0.043005
0.464004
0.849748
0.056447
0.424221
19
0.88417
0.725553
0.001559
0.273916
0.643806
0.102261
0.280440
0.360105
0.760108
0.674790
demo
Output:
prod_id
name
area
one
2200
apple
andheri
two
3300
mango
parle
three
4004
grapes
santacruz
demo[‘prod_id’][1] = 5005
demo
Output:
prod_id
name
area
one
2200
apple
andheri
two
5005
mango
parle
three
4004
grapes
santacruz
demo.dtypes
Output:
prod_id int64
name object
area object
dtype: object
df1.dtypes
Output:
0 object
1 object
2 object
3 float64
4 float64
5 float64
6 float64
7 float64
8 float64
9 float64
dtype: object
df1
Output:
0
1
2
3
4
5
6
7
8
9
0
0.889829
efg
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
abc
0.961605
aaa
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
pqr
0.297326
0.66717
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.64876
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
4
0.538527
0.630472
0.851454
0.061778
0.659211
0.565140
0.876626
0.598274
0.997209
0.087594
5
0.541544
0.934696
0.424254
0.602228
0.491561
0.614428
0.120711
0.491124
0.204725
0.973860
6
0.628961
0.302158
0.846598
0.068880
0.285089
0.233620
0.408571
0.277139
0.119807
0.524263
7
0.120473
0.407693
0.207758
0.042455
0.203260
0.605364
0.230598
0.450066
0.450713
0.003687
8
0.558722
0.927035
0.777533
0.483478
0.847846
0.096667
0.910407
0.327488
0.254891
0.337679
9
0.427066
0.629416
0.845941
0.008152
0.927802
0.945599
0.783255
0.626967
0.922936
0.155402
10
0.748707
0.909395
0.49247
0.046778
0.203244
0.102367
0.242721
0.370299
0.525937
0.410644
11
0.190404
0.602494
0.196155
0.650595
0.986109
0.680599
0.886406
0.262964
0.956797
0.719145
12
0.240944
0.520401
0.174845
0.756972
0.198388
0.355310
0.419668
0.514867
0.761939
0.560055
13
0.627101
0.535762
0.842373
0.963862
0.816623
0.052924
0.211294
0.368572
0.167157
0.388588
14
0.978139
0.237486
0.077492
0.209904
0.650783
0.663827
0.352613
0.130673
0.536371
0.074908
15
0.48894
0.336477
0.495782
0.341456
0.425742
0.461244
0.142852
0.294217
0.499867
0.226806
16
0.024142
0.726993
0.602587
0.815984
0.753234
0.515214
0.982483
0.124366
0.452646
0.757576
17
0.42868
0.481441
0.671396
0.437300
0.565147
0.387528
0.174145
0.295377
0.683534
0.326617
18
0.529209
0.236979
0.60565
0.002481
0.898732
0.043005
0.464004
0.849748
0.056447
0.424221
19
0.88417
0.725553
0.001559
0.273916
0.643806
0.102261
0.280440
0.360105
0.760108
0.674790
2.2.9: DataFrame.to_numpy(): Data Frame can be converted to NumPy ndarray with the help of the DataFrame.to_numpy() method.
df1.to_numpy()
Output:
array([[0.8898290587574625, ‘efg’, 0.9504637009415536,
0.11445350753707939, 0.1752600347199531, 0.17178486545231497,
0.5028824898103187, 0.431305684264646, 0.5858017094187148,
0.8249071762869403],
[‘abc’, 0.9616049518794304, ‘aaa’, 0.6170616309044902,
0.7786715778059061, 0.7373050333549065, 0.22403425159455848,
0.7926807836787797, 0.04348811153552434, 0.7557984441046045],
[‘pqr’, 0.29732601721264285, 0.6671699248305508,
0.8106324766594586, 0.9541240643323579, 0.5271480185664397,
0.6977800757136665, 0.6794261624792959, 0.25194753570017625,
0.12448932434428084],
[0.6487599631982358, 0.7706715150467816, 0.2540076701928634,
0.02594477872312806, 0.11026541677027257, 0.6026986958103315,
0.4987517228774331, 0.4133382664044277, 0.31299444322854375,
0.2939702728123903],
[0.5385269288296717, 0.6304722606997493, 0.8514542001716628,
0.06177752739395137, 0.6592108981215252, 0.5651400220152262,
0.8766258842206889, 0.5982735952347097, 0.9972089870133152,
0.08759388187127126],
[0.5415438818693964, 0.9346959871836261, 0.4242535605595833,
0.6022280894199854, 0.49156096274871275, 0.6144283361246711,
0.1207113831461939, 0.49112392727947407, 0.20472539299943238,
0.9738604915353611],
[0.6289611558850483, 0.3021577039431248, 0.846597984603601,
0.06887979170151126, 0.2850892718126471, 0.23362048790526357,
0.4085709972198358, 0.27713931102605127, 0.11980742869681049,
0.5242625516236767],
[0.12047323514029218, 0.4076931276538779, 0.20775814224756806,
0.04245494922213655, 0.20325987717441063, 0.6053639969934034,
0.23059818844544866, 0.4500660481280979, 0.4507134591810671,
0.003687096791495703],
[0.558721730405733, 0.9270350875298212, 0.7775332056913566,
0.48347777029749706, 0.8478464104949196, 0.09666694338157167,
0.9104070401598463, 0.32748771998561244, 0.2548911720594357,
0.3376787878401263],
[0.42706601825584, 0.6294157375842707, 0.8459413328009987,
0.008151933416917778, 0.9278019823366778, 0.9455985340636448,
0.7832549150376663, 0.6269671160495615, 0.9229361644222673,
0.15540235785663703],
[0.7487069527367622, 0.9093947236386258, 0.49246977663430314,
0.04677816297810866, 0.20324374999174266, 0.1023672526008913,
0.24272067170750022, 0.3702986708319129, 0.5259374157967004,
0.41064368110825533],
[0.19040400501064036, 0.6024937826358194, 0.19615487722677982,
0.6505953528159473, 0.9861090888652927, 0.6805992380978194,
0.8864063677267591, 0.26296403260958523, 0.9567974966351677,
0.7191453275424041],
[0.24094430235912145, 0.5204007187411817, 0.1748447311932637,
0.7569724353480395, 0.19838767252402834, 0.3553101063124129,
0.4196679434769559, 0.5148666686957333, 0.7619392858693081,
0.560054898230945],
[0.6271005616304949, 0.5357621588665652, 0.8423727130622828,
0.9638621473539152, 0.8166226983755571, 0.05292449501022456,
0.21129437931396955, 0.36857150469345834, 0.1671565144741678,
0.38858763486196646],
[0.9781386628079035, 0.23748583454810612, 0.0774922086014398,
0.20990350008401104, 0.650783131011194, 0.6638271605151387,
0.35261309636887295, 0.1306729819085617, 0.5363706849354714,
0.07490764205042],
[0.48894020892946477, 0.33647710082718496, 0.4957819333797754,
0.3414556185184121, 0.4257420648833655, 0.46124407258671696,
0.1428515761588003, 0.294216975284216, 0.49986677703088356,
0.2268061817643775],
[0.02414249471968255, 0.7269933849502771, 0.6025866134022627,
0.8159838725163027, 0.7532337808624099, 0.5152143189684777,
0.9824832756390718, 0.12436628539572403, 0.4526464668408925,
0.7575758731905909],
[0.4286800832564034, 0.48144143422024277, 0.6713958370200016,
0.43730039897594153, 0.5651468661339273, 0.38752775998992894,
0.1741454366748063, 0.2953770888681998, 0.6835335205019395,
0.32661688397358923],
[0.5292085630387563, 0.23697946207749765, 0.6056501407655644,
0.0024813922441588865, 0.8987318909075309, 0.04300455769905975,
0.46400399742872456, 0.8497477915770419, 0.05644742404326131,
0.4242205865280846],
[0.8841697571458046, 0.7255526457480526, 0.0015591409819755153,
0.2739157678767543, 0.6438064650372433, 0.10226113788566826,
0.2804404425187115, 0.36010547725733033, 0.7601084640428719,
0.6747895650580462]], dtype=object)
df1
Output:
0
1
2
3
4
5
6
7
8
9
0
0.889829
efg
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
abc
0.961605
aaa
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
pqr
0.297326
0.66717
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.64876
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
4
0.538527
0.630472
0.851454
0.061778
0.659211
0.565140
0.876626
0.598274
0.997209
0.087594
5
0.541544
0.934696
0.424254
0.602228
0.491561
0.614428
0.120711
0.491124
0.204725
0.973860
6
0.628961
0.302158
0.846598
0.068880
0.285089
0.233620
0.408571
0.277139
0.119807
0.524263
7
0.120473
0.407693
0.207758
0.042455
0.203260
0.605364
0.230598
0.450066
0.450713
0.003687
8
0.558722
0.927035
0.777533
0.483478
0.847846
0.096667
0.910407
0.327488
0.254891
0.337679
9
0.427066
0.629416
0.845941
0.008152
0.927802
0.945599
0.783255
0.626967
0.922936
0.155402
10
0.748707
0.909395
0.49247
0.046778
0.203244
0.102367
0.242721
0.370299
0.525937
0.410644
11
0.190404
0.602494
0.196155
0.650595
0.986109
0.680599
0.886406
0.262964
0.956797
0.719145
12
0.240944
0.520401
0.174845
0.756972
0.198388
0.355310
0.419668
0.514867
0.761939
0.560055
13
0.627101
0.535762
0.842373
0.963862
0.816623
0.052924
0.211294
0.368572
0.167157
0.388588
14
0.978139
0.237486
0.077492
0.209904
0.650783
0.663827
0.352613
0.130673
0.536371
0.074908
15
0.48894
0.336477
0.495782
0.341456
0.425742
0.461244
0.142852
0.294217
0.499867
0.226806
16
0.024142
0.726993
0.602587
0.815984
0.753234
0.515214
0.982483
0.124366
0.452646
0.757576
17
0.42868
0.481441
0.671396
0.437300
0.565147
0.387528
0.174145
0.295377
0.683534
0.326617
18
0.529209
0.236979
0.60565
0.002481
0.898732
0.043005
0.464004
0.849748
0.056447
0.424221
19
0.88417
0.725553
0.001559
0.273916
0.643806
0.102261
0.280440
0.360105
0.760108
0.674790
Practice 2.4:
demo
Output:
prod_id
name
area
one
2200
apple
andheri
two
5005
mango
parle
three
4004
grapes
santacruz
• Display Transpose of the above demo.
demo.T
Output:
one
two
three
prod_id
2200
5005
4004
name
apple
mango
grapes
area
andheri
parle
santacruz
df2 = pd.DataFrame(np.random.rand(10,5))
df2
Output:
0
1
2
3
4
0
0.988782
0.155982
0.163659
0.216378
0.338656
1
0.922171
0.810851
0.249822
0.283435
0.181059
2
0.069235
0.844811
0.165427
0.086819
0.301486
3
0.789741
0.358560
0.738854
0.373372
0.934196
4
0.405396
0.146483
0.516349
0.259770
0.846987
5
0.929204
0.212274
0.604740
0.422453
0.722843
6
0.247970
0.452907
0.853457
0.639186
0.590882
7
0.672903
0.397623
0.773096
0.071042
0.135975
8
0.139015
0.843306
0.936715
0.941274
0.551718
9
0.052673
0.486642
0.234463
0.257344
0.981282
df2.sort_index(axis=1, ascending=False)
Output:
4
3
2
1
0
0
0.338656
0.216378
0.163659
0.155982
0.988782
1
0.181059
0.283435
0.249822
0.810851
0.922171
2
0.301486
0.086819
0.165427
0.844811
0.069235
3
0.934196
0.373372
0.738854
0.358560
0.789741
4
0.846987
0.259770
0.516349
0.146483
0.405396
5
0.722843
0.422453
0.604740
0.212274
0.929204
6
0.590882
0.639186
0.853457
0.452907
0.247970
7
0.135975
0.071042
0.773096
0.397623
0.672903
8
0.551718
0.941274
0.936715
0.843306
0.139015
9
0.981282
0.257344
0.234463
0.486642
0.052673
demo
Output:
prod_id
name
area
one
2200
apple
andheri
two
5005
mango
parle
three
4004
grapes
santacruz
p = demo.sort_values(‘name’)
p
Output:
prod_id
name
area
one
2200
apple
andheri
three
4004
grapes
santacruz
two
5005
mango
parle
2.2.10: loc(): The. loc property of the DataFrame object allows the return of specified rows and/or columns from that DataFrame.
df1.loc[0,0]=990
df1
Output:
0
1
2
3
4
5
6
7
8
9
0
990
efg
0.950464
0.114454
0.175260
0.171785
0.502882
0.431306
0.585802
0.824907
1
abc
0.961605
aaa
0.617062
0.778672
0.737305
0.224034
0.792681
0.043488
0.755798
2
pqr
0.297326
0.66717
0.810632
0.954124
0.527148
0.697780
0.679426
0.251948
0.124489
3
0.64876
0.770672
0.254008
0.025945
0.110265
0.602699
0.498752
0.413338
0.312994
0.293970
4
0.538527
0.630472
0.851454
0.061778
0.659211
0.565140
0.876626
0.598274
0.997209
0.087594
5
0.541544
0.934696
0.424254
0.602228
0.491561
0.614428
0.120711
0.491124
0.204725
0.973860
6
0.628961
0.302158
0.846598
0.068880
0.285089
0.233620
0.408571
0.277139
0.119807
0.524263
7
0.120473
0.407693
0.207758
0.042455
0.203260
0.605364
0.230598
0.450066
0.450713
0.003687
8
0.558722
0.927035
0.777533
0.483478
0.847846
0.096667
0.910407
0.327488
0.254891
0.337679
9
0.427066
0.629416
0.845941
0.008152
0.927802
0.945599
0.783255
0.626967
0.922936
0.155402
10
0.748707
0.909395
0.49247
0.046778
0.203244
0.102367
0.242721
0.370299
0.525937
0.410644
11
0.190404
0.602494
0.196155
0.650595
0.986109
0.680599
0.886406
0.262964
0.956797
0.719145
12
0.240944
0.520401
0.174845
0.756972
0.198388
0.355310
0.419668
0.514867
0.761939
0.560055
13
0.627101
0.535762
0.842373
0.963862
0.816623
0.052924
0.211294
0.368572
0.167157
0.388588
14
0.978139
0.237486
0.077492
0.209904
0.650783
0.663827
0.352613
0.130673
0.536371
0.074908
15
0.48894
0.336477
0.495782
0.341456
0.425742
0.461244
0.142852
0.294217
0.499867
0.226806
16
0.024142
0.726993
0.602587
0.815984
0.753234
0.515214
0.982483
0.124366
0.452646
0.757576
17
0.42868
0.481441
0.671396
0.437300
0.565147
0.387528
0.174145
0.295377
0.683534
0.326617
18
0.529209
0.236979
0.60565
0.002481
0.898732
0.043005
0.464004
0.849748
0.056447
0.424221
19
0.88417
0.725553
0.001559
0.273916
0.643806
0.102261
0.280440
0.360105
0.760108
0.674790
df2
Output:
0
1
2
3
4
0
0.988782
0.155982
0.163659
0.216378
0.338656
1
0.922171
0.810851
0.249822
0.283435
0.181059
2
0.069235
0.844811
0.165427
0.086819
0.301486
3
0.789741
0.358560
0.738854
0.373372
0.934196
4
0.405396
0.146483
0.516349
0.259770
0.846987
5
0.929204
0.212274
0.604740
0.422453
0.722843
6
0.247970
0.452907
0.853457
0.639186
0.590882
7
0.672903
0.397623
0.773096
0.071042
0.135975
8
0.139015
0.843306
0.936715
0.941274
0.551718
9
0.052673
0.486642
0.234463
0.257344
0.981282
df2.loc[0,0]=990
df2
Output:
0
1
2
3
4
0
990.000000
0.155982
0.163659
0.216378
0.338656
1
0.922171
0.810851
0.249822
0.283435
0.181059
2
0.069235
0.844811
0.165427
0.086819
0.301486
3
0.789741
0.358560
0.738854
0.373372
0.934196
4
0.405396
0.146483
0.516349
0.259770
0.846987
5
0.929204
0.212274
0.604740
0.422453
0.722843
6
0.247970
0.452907
0.853457
0.639186
0.590882
7
0.672903
0.397623
0.773096
0.071042
0.135975
8
0.139015
0.843306
0.936715
0.941274
0.551718
9
0.052673
0.486642
0.234463
0.257344
0.981282
Practice:
df2.columns = list(“ABCDE”)
df2
Output:
A
B
C
D
E
0
990.000000
0.155982
0.163659
0.216378
0.338656
1
0.922171
0.810851
0.249822
0.283435
0.181059
2
0.069235
0.844811
0.165427
0.086819
0.301486
3
0.789741
0.358560
0.738854
0.373372
0.934196
4
0.405396
0.146483
0.516349
0.259770
0.846987
5
0.929204
0.212274
0.604740
0.422453
0.722843
6
0.247970
0.452907
0.853457
0.639186
0.590882
7
0.672903
0.397623
0.773096
0.071042
0.135975
8
0.139015
0.843306
0.936715
0.941274
0.551718
9
0.052673
0.486642
0.234463
0.257344
0.981282
df2.loc[0,’A’]=89
df2
Output:
A
B
C
D
E
0
89.000000
899.000000
0.163659
0.216378
0.338656
1
0.922171
0.810851
0.249822
0.283435
0.181059
2
0.069235
0.844811
0.165427
0.086819
0.301486
3
0.789741
0.358560
0.738854
0.373372
0.934196
4
0.405396
0.146483
0.516349
0.259770
0.846987
5
0.929204
0.212274
0.604740
0.422453
0.722843
6
0.247970
0.452907
0.853457
0.639186
0.590882
7
0.672903
0.397623
0.773096
0.071042
0.135975
8
0.139015
0.843306
0.936715
0.941274
0.551718
9
0.052673
0.486642
0.234463
0.257344
0.981282
dt = pd.DataFrame(np.random.rand(10,5))
dt
Ouput:
0
1
2
3
4
0
0.514973
0.132473
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
9
0.649240
0.894046
0.534226
0.551215
0.025614
dt[0][0]=88
dt
Output:
0
1
2
3
4
0
88.000000
0.132473
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
9
0.649240
0.894046
0.534226
0.551215
0.025614
2.2.11: sort_index(): Pandas dataframe.sort_index() function sorts objects by labels along the given axis.
dt.sort_index(axis=1, ascending=False)
Output:
4
3
2
1
0
0
0.099254
0.870011
0.662300
0.132473
88.000000
1
0.258930
0.459002
0.709748
0.655760
0.505812
2
0.742279
0.653753
0.959236
0.850593
0.446541
3
0.396865
0.904143
0.233297
0.001264
0.364539
4
0.834405
0.403364
0.010521
0.344468
0.214473
5
0.386030
0.971037
0.517688
0.511075
0.543493
6
0.537692
0.767525
0.385691
0.310684
0.757976
7
0.483544
0.581528
0.438818
0.294248
0.532578
8
0.044865
0.600649
0.258645
0.366597
0.383618
9
0.025614
0.551215
0.534226
0.894046
0.649240
dt[0][0]=0.9
dt
Output:
0
1
2
3
4
0
0.900000
0.132473
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
9
0.649240
0.894046
0.534226
0.551215
0.025614
Practice 2.5:
dt.columns = list(“abcde”)
dt
Output:
a
b
c
d
e
0
0.900000
0.132473
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
dt.loc[0,’b’]=68
dt
Output:
a
b
c
d
e
0
0.900000
68.000000
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
9
0.649240
0.894046
0.534226
0.551215
0.025614
dt.loc[0,0]=98
dt
Output:
a
b
c
d
e
0
0
0.900000
68.000000
0.662300
0.870011
0.099254
98.0
1
0.505812
0.655760
0.709748
0.459002
0.258930
NaN
2
0.446541
0.850593
0.959236
0.653753
0.742279
NaN
3
0.364539
0.001264
0.233297
0.904143
0.396865
NaN
4
0.214473
0.344468
0.010521
0.403364
0.834405
NaN
5
0.543493
0.511075
0.517688
0.971037
0.386030
NaN
6
0.757976
0.310684
0.385691
0.767525
0.537692
NaN
7
0.532578
0.294248
0.438818
0.581528
0.483544
NaN
8
0.383618
0.366597
0.258645
0.600649
0.044865
NaN
9
0.649240
0.894046
0.534226
0.551215
0.025614
NaN
dt.drop(0,axis=1)
Output:
a
b
c
d
e
0
0.900000
68.000000
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
9
0.649240
0.894046
0.534226
0.551215
0.025614
newdt = dt.drop(0,axis=1)
newdt
Output:
a
b
c
d
e
0
0.900000
68.000000
0.662300
0.870011
0.099254
1
0.505812
0.655760
0.709748
0.459002
0.258930
2
0.446541
0.850593
0.959236
0.653753
0.742279
3
0.364539
0.001264
0.233297
0.904143
0.396865
4
0.214473
0.344468
0.010521
0.403364
0.834405
5
0.543493
0.511075
0.517688
0.971037
0.386030
6
0.757976
0.310684
0.385691
0.767525
0.537692
7
0.532578
0.294248
0.438818
0.581528
0.483544
8
0.383618
0.366597
0.258645
0.600649
0.044865
9
0.649240
0.894046
0.534226
0.551215
0.025614