Pandas¶

Pandas Logo
by Akashdip Mahapatra
🍭 ML 🍭 🐥 Pandas 🐥 ❌ numPy ❌
In [ ]:
import numpy as np
import matplotlib.pyplot as plt

# Data for plotting
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2 * np.pi * t)

fig, ax = plt.subplots()
ax.plot(t, s)

ax.set(xlabel='time (s)', ylabel='voltage (mV)',
       title='About as simple as it gets, folks')
ax.grid()

plt.show()
No description has been provided for this image
In [ ]:
import numpy as np # type: ignore
import pandas as pd # type: ignore
In [ ]:
dict1 = {
    "name":['Akash', 'Arka', 'Megha', 'Sahil'],
    "marks":[98, 87, 38, 27],
    "fav_sub":['math', 'phy', 'chemistory', 'History']
}
In [ ]:
df = pd.DataFrame(dict1)

DataFrame => like a Excel seat

In [ ]:
df
Out[ ]:
name marks fav_sub
0 Akash 98 math
1 Arka 87 phy
2 Megha 38 chemistory
3 Sahil 27 History
In [ ]:
df.head(2) #.head() => use for to see only 2 Rows
Out[ ]:
name marks fav_sub
0 Akash 98 math
1 Arka 87 phy
In [ ]:
df.tail(2)
Out[ ]:
name marks fav_sub
2 Megha 38 chemistory
3 Sahil 27 History
In [ ]:
df.describe()
Out[ ]:
marks
count 4.000000
mean 62.500000
std 35.218366
min 27.000000
25% 35.250000
50% 62.500000
75% 89.750000
max 98.000000

TO export as a Excel seat => to_CSV¶

In [ ]:
df.to_csv('exam.csv')
In [ ]:
df.to_csv('exam_index_false.csv', index=False)

To see Any Excel file exist on the folder¶

[variable] = pd.read_CSV('fileName')¶

In [ ]:
new01 = pd.read_csv('new.csv')
In [ ]:
new01
Out[ ]:
s_name marks fav_sub
0 Ak 78 math
1 Ar 87 phy
2 Me 38 chem
3 Sa 27 History
In [ ]:
new01['fav_sub']
Out[ ]:
0       math
1        phy
2       chem
3    History
Name: fav_sub, dtype: object
In [ ]:
new01['fav_sub'][3]
Out[ ]:
'History'
In [ ]:
new01['fav_sub'][3] = 'His'
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\3023452953.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  new01['fav_sub'][3] = 'His'
In [ ]:
new01
Out[ ]:
s_name marks fav_sub
0 Ak 78 math
1 Ar 87 phy
2 Me 38 chem
3 Sa 27 His
In [ ]:
new01.loc[3, 'marks'] = 7
In [ ]:
new01
Out[ ]:
s_name marks fav_sub
0 Ak 78 math
1 Ar 87 phy
2 Me 38 chem
3 Sa 7 His
In [ ]:
new01.index = ['one', 'two', 'three', 'four']
In [ ]:
new01
Out[ ]:
s_name marks fav_sub
one Ak 78 math
two Ar 87 phy
three Me 38 chem
four Sa 7 His

Theoretical Consepts¶

No description has been provided for this image No description has been provided for this image No description has been provided for this image

create a randome searies¶

In [ ]:
ser = pd.Series(np.random.rand(14))
In [ ]:
ser
Out[ ]:
0     0.727978
1     0.425523
2     0.539653
3     0.383829
4     0.446115
5     0.478824
6     0.573915
7     0.788179
8     0.098496
9     0.526088
10    0.697754
11    0.525282
12    0.108756
13    0.167373
dtype: float64
In [ ]:
type(ser)
Out[ ]:
pandas.core.series.Series
it says its a series, Series is a dataSteacture in Pandas(basic index).¶

create a randome DataFrame¶

In [ ]:
newdf = pd.DataFrame(np.random.rand(334,5))
In [ ]:
newdf
Out[ ]:
0 1 2 3 4
0 0.094915 0.642031 0.709760 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
... ... ... ... ... ...
329 0.718109 0.364571 0.523523 0.641379 0.822717
330 0.542813 0.583704 0.321597 0.381247 0.083288
331 0.294751 0.913405 0.858894 0.597241 0.223458
332 0.767770 0.254591 0.745771 0.949127 0.370670
333 0.410857 0.550785 0.608418 0.096930 0.635482

334 rows × 5 columns

In [ ]:
type(newdf)
Out[ ]:
pandas.core.frame.DataFrame
In [ ]:
newdf.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 0.094915 0.642031 0.709760 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf.describe()
Out[ ]:
0 1 2 3 4
count 334.000000 334.000000 334.000000 334.000000 334.000000
mean 0.513768 0.468546 0.487145 0.488857 0.506310
std 0.275904 0.292611 0.294151 0.295852 0.281235
min 0.001253 0.000532 0.000390 0.000398 0.000331
25% 0.288274 0.213710 0.219287 0.228300 0.269539
50% 0.540356 0.468607 0.475395 0.478126 0.502517
75% 0.754102 0.717075 0.747913 0.767165 0.743202
max 0.998423 0.992619 0.997047 0.997625 0.996701
In [ ]:
newdf.dtypes
Out[ ]:
0    float64
1    float64
2    float64
3    float64
4    float64
dtype: object

Now i can change the 1st Matrix =>> float => Obj¶

In [ ]:
newdf[0][0] = "Akashdip"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\3796500939.py:1: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  newdf[0][0] = "Akashdip"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\3796500939.py:1: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'Akashdip' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  newdf[0][0] = "Akashdip"
In [ ]:
newdf.dtypes
Out[ ]:
0     object
1    float64
2    float64
3    float64
4    float64
dtype: object
In [ ]:
newdf.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 Akashdip 0.642031 0.709760 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf.index
Out[ ]:
RangeIndex(start=0, stop=334, step=1)
In [ ]:
newdf.columns
Out[ ]:
RangeIndex(start=0, stop=5, step=1)

Change the Floating no. => NumPy Array

In [ ]:
newdf.to_numpy()
Out[ ]:
array([['Akashdip', 0.6420314732051644, 0.7097602652225738,
        0.22136650780922862, 0.710656025456036],
       [0.4031559705999236, 0.1337533469665244, 0.34517669895959047,
        0.7637337782709527, 0.29940016314694406],
       [0.46718680306580895, 0.09424837999065083, 0.6897528475532556,
        0.2501084113231712, 0.8630476945059673],
       ...,
       [0.29475112674657233, 0.9134054266812583, 0.858893929449332,
        0.5972412770375036, 0.223457893816558],
       [0.7677704033620348, 0.25459117995897806, 0.7457713075951636,
        0.9491268772733774, 0.37066994919516383],
       [0.4108572942041072, 0.5507852776157258, 0.6084184825250398,
        0.0969295350529964, 0.6354818036266742]], dtype=object)
In [ ]:
newdf[1][0] = "Megha"
newdf[2][0] = "Vii"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1671826079.py:1: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  newdf[1][0] = "Megha"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1671826079.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  newdf[1][0] = "Megha"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1671826079.py:1: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'Megha' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  newdf[1][0] = "Megha"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1671826079.py:2: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  newdf[2][0] = "Vii"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1671826079.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  newdf[2][0] = "Vii"
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1671826079.py:2: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'Vii' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  newdf[2][0] = "Vii"
In [ ]:
newdf.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 Akashdip Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824

Transpose the Matrix => [name].T¶

In [ ]:
newdfT = newdf.T
In [ ]:
newdfT
Out[ ]:
0 1 2 3 4 5 6 7 8 9 ... 324 325 326 327 328 329 330 331 332 333
0 Akashdip 0.403156 0.467187 0.198747 0.674123 0.313904 0.501042 0.085362 0.642413 0.809237 ... 0.143556 0.460996 0.293234 0.037602 0.100818 0.718109 0.542813 0.294751 0.76777 0.410857
1 Megha 0.133753 0.094248 0.221682 0.938022 0.384164 0.105078 0.734469 0.474613 0.916137 ... 0.19494 0.609987 0.693375 0.953451 0.365788 0.364571 0.583704 0.913405 0.254591 0.550785
2 Vii 0.345177 0.689753 0.267885 0.496734 0.928849 0.850782 0.456449 0.363067 0.050569 ... 0.45478 0.556856 0.443906 0.348995 0.148145 0.523523 0.321597 0.858894 0.745771 0.608418
3 0.221367 0.763734 0.250108 0.655873 0.522539 0.165165 0.616262 0.805086 0.481891 0.6074 ... 0.073695 0.806612 0.463195 0.007405 0.42351 0.641379 0.381247 0.597241 0.949127 0.09693
4 0.710656 0.2994 0.863048 0.973556 0.360824 0.105793 0.449298 0.3563 0.022658 0.611752 ... 0.457065 0.558622 0.423029 0.449113 0.236166 0.822717 0.083288 0.223458 0.37067 0.635482

5 rows × 334 columns

In [ ]:
newdfT.sort_index(axis=0, ascending=False)
Out[ ]:
0 1 2 3 4 5 6 7 8 9 ... 324 325 326 327 328 329 330 331 332 333
4 0.710656 0.2994 0.863048 0.973556 0.360824 0.105793 0.449298 0.3563 0.022658 0.611752 ... 0.457065 0.558622 0.423029 0.449113 0.236166 0.822717 0.083288 0.223458 0.37067 0.635482
3 0.221367 0.763734 0.250108 0.655873 0.522539 0.165165 0.616262 0.805086 0.481891 0.6074 ... 0.073695 0.806612 0.463195 0.007405 0.42351 0.641379 0.381247 0.597241 0.949127 0.09693
2 Vii 0.345177 0.689753 0.267885 0.496734 0.928849 0.850782 0.456449 0.363067 0.050569 ... 0.45478 0.556856 0.443906 0.348995 0.148145 0.523523 0.321597 0.858894 0.745771 0.608418
1 Megha 0.133753 0.094248 0.221682 0.938022 0.384164 0.105078 0.734469 0.474613 0.916137 ... 0.19494 0.609987 0.693375 0.953451 0.365788 0.364571 0.583704 0.913405 0.254591 0.550785
0 Akashdip 0.403156 0.467187 0.198747 0.674123 0.313904 0.501042 0.085362 0.642413 0.809237 ... 0.143556 0.460996 0.293234 0.037602 0.100818 0.718109 0.542813 0.294751 0.76777 0.410857

5 rows × 334 columns

axis=0 => Row¶

axis=1 => Column¶

In [ ]:
newdf.sort_index(axis=0, ascending=False)
Out[ ]:
0 1 2 3 4
333 0.410857 0.550785 0.608418 0.096930 0.635482
332 0.76777 0.254591 0.745771 0.949127 0.370670
331 0.294751 0.913405 0.858894 0.597241 0.223458
330 0.542813 0.583704 0.321597 0.381247 0.083288
329 0.718109 0.364571 0.523523 0.641379 0.822717
... ... ... ... ... ...
4 0.674123 0.938022 0.496734 0.522539 0.360824
3 0.198747 0.221682 0.267885 0.655873 0.973556
2 0.467187 0.094248 0.689753 0.250108 0.863048
1 0.403156 0.133753 0.345177 0.763734 0.299400
0 Akashdip Megha Vii 0.221367 0.710656

334 rows × 5 columns

In [ ]:
newdf.sort_index(axis=1, ascending=False)
Out[ ]:
4 3 2 1 0
0 0.710656 0.221367 Vii Megha Akashdip
1 0.299400 0.763734 0.345177 0.133753 0.403156
2 0.863048 0.250108 0.689753 0.094248 0.467187
3 0.973556 0.655873 0.267885 0.221682 0.198747
4 0.360824 0.522539 0.496734 0.938022 0.674123
... ... ... ... ... ...
329 0.822717 0.641379 0.523523 0.364571 0.718109
330 0.083288 0.381247 0.321597 0.583704 0.542813
331 0.223458 0.597241 0.858894 0.913405 0.294751
332 0.370670 0.949127 0.745771 0.254591 0.76777
333 0.635482 0.096930 0.608418 0.550785 0.410857

334 rows × 5 columns

In [ ]:
newdf[0]
Out[ ]:
0      Akashdip
1      0.403156
2      0.467187
3      0.198747
4      0.674123
         ...   
329    0.718109
330    0.542813
331    0.294751
332     0.76777
333    0.410857
Name: 0, Length: 334, dtype: object
In [ ]:
type(newdf[0])
Out[ ]:
pandas.core.series.Series

it's a Series ⬆️¶

every Row is a Series => 1D array of index

Copy¶

newdf_2 = newdf.copy()¶

In [ ]:
newdf.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 Akashdip Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf_2 = newdf.copy()
In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 Akashdip Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf_2[0][0] = 0.0001
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1230259220.py:1: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  newdf_2[0][0] = 0.0001
C:\Users\akash\AppData\Local\Temp\ipykernel_12420\1230259220.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  newdf_2[0][0] = 0.0001
In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 0.0001 Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 Akashdip Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824

so, 1st I COPY newdf => newdf_2 , Then I change a value , But it's not effect on Iriginal file.

Copy => [name_01] = [name].copy()¶

View => [name_01] = [name]¶

[name][0][0] = 00000¶

it's depends on Py internal Memory management => to apply Copy // View => So, we use

[name].loc[0,0] = 00000¶

In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 0.0001 Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf_2.loc[0,0] = 999
In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
0 1 2 3 4
0 999 Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf_2.columns = list("ABCDE")
In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
A B C D E
0 999 Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
In [ ]:
newdf_2.loc[0,'A'] = 888
In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
A B C D E
0 888 Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824

But¶

In [ ]:
newdf_2.loc[0,0] = 888
In [ ]:
newdf_2.head() #.head() => use for to see only some 1st Rows
Out[ ]:
A B C D E 0
0 888 Megha Vii 0.221367 0.710656 888.0
1 0.403156 0.133753 0.345177 0.763734 0.299400 NaN
2 0.467187 0.094248 0.689753 0.250108 0.863048 NaN
3 0.198747 0.221682 0.267885 0.655873 0.973556 NaN
4 0.674123 0.938022 0.496734 0.522539 0.360824 NaN

And, Remove Column¶

Using Name '0'¶

[NAME-of-df].drop(0, axis=1)

Using Location¶

[NAME-of-df].drop([0]) => Remove 1st Row

NB, df = dataFrame

In [ ]:
newdf_2.drop(0, axis=1)
Out[ ]:
A B C D E
0 888 Megha Vii 0.221367 0.710656
1 0.403156 0.133753 0.345177 0.763734 0.299400
2 0.467187 0.094248 0.689753 0.250108 0.863048
3 0.198747 0.221682 0.267885 0.655873 0.973556
4 0.674123 0.938022 0.496734 0.522539 0.360824
... ... ... ... ... ...
329 0.718109 0.364571 0.523523 0.641379 0.822717
330 0.542813 0.583704 0.321597 0.381247 0.083288
331 0.294751 0.913405 0.858894 0.597241 0.223458
332 0.76777 0.254591 0.745771 0.949127 0.370670
333 0.410857 0.550785 0.608418 0.096930 0.635482

334 rows × 5 columns

In [ ]:
newdf_2.loc[[1,2], ['C', 'D']]
Out[ ]:
C D
1 0.345177 0.763734
2 0.689753 0.250108

Not change the value (dataFrame) => Py. Just return a View (temporary)¶

View => [name_01] = [name] ⬆️¶

For All Row¶

In [ ]:
newdf_2.loc[:, ['C', 'D']]
Out[ ]:
C D
0 Vii 0.221367
1 0.345177 0.763734
2 0.689753 0.250108
3 0.267885 0.655873
4 0.496734 0.522539
... ... ...
329 0.523523 0.641379
330 0.321597 0.381247
331 0.858894 0.597241
332 0.745771 0.949127
333 0.608418 0.096930

334 rows × 2 columns

All Columns¶

In [ ]:
newdf_2.loc[[1,2], :]
Out[ ]:
A B C D E 0
1 0.403156 0.133753 0.345177 0.763734 0.299400 NaN
2 0.467187 0.094248 0.689753 0.250108 0.863048 NaN

Find Row, which have < 0.3 values¶

In [ ]:
newdf88 = pd.DataFrame(np.random.rand(30,5))
newdf88.columns = list("ABCDE")
newdf88
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
1 0.747436 0.248261 0.149104 0.326140 0.638349
2 0.711883 0.817792 0.893393 0.131012 0.011234
3 0.690050 0.217207 0.072389 0.007611 0.873336
4 0.881278 0.522446 0.145804 0.110494 0.284493
5 0.237359 0.725402 0.877323 0.895825 0.919854
6 0.545712 0.715389 0.213039 0.904846 0.935825
7 0.331696 0.045595 0.494815 0.394165 0.891205
8 0.342224 0.473011 0.576432 0.326216 0.390410
9 0.640362 0.289781 0.341323 0.891514 0.717012
10 0.552238 0.728340 0.286339 0.012671 0.464955
11 0.426008 0.759918 0.678907 0.580389 0.737262
12 0.221629 0.678070 0.814349 0.748124 0.333247
13 0.180518 0.801238 0.508825 0.255169 0.266111
14 0.590870 0.182144 0.248982 0.985712 0.619627
15 0.605472 0.922776 0.446200 0.794544 0.555656
16 0.196671 0.957782 0.193746 0.310691 0.867964
17 0.625365 0.754752 0.202185 0.677774 0.768727
18 0.933003 0.299684 0.867389 0.960339 0.410006
19 0.041841 0.248839 0.569793 0.879107 0.836096
20 0.134533 0.467836 0.733566 0.987148 0.326727
21 0.360502 0.290221 0.304575 0.246893 0.938708
22 0.079849 0.039028 0.363027 0.762271 0.264923
23 0.964659 0.529230 0.930264 0.580089 0.549645
24 0.318614 0.767586 0.390680 0.812308 0.923355
25 0.928536 0.571438 0.075288 0.969695 0.084284
26 0.301159 0.029311 0.908624 0.521737 0.008922
27 0.756558 0.046806 0.870727 0.587439 0.665298
28 0.674713 0.055849 0.672791 0.986618 0.442269
29 0.078427 0.168581 0.431960 0.566728 0.279052
In [ ]:
newdf88.loc[(newdf88['A']<0.3)]
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
5 0.237359 0.725402 0.877323 0.895825 0.919854
12 0.221629 0.678070 0.814349 0.748124 0.333247
13 0.180518 0.801238 0.508825 0.255169 0.266111
16 0.196671 0.957782 0.193746 0.310691 0.867964
19 0.041841 0.248839 0.569793 0.879107 0.836096
20 0.134533 0.467836 0.733566 0.987148 0.326727
22 0.079849 0.039028 0.363027 0.762271 0.264923
29 0.078427 0.168581 0.431960 0.566728 0.279052
In [ ]:
newdf88.loc[(newdf88['A']<0.3) & (newdf88['C']>0.5)]
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
5 0.237359 0.725402 0.877323 0.895825 0.919854
12 0.221629 0.678070 0.814349 0.748124 0.333247
13 0.180518 0.801238 0.508825 0.255169 0.266111
19 0.041841 0.248839 0.569793 0.879107 0.836096
20 0.134533 0.467836 0.733566 0.987148 0.326727
In [ ]:
newdf88.head(2) #.head() => use for to see only 2 Rows
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.56137 0.565630
1 0.747436 0.248261 0.149104 0.32614 0.638349

Find Value¶

In [ ]:
newdf88.iloc[0,4]
Out[ ]:
np.float64(0.5656303065830017)
In [ ]:
newdf88.iloc[[0,5], [1,2]]
Out[ ]:
B C
0 0.381453 0.682943
5 0.725402 0.877323
In [ ]:
newdf88.head() #.head() => use for to see only some 1st Rows
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
1 0.747436 0.248261 0.149104 0.326140 0.638349
2 0.711883 0.817792 0.893393 0.131012 0.011234
3 0.690050 0.217207 0.072389 0.007611 0.873336
4 0.881278 0.522446 0.145804 0.110494 0.284493

Remove 1st one, using []¶

In [ ]:
newdf88.drop([0])
Out[ ]:
A B C D E
1 0.747436 0.248261 0.149104 0.326140 0.638349
2 0.711883 0.817792 0.893393 0.131012 0.011234
3 0.690050 0.217207 0.072389 0.007611 0.873336
4 0.881278 0.522446 0.145804 0.110494 0.284493
5 0.237359 0.725402 0.877323 0.895825 0.919854
6 0.545712 0.715389 0.213039 0.904846 0.935825
7 0.331696 0.045595 0.494815 0.394165 0.891205
8 0.342224 0.473011 0.576432 0.326216 0.390410
9 0.640362 0.289781 0.341323 0.891514 0.717012
10 0.552238 0.728340 0.286339 0.012671 0.464955
11 0.426008 0.759918 0.678907 0.580389 0.737262
12 0.221629 0.678070 0.814349 0.748124 0.333247
13 0.180518 0.801238 0.508825 0.255169 0.266111
14 0.590870 0.182144 0.248982 0.985712 0.619627
15 0.605472 0.922776 0.446200 0.794544 0.555656
16 0.196671 0.957782 0.193746 0.310691 0.867964
17 0.625365 0.754752 0.202185 0.677774 0.768727
18 0.933003 0.299684 0.867389 0.960339 0.410006
19 0.041841 0.248839 0.569793 0.879107 0.836096
20 0.134533 0.467836 0.733566 0.987148 0.326727
21 0.360502 0.290221 0.304575 0.246893 0.938708
22 0.079849 0.039028 0.363027 0.762271 0.264923
23 0.964659 0.529230 0.930264 0.580089 0.549645
24 0.318614 0.767586 0.390680 0.812308 0.923355
25 0.928536 0.571438 0.075288 0.969695 0.084284
26 0.301159 0.029311 0.908624 0.521737 0.008922
27 0.756558 0.046806 0.870727 0.587439 0.665298
28 0.674713 0.055849 0.672791 0.986618 0.442269
29 0.078427 0.168581 0.431960 0.566728 0.279052
In [ ]:
newdf88.drop(['A', 'C'], axis=1)
Out[ ]:
B D E
0 0.381453 0.561370 0.565630
1 0.248261 0.326140 0.638349
2 0.817792 0.131012 0.011234
3 0.217207 0.007611 0.873336
4 0.522446 0.110494 0.284493
5 0.725402 0.895825 0.919854
6 0.715389 0.904846 0.935825
7 0.045595 0.394165 0.891205
8 0.473011 0.326216 0.390410
9 0.289781 0.891514 0.717012
10 0.728340 0.012671 0.464955
11 0.759918 0.580389 0.737262
12 0.678070 0.748124 0.333247
13 0.801238 0.255169 0.266111
14 0.182144 0.985712 0.619627
15 0.922776 0.794544 0.555656
16 0.957782 0.310691 0.867964
17 0.754752 0.677774 0.768727
18 0.299684 0.960339 0.410006
19 0.248839 0.879107 0.836096
20 0.467836 0.987148 0.326727
21 0.290221 0.246893 0.938708
22 0.039028 0.762271 0.264923
23 0.529230 0.580089 0.549645
24 0.767586 0.812308 0.923355
25 0.571438 0.969695 0.084284
26 0.029311 0.521737 0.008922
27 0.046806 0.587439 0.665298
28 0.055849 0.986618 0.442269
29 0.168581 0.566728 0.279052
In [ ]:
newdf88.drop([1, 3, 5], axis=0)
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
2 0.711883 0.817792 0.893393 0.131012 0.011234
4 0.881278 0.522446 0.145804 0.110494 0.284493
6 0.545712 0.715389 0.213039 0.904846 0.935825
7 0.331696 0.045595 0.494815 0.394165 0.891205
8 0.342224 0.473011 0.576432 0.326216 0.390410
9 0.640362 0.289781 0.341323 0.891514 0.717012
10 0.552238 0.728340 0.286339 0.012671 0.464955
11 0.426008 0.759918 0.678907 0.580389 0.737262
12 0.221629 0.678070 0.814349 0.748124 0.333247
13 0.180518 0.801238 0.508825 0.255169 0.266111
14 0.590870 0.182144 0.248982 0.985712 0.619627
15 0.605472 0.922776 0.446200 0.794544 0.555656
16 0.196671 0.957782 0.193746 0.310691 0.867964
17 0.625365 0.754752 0.202185 0.677774 0.768727
18 0.933003 0.299684 0.867389 0.960339 0.410006
19 0.041841 0.248839 0.569793 0.879107 0.836096
20 0.134533 0.467836 0.733566 0.987148 0.326727
21 0.360502 0.290221 0.304575 0.246893 0.938708
22 0.079849 0.039028 0.363027 0.762271 0.264923
23 0.964659 0.529230 0.930264 0.580089 0.549645
24 0.318614 0.767586 0.390680 0.812308 0.923355
25 0.928536 0.571438 0.075288 0.969695 0.084284
26 0.301159 0.029311 0.908624 0.521737 0.008922
27 0.756558 0.046806 0.870727 0.587439 0.665298
28 0.674713 0.055849 0.672791 0.986618 0.442269
29 0.078427 0.168581 0.431960 0.566728 0.279052

Parmanant Change of dataFrame¶

In [ ]:
newdf88.drop([1, 3, 5], axis=0, inplace=True)
In [ ]:
newdf88.head() #.head() => use for to see only some 1st Rows
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
2 0.711883 0.817792 0.893393 0.131012 0.011234
4 0.881278 0.522446 0.145804 0.110494 0.284493
6 0.545712 0.715389 0.213039 0.904846 0.935825
7 0.331696 0.045595 0.494815 0.394165 0.891205

ReArrange the No. 1,2,3,4,5,....¶

In [ ]:
newdf88.reset_index()
Out[ ]:
index A B C D E
0 0 0.281601 0.381453 0.682943 0.561370 0.565630
1 2 0.711883 0.817792 0.893393 0.131012 0.011234
2 4 0.881278 0.522446 0.145804 0.110494 0.284493
3 6 0.545712 0.715389 0.213039 0.904846 0.935825
4 7 0.331696 0.045595 0.494815 0.394165 0.891205
5 8 0.342224 0.473011 0.576432 0.326216 0.390410
6 9 0.640362 0.289781 0.341323 0.891514 0.717012
7 10 0.552238 0.728340 0.286339 0.012671 0.464955
8 11 0.426008 0.759918 0.678907 0.580389 0.737262
9 12 0.221629 0.678070 0.814349 0.748124 0.333247
10 13 0.180518 0.801238 0.508825 0.255169 0.266111
11 14 0.590870 0.182144 0.248982 0.985712 0.619627
12 15 0.605472 0.922776 0.446200 0.794544 0.555656
13 16 0.196671 0.957782 0.193746 0.310691 0.867964
14 17 0.625365 0.754752 0.202185 0.677774 0.768727
15 18 0.933003 0.299684 0.867389 0.960339 0.410006
16 19 0.041841 0.248839 0.569793 0.879107 0.836096
17 20 0.134533 0.467836 0.733566 0.987148 0.326727
18 21 0.360502 0.290221 0.304575 0.246893 0.938708
19 22 0.079849 0.039028 0.363027 0.762271 0.264923
20 23 0.964659 0.529230 0.930264 0.580089 0.549645
21 24 0.318614 0.767586 0.390680 0.812308 0.923355
22 25 0.928536 0.571438 0.075288 0.969695 0.084284
23 26 0.301159 0.029311 0.908624 0.521737 0.008922
24 27 0.756558 0.046806 0.870727 0.587439 0.665298
25 28 0.674713 0.055849 0.672791 0.986618 0.442269
26 29 0.078427 0.168581 0.431960 0.566728 0.279052
In [ ]:
newdf88.reset_index(drop=True) # drop=True => Remove the Old index Column
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
1 0.711883 0.817792 0.893393 0.131012 0.011234
2 0.881278 0.522446 0.145804 0.110494 0.284493
3 0.545712 0.715389 0.213039 0.904846 0.935825
4 0.331696 0.045595 0.494815 0.394165 0.891205
5 0.342224 0.473011 0.576432 0.326216 0.390410
6 0.640362 0.289781 0.341323 0.891514 0.717012
7 0.552238 0.728340 0.286339 0.012671 0.464955
8 0.426008 0.759918 0.678907 0.580389 0.737262
9 0.221629 0.678070 0.814349 0.748124 0.333247
10 0.180518 0.801238 0.508825 0.255169 0.266111
11 0.590870 0.182144 0.248982 0.985712 0.619627
12 0.605472 0.922776 0.446200 0.794544 0.555656
13 0.196671 0.957782 0.193746 0.310691 0.867964
14 0.625365 0.754752 0.202185 0.677774 0.768727
15 0.933003 0.299684 0.867389 0.960339 0.410006
16 0.041841 0.248839 0.569793 0.879107 0.836096
17 0.134533 0.467836 0.733566 0.987148 0.326727
18 0.360502 0.290221 0.304575 0.246893 0.938708
19 0.079849 0.039028 0.363027 0.762271 0.264923
20 0.964659 0.529230 0.930264 0.580089 0.549645
21 0.318614 0.767586 0.390680 0.812308 0.923355
22 0.928536 0.571438 0.075288 0.969695 0.084284
23 0.301159 0.029311 0.908624 0.521737 0.008922
24 0.756558 0.046806 0.870727 0.587439 0.665298
25 0.674713 0.055849 0.672791 0.986618 0.442269
26 0.078427 0.168581 0.431960 0.566728 0.279052
In [ ]:
newdf88.reset_index(drop=True, inplace=True) # Permanent Change
newdf88.head()                               #.head() => use for to see only some 1st Row
Out[ ]:
A B C D E
0 0.281601 0.381453 0.682943 0.561370 0.565630
1 0.711883 0.817792 0.893393 0.131012 0.011234
2 0.881278 0.522446 0.145804 0.110494 0.284493
3 0.545712 0.715389 0.213039 0.904846 0.935825
4 0.331696 0.045595 0.494815 0.394165 0.891205