Welcome to Ed2Ti Blog.

My journey on IT is here.

A new challenge from Professor Iyad Koteich with Python and Pandas

The objective consists of organizing and joining (merge) three spreadsheets from Colleges in Montreal. All three spreadsheets are different and some work is necessary to normalize them.

  1. Define the base columns. (Best option is the Concordia spreadsheet.
  2. Insert College column.
  3. Rename the names of the DataFrame columns (if necessary)
  4. Delete some DataFrame columns (if necessary)
  5. Organize DataFrame columns from trebas
  6. Save the fina DataFrame on an xlsx file.

After that, just put your hands on it.

GitHub -> https://github.com/ed2ti/exercise03

[The Python CoDe]

# *************************** #
# College: Trebas Institute 
# Professor: Iyad Koteich
# Class: Edward
# Day: 2022-10-05
# *************************** #

#Importing Libres
import pandas as pd
from pandas import DataFrame 

###
### INFORMATIONS FROM Concordia ###
###

##Loading Data
concordia = pd.read_excel('Concordia.xlsx')

## Informing that this informations are form Concordia
concordia["College"] = 'Concordia'

#show Concordia Result
print(concordia.head())
print('n')

###
### INFORMATIONS FROM McGill ###
###

##Loading Data
mcgill    = pd.read_excel('McGill.xlsx')

## Informing that this informations are form McGill
mcgill["College"] = 'McGill'

## Organizing the DataFrame
mcgill.rename(columns={'id': 'Student_Code'}, inplace = True)
mcgill.rename(columns={'name': 'Full_Name'}, inplace = True)
mcgill.rename(columns={'Course': 'Program'}, inplace = True)
mcgill.rename(columns={'country': 'Nationality'}, inplace = True)
mcgill.drop(columns=['city'], inplace = True)

#show McGill Result
print(mcgill.head())
print('n')

###
### INFORMATIONS FROM Trebas ###
###

##Loading Data And Print The Colums
trebas    = pd.read_excel('TREBAS.xlsx')

## Informing that this informations are form TREBAS
trebas["College"] = 'TREBAS'

## Organizing the DataFrame
trebas.rename(columns={'Country': 'Nationality'}, inplace = True)
trebas.rename(columns={'Student_ID': 'Student_Code'}, inplace = True)
trebas["Full_Name"] = trebas["First_Name"] + " " + trebas["Last_Name"]
trebas.drop(columns=['City'], inplace = True)
trebas.drop(columns=['First_Name'], inplace = True)
trebas.drop(columns=['Last_Name'], inplace = True)

#Ajusting the sequence of the columns
trebas = trebas[["Student_Code","Full_Name","Nationality","Program", "College"]]

#show Trebas Result
print(trebas.head())
print('n')

final = pd.concat([trebas, concordia,mcgill], ignore_index=True, sort=False)
print(final)


# Exporting to excel (to_excel)
final.to_excel ('final.xlsx', index = True, header=True)