Python Fundamentals with Prof. Gerhard Trippen - IMI BIGDataAIHUB Technical Workshops

Prof. Gerhard Trippen delivering Workshop 2 in the 2022-2023 IMI BIGDataAIHUB’s Case Competition at the University of Toronto Mississauga. Topic: an introduction to Jupyter Notebooks and Python Basics.

            The 2022-2023 IMI BIGDataAIHUB’s Case Competition’s second Technical Workshop was presented on December 5 by Professor Gerhard Trippen. The workshop presented an introduction to Python basics to help case competition participants with their data analysis, the presentation can be found here.

 

What is Python?

            Python is a high-level and general-purpose programming language for problem-solving. It was invented in the early 1990s by Guido van Rossum. See use resources at: www.python.org

 

            Simple Python expressions can be run in a shell. The easiest way to begin a Python shell is to open Jupyter QtConsole (IPython), which is part of Anaconda. Alternatively, the Anaconda package can be installed here: https://www.anaconda.com/download/ or  Google’s Jupyter Notebook (Google Colab) can be accessed here: https://colab.research.google.com/

 

What is Jupyter Notebook?

            Jupyter Notebook is a tool for developing and presenting data science projects in an interactive way. View a Jupyter Notebook tutorial here: https://www.dataquest.io/blog/jupyter-notebook-tutorial/   

Prof. Gerhard Trippen delivering Workshop 2 in the 2022-2023 IMI BIGDataAIHUB’s Case Competition at the University of Toronto Mississauga. Topic: an introduction to Jupyter Notebooks and Python Basics.

 

 

Tips for using Python with Big Data:

  1. Remember Python starts at letter zero
  2. Be mindful of extra spaces in datasets—spend time cleaning up the data!
  3. Avoid loops, as much as possible, if you are working with big data
  4. Before hard coding, consider if it can be kept flexible—you may need to rework later if you are expanding the project
  5. Empty list can be helpful as a placeholder to append something to a list later
  6. Data can be sliced backwards by starting the count at the end with negative numbers