6. Object Oriented Programming

6. Object Oriented Programming#

Before you start this activity

Be sure you are running the kernel needed for this lab. (click on the Python(kernel-name) next to the debugger in the upper right corner).
Restart the kernel and clear Output of All Cells - in the top Menu click “Kernel” and restart the Kernel and clear output of all cells

1. Procedural Vs. Object Oriented Programming#

This is a Markdown Cell, and when you execute it (shift-enter) no code is sent to the python processor

Python is a type of programming known as Object Oriented Programming (OOP), and we are going to start by looking at the difference between the procedural and the object oriented approach to summing numbers in a python data structure known as an array, or more accurately, the numpy array. We will start by generating an array of 8 random numbers, and by using a seed, we can reproduce the same array. If we change the seed the numbers in the array will change. We will then create two different arrays and assign them to two variables called random_arrayA and random_arrayB. In the last step of the code we will print the array and the variable’s class type.

The next cell is a code cell, and when you execute it, the script will run You will see the number 1 appear in the [ ]: to the left of the cell and this is telling you the order you have processed the cells with. That is, in a Jupyter notebook you process one cell at a time and can do so in any order you choose, and the number reflects that order. When you restart the kernel and clear all output, all numbers disappear.

'''In the next cell we are importing numpy with the alias np.  This will give us 
access to the resources withing the numpy package. We will first set a seed for the
numpy random integer generator so that it will provide the same random numbers, 
and then we will create two arrays of 8 numbers between 0 and 99.'''

import numpy as np

# Set the seed for reproducibility
np.random.seed(42)

# Create a random 1D NumPy array of 8 integers between 0 and 99 inclusive
random_arrayA = np.random.randint(0, 100, 8)

# Set the seed for reproducibility
np.random.seed(24)

# Create a random 1D NumPy array of 8 integers between 0 and 99 inclusive
random_arrayB = np.random.randint(0, 100, 8)

# Print the arrays and their type.
print(random_arrayA)
print(f"random_arrayA is an object of {type(random_arrayA)}.")
print(random_arrayB)
print(f"random_arrayB is an object of {type(random_arrayB)}.")

[51 92 14 71 60 20 82 86]
random_arrayA is an object of <class 'numpy.ndarray'>.
[34  3 64 87 17 17  1 79]
random_arrayB is an object of <class 'numpy.ndarray'>.

The code output of the above cell is indicative of object oriented programming. The two arrays are different instances (objects) of the class numpy.ndarray, that have different numbers, just as two Eagles may have different weights and wingspans, but they are still eagles.

Procedural Approach to Summing an Array#

In the next cell we are going to sum these numbers using a procedural approach. We set our counter equal to zero and then print out its value after we add the eight numbers. Each number of the array is identified by an index position, and we start at position 0 and go up to 8, adding each number to the total as we loop through the loop. Click on the code cell below and place your curser to the left of the code cell but right of the blue line and type shift L (a capital L) and line numbers will appear

'''Each position of a list has an index number, and here we use a procedural approach to 
sum up the values at each index position.'''

# Initialize the sum and set counters = 0.  Note, the equal sign is the assignment operator.
totalA = 0
totalB = 0
# Loop through the list by its index numbers and add each number to the sum
for index in range(0,8):
    totalA = totalA + random_arrayA[index]
for index in range(0,8):
    totalB = totalB + random_arrayB[index]
    
# Print the list and the sum
print(f"random_arrayA: {random_arrayA} with sum: {totalA}")
print(f"random_arrayB: {random_arrayB} with sum: {totalB}")

random_arrayA: [51 92 14 71 60 20 82 86] with sum: 476
random_arrayB: [34  3 64 87 17 17  1 79] with sum: 302

Object Oriented Approach to Summing an Array#

A fish has gills and a bird has wings and these are attributes of fish and birds that allow them to live under water or fly in the air, but each bird and fish is a unique instance of its class, and the same goes for pythonic objects. Both of the arrays have the attributes of the numpy.ndarray class, but they have different numbers that sum to different values, simply speaking, they are different arrays.

In the empty cell below start to type random but stop at the n and type tab, and choose one of the random arrays created in the above code block, and then type a period after the array name and hit tab again. This will show you a list of numpy array attributes and methods (functions), and if you scroll down you will see one is called sum. Now, sum is actually two types of attributes, it is a method and a function, and we will discuss these shortly.

#Here we are looking at the sum as a method
array_sumA =random_arrayA.sum()
array_sumB =random_arrayB.sum()

# print the array and sum in an f-string.  This is a way of formatting 
# print statements and we will cover it later
print(f"random_arrayA: {random_arrayA} with sum: {array_sumA}")
print(f"random_arrayB: {random_arrayB} with sum: {array_sumB}")

random_arrayA: [51 92 14 71 60 20 82 86] with sum: 476
random_arrayB: [34  3 64 87 17 17  1 79] with sum: 302

Note the syntax above is array_name.sum(), where sum is a method of the class numpy.ndarray. It operates directly on the object (array_name) and is invoked using the attribute access operator (.).”

“Note the syntax below is np.sum(array_name), where sum is a standalone function from the NumPy module (np). The object (array_name) is passed as an argument to the function.”

# here we are looking at sum as a function
array_sumAf = np.sum(random_arrayA)
array_sumBf = np.sum(random_arrayB)

print(f"random_arrayA: {random_arrayA} with sum: {array_sumAf}")
print(f"random_arrayB: {random_arrayB} with sum: {array_sumBf}")

random_arrayA: [51 92 14 71 60 20 82 86] with sum: 476
random_arrayB: [34  3 64 87 17 17  1 79] with sum: 302

In fact, we could put the array with its method inside of the print statement and reduce the above to one line. This print statement is using something called an f string and we will cover that later.

# Here we are looking at sum as a method
print(f"random_arrayA: {random_arrayA} with sum: {random_arrayA.sum()}\nrandom_arrayB: {random_arrayB} with sum: {random_arrayB.sum()}")

random_arrayA: [51 92 14 71 60 20 82 86] with sum: 476
random_arrayB: [34  3 64 87 17 17  1 79] with sum: 302

# here we are looking at sum as a function
print(f"random_arrayA: {random_arrayA} with sum: {array_sumAf} \nrandom_arrayB: {random_arrayB} with sum: {array_sumBf}")

random_arrayA: [51 92 14 71 60 20 82 86] with sum: 476 
random_arrayB: [34  3 64 87 17 17  1 79] with sum: 302

2. Overview of Object Oriented Programming (OOP)#

Objects are the core of Python’s OOP model. They are instances of classes, combining data (attributes) and behavior (methods) into a single entity. Objects are created from classes, which are like blueprints that define the structure and behavior of the objects. For example, a NumPy array is an object created from the numpy.ndarray class. Classes allow us to group related data and behavior together, making our code more organized and easier to work with.

In OOP, an attribute is a piece of information stored in an object, like the shape of an array (array.shape), while a method is a function associated with an object that performs an action on it, like summing the elements of an array (array.sum()). An instance is a specific object created from a class, such as a particular array or dictionary you define in your code. These characteristics make Python a flexible and intuitive language for solving scientific problems, as you can work with data and its associated actions in a structured way.

Python is written in C, or CPython, which is a procedural language but it contains PyObjects for each object in Python and these are what is placed in the memory heap

TO BE CONTINUED

3. Memory and Python Objects (Classes and Functions)#

There are sort of three ways you can obtain pythonic objects

Built-In Functions and Classes
Built in Modules
External Modules that you need to install. Let’s quickly look at these, one at a time.

3.1. Built-In Functions and Classes#

Part of Pythons Standard Library
Always available and do not require importing
Loaded into RAM when you launch Pythons interpreter
Part of python global namespace and are available to any program by simply typing their names.
Frequently used functions like print() may even be loaded into the CPU cache for quicker execution.

Built-In Functions#

Function	Description
`print()`	Displays output to the console.
`type()`	Returns the type (class) of an object.
`len()`	Returns the number of items in a container (e.g., list, string, dictionary).
`float()`	Converts a number or string to a floating-point number.
`int()`	Converts a number or string to an integer, truncating towards zero.
`str()`	Converts an object to a string representation.
`input()`	Reads a line of input from the console.

Built-In Classes#

Class	Description
`list`	An ordered collection of items (mutable).
`dict`	A collection of key-value pairs (mutable and unordered).
`int`	An integer number (unlimited precision).
`float`	A floating-point number (decimal).
`str`	A sequence of Unicode characters (string).
`bool`	A boolean type with only two possible values: `True` or `False`.
`set`	An unordered collection of unique elements.

3.2 Built in Modules#

Need to be imported to be available
Two basic types of import statements
1. import math - functions and classes are not part of the global namespace, to use the sqrt function of that module you would need to write math.sqrt().
2. from math import sqrt now sqrt is part of the global namespace and you can use it without the math.

Here is a table of some common built in modules

Module	Description
`math`	Provides basic mathematical functions like `sqrt`, `sin`, and `cos`, as well as constants like `pi` and `e`.
`random`	Offers functionality for generating random numbers, shuffling sequences, and picking random items from lists.
`os`	Enables interaction with the operating system, such as working with files, directories, and environment variables.
`sys`	Gives access to system-specific parameters and functions, including command-line arguments and the Python runtime.
`time`	Provides time-related functions, like getting the current time, pausing execution (`sleep`), or measuring performance.
`datetime`	Supplies classes for manipulating dates and times, including formatting and arithmetic with time objects.
`re`	Implements regular expression operations for string searching, matching, and substitution.
`json`	Lets you read and write JSON (JavaScript Object Notation) data for easy data serialization and sharing.
`csv`	Simplifies reading and writing CSV (Comma-Separated Values) files.

import math
# uncomment the following line and it will give an an error
#sqrt(4)

math.sqrt(4)

2.0

from math import sqrt
sqrt(4)

2.0

NOTE: Go back and run the cell that gave the error (3 cells above). Can you explain why it works now? (hint: look at the numbers next to the cell)

3.3. Externally Managed Packages and Modules#

One of the strengths of Python as an Open Source Software is that third parties can create their own customized libraries in the form of installable packages that can extend the capabilities of your programs. These packages can contain modules with functions and classes that are built on other modules, or the core built-in libraries. Because there are dependencies across packages it is important to install them in virtual environments and that is why we are using miniconda.

To use these packages we have to first use conda commands in the terminal to install the package, and once installed, we need to import them in our python programs just was we import the built-in modules. It is very important to use conda to install these in the virtual environments. There are two major repositories for obtaining python packages, the Python Package Index (PyPi) and Conda/Conda-Forge

Conda-Forge#

Conda-Forge is a Git-Hub organization that as of January 2025 has over 28.5K packages and 0.81 billion monthly downloads. Conda-Forge is essentially focused on data science and scientific computing and has libraries in multiple programming languages besides Python. The conda package management system is very robust in handling the dependencies across multiple packages and this is the repository we will give priority to when installing externally managed packages.

Python Package Index#

Python Package Index (PyPi) is absolutely amazing and has over 600,000 python projects as of January 2025, and is the go-to place for code for everything from sensors to data science packages. When you find packages you will often see a pip install package-name and this will install the package from PyPi. In this class you are to first try and install packages using conda-forge, as it tends to be better in handling the complex dependencies, and integrates with your other packages that were installed with Conda. There is another type of environment called venv that is part of the python standard library and pip integrates well with it, but as out class is going to be mixing multiple packages for different projects we will try and stick to Conda. But PyPi is an awesome resource and we may find times where there are packages we need that can only be obtained through PyPi. In those cases, it will be prudent to create a new environment for that project, just to be on the safe side.

4. CPython#

The most common version of Python is written in the C programming language and is often called CPython. When data is stored in the memory heap it is not stored as a Python object, but as a CPython object called a PyObject. This is a binary data representation that can be effectively processed by the Python interpreter, but contains additional metadata, which is data about the data, and that depends on the type of object.

Common Built-In Python Types and Their CPython Representations#

Python Object Type	CPython Object Structure	Description
`int`	`PyLongObject`	Represents arbitrary precision integers.
`float`	`PyFloatObject`	Represents floating-point numbers.
`complex`	`PyComplexObject`	Represents complex numbers, storing real and imaginary parts as floats.
`str`	`PyUnicodeObject`	Represents Unicode strings, with metadata for encoding and actual character data.
`bytes`	`PyBytesObject`	Represents immutable sequences of raw bytes.
`list`	`PyListObject`	Represents lists, using a dynamic array of pointers to other objects.
`tuple`	`PyTupleObject`	Represents tuples, storing a fixed-size array of object references.
`dict`	`PyDictObject`	Represents dictionaries, implemented as hash tables.
`set`	`PySetObject`	Represents sets, using a hash table for unique elements.
`bool`	`PyBoolObject`	Represents Boolean values (`True`, `False`), implemented as a subclass of `PyLongObject`.
`None`	`PyNoneObject`	Represents the singleton `None`. Only one instance exists.
`function`	`PyFunctionObject`	Represents Python functions, including their bytecode and other metadata.
`module`	`PyModuleObject`	Represents Python modules, storing module-level data and functions.
`class` (type)	`PyTypeObject`	Represents Python classes. Stores information about the class itself (methods, attributes, etc.).

5. Python Functions that Reveal Object Information#

Function	Description
`type(obj)`	Returns the class (type) of an object (e.g., `<class 'int'>`).
`id(obj)`	Returns a unique integer identifier for an object, often its memory address in CPython.
`dir(obj)`	Lists the attributes (including methods) associated with an object.
`help(obj)`	Displays help documentation or the docstring for an object in an interactive console.
`isinstance(obj, C)`	Checks if `obj` is an instance (or subclass instance) of the class `C`.
`issubclass(A, B)`	Checks if class `A` is a subclass of class `B`.
`callable(obj)`	Returns `True` if `obj` can be called like a function, `False` otherwise.
`hasattr(obj, name)`	Returns `True` if an attribute named `name` exists in `obj`, `False` otherwise.

6. Data Structures and Classes#

In this class we are not going to worry about CPython and are going to look at data structures from the more abstract perspective of Python, that is, not how they are stored on the computer but how they are represented in the programming language we are learning. We will start with developing an understanding of built-in break data structures and then move onto using this as a framework for developing strategies to use AIs to understand data structures and types in scientific packages. There are essentially two fundamental types of data structures, primative (atomic) nd composite (container).

Primitive (atomic) are where the memory block is one location on the heap. These include integers, floats, strings, booleans and complex numbers.
Composite (container) are where the address of the memory block is really a pointer to other memory blocks. Examples are lists, dictionaries, tuples and arrays.

Primitive Data Structures#

Primitive or atomic data structures are ones where the binary encoding that a variable points to represents the data itself, in contrast to a container, that can hold other data structures.

Data Structure	Description
`int`	Represents whole numbers (positive, negative, or zero). Example: `42`, `-7`, `0`.
`float`	Represents decimal (floating-point) numbers. Example: `3.14`, `-0.001`, `2.0`.
`complex`	Represents numbers with real and imaginary parts. Example: `3+4j`.
`bool`	Represents truth values, either `True` or `False`.
`str`	Represents immutable sequences of Unicode characters (text data). Example: `"hello"`.
`bytes`	Represents immutable sequences of raw bytes. Example: `b'hello'`.

Composite Data Structures#

Data Structure	Description
`list`	A collection of ordered, mutable items that can hold elements of different types.
`tuple`	An ordered, immutable sequence of items, which can also hold elements of different types.
`set`	An unordered collection of unique, immutable items.
`dict`	A collection of key-value pairs, where keys are unique and immutable, and values are mutable.
`frozenset`	An immutable version of a set, useful for storing unique items that should not be changed.

Acknowledgements#

This content was developed with assistance from Perplexity AI and Chat GPT. Multiple queries were made during the Fall 2024 and the Spring 2025.