Python oop basics

A class is a user-defined blueprint or prototype from which objects are created. Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state.
To understand the need for creating a class let’s consider an example, let’s say you wanted to track the number of dogs which may have different attributes like breed, age. If a list is used, the first element could be the dog’s breed while the second element could represent its age. Let’s suppose there are 100 different dogs, then how would you know which element is supposed to be which? What if you wanted to add other properties to these dogs? This lacks organization and it’s the exact need for classes.
Class creates a user-defined data structure, which holds its own data members and member functions, which can be accessed and used by creating an instance of that class. A class is like a blueprint for an object.
Some points on Python class:
Classes are created by keyword class.
Attributes are the variables that belong to class.
Attributes are always public and can be accessed using dot (.) operator. Eg.: Myclass.Myattribute


<strong>Class Definition Syntax:</strong> class ClassName:     
# Statement-1     
.     .     .     
# Statement-N 
<strong>Defining a class –</strong>

python class

Declaring Objects (Also called instantiating a class)
When an object of a class is created, the class is said to be instantiated. All the instances share the attributes and the behavior of the class. But the values of those attributes, i.e. the state are unique for each object. A single class may have any number of instances.
Example:

python declaring an object

Declaring an object


# Python program to
# demonstrate instantiating
# a class


classDog: 

# A simple class
# attribute
attr1 ="mamal"
attr2 ="dog"

# A sample method  
deffun(self): 
print("I'm a", self.attr1)
print("I'm a", self.attr2)

# Driver code
# Object instantiation
Rodger =Dog()

# Accessing class attributes
# and method through objects
print(Rodger.attr1)
Rodger.fun()

Output:
mamal I’m a mamal I’m a dog
In the above example, an object is created which is basically a dog named Rodger. This class only has two class attributes that tell us that Rodger is a dog and a mammal.
The self
Class methods must have an extra first parameter in method definition. We do not give a value for this parameter when we call the method, Python provides it.
If we have a method which takes no arguments, then we still have to have one argument.
This is similar to this pointer in C++ and this reference in Java.
When we call a method of this object as myobject.method(arg1, arg2), this is automatically converted by Python into MyClass.method(myobject, arg1, arg2) – this is all the special self is about.
__init__ method:
The __init__ method is similar to constructors in C++ and Java. Constructors are used to initialize the object’s state. Like methods, a constructor also contains a collection of statements(i.e. instructions) that are executed at the time of Object creation. It is run as soon as an object of a class is instantiated. The method is useful to do any initialization you want to do with your object.

# A Sample class with init method 
classPerson: 
# init method or constructor  
def__init__(self, name): 
self.name =name 
# Sample Method  
defsay_hi(self): 
print('Hello, my name is', self.name)
p =Person('Nikhil') 
p.say_hi() 

 
Output:
Hello, my name is Nikhil
Class and Instance Variables
Instance variables are for data unique to each instance and class variables are for attributes and methods shared by all instances of the class. Instance variables are variables whose value is assigned inside a constructor or method with self whereas class variables are variables whose value is assigned in the class.
Defining instance varibale using constructor.

# Python program to show that the variables with a value  
# assigned in the class declaration, are class variables and 
# variables inside methods and constructors are instance 
# variables. 

# Class for Dog 
classDog: 

# Class Variable 
animal ='dog'

# The init method or constructor 
def__init__(self, breed, color): 

# Instance Variable     
self.breed =breed
self.color =color        

# Objects of Dog class 
Rodger =Dog("Pug", "brown") 
Buzo =Dog("Bulldog", "black") 

print('Rodger details:')   
print('Rodger is a', Rodger.animal) 
print('Breed: ', Rodger.breed)
print('Color: ', Rodger.color)

print('\nBuzo details:')   
print('Buzo is a', Buzo.animal) 
print('Breed: ', Buzo.breed)
print('Color: ', Buzo.color)

# Class variables can be accessed using class 
# name also 
print("\nAccessing class variable using class name")
print(Dog.animal)        

Output:
Rodger details: Rodger is a dog Breed: Pug Color: brown Buzo details: Buzo is a dog Breed: Bulldog Color: black Accessing class variable using class name dog
Defining instance variable using the normal method.

# Python program to show that we can create  
# instance variables inside methods 

# Class for Dog 
classDog: 

# Class Variable 
animal ='dog'

# The init method or constructor 
def__init__(self, breed): 

# Instance Variable 
self.breed =breed             

# Adds an instance variable  
defsetColor(self, color): 
self.color =color 

# Retrieves instance variable     
defgetColor(self):     
returnself.color    

# Driver Code 
Rodger =Dog("pug") 
Rodger.setColor("brown") 
print(Rodger.getColor())  
Output:
brown

Python csv basics

Working with csv files in Python

csv3

This article explains how to load and parse a CSV file in Python.

First of all, what is a CSV ?
CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format.

For working CSV files in python, there is an inbuilt module called csv.

Reading a CSV file

# importing csv module 
import csv 

# csv file name 
filename = "aapl.csv"

# initializing the titles and rows list 
fields = [] 
rows = [] 

# reading csv file 
with open(filename, 'r') as csvfile: 
	# creating a csv reader object 
	csvreader = csv.reader(csvfile) 
	
	# extracting field names through first row 
	fields = next(csvreader) 

	# extracting each data row one by one 
	for row in csvreader: 
		rows.append(row) 

	# get total number of rows 
	print("Total no. of rows: %d"%(csvreader.line_num)) 

# printing the field names 
print('Field names are:' + ', '.join(field for field in fields)) 

# printing first 5 rows 
print('\nFirst 5 rows are:\n') 
for row in rows[:5]: 
	# parsing each column of a row 
	for col in row: 
		print("%10s"%col), 
	print('\n') 

The output of above program looks like this:

csv1

The above example uses a CSV file aapl.csv which can be downloaded from here.
Run this program with the aapl.csv file in same directory.

Let us try to understand this piece of code.

  • with open(filename, ‘r’) as csvfile: csvreader = csv.reader(csvfile)Here, we first open the CSV file in READ mode. The file object is named as csvfile. The file object is converted to csv.reader object. We save the csv.reader object as csvreader.
  • fields = csvreader.next()csvreader is an iterable object. Hence, .next() method returns the current row and advances the iterator to the next row. Since the first row of our csv file contains the headers (or field names), we save them in a list called fields.
  • for row in csvreader: rows.append(row)Now, we iterate through remaining rows using a for loop. Each row is appended to a list called rows. If you try to print each row, one can find that row is nothing but a list containing all the field values.
  • print(“Total no. of rows: %d”%(csvreader.line_num))csvreader.line_num is nothing but a counter which returns the number of rows which have been iterated.

Writing to a CSV file

# importing the csv module 
import csv 

# field names 
fields = ['Name', 'Branch', 'Year', 'CGPA'] 

# data rows of csv file 
rows = [ ['Nikhil', 'COE', '2', '9.0'], 
		['Sanchit', 'COE', '2', '9.1'], 
		['Aditya', 'IT', '2', '9.3'], 
		['Sagar', 'SE', '1', '9.5'], 
		['Prateek', 'MCE', '3', '7.8'], 
		['Sahil', 'EP', '2', '9.1']] 

# name of csv file 
filename = "university_records.csv"

# writing to csv file 
with open(filename, 'w') as csvfile: 
	# creating a csv writer object 
	csvwriter = csv.writer(csvfile) 
	
	# writing the fields 
	csvwriter.writerow(fields) 
	
	# writing the data rows 
	csvwriter.writerows(rows)

Let us try to understand the above code in pieces.

  • fields and rows have been already defined. fields is a list containing all the field names. rows is a list of lists. Each row is a list containing the field values of that row.
  • with open(filename, ‘w’) as csvfile: csvwriter = csv.writer(csvfile)Here, we first open the CSV file in WRITE mode. The file object is named as csvfile. The file object is converted to csv.writer object. We save the csv.writer object as csvwriter.
  • csvwriter.writerow(fields)Now we use writerow method to write the first row which is nothing but the field names.
  • csvwriter.writerows(rows)We use writerows method to write multiple rows at once.

Writing a dictionary to a CSV file

# importing the csv module 
import csv 

# my data rows as dictionary objects 
mydict =[{'branch': 'COE', 'cgpa': '9.0', 'name': 'Nikhil', 'year': '2'}, 
		{'branch': 'COE', 'cgpa': '9.1', 'name': 'Sanchit', 'year': '2'}, 
		{'branch': 'IT', 'cgpa': '9.3', 'name': 'Aditya', 'year': '2'}, 
		{'branch': 'SE', 'cgpa': '9.5', 'name': 'Sagar', 'year': '1'}, 
		{'branch': 'MCE', 'cgpa': '7.8', 'name': 'Prateek', 'year': '3'}, 
		{'branch': 'EP', 'cgpa': '9.1', 'name': 'Sahil', 'year': '2'}] 

# field names 
fields = ['name', 'branch', 'year', 'cgpa'] 

# name of csv file 
filename = "university_records.csv"

# writing to csv file 
with open(filename, 'w') as csvfile: 
	# creating a csv dict writer object 
	writer = csv.DictWriter(csvfile, fieldnames = fields) 
	
	# writing headers (field names) 
	writer.writeheader() 
	
	# writing data rows 
	writer.writerows(mydict) 

In this example, we write a dictionary mydict to a CSV file.

  • with open(filename, ‘w’) as csvfile: writer = csv.DictWriter(csvfile, fieldnames = fields)Here, the file object (csvfile) is converted to a DictWriter object.
    Here, we specify the fieldnames as an argument.
  • writer.writeheader()writeheader method simply writes the first row of your csv file using the pre-specified fieldnames.
  • writer.writerows(mydict)writerows method simply writes all the rows but in each row, it writes only the values(not keys).

So, in the end, our CSV file looks like this:

csv2

Important Points:

  • In csv modules, an optional dialect parameter can be given which is used to define a set of parameters specific to a particular CSV format. By default, csv module uses excel dialect which makes them compatible with excel spreadsheets. You can define your own dialect using register_dialect method.
    Here is an example:
     csv.register_dialect(
    'mydialect',
    delimiter = ',',
    quotechar = '"',
    doublequote = True,
    skipinitialspace = True,
    lineterminator = '\r\n',
    quoting = csv.QUOTE_MINIMAL)

Now, while defining a csv.reader or csv.writer object, we can specify the dialect like
this:

csvreader = csv.reader(csvfile, dialect='mydialect')
  • Now, consider that a CSV file looks like this in plain-text:
    We notice that the delimiter is not a comma but a semi-colon. Also, the rows are separated by two newlines instead of one. In such cases, we can specify the delimiter and line terminator as follows:csvreader = csv.reader(csvfile, delimiter = ';', lineterminator = '\n\n')

Python json basics

Introduction of JSON in Python :
The full-form of JSON is JavaScript Object Notation. It means that a script (executable) file which is made of text in a programming language, is used to store and transfer the data. Python supports JSON through a built-in package called json. To use this feature, we import the json package in Python script. The text in JSON is done through quoted string which contains value in key-value mapping within { }. It is similar to the dictionary in Python. JSON shows an API similar to users of Standard Library marshal and pickle modules and Python natively supports JSON features.

Python program showing
 use of json package
 import json 
 {key:value mapping}
 a ={"name":"John", 
 "age":31, 
     "Salary":25000} 
 conversion to JSON done by dumps() function
 b = json.dumps(a) 
 printing the output
 print(b) 

Output:

{"age": 31, "Salary": 25000, "name": "John"}

As you can see, JSON supports primitive types, like strings and numbers, as well as nested lists, tuples and objects.

Python program showing that
 json support different primitive
 types
 import json 
 list conversion to Array
 print(json.dumps(['Welcome', "to", "GeeksforGeeks"])) 
 tuple conversion to Array
 print(json.dumps(("Welcome", "to", "GeeksforGeeks"))) 
 string conversion to String
 print(json.dumps("Hi")) 
 int conversion to Number
 print(json.dumps(123)) 
 float conversion to Number
 print(json.dumps(23.572)) 
 Boolean conversion to their respective values
 print(json.dumps(True)) 
 print(json.dumps(False)) 
 None value to null
 print(json.dumps(None)) 

Output:

["Welcome", "to", "GeeksforGeeks"]
["Welcome", "to", "GeeksforGeeks"]
"Hi"
123
23.572
true
false
null

Python django basics

Django is a Python-based web framework which allows you to quickly create web application without all of the installation or dependency problems that you normally will find with other frameworks.
When you’re building a website, you always need a similar set of components: a way to handle user authentication (signing up, signing in, signing out), a management panel for your website, forms, a way to upload files, etc. Django gives you ready-made components to use.

django-basics

Why Django?

  • Django is a rapid web development framework that can be used to develop fully fleshed web applications in a short period of time.
  • It’s very easy to switch database in Django framework.
  • It has built-in admin interface which makes easy to work with it.
  • Django is fully functional framework that requires nothing else.
  • It has thousands of additional packages available.
  • It is very scalable. For more visit When to Use Django? Comparison with other Development Stacks ?

Django architecture

Django is based on MVT (Model-View-Template) architecture. MVT is a software design pattern for developing a web application.

MVT Structure has the following three parts –

Model: Model is going to act as the interface of your data. It is responsible for maintaining data. It is the logical data structure behind the entire application and is represented by a database (generally relational databases such as MySql, Postgres).

Python pandas basics

Pandas is an open-source library that is made mainly for working with relational or labeled data both easily and intuitively. It provides various data structures and operations for manipulating numerical data and time series. This library is built on the top of the NumPy library. Pandas is fast and it has high-performance & productivity for users.

Table of Content

  • History
  • Advantages
  • Getting Started
    • Series
    • DataFrame
  • Why Pandas is used for Data Science

History

Pandas was initially developed by Wes McKinney in 2008 while he was working at AQR Capital Management. He convinced the AQR to allow him to open source the Pandas. Another AQR employee, Chang She, joined as the second major contributor to the library in 2012. Over the time many versions of pandas have been released. The latest version of the pandas is 1.0.1

Advantages

  • Fast and efficient for manipulating and analyzing data.
  • Data from different file objects can be loaded.
  • Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data
  • Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects
  • Data set merging and joining.
  • Flexible reshaping and pivoting of data sets
  • Provides time-series functionality.
  • Powerful group by functionality for performing split-apply-combine operations on data sets.

Getting Started

After the pandas has been installed into the system, you need to import the library. This module is generally imported as –

import pandas as pd

Here, pd is referred to as an alias to the Pandas. However, it is not necessary to import the library using alias, it just helps in writing less amount of code everytime a method or property is called.

Python numpy basics

Numpy is a general-purpose array-processing package. It provides a high-performance multidimensional array object, and tools for working with these arrays. It is the fundamental package for scientific computing with Python.
Besides its obvious scientific uses, Numpy can also be used as an efficient multi-dimensional container of generic data.

Arrays in Numpy

Array in Numpy is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In Numpy, number of dimensions of the array is called rank of the array.A tuple of integers giving the size of the array along each dimension is known as shape of the array. An array class in Numpy is called as ndarray. Elements in Numpy arrays are accessed by using square brackets and can be initialized by using nested Python Lists.

Creating a Numpy Array
Arrays in Numpy can be created by multiple ways, with various number of Ranks, defining the size of the Array. Arrays can also be created with the use of various data types such as lists, tuples, etc. The type of the resultant array is deduced from the type of the elements in the sequences.
Note: Type of array can be explicitly defined while creating the array.

<code># Python program for</code>
<code># Creation of Arrays</code>
<code>import</code> <code>numpy as np</code>
 
<code># Creating a rank 1 Array</code>
<code>arr =</code> <code>np.array([1, 2, 3])</code>
<code>print("Array with Rank 1: \n",arr)</code>
 
<code># Creating a rank 2 Array</code>
<code>arr =</code> <code>np.array([[1, 2, 3],</code>
<code>[4, 5, 6]])</code>
<code>print("Array with Rank 2: \n", arr)</code>
 
<code># Creating an array from tuple</code>
<code>arr =</code> <code>np.array((1, 3, 2))</code>
<code>print("\nArray created using "</code>
<code>"passed tuple:\n", arr)</code>

Output:

Array with Rank 1: 
 [1 2 3]
Array with Rank 2: 
 [[1 2 3]
 [4 5 6]]

Array created using passed tuple:
 [1 3 2]

 
Accessing the array Index
In a numpy array, indexing or accessing the array index can be done in multiple ways. To print a range of an array, slicing is done. Slicing of an array is defining a range in a new array which is used to print a range of elements from the original array. Since, sliced array holds a range of elements of the original array, modifying content with the help of sliced array modifies the original array content.

<code># Python program to demonstrate</code>
<code># indexing in numpy array</code>
<code>import</code> <code>numpy as np</code>
 
<code># Initial Array</code>
<code>arr =</code> <code>np.array([[-1, 2, 0, 4],</code>
<code>[4, -0.5, 6, 0],</code>
<code>[2.6, 0, 7, 8],</code>
<code>[3, -7, 4, 2.0]])</code>
<code>print("Initial Array: ")</code>
<code>print(arr)</code>
 
<code># Printing a range of Array</code>
<code># with the use of slicing method</code>
<code>sliced_arr =</code> <code>arr[:2, ::2]</code>
<code>print</code> <code>("Array with first 2 rows and"</code>
<code>" alternate columns(0 and 2):\n", sliced_arr)</code>
 
<code># Printing elements at</code>
<code># specific Indices</code>
<code>Index_arr =</code> <code>arr[[1, 1, 0, 3], </code>
<code>[3, 2, 1, 0]]</code>
<code>print</code> <code>("\nElements at indices (1, 3), "</code>
<code>"(1, 2), (0, 1), (3, 0):\n", Index_arr)</code>

Output:

Initial Array: 
[[-1.   2.   0.   4. ]
 [ 4.  -0.5  6.   0. ]
 [ 2.6  0.   7.   8. ]
 [ 3.  -7.   4.   2. ]]
Array with first 2 rows and alternate columns(0 and 2):
 [[-1.  0.]
 [ 4.  6.]]

Elements at indices (1, 3), (1, 2), (0, 1), (3, 0):
 [ 0. 54.  2.  3.]

Python collections basics

Counter is a container included in the collections module. Now you all must be wondering what is a container. Don’t worry first let’s discuss about the container.

What is Container?

Containers are objects that hold objects. They provide a way to access the contained objects and iterate over them. Examples of built in containers are Tuple, list, and dictionary. Others are included in Collections module.

A Counter is a subclass of dict. Therefore it is an unordered collection where elements and their respective count are stored as a dictionary. This is equivalent to a bag or multiset of other languages.

Syntax :

class collections.Counter([iterable-or-mapping])

Initialization :
The constructor of counter can be called in any one of the following ways :

With sequence of itemsWith dictionary containing keys and countsWith keyword arguments mapping string names to counts.

OrderedDict in Python

An OrderedDict is a dictionary subclass that remembers the order that keys were first inserted. The only difference between dict() and OrderedDict() is that:

OrderedDict preserves the order in which the keys are inserted. A regular dict doesn’t track the insertion order, and iterating it gives the values in an arbitrary order. By contrast, the order the items are inserted is remembered by OrderedDict.

A Python program to demonstrate working of OrderedDict
 from collections import OrderedDict 
 print("This is a Dict:\n") 
 d = {} 
 d['a'] = 1
 d['b'] = 2
 d['c'] = 3
 d['d'] = 4
 for key, value in d.items(): 
     print(key, value) 
 print("\nThis is an Ordered Dict:\n") 
 od = OrderedDict() 
 od['a'] = 1
 od['b'] = 2
 od['c'] = 3
 od['d'] = 4
 for key, value in od.items(): 
     print(key, value) 

Output:

This is a Dict:
('a', 1)
('c', 3)
('b', 2)
('d', 4)

This is an Ordered Dict:
('a', 1)
('b', 2)
('c', 3)
('d', 4)

Important Points:

  1. Key value Change: If the value of a certain key is changed, the position of the key remains unchanged in OrderedDict.
A Python program to demonstrate working of key
 value change in OrderedDict
 from collections import OrderedDict 
 print("Before:\n") 
 od = OrderedDict() 
 od['a'] = 1
 od['b'] = 2
 od['c'] = 3
 od['d'] = 4
 for key, value in od.items(): 
     print(key, value) 
 print("\nAfter:\n") 
 od['c'] = 5
 for key, value in od.items(): 
     print(key, value) 
Before: ('a', 1) ('b', 2) ('c', 3) ('d', 4) 
After: ('a', 1) ('b', 2) ('c', 5) ('d', 4)

Other Considerations:

  • Ordered dict in Python version 2.7 consumes more memory than normal dict. This is due to the underlying Doubly Linked List implementation for keeping the order. In Python 2.7 Ordered Dict is not dict subclass, it’s a specialized container from collections module.
  • Starting from Python 3.7, insertion order of Python dictionaries is guaranteed.
  • Ordered Dict can be used as a stack with the help of popitem function. Try implementing LRU cache with Ordered Dict.

Python projects worth building

Learning Python becomes much easier when you build things that solve real problems. Tutorials help, but projects are what turn syntax into skill. The best beginner and intermediate projects are small enough to finish and useful enough to teach good habits.

Project Ideas That Are Worth Building

1. Log Analyzer

Read a log file, count error messages, group them by level, and export a short report. This teaches file I/O, loops, string processing, and simple data structures.

2. API Data Fetcher

Call a public API, store selected fields, and print a summary. This teaches HTTP requests, JSON parsing, and error handling.

import requests

response = requests.get("https://api.github.com/repos/python/cpython")
data = response.json()
print(data["stargazers_count"])

3. CSV Cleaner

Take messy CSV input, normalize column names, remove invalid rows, and write a cleaned file. This is very close to real engineering work.

4. Deployment Helper Script

Build a CLI tool that runs tests, tags a release, or validates environment variables before deployment.

What Good Projects Teach

  • how to structure code into functions,
  • how to handle bad input,
  • how to log useful information,
  • and how to make tools that others can actually use.

A Good Rule

Do not chase huge project ideas too early. A finished 150-line script that solves one real problem is often more valuable than a half-built web app that never works.

Final Thoughts

If you want to learn Python seriously, build small tools with clear outcomes. Practical projects create momentum, and momentum is what turns learning into engineering ability.

Python for engineers

Python is one of the most practical languages you can learn as an engineer. It is easy to read, productive for day-to-day work, and flexible enough for automation, backend services, machine learning, testing, and DevOps tasks.

Why Python Is So Useful

  • It has simple syntax, so you can focus on the problem instead of fighting the language.
  • It has a huge ecosystem for web, data, AI, testing, and scripting.
  • It is excellent for glue code between services, files, APIs, and command-line tools.

A Small Example

The following script reads a log file and counts how many lines contain the word ERROR.

from pathlib import Path

log_path = Path("app.log")
error_count = 0

for line in log_path.read_text().splitlines():
    if "ERROR" in line:
        error_count += 1

print(f"Found {error_count} error lines")

This is a good example of why Python is popular. The code is short, readable, and immediately useful.

Common Use Cases

1. Automation

Python is great for repetitive tasks such as renaming files, processing CSV data, calling REST APIs, or sending reports by email.

import requests

response = requests.get("https://api.github.com")
print(response.status_code)

2. Data Processing

You can parse text, clean records, and transform data before loading it into another system.

3. Backend Development

Frameworks like Flask, FastAPI, and Django make Python a strong option for internal tools and web services.

4. Machine Learning

Libraries such as NumPy, pandas, scikit-learn, PyTorch, and TensorFlow make Python a default choice in many AI workflows.

Tips for Writing Better Python

  • Use virtual environments to keep dependencies isolated.
  • Write small functions with clear names.
  • Handle exceptions only where you can act on them.
  • Use type hints when they improve readability.
  • Add tests for scripts that affect production systems.

Final Thoughts

If you want one language that helps with automation, prototyping, and real production work, Python is hard to beat. It is not only beginner-friendly. It is genuinely powerful for experienced engineers as well.

Perceptron on the iris dataset in python

The Iris dataset is one of the best beginner examples for understanding the perceptron. It is small, well known, and easy to visualize. That makes it a practical way to see how a linear classifier learns from real feature values rather than only from toy Boolean inputs.

In this article, we use the Iris dataset to train a perceptron in Python and explain what the result actually teaches. The goal is not only to show code. The goal is to understand why this dataset works well as a first perceptron example.

What you will learn

  • why the Iris dataset is a good starting point
  • how to prepare a binary classification task for the perceptron
  • how the training loop works on real feature data
  • what to expect from the result and where the model starts to struggle

Why the Iris dataset is useful here

Scikit-learn’s Iris example describes the dataset as 150 samples with four features across three iris species. For a beginner, that is perfect because the data is simple enough to inspect while still being real tabular classification data.

A common first step is to simplify the task into a binary classification problem. For example, you can choose two classes and focus on two features such as petal length and petal width. That keeps the geometry easy to visualize and matches the perceptron’s linear nature.

Preparing the data

from sklearn.datasets import load_iris
import numpy as np

iris = load_iris()
X = iris.data[:, [2, 3]]  # petal length and petal width
y = iris.target

# keep only two classes for a binary perceptron example
mask = y < 2
X = X[mask]
y = y[mask]

This transforms the classic three-class dataset into a much cleaner binary problem that a single perceptron can handle more naturally.

Training a perceptron

You can train either a scratch implementation or the scikit-learn version. The scratch route is best for intuition. The scikit-learn route is best when you want a fast verified baseline.

from sklearn.linear_model import Perceptron

model = Perceptron(max_iter=1000, tol=1e-3, random_state=42)
model.fit(X, y)
predictions = model.predict(X)

Scikit-learn’s documentation notes that its perceptron classifier is implemented as a wrapper around `SGDClassifier` with a perceptron loss and constant learning rate. That is useful context because it shows the historical model inside a modern linear-learning framework.

What the result means

If you choose two well-separated classes and helpful features, the perceptron often performs very well on this simplified Iris task. That result should not be read as “the perceptron solves general machine learning.” It should be read more carefully:

  • the problem has been simplified into a binary task
  • the selected features support a fairly clean separation
  • the perceptron succeeds because the geometry is favorable

This is exactly why Iris is a teaching dataset. It helps you see when a linear classifier is a good fit.

What to inspect during training

When working through this example, pay attention to:

  • which two classes you selected
  • which two features you used
  • whether the points look roughly linearly separable
  • how stable the predictions become after training

If you change the task to something less separable, the perceptron can struggle. That is not surprising. It is the same structural limitation discussed in Why perceptrons fail on xor.

Why this example is worth keeping on the site

The Iris article is a strong supporting piece in the Perceptron cluster because it connects theory to data. The pillar article Perceptron explained for beginners teaches the concept. This article shows the concept on a familiar dataset. Together, they make the topic much easier to trust and understand.

Common mistakes or limitations

  • using all three Iris classes and expecting a simple binary explanation
  • not checking whether the chosen features are linearly separable enough
  • treating a clean toy result as proof that the model is broadly strong
  • confusing dataset convenience with real-world robustness

Key takeaways

  • The Iris dataset is a strong beginner example for the perceptron because it is small and interpretable.
  • A binary subset with suitable features fits the perceptron especially well.
  • The example teaches when a linear classifier works, not that it works everywhere.

Next steps

References