Version Control Made Easy: Introduction to Git and GitHub

Version Control
python
Git
Github
Author

Sandeep N

Published

May 30, 2025

Tweet LinkedIn Email WhatsApp Reddit

Warning

🚧 Work in Progress!
Hey! I’m still working on this post. If you’re interested, keep an eye out for updates here — exciting stuff coming soon!!

You start with a file called script.py. Then it becomes script_final.py. Then script_final2.py. Eventually, you’re looking at script_final_final_really_final.py.

It gets messy fast.

A system that can remember what changed, when, why and by whom — without relying on awkward filenames or endless copies. That’s exactly where Version Control Systems (VCS) come in.

git{Source: Google Gemini, height=500px, width= 1000px }

Version Control System (VCS)

helps developers track changes in source code over time. It allows you to:

  • Revert to previous versions of code
  • Compare changes made over time
  • Identify who modified a file and when
  • Collaborate with multiple contributors without overwriting each other’s work
  • Experiment in branches without affecting the main codebase

Think of it as a magical time machine for your code. You can move back and forth between versions, experiment freely, and maintain a clear record of all changes.

There are two major types of VCS:

  1. Centralized Version Control Systems (CVCS)
  2. Distributed Version Control Systems (DVCS)

In this blog, I’m going to focus on DVCS specifically, on Git.

Note

Git is the most widely used Version Control System today. It’s trusted by individual developers, open source communities, and large enterprises alike making it an essential tool in modern software development.

What is Git?

Git is the most popular Distributed Version Control System (DVCS) used by developers across the world. It was created in 2005 by Linus Torvalds, the creator of the Linux kernel, to manage the Linux source code more efficiently.

Unlike centralized systems, Git allows every user to have their own full copy of the repository, including its entire history.

Git Terminology

high-level overview of Git’s core concepts:

  • Repository (repo): A directory that contains your project files and a .git folder, which tracks all changes.
  • Commit: A snapshot of your files at a certain point in time. Each commit has a unique ID and message describing the change.
  • Branch: A separate line of development. The default branch is usually called main or master.
  • Merge: Integrates changes from one branch into another.
  • Clone: Copy an entire repository from a remote server to your local machine.
  • Push / Pull: Push sends your local commits to a remote repository (like GitHub). Pull fetches and integrates changes from remote to local.

Git follows a three-stage architecture:

  1. Working Tree – Where you make changes.
  2. Staging Area (Index) – Where you prepare changes for the next commit.
  3. Repository (.git folder) – Where committed changes are stored permanently.

Why Use Git?

  • Speed: Most operations are local and incredibly fast.
  • Data Integrity: history of every change is securely stored
  • Branching and Merging: Branches are lightweight and encourage experimentation.
  • Widespread Adoption: Git is used in open source and enterprise environments alike. Learning Git is an essential skill for any developer or data scientist.

Download and Install

To start using Git, you’ll need to install it on your machine. Here are the official resources:

Download Git
Installation Guide

Once installed, you can open a terminal (or Git Bash on Windows) and run:

git --version

If it returns a version number without any error, Git is ready to use!

Step-by-Step Git Workflow

We are going to use the directory created in our last session (ml_project). please reffer Virtual environments in ML projects if you do not have it already. (You can also create a new directory if you want).

ml_project/
├── data/            # Raw or processed datasets
├── notebooks/       # Jupyter notebooks for EDA
├── src/             # Source code / scripts / modules
├── models/          # Saved model files
├── venv/            # Virtual environment
├── requirements.txt # Package list

Set Your Name and Email (one-time setup)

git config --global user.name "Your Full Name"
git config --global user.email "your.email@example.com"

This configures your name and email and appears in your commit history and helps identify who made each change. Use the same email as your GitHub account to link commits correctly.

Initialize Git in Your Project

Add a .gitkeep or empty .txt file inside each folder so Git can track them even if they’re empty initially:

touch ml_project/{data,notebooks,src,models,outputs,tests}/.gitkeep

Once we have created a project directory we can initialize git by

git init

This turns your folder into a Git repository by creating a hidden .git/ folder. This folder contains everything Git needs to track changes, including:

  • Commit history
  • Branches
  • Staging area (index)
  • Configuration settings

It tells Git: “Start tracking this project.” It enables all Git features like git add, git commit, and git push.

To view hidden .git directory you can use

ls -l .git
Caution

Only run git init once per project. If you delete the .git/ folder, you lose all version history.

right after initializing git we create two files which are important

echo "# My Project Title" > README.md
touch .gitignore

README.md is a Markdown file that typically serves as the front page of your project.

It should include: - What the project does - How to install/run it - Dependencies - Usage examples and other important project details

.gitignore tells Git which files or folders NOT to track or include in commits. Examples of what to ignore:

  • OS/system files (.DS_Store, Thumbs.db)
  • IDE/editor files (.vscode/, .idea/)
  • Logs or temporary files (.log, .tmp)
  • Dependency folders (node_modules/, venv/)
  • Sensitive data (.env, credentials.json)

In our example we are going to add “venv/” to .gitignore

Caution

Without a .gitignore, Git may track junk or private files that you never meant to upload.

Check the Current Git Status

This shows you: - New/untracked files - Changes that are staged or not staged - What’s ready to commit

git status

You’ll see something like:

““” On branch master

No commits yet

Untracked files: (use “git add …” to include in what will be committed)

.gitignore
README.md

nothing added to commit but untracked files present (use “git add” to track) ““”

Now, to track these files

Stage the File for Commit

git add README.md
git add .gitignore

Think of git add as preparing the file to be saved (but not saving yet)

Working Directory ──(git add)──▶ Staging Area

Commit the File

git commit -m "Add README and .giignore files"

The -m flag adds a commit message describing what you did. it is always a good practice to give a proper desciption of what changes you have done in this commit. git commit takes a snapshot of your staged changes and permanently stores it in the Git repository.

Why Is git commit Important?

  • It records your progress in small, logical steps
  • Makes it easier to undo mistakes, collaborate, and track who did what
  • Helps in code reviews and project management

Working Directory ──(git add)──▶ Staging Area ──(git commit)──▶ Git Repository

Check the Commit History

git log

git log displays the history of commits in your repository — from the most recent to the oldest.

It shows you: - Who made the commit (name and email) - When it was made - What the commit message was - A unique commit ID (SHA)

Wrapping Up

In this post, you’ve taken your very first steps with Git:

  1. You initialized a Git repository with git init
  2. Learned how to track changes with git add and git commit
  3. Explored how to inspect your project history with git log
  4. And got familiar with concepts like .gitignore and git status
  5. You now have a working Git project right on your local machine.

But what if you want to back up your code, access it from anywhere, or collaborate with others?

That’s where GitHub comes in!

What’s Next?

In the next blog, we’ll connect your local Git project to a remote repository on GitHub. You’ll learn:

  1. Why pushing to GitHub is useful
  2. How to create a GitHub repo
  3. How to connect it to your local repo
  4. How to push your code to the cloud!

These are the foundational steps every developer takes when beginning a new project, this is just the beginning.

Source

  1. Git Documentation
  2. Git Book

Git Cheatsheet

Connect with Me

  • GitHub
  • LinkedIn
  • Email
Desclaimer

I use AI tools to assist in writing and drafting some of the content on this blog. but all content is reviewed and edited by me for accuracy and clarity.

💬 Comments