git-rs

Git-RS: Educational Git Implementation ๐Ÿฆ€

CI Documentation Quality Maintenance

A minimal Git implementation in Rust designed for learning Git internals and understanding how version control systems work under the hood.

๐ŸŽฏ Project Goals

This project implements core Git functionality from scratch to understand:

โš ๏ธ Important: Flexible Repository Structure

Git-rs supports two modes for different learning and testing needs:

๐ŸŽ“ Educational Mode (Default - Safe for Learning)

When you run git-rs init, it creates:

This allows you to:

๐Ÿ”„ Git Compatibility Mode (Advanced - Real Git Interoperability)

When you run git-rs --git-compat init, it creates:

This enables you to:

Examples:

# Safe learning mode (default)
git-rs init                    # Creates .git-rs/
git-rs add file.txt           # Uses .git-rs/git-rs-index

# Git compatibility mode  
git-rs --git-compat init      # Creates .git/
git-rs --git-compat add file.txt  # Uses .git/index
git status                    # Can use real Git to check!

๐Ÿ—๏ธ Architecture (Domain-Driven Design)

This project follows DDD principles with clean separation of concerns:

src/
โ”œโ”€โ”€ main.rs              # CLI entry point with clap
โ”œโ”€โ”€ lib.rs               # Library exports and error handling
โ”œโ”€โ”€ domain/              # ๐Ÿง  Core business logic
โ”‚   โ”œโ”€โ”€ repository.rs    # Repository aggregate root
โ”‚   โ”œโ”€โ”€ objects.rs       # Git objects (Blob, Tree, Commit)
โ”‚   โ”œโ”€โ”€ references.rs    # HEAD, branches, tags
โ”‚   โ””โ”€โ”€ index.rs         # Staging area model
โ”œโ”€โ”€ infrastructure/      # ๐Ÿ’พ Persistence layer
โ”‚   โ”œโ”€โ”€ object_store.rs  # File-based object database
โ”‚   โ”œโ”€โ”€ ref_store.rs     # Reference file management
โ”‚   โ””โ”€โ”€ index_store.rs   # Index file serialization
โ”œโ”€โ”€ application/         # ๐ŸŽฏ Use cases (commands)
โ”‚   โ”œโ”€โ”€ init.rs          # โœ… Repository initialization
โ”‚   โ”œโ”€โ”€ add.rs           # โœ… File staging
โ”‚   โ”œโ”€โ”€ status.rs        # โœ… Working tree status
โ”‚   โ”œโ”€โ”€ commit.rs        # โœ… Commit creation
โ”‚   โ”œโ”€โ”€ diff.rs          # โœ… Content comparison
โ”‚   โ””โ”€โ”€ clone.rs         # โœ… Repository cloning
โ””โ”€โ”€ cli/                 # ๐Ÿ–ฅ๏ธ Command line interface
    โ””โ”€โ”€ commands.rs      # Command handlers and user interaction

Layer Responsibilities:

๐Ÿ“Š Git Internals: Visual Guide

Repository Structure (.git-rs/)

.git-rs/
โ”œโ”€โ”€ objects/              # Content-addressed object database
โ”‚   โ”œโ”€โ”€ 5a/
โ”‚   โ”‚   โ””โ”€โ”€ 1b2c3d...    # Blob object (file content)
โ”‚   โ”œโ”€โ”€ ab/
โ”‚   โ”‚   โ””โ”€โ”€ cd1234...    # Tree object (directory listing)
โ”‚   โ””โ”€โ”€ ef/
โ”‚       โ””โ”€โ”€ 567890...    # Commit object (snapshot + metadata)
โ”œโ”€โ”€ refs/                 # Reference storage
โ”‚   โ”œโ”€โ”€ heads/           # Branch references
โ”‚   โ”‚   โ”œโ”€โ”€ main         # Contains: "5abc123def..."
โ”‚   โ”‚   โ””โ”€โ”€ feature-x    # Contains: "7def456ghi..."
โ”‚   โ””โ”€โ”€ tags/            # Tag references
โ”œโ”€โ”€ HEAD                  # Current branch pointer
โ”œโ”€โ”€ git-rs-index         # Staging area (JSON format)
โ”œโ”€โ”€ config               # Repository configuration
โ””โ”€โ”€ description          # Repository description

Object Storage Model

Working Directory  โ†’  Staging Area  โ†’  Repository
     (files)           (git-rs-index)    (objects/)
        โ”‚                    โ”‚              โ”‚
        โ”‚โ”€โ”€ git add โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ถ              โ”‚
        โ”‚                    โ”‚โ”€โ”€ commit โ”€โ”€โ”€โ–ถ
        โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ checkout โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚

Hash-Based Object System

Object Content โ†’ SHA-1 Hash โ†’ Storage Path
"Hello World"  โ†’ a1b2c3...  โ†’ .git-rs/objects/a1/b2c3...

Object Format:
"<type> <size>\0<content>"
"blob 11\0Hello World"

๐Ÿ”ง Implemented Commands

โœ… git-rs init - Repository Initialization

What it does:

Educational Insights:

Example:

git-rs init
# Creates: .git-rs/{objects,refs/{heads,tags},HEAD,config,description}

โœ… git-rs add - File Staging

What it does:

Educational Insights:

Example:

git-rs add README.md src/
# Creates blob objects and updates git-rs-index

Internal Process:

  1. Read file content: "Hello World"
  2. Create blob: "blob 11\0Hello World"
  3. Calculate hash: SHA-1("blob 11\0Hello World") = 5ab2c3d...
  4. Store compressed object: .git-rs/objects/5a/b2c3d...
  5. Update index: {"README.md": {"hash": "5ab2c3d...", ...}}

โœ… git-rs status - Working Tree Status

What it does:

Educational Insights:

Status Categories:

Changes to be committed:     # In index, different from HEAD
  new file:   README.md
  modified:   src/main.rs

Changes not staged:          # In working dir, different from index  
  modified:   README.md
  deleted:    old_file.txt

Untracked files:            # In working dir, not in index
  new_feature.rs

โœ… git-rs commit - Commit Creation

What it does:

Educational Insights:

Example:

git-rs commit -m "Initial implementation"
# Creates tree object, commit object, and updates branch ref

Internal Process:

  1. Load staged files from index: git-rs-index
  2. Create tree entries: {name: "README.md", mode: 100644, hash: "5ab2c3d..."}
  3. Store tree object: tree 42\0<tree-content> โ†’ 7def456ghi...
  4. Create commit object with:
    • Tree hash: 7def456ghi...
    • Parent commits (if any)
    • Author/committer signatures
    • Commit message
  5. Store commit object: commit 156\0<commit-content> โ†’ 9abc123def...
  6. Update branch reference: .git-rs/refs/heads/main โ†’ 9abc123def...

Commit Object Format:

tree 7def456ghi789...
parent 1abc234def567... (if not root commit)
author John Doe <john@example.com> 1692000000 +0000
committer John Doe <john@example.com> 1692000000 +0000

Initial implementation

โœ… git-rs diff - Content Comparison

What it does:

Educational Insights:

Example:

# Show unstaged changes
git-rs diff

# Show staged changes  
git-rs diff --cached

Output Format:

diff --git a/README.md b/README.md
index 1234567..abcdefg 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,4 @@
 # Git-RS
 
-Old line
+New line
+Added line

โœ… git-rs clone - Repository Cloning

Complete HTTP-based repository cloning with educational insights.

# Clone to directory with same name as repository
git-rs clone https://github.com/user/repo.git

# Clone to custom directory name
git-rs clone https://github.com/user/repo.git my-project

# Clone specific branch
git-rs clone --branch develop https://github.com/user/repo.git

Features:

๐Ÿšง Future Commands (Planned)

๏ฟฝ git-rs log - Commit History

Status: Not yet implemented (placeholder exists in CLI)

git-rs log           # Show all commit history
git-rs log -n 5      # Show last 5 commits

Will implement:

๏ฟฝ๐Ÿ”„ Branch Operations

git-rs status
# Shows comprehensive file state analysis

๐Ÿงฎ Hash Calculation Deep Dive

Git uses SHA-1 content addressing for all objects:

// Object format: "<type> <size>\0<content>"
let blob_content = b"Hello World";
let object_content = format!("blob {}\0", blob_content.len());
let full_content = [object_content.as_bytes(), blob_content].concat();
let hash = sha1::digest(&full_content); // "5ab2c3d4e5f6..."

Why this matters:

cargo test                    # Run all tests
cargo test --test integration # Integration tests only
cargo test domain::          # Domain layer tests

๐Ÿš€ Usage Examples

Basic Workflow

# Initialize repository
git-rs init

# Add files to staging
git-rs add README.md src/

# Check status
git-rs status

# Create commit
git-rs commit -m "Initial implementation"

# View differences (when implemented)
git-rs diff
git-rs diff --staged

Educational Exploration

# Examine object database
find .git-rs/objects -type f
file .git-rs/objects/5a/b2c3d4...

# View staging area
cat .git-rs/git-rs-index | jq .

# Check references
cat .git-rs/HEAD
cat .git-rs/refs/heads/main

๐ŸŽ“ Learning Outcomes

After exploring this implementation, youโ€™ll understand:

  1. Gitโ€™s Object Model: How blobs, trees, and commits form a directed acyclic graph
  2. Content Addressing: Why identical content produces identical hashes
  3. Three Trees: Working directory, index, and HEAD relationships
  4. Reference System: How branches and tags are just pointers to commits
  5. Staging Process: Why the index exists and how it enables powerful workflows
  6. File System Integration: How Git maps its abstract model to disk storage

๐Ÿ” Debugging and Introspection

Use these commands to explore git-rs internals:

# Object inspection
hexdump -C .git-rs/objects/5a/b2c3d4...

# Decompression (requires zlib tools)
zpipe -d < .git-rs/objects/5a/b2c3d4...

# Index inspection  
jq . .git-rs/git-rs-index

# Reference tracking
find .git-rs/refs -type f -exec echo {} \; -exec cat {} \;

๐Ÿ“š Educational Resources

Each command implementation includes:

๐Ÿ“– Documentation

๐Ÿ“š Core Documentation

๐ŸŽฏ Learning Paths

For Git Beginners:

  1. Start with this README for project overview
  2. Read Git Internals Explained to understand core concepts
  3. Try hands-on examples in Command Reference
  4. Explore Architecture Guide for implementation details

For Developers:

  1. Review Project Status for current state and roadmap
  2. Study Architecture Guide for system design
  3. Check API Documentation for code reference
  4. Follow contribution guidelines below

For Rust Learners:

  1. Examine Architecture Guide for Domain-Driven Design patterns
  2. Browse API Documentation for Rust idioms
  3. Look at GitHub Actions workflows in .github/workflows/ for CI/CD examples

๐Ÿ” Key Concepts Youโ€™ll Learn

๐Ÿ†˜ Getting Help

๐Ÿค Contributing

This is primarily an educational project, but contributions are welcome:

Development Setup

Before committing changes, ensure code quality with our formatting script:

# Run all formatting and checks
./scripts/format.sh

# Or manually run individual tools:
cargo fmt                    # Rust code formatting
markdownlint-cli2 --fix "**/*.md" "!target/**" "!node_modules/**"  # Markdown formatting
cargo clippy --all-targets --all-features -- -D warnings  # Linting
cargo test                   # Test suite

๐Ÿ“– References


Remember: This implementation uses .git-rs/ directories to avoid conflicts with real Git repositories, making it safe to experiment with in existing projects!