Distributed File System (DFS) - A GFS-like Architecture

This project implements a distributed file system inspired by Google File System (GFS). It is structured into four labs, each building upon the previous one to achieve a scalable, fault-tolerant, and efficient distributed file system with additional support for distributed computation.

Project Overview

The project is divided into four labs:

Lab 1: Build an Inode-based file system.
Lab 2: Extend the Inode-based file system to a distributed file system.
Lab 3: Integrate the Raft consensus algorithm for strong consistency.
Lab 4: Implement a MapReduce-like distributed computation framework on top of the distributed file system.

Architecture

Lab 1: Inode-based File System

Implements a basic file system using Inodes to manage file metadata and data blocks.
Provides foundational functionalities such as file creation, deletion, reading, and writing.

Lab 2: Distributed File System

The distributed file system consists of three main components:

Client:
- Requests file metadata (e.g., chunk IDs) from the Metadata Server.
- Retrieves file content directly from the Chunk Servers.
Metadata Server:
- Stores file metadata, including file-to-chunk mappings.
- Ensures atomicity of operations using Two-Phase Locking (2PL).
Chunk Servers:
- Store file chunks and ensure data consistency through versioning.
- Use Write-Ahead Logging (WAL) for crash recovery and durability.
- Trigger checkpoints to persist chunks to disk when the log reaches a threshold, preventing unbounded log growth.

Lab 3: Integration of Raft Consensus Protocol

Leader Election: Ensures a single leader manages the metadata server cluster.
Log Replication: Replicates operations across the metadata server cluster for fault tolerance.
Safety: Guarantees strong consistency and avoids split-brain scenarios.
Checkpointing: Removes old logs to prevent unbounded growth while maintaining system state.

Lab 4: Distributed Computation with MapReduce

Mapper: Processes input data and emits key-value pairs.
Reducer: Aggregates key-value pairs to produce final results.
Implements a word count task as an example of distributed computation.

Key Features

Scalability:
- Distributes file storage across multiple chunk servers.
- Supports large files by splitting them into chunks.
Fault Tolerance:
- Uses Raft for metadata server fault tolerance.
- Implements WAL and checkpointing for crash recovery.
Consistency:
- Ensures chunk versioning to avoid stale or inconsistent data.
- Uses Two-Phase Locking for atomic metadata operations.
Distributed Computation:
- Provides a MapReduce-like framework for distributed data processing.

Technologies Used

Programming Language: [Specify the language, e.g., Go, Java, etc.]
Consensus Protocol: Raft
Logging and Recovery: Write-Ahead Logging (WAL)
Distributed Computation: MapReduce-like framework
Concurrency Control: Two-Phase Locking (2PL)

Good luck to you in this distributed file system project! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
build_support		build_support
daemons		daemons
docs/lab1		docs/lab1
scripts/lab1		scripts/lab1
src		src
stress-test		stress-test
test		test
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
README.md		README.md
mypasswd		mypasswd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Distributed File System (DFS) - A GFS-like Architecture

Project Overview

Architecture

Lab 1: Inode-based File System

Lab 2: Distributed File System

Lab 3: Integration of Raft Consensus Protocol

Lab 4: Distributed Computation with MapReduce

Key Features

Technologies Used

About

Uh oh!

Releases

Packages

Languages

everparadise/chfs

Folders and files

Latest commit

History

Repository files navigation

Distributed File System (DFS) - A GFS-like Architecture

Project Overview

Architecture

Lab 1: Inode-based File System

Lab 2: Distributed File System

Lab 3: Integration of Raft Consensus Protocol

Lab 4: Distributed Computation with MapReduce

Key Features

Technologies Used

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages