This project is designed to retrieve data from an API, process it using Apache Kafka, and store the results in MinIO. It utilises Pandas for data manipulation and analysis. The setup employs Docker for containerisation, ensuring a consistent environment across different platforms.
The API used for this project is https://www.thecocktaildb.com/api.php
The tools used for this project are shown in the diagram that was created using drawio below:

This project is designed to retrieve data from an API, process it using Kafka, and store the results in MinIO. The setup utilizes Docker for containerization, ensuring a consistent environment across different platforms.
API_retrieval/
│
├── data/
│ │
│ └── minio_data/ # Directory for MinIO data
│
├── docker-compose.yml # Docker Compose file for orchestrating services
│
├── main.py # Main script for data retrieval and processing
│
├── blob_minio.py # Module for handling MinIO interactions
│
├── run.ps1 # PowerShell script for Windows setup
│
└── run.sh # Bash script for Linux setup
- Docker
- Docker Compose
- Python 3.10 or higher (will be set up automatically)
- Clone this repository:
git clone https://github.com/wiljav/API2CSV.git cd API2CSV - Make the script executable (On macOS/Linux only) :
chmod +x run.sh
- Run the appropriate script for your operating system:
- On Windows:
.\run.ps1
- On macOS/Linux:
./run.sh
- On Windows:
- Update the configuration in
main.pyto specify the API endpoint and MinIO bucket details. - The main script will run automatically after executing the setup script.
- Connection issues with MinIO: Ensure that MinIO is running and accessible at
localhost:9000. - Permission errors: If you encounter permission issues, check the ownership and permissions of the mounted directories.
Feel free to open issues or submit pull requests for any enhancements or bug fixes.
This project is licensed under the MIT License - see the LICENSE file for details.