Skip to content

Dee-Bajaj/meter-ingestion-service

Repository files navigation

meter-ingestion-service

Backend service to ingest NEM12 CSV meter readings at scale.

What is the rationale for the technologies you have decided to use?

  • Kotlin + Spring Boot: I chose Kotlin for its concise syntax and modern features, and Spring Boot for its wide adoption in building REST APIs. Also because this is the tech stack I used for my last project, so I was most the comfortable with this.
  • Flyway: To manage database schema changes in a reliable and version-controlled way.
  • Docker Compose: To spin things up locally without fuss.
  • JPA (Hibernate): For seamless ORM support and easier data persistence.
  • JUnit5 + Mockito: For unit and integration testing.

What would you have done differently if you had more time?

  • Added more robust validations and schema-based parsing
  • Add a UI/status endpoint for feedback.
  • Consider using a queue like kafka to handle large file sizes .We can push the parsed rows to kafka instead of holding them in memory. This queue can act like a buffer to hold the data and ingestion can happen asynchronously (separate from parsing logic). This can improve api response time.
  • Consider automating the file upload process, by using airflow jobs (or something similar). airflow DAG when triggered can call the upload endpoint. This can happen at a scheduled interval (maybe next due date for readings).
  • Add datadog for monitoring and alerting.
  • Add API documentation with swagger
  • I missed to add security, I would definitely add that in a production grade setup.

What is the rationale for the design choices that you have made?

  • Modular design - I have tried to keep the design modular, keeping modules like parser, validator independent making reusability and testability easier.
  • Testing- I have tried to keep test cases as simple to read as possible so that they can we used as dev documentation as well.
  • Validation & Error Reporting - In case of file structure issues there is full rejection, in other cases there is partial rejection and proper error report is returned to make it easy to retry.
  • Batch upsert - To avoid overwhelming the DB and reduce memory pressure, I added a chunked saveAll logic to insert records in batches.
  • Re-uploading - I am avoiding re insertion of rows in case of re-uploads.

About

Backend service to ingest NEM12 CSV meter readings at scale.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages