Sign in
Log inSign up

Nodejs CSV Ingest to MySQL - Billion rows use case

Shahid Shaikh's photo
Shahid Shaikh
·Oct 7, 2017

Use case: I have a table storing file information such as path, size etc. I need to read this table and grab the file path. Read those CSV files in parallel and ingest in MySQL in parallel.

My design:

I somehow designed this.

  • Cron to read the file information table in every 30 minutes.
  • I read those files in parallel and start a stream to read those files.
  • Push each stream messages i.e file content in the Message queue to say RabbitMQ.
  • Attach multiple listeners say 4 at the other end of the queue and fetch 100 messages at once i.e 400 messages at once.
  • Perform the parallel MySQL query insertion and update the tables accordingly.

I need your suggestion and inputs to correct me if I am doing it wrong!

Thanks in advance.

Hassle-free blogging platform that developers and teams love.
  • Docs by Hashnode
    New
  • Blogs
  • AI Markdown Editor
  • GraphQL APIs
  • Open source Starter-kit

© Hashnode 2024 — LinearBytes Inc.

Privacy PolicyTermsCode of Conduct