All add-ons currently require PHP 7.4 or greater.

On July 4th 2024 PHP 8.2 will be the new minimum requirement for all add-ons. Expect any add-on released after that date to require 8.2 or greater. Some releases may not immediately take advantage of 8.x specific features in PHP, which means you might, be able to continue using new releases in PHP 7.4, however, if you experience an error the first thing you should do is update to PHP 8.2 then create a support ticket if the error persists.

Please read about the changes to BoldMinded add-on licensing.

Ticket: Question about optimizing DataGrab speed when importing CSVs

Status Resolved
Add-on / Version DataGrab 4.0.4
Severity
EE Version 6.3.4

Jason Roxz

Jul 12, 2022

Greetings friends.

Question: Is there a way to speed up DataGrab to perform more like a direct SQL write?

- Yesterday I ran my first CSV import of 7966 entries - (first entries ever added to this fresh EE install)
- Brand new EE install with only Pro and DataGrab installed
- Brand new dedicated server with 0 load
- Each entry with 20 plain text fields (including title, url_title)
- All fields under 50 characters and most with 3 or fewer characters
- Tested with a 200 entry batch size

Log excerpt:
===============
...
22:06:25 07/11/2022 Added 200 entries
22:06:25 07/11/2022 Clearing all cache
22:06:25 07/11/2022 Begin Importing [u7Nk3e]
...
22:09:53 07/11/2022 Added 200 entries
22:09:53 07/11/2022 Clearing all cache
22:09:53 07/11/2022 Begin Importing [bR4cp9]
...
22:10:22 07/11/2022 Added 200 entries
22:10:22 07/11/2022 Clearing all cache
22:10:22 07/11/2022 Begin Importing [y2KY6g]
...
22:13:33 07/11/2022 Added 200 entries
22:13:33 07/11/2022 Clearing all cache
22:13:33 07/11/2022 Begin Importing [D2f4pK]
...
22:16:48 07/11/2022 Added 200 entries
22:16:48 07/11/2022 Clearing all cache
22:16:49 07/11/2022 Begin Importing [wRt70b]
...
===============

In all, the import took nearly 12 minutes – far too long for my live data use case.

I need near-instant delete/overwrite/updates for these 7500 - 8500 entries every 2 minutes to provide live data on my site – just like a regular SQL write. 

Am I spinning my wheels trying to make DataGrab do something it can’t? 
Should I be working out how to make EE work with an external database instead?

Thanks.

#1

BoldMinded (Brian)

Hi, Jason. To answer your question, no there is no way to update it to do direct SQL writes, and honestly I would be suspect of any 3rd party module that tried to do it this way. Not using the provided ORM/models would make such an add-on unstable and difficult to support.

https://docs.boldminded.com/datagrab/faqs#is-datagrab-4-is-slower-than-previous-versions

I suggest if you need near-instant database updates for 7000 entries every 2 minutes you look into a more robust pub/sub delivery system such as RabbitMQ or Amazon SQS. I’m not even sure I’d trust raw SQL queries to run in batches of 200 every 2 minutes for that kind of work load.

#2

Jason Roxz

Ok, rats. Thank you for the thorough reply.

Login to reply