Publisher does not support the Fluid field type. Please do not contact asking when support will be available.

If you purchased an add-on from expressionengine.com, be sure to visit boldminded.com/claim to add the license to your account here on boldminded.com.

Ticket: Datagrab missing important CartThrob fields on import of orders

Status	Resolved
Add-on / Version	DataGrab 4.2.1
Severity
EE Version	7.2.0

Niki Wolf

Dec 16, 2022

It seems that Datagrab is leaving out several columns of data (lives in exp_cartthrob_order_items table) that CartThrob needs for an order entry: price_plus_tax, site_id, entry_date, discount & item_number.

I honestly don’t know if that was an issue in old Datagrab or not as I hadn’t needed to do these type of imports before. Only six things are asking to be mapped for that section, none of which are the missing fields, and two of them aren’t even in our actual ‘Store - Orders’ entries - those are ‘modifier’ and ‘size’. I left these blank as we do not use these fields, if they still exist in CT, but it still tried to add them into the order-item line of the order entry.

The last two fields may be unique to our build, and as such may be stored in the ‘extra’ column in the exp_cartthrob_order_items table as some sort of string.

The primary issue here is that CT requires that those first mentioned fields not be null, so we end up with SQL/PHP errors when trying to save the entry. I’m not sure what other unforeseen issues it could cause with those fields missing.

Additionally, it seems only single line items are being imported to the order entry. For example, there are several orders that have more than one different item, but the import only picks the _last_ one it sees.

The staging server has an import set up for testing purposes named ‘orders with items’; it’s already pulled in all of the latest orders as of 12/13, but I am sure there have been more since that date I can extract from the production server and create a new CSV if necessary.

Thanks.

BoldMinded (Brian)

Niki, I haven’t logged in yet but can you do me a favor before I do? Clone the existing import and configure it to use a smaller import file with only 2 or 3 entries to replicate the issue.

TBH I’m not really understanding what the Order Items field is and what it’s supposed to do. It looks like it does not have any configuration options in EE, and DataGrab is hard-coded to look for the title, quantity, price, size, and modifier values, so I don’ know where item_number, discount et care coming from in your first screenshot.

It might be worth sharing this ticket with the CT support team (e.g. Eric) so maybe he can explain what is happening.

Niki Wolf

I managed to hack the datagrab_cartthrob_order_items.php file to get the fields we need, with the exception of the ‘extra’ column - while it did pull in data, it wasn’t what was in the CSV’s cell for that column, it was more-than what was there, but, I can probably get around that with some manual legwork of copy/pasting the items from prod > staging for that column.

Basically, I took the “size” and “modifiers” parts, since I don’t need those, and replaced them with the fields that I need pulled. Since I technically only need this before we move the staging server to become prod, I am good with this work around, and will put that file back to normal after I’m done with the import.

However, that still leaves the matter of the additional items being left out when there are more than one “row” of items on an order.

So what I can do related to your request for a smaller file & cloned import, is I’ll look for new orders since I pulled the import I am working with now, and target those with extra items per order, and a few without.

Oddly in my local testing, there was ONE order that made a feeble attempt to pull in a second item, but all it managed to get, data-wise, was the row_order & order entry number it was associated with, and its title. It left out literally everything else (price, quantity, etc.). I don’t know why that one seemed to make the attempt but none of the others that had multiple items did not.

Anyway, I’ll update again once I’ve got the new CSV and cloned import ready for you.

Niki Wolf

Ok, there’s a new import there, it’s named ‘orders with items [Clone]’, and has 22 items (23 lines incl. header). There are at least five orders that have multiple items - orders# 2048, 2049, 2050, 2054 & 2056. The rest are single item orders.

Thanks!

Niki Wolf

Of course I realize now why it is probably doing the single-item per order, because I told it to check for ‘Title’ for duplicate entries, and the rows where there is more than one item ordered share the same titles… so it’s overwriting the items in the order based on that. I’m not entirely sure how this would be overcome, it’s not a typical duplicate title matter where we would need a new URL title, we just need the extra item edited into to the existing entry without deleting the ‘order item’ of the first round. Hmmm…

Niki Wolf

I wonder if the ‘row_order’ column (column “LO” if the CSV is opened in a spreadsheet viewer like Excel or Numbers, otherwise I think it is column 327 in my CSV) could be used for that?

BoldMinded (Brian)

I’d still like to have Eric consult on this ticket if possible since I’m not really aware of what the Order Items field is supposed to do or how it’s supposed to work. I suspect that the code in DataGrab is outdated and CT possibly changed/updated/enhanced how this field works.

Niki Wolf

I asked Eric to weigh in.. it sounds like I would need to use XML instead of CSV so that the multi-layer data could be addressed in the proper way … however there is an added issue wherein there is yet another channel that CT uses in conjunction with orders - Purchased Items, which are also associated with EE Entry IDs (of the Orders), and DG is only capable of applying data to one channel at a time as far as I am aware. In order to keep the entry IDs in sync between channels, it’s looking like DG isn’t built for that, which is fine.

So we are going to try a different tactic with the upgrade to minimize the things we need to import to just the products, which is really the primary focus in the end. I’ll run some tests on those to make sure we’re all squared away there.

Thanks for your time on this!

Niki Wolf

Ok I did some more testing this week, just testing the big imports that are critical to the client’s business. They work, but with a twist that I don’t understand.

The first CSV has 3000 rows. Row 1 is the column headers, and then there are 2999 items to import.

At first I set the command to run this:

php system/ee/eecli.php import:run --id=2 --pass_key=1234 --limit=1500 --offset=0 --batch=no

But oddly, it would skip seven entries.

I changed the limit to 1507, and it gets them properly. I set the next half of the run to the same limit of 1507 (just in case), and offset to 1501, and it gets the rest. The end result is a log which, after removing the DG lines ‘begin importing’ and the ‘hook’ lines, as well as the starter and end ‘complete’ lines, I have 2999 entries.

I just wanted to pass this along before I close the ticket in case it is something you want to address. I am not sure if it is because of the header row that messes it up initially, it starts thinking it should be an entry to import, and I get this at the start of that log:

20:52:22 12/28/2022 Begin Importing [U_WebProductName]
20:52:22 12/28/2022 [U_WebProductName] is a new entry
20:52:22 12/28/2022 Import error with entry U_WebProductName with the status field: Invalid Selection
20:52:22 12/28/2022 Import error with entry U_WebProductName with the field_id_74 field: This field must contain only numbers.
20:52:22 12/28/2022 Import error with entry U_WebProductName with the field_id_76 field: Invalid Selection
20:52:22 12/28/2022 Import error with entry U_WebProductName with the field_id_119 field: Invalid Selection
20:52:22 12/28/2022 Import error with entry U_WebProductName with the field_id_168 field: Invalid Selection

That is precisely seven lines which equals the seven skipped rows of items - whether that is coincidence or the reason for what I am seeing, I am not sure. The second CSV I tested today had 1264 rows, none of which was a header row, and it ran 1264 imports perfectly fine.

I think there is perhaps a setting on the CSV imports that maybe is being ignored, but at the same time I do not see a way to edit it, and that is the ‘skip’ option. On both of the imports it is set to ‘1’. I presume this is intended to be some ‘autodetect’ feature that would skip the first row of an import, but it is not doing so, nor would I want it to do so on the second, smaller CSV as it has no header row?

Meanwhile my next test after I upgrade EE to its latest version, is going to be setting the offset to 1 on the first command to see if I can get the limit to run precisely the number I want it to stop at - which in this case will be 1499 (skip 1, run 1499 rows to import).

Niki Wolf

A quick follow up.

Using the offset of 1 does help skip the header row on the first run. However, it is still skipping entries somehow, and only ending up with updating 1493 entries instead of 1499 - so it is missing six. When I check the log the last line where it stopped importing is line 1494 (so that would be 1493 entries imported after the offset of 1 to remove the header row).

Next I set a limit of 1506 to see what line it stops at. It stops at 1500, which is the target row.

It’s a strange bug but I can work with it by tweaking the offset and limit numbers for the time being.

#10

BoldMinded (Brian)

I think in this case you might have to tweak the import numbers b/c if you’re running the import 3 different times and it’s skipping 6 entries consistently, then that skip value of 1 would account for why those 6 are skipped. The 2nd and 3rd time the import is run they don’t know what was previously run, so they can’t automatically know to not skip the first row and they’re defaulting to skipping the first row b/c it assumes it’s the title row each time you run it.

#11

BoldMinded (Brian)

Comment has been marked private.