Hello,
Well, my question is more theoretical as I do not have code to show and possibly look for the mistakes.
To explain a bit. We have set up ETL to get all the data we have on JIRA(about 60k rows).
The problem is that when we run the ETL day by day there are lots of differences, last time it was in 10k records - this is very relevant to the dates based on the statuses in the history(in development date, under consideration date, ready to ship date, etc.)
Sometimes for some issues we get the correct dates, the next day they are "null", the day after they are again correct, but we are never able to capture the whole picture.
Is it possibly because we make around 60k calls to the API for every single issue in the history?
Or maybe the problem is somewhere else.
One more question, can we make an export for 60k rows at once? If yes, how and if no, what is the alternative?
Thanks for you input!
Best,
My guess would be that your "ETL" is not stable - either it, or your network, are not completing all of the calls being made. 60,000 calls for one issue is a massive load and I would have to say you're simply going to get errors when you overload a system like that. There's no question that that many calls to get data is a terrible design. If you're doing a once-a-day task, you should be making one call a day, not 60,000
I would ask why you are doing this? What is the question you are trying to answer with this ETL?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.