BI tool such as Talend compared to JIRA import utility for data import ?

can we use BI tool such as Talend for data import instead of JIRA import utility(CSV import)

we are migrating data from clearDDTS to JIRA, where data exported from DDTS is in the form of CSV file, and the volume of records is 1 lacs apx , so is it recomended to use CSV import utlity or using any BI tool such as Talend is preferable ? please advice..

6 answers

2 votes

Yes, sorry, there's quite a lot there!

1. You need to replicate what Jira would do if you were using the importer. For each issue you raise, there is a unique ID in the database. Jira has a table saying "next numer is xxxxxx" and for each issue, it grabs a number and increments the counter so the next issue has a new number.

So... if your counter is currently (random number here for examples only) 1024, and you've got 100 issues to import, you need to number the issues 1025, 1026, 1027... 1124, and then set the counter to 1125 (or higher)

As well as the issue counter, you also need to increase the project Key counter. If you are importing these 100 issues into project ABC and your current issue list goes up to ABC-42, then you need to change the project key counter to ABC-142 to make sure the next issue created is higher than your imported issue numbers. Again, you'll need to number your issues based on the highest existing issue key too.

2. Possibly. Each issue has TWO keys as I've hinted above. One key is virtually invisible to the user and is simply a number, stored in the "ID" column in Jiraissues and it does not change. It's also the key you need for reading other issue related tables. The other is the more human project key of format <project-sequence> and that can change if you move an issue from one project to another.

3. Probably. If you are ONLY inserting really simple issue data, then you only need to touch jiraissue, the sequence table, and the os_workflow tables. If you want to import component, version, watcher or voter data on an issue, then you need to add lines to nodeassociation. If you want to import custom field data, then you need to add lines to customfieldvalue. Each table you touch requires you to increment counters in the sequence table.

So, in short, yes. If you want to insert Jira data into the database with an ETL or even raw SQL, you MUST insert data into a LOT of tables. It's absolutelycritical that you get every table right, and update the sequence data appropriately too.

Or the really short version - forget it. Use the importers. Please, please, please, use the importers.

Thanks a lot Nic , now i have much better understanding,

However just need coulple of clarification based on your explanation,i.e

I checked the schema of my Test set up but could not find any table called "sequence" ,

I guess when you are saying "Jira has a table saying "next numer is xxxxxx" ,

You are referring to "sequence table" ??

Also i could not find any table having cloumn "Project couneter" ?/

Regards,

Satyakam

I think Nic is referring to the table SEQUENCE_VALUE_ITEM.

but "SEQUENCE_VALUE_ITEM" table is not linked to any other table ,not even to jiraissue table ?>

And as per Nic, what understood is the sequence table should have entry like "next numer is xxxxxx" ..

Yes, it is not linked. You have to go through it by names e.g. for the sequence used in Table CWD_USER look for SEQ_NAME = User

Thanks Udo - I can never remember the name of that table, just that it's got sequence in the name.

You need to understand the counters/sequences in that table in great detail, because you will make a right mess if you get even one of them wrong.

As I said before, PLEASE, use the importers, not SQL

Nic, It's for sure, we are not going with ETL any more :),

But just for my understanding with Schema,

Is "Sequence_name=Action" is used for jira-Issue ID calculation ?

so, when you are saying "Jira has a table saying "next numer is xxxxxx" ,

in that case you are referring to "SEQUENCE_VALUE_ITEM" table , right ??

Ahh, good.

No, "sequence_name = action" is probably for the ID of the next comment. I can't remember which entry is for issue id. But yes, that's the table that keeps the counters in it (there's quite a lot of counters - fields, custom field values, comments, workflows, schemes, etc).

Seq_name ,seq_id
"OSWorkflowEntry",40600
"OAuthConsumer",10100
"ListenerConfig",10100
"OSCurrentStep",40600
"OSUser",10100
"OSGroup",10100
"OSMembership",10100
"Issue",40600
"CustomFieldValue",28800
"NotificationScheme",10100
"FieldLayoutItem",10300
"PortalPage",10100
"PortletConfiguration",10100
"GadgetUserPreference",10100
"FieldLayoutScheme",10100
"FieldLayoutSchemeEntity",10100
"FieldScreenScheme",10100
"FieldScreenSchemeItem",10200
"FieldScreen",10100
"IssueTypeScreenScheme",10100
"ProjectRole",10100
"IssueTypeScreenSchemeEntity",10200
"Notification",10200
"SharePermissions",10100
"Group",10010
"ExternalEntity",10
"UpgradeHistory",10300
"Avatar",10200
"SchemePermissions",10300
"UpgradeVersionHistory",10100
"IssueLinkType",10300
"IssueLink",10100
"Label",10100
"OSPropertyEntry",11300
"ServiceConfig",10800
"IssueSecurityScheme",10100
"FieldConfigSchemeIssueType",10600
"OptionConfiguration",10300
"FieldConfiguration",10600
"FieldConfigScheme",10600
"GenericConfiguration",10200
"FieldScreenTab",10200
"Project",10300
"ProjectRoleActor",10400
"ConfigurationContext",10600
"FileAttachment",10200
"ChangeGroup",10400
"ChangeItem",10400
"PluginVersion",10400
"ApplicationUser",10200
"User",10110
"UserAttribute",310
"Membership",10110
"Component",10100
"Workflow",10100
"WorkflowScheme",10100
"Action",10100
"WorkflowSchemeEntity",10100
"UserHistoryItem",10500
"FieldScreenLayoutItem",10600
"DraftWorkflow",10100
"CustomField",10500
"FieldLayout",10200

Hi Nic, this is details list of "SEQUENCE_VALUE_ITEM" table, r

Rquest you if u can identify which one corresponds to issue_ID & which one is for "Project issue counter "?

I would go for "Issue"

"Issue" it is. You can find the mapping of a table and entity name (which is used in the SEQUENCE_VALUE_ITEM table) in the entitymodel.xml under atlassian-jira/WEB-INF/classes/entitydefs folder.

Yes, issue for the issue counter.

The project counters are in the project table though.

This is exactly why you should use the importers - you don't need these details if you use them, and they're going to do a better, faster, safer job than you having to work all of this stuff out for yourself.

I recommend the csv importer, because using a ETL-tool has a couple of challenges.

  • you need to know the database scheme of your target in detail
  • you need to know how IDs in your target tables are generated (I guess it is handled by the application and not by the database)

All this is already handled by the csv importer, and I have not heard of a limit (even if I don't know what "1 lacs apx" means).

Cheers,

Udo

one additional point, when i have huge number of records to import, do u feel using ETL is error free/quicker ?

Thanks Brand,

(1 lac= 100000), hwever can u please give little explanatin on your second point ?

satya

You're welcome Satya.

The problem is when you insert values in a table you need to set the ID. I have not seen an after insert trigger which does that, so my guess is the generation of the ids are handled by JIRA itself. So you can't simply call a sequence while inserting records in a table. JIRA might still want to use Ids that you have inserted already.

Udo

so u r saying importing through ETL tool is not possible, because of above issue ??

I'm not saying it is impossible but difficult since I don't know how the IDs are generated and handled.

Thanks Brand for ur help,

can u please answer my previous question whether using ETL is having adventage like Error free/Time effective comapred to JIRA import utility ??

Since I have not used an ETL tool (for importing issues into JIRA), I can't tell you if there are any advantages.

1 votes

I would avoid an ETL, because you really do need to know the database in a LOT of detail to be able to import data.

To run you through a single issue import in a generic way, you will need to:

  • Stop Jira and get a full backup. (NEVER never never write to an active Jira database, and if you insist on writing to a shut-down one ALWAYS get a backup first)
  • Import the core data into jiraissues (reporter, assignee, summary, key, project etc)
  • Update the project issue counter to make sure the next issue to be created falls after the new one
  • Increment the issue-id counter in the sequence table (apologies, forgotton the exact name of this table)
  • For each custom field, look up names, match the id and insert one or many lines into customfieldvalue
  • Increment the ID counter for customfieldvalue in the sequence table
  • For versions, watchers, components, voters, insert lines into nodeassociation
  • insert records into the OS_workflow tables to get the workflow to be valid for the issue
  • Restart Jira and immediately re-index it.

Note that for an absolutely minimal import you have to hit four tables, and not just write, you need to lookup several pieces of data dynamically so you can get your data consistent.

I'm pretty sure I've forgotton quite a lot of other things you need to do as well - there are definitely tables missing from that.

First of all many thanks Nic for your detailed reply,

1- From your first reply "You need to work out next free value from existing data" i could not understand exactly what we need to do here ?

suppose we are importing thousand of records in one attempt then how feasible is the solution ?>

2- From your 2nd reply i understood, we need to update all these tables(mentioned by you) after each import..

Note: is issue-id counter different from ID of jiraissues table ?

Thanks,

Satyakam

First of all many thanks Nic for your detailed reply,

1- From your first reply "You need to work out next free value from existing data" i could not understand exactly what we need to do here ?

suppose we are importing thousand of records in one attempt then how feasible is the solution ?>

2- From your 2nd reply i understood, we need to update all these tables(mentioned by you) after each import..

is issue-id counter different from ID of jiraissues table ?

3- "For each custom field, look up names, match the id and insert one or many lines into customfieldvalue" ,

"For versions, watchers, components, voters, insert lines into nodeassociation"

Are you saying here that we manually need to insert lines for these two tables ?

Note: Are you saying all these steps need to be done to import one single issue ??

Thanks,Satyakam

Hi Satyakam,

To give you a simple answer to your question : Using ETL does only have disadvantages compared to using the build in JIRA import utility. So don't do it.

Udo clearly explained why.

Best regards,

Peter

Yeah correct, but i still not having a clear understanding regarding the second issue mentioned by Udo , i.e. problem when inserting a value to a table we need to set the ID"

does something related while mapping the field of source CSV to the corresponding column of the target JIRA table" ??

curious to know ...

Example: when using csv importer you fill table JIRAISSUE the field ID (PK) gets a value somehow by the application. When using ETL Tool you would need to get a source for this Field (it is not nullable), so how you want you fill it?

No. You will break your Jira. You need to work out the next free value from existing data. For every issue.

Udo, can we ignore the ID column while mapping the fields to JIRA column through ETL ??

You cannot, as Udo points out it's not nullable, so you can't ignore it.

No, same answer as Peter and Nic.

Satyakam,

Why do you keep on pushing this ETL solution when 3 people already explained you in detail that this is not the way to go?

Both Nic and Udo know what they are talking about, when they suggest not to do it with an ETL I would listen.

Best regards,

Peter

Satyakam,

Then I wish you good luck on your endeavour.

Maybe this documentation might help you too :

https://developer.atlassian.com/display/JIRADEV/Database+Schema

Best regards,

Peter

Peter,

Definitely i would go with the suggestion of not going with ETL,

However my questions are now more related to know the JIRA schema better ..

Satyakam

Thanks everybody, especially Nic/Brand for providing such a good understanding with respect to JIRA schema.

Suggest an answer

Log in or Join to answer
Community showcase
Sarah Schuster
Posted Jan 29, 2018 in Jira

What are common themes you've seen across successful & failed Jira Software implementations?

Hey everyone! My name is Sarah Schuster, and I'm a Customer Success Manager in Atlassian specializing in Jira Software Cloud. Over the next few weeks I will be posting discussion topics (8 total) to ...

3,104 views 13 19
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you
Atlassian Team Tour

Join us on the Team Tour

We're bringing product updates and pro tips on teamwork to ten cities around the world.

Save your spot