BI tool such as Talend compared to JIRA import utility for data import ?

satyakam panigrahi October 30, 2013

can we use BI tool such as Talend for data import instead of JIRA import utility(CSV import)

we are migrating data from clearDDTS to JIRA, where data exported from DDTS is in the form of CSV file, and the volume of records is 1 lacs apx , so is it recomended to use CSV import utlity or using any BI tool such as Talend is preferable ? please advice..

6 answers

2 votes
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 30, 2013

Yes, sorry, there's quite a lot there!

1. You need to replicate what Jira would do if you were using the importer. For each issue you raise, there is a unique ID in the database. Jira has a table saying "next numer is xxxxxx" and for each issue, it grabs a number and increments the counter so the next issue has a new number.

So... if your counter is currently (random number here for examples only) 1024, and you've got 100 issues to import, you need to number the issues 1025, 1026, 1027... 1124, and then set the counter to 1125 (or higher)

As well as the issue counter, you also need to increase the project Key counter. If you are importing these 100 issues into project ABC and your current issue list goes up to ABC-42, then you need to change the project key counter to ABC-142 to make sure the next issue created is higher than your imported issue numbers. Again, you'll need to number your issues based on the highest existing issue key too.

2. Possibly. Each issue has TWO keys as I've hinted above. One key is virtually invisible to the user and is simply a number, stored in the "ID" column in Jiraissues and it does not change. It's also the key you need for reading other issue related tables. The other is the more human project key of format <project-sequence> and that can change if you move an issue from one project to another.

3. Probably. If you are ONLY inserting really simple issue data, then you only need to touch jiraissue, the sequence table, and the os_workflow tables. If you want to import component, version, watcher or voter data on an issue, then you need to add lines to nodeassociation. If you want to import custom field data, then you need to add lines to customfieldvalue. Each table you touch requires you to increment counters in the sequence table.

So, in short, yes. If you want to insert Jira data into the database with an ETL or even raw SQL, you MUST insert data into a LOT of tables. It's absolutelycritical that you get every table right, and update the sequence data appropriately too.

Or the really short version - forget it. Use the importers. Please, please, please, use the importers.

satyakam panigrahi November 4, 2013

Thanks a lot Nic , now i have much better understanding,

However just need coulple of clarification based on your explanation,i.e

I checked the schema of my Test set up but could not find any table called "sequence" ,

I guess when you are saying "Jira has a table saying "next numer is xxxxxx" ,

You are referring to "sequence table" ??

Also i could not find any table having cloumn "Project couneter" ?/

Regards,

Satyakam

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 4, 2013

I think Nic is referring to the table SEQUENCE_VALUE_ITEM.

satyakam panigrahi November 4, 2013

but "SEQUENCE_VALUE_ITEM" table is not linked to any other table ,not even to jiraissue table ?>

And as per Nic, what understood is the sequence table should have entry like "next numer is xxxxxx" ..

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 4, 2013

Yes, it is not linked. You have to go through it by names e.g. for the sequence used in Table CWD_USER look for SEQ_NAME = User

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 4, 2013

Thanks Udo - I can never remember the name of that table, just that it's got sequence in the name.

You need to understand the counters/sequences in that table in great detail, because you will make a right mess if you get even one of them wrong.

As I said before, PLEASE, use the importers, not SQL

satyakam panigrahi November 4, 2013

Nic, It's for sure, we are not going with ETL any more :),

But just for my understanding with Schema,

Is "Sequence_name=Action" is used for jira-Issue ID calculation ?

so, when you are saying "Jira has a table saying "next numer is xxxxxx" ,

in that case you are referring to "SEQUENCE_VALUE_ITEM" table , right ??

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 4, 2013

Ahh, good.

No, "sequence_name = action" is probably for the ID of the next comment. I can't remember which entry is for issue id. But yes, that's the table that keeps the counters in it (there's quite a lot of counters - fields, custom field values, comments, workflows, schemes, etc).

satyakam panigrahi November 4, 2013

Seq_name ,seq_id
"OSWorkflowEntry",40600
"OAuthConsumer",10100
"ListenerConfig",10100
"OSCurrentStep",40600
"OSUser",10100
"OSGroup",10100
"OSMembership",10100
"Issue",40600
"CustomFieldValue",28800
"NotificationScheme",10100
"FieldLayoutItem",10300
"PortalPage",10100
"PortletConfiguration",10100
"GadgetUserPreference",10100
"FieldLayoutScheme",10100
"FieldLayoutSchemeEntity",10100
"FieldScreenScheme",10100
"FieldScreenSchemeItem",10200
"FieldScreen",10100
"IssueTypeScreenScheme",10100
"ProjectRole",10100
"IssueTypeScreenSchemeEntity",10200
"Notification",10200
"SharePermissions",10100
"Group",10010
"ExternalEntity",10
"UpgradeHistory",10300
"Avatar",10200
"SchemePermissions",10300
"UpgradeVersionHistory",10100
"IssueLinkType",10300
"IssueLink",10100
"Label",10100
"OSPropertyEntry",11300
"ServiceConfig",10800
"IssueSecurityScheme",10100
"FieldConfigSchemeIssueType",10600
"OptionConfiguration",10300
"FieldConfiguration",10600
"FieldConfigScheme",10600
"GenericConfiguration",10200
"FieldScreenTab",10200
"Project",10300
"ProjectRoleActor",10400
"ConfigurationContext",10600
"FileAttachment",10200
"ChangeGroup",10400
"ChangeItem",10400
"PluginVersion",10400
"ApplicationUser",10200
"User",10110
"UserAttribute",310
"Membership",10110
"Component",10100
"Workflow",10100
"WorkflowScheme",10100
"Action",10100
"WorkflowSchemeEntity",10100
"UserHistoryItem",10500
"FieldScreenLayoutItem",10600
"DraftWorkflow",10100
"CustomField",10500
"FieldLayout",10200

Hi Nic, this is details list of "SEQUENCE_VALUE_ITEM" table, r

Rquest you if u can identify which one corresponds to issue_ID & which one is for "Project issue counter "?

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 4, 2013

I would go for "Issue"

Jobin Kuruvilla [Adaptavist]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 4, 2013

"Issue" it is. You can find the mapping of a table and entity name (which is used in the SEQUENCE_VALUE_ITEM table) in the entitymodel.xml under atlassian-jira/WEB-INF/classes/entitydefs folder.

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 4, 2013

Yes, issue for the issue counter.

The project counters are in the project table though.

This is exactly why you should use the importers - you don't need these details if you use them, and they're going to do a better, faster, safer job than you having to work all of this stuff out for yourself.

1 vote
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 30, 2013

I would avoid an ETL, because you really do need to know the database in a LOT of detail to be able to import data.

To run you through a single issue import in a generic way, you will need to:

  • Stop Jira and get a full backup. (NEVER never never write to an active Jira database, and if you insist on writing to a shut-down one ALWAYS get a backup first)
  • Import the core data into jiraissues (reporter, assignee, summary, key, project etc)
  • Update the project issue counter to make sure the next issue to be created falls after the new one
  • Increment the issue-id counter in the sequence table (apologies, forgotton the exact name of this table)
  • For each custom field, look up names, match the id and insert one or many lines into customfieldvalue
  • Increment the ID counter for customfieldvalue in the sequence table
  • For versions, watchers, components, voters, insert lines into nodeassociation
  • insert records into the OS_workflow tables to get the workflow to be valid for the issue
  • Restart Jira and immediately re-index it.

Note that for an absolutely minimal import you have to hit four tables, and not just write, you need to lookup several pieces of data dynamically so you can get your data consistent.

I'm pretty sure I've forgotton quite a lot of other things you need to do as well - there are definitely tables missing from that.

RAMANIGANTH_VELUR October 30, 2013

First of all many thanks Nic for your detailed reply,

1- From your first reply "You need to work out next free value from existing data" i could not understand exactly what we need to do here ?

suppose we are importing thousand of records in one attempt then how feasible is the solution ?>

2- From your 2nd reply i understood, we need to update all these tables(mentioned by you) after each import..

Note: is issue-id counter different from ID of jiraissues table ?

Thanks,

Satyakam

satyakam panigrahi October 30, 2013

First of all many thanks Nic for your detailed reply,

1- From your first reply "You need to work out next free value from existing data" i could not understand exactly what we need to do here ?

suppose we are importing thousand of records in one attempt then how feasible is the solution ?>

2- From your 2nd reply i understood, we need to update all these tables(mentioned by you) after each import..

is issue-id counter different from ID of jiraissues table ?

3- "For each custom field, look up names, match the id and insert one or many lines into customfieldvalue" ,

"For versions, watchers, components, voters, insert lines into nodeassociation"

Are you saying here that we manually need to insert lines for these two tables ?

Note: Are you saying all these steps need to be done to import one single issue ??

Thanks,Satyakam

1 vote
Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 30, 2013

I recommend the csv importer, because using a ETL-tool has a couple of challenges.

  • you need to know the database scheme of your target in detail
  • you need to know how IDs in your target tables are generated (I guess it is handled by the application and not by the database)

All this is already handled by the csv importer, and I have not heard of a limit (even if I don't know what "1 lacs apx" means).

Cheers,

Udo

satyakam panigrahi October 30, 2013

one additional point, when i have huge number of records to import, do u feel using ETL is error free/quicker ?

satyakam panigrahi October 30, 2013

Thanks Brand,

(1 lac= 100000), hwever can u please give little explanatin on your second point ?

satya

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 30, 2013

You're welcome Satya.

The problem is when you insert values in a table you need to set the ID. I have not seen an after insert trigger which does that, so my guess is the generation of the ids are handled by JIRA itself. So you can't simply call a sequence while inserting records in a table. JIRA might still want to use Ids that you have inserted already.

Udo

satyakam panigrahi October 30, 2013

so u r saying importing through ETL tool is not possible, because of above issue ??

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 30, 2013

I'm not saying it is impossible but difficult since I don't know how the IDs are generated and handled.

satyakam panigrahi October 30, 2013

Thanks Brand for ur help,

can u please answer my previous question whether using ETL is having adventage like Error free/Time effective comapred to JIRA import utility ??

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 30, 2013

Since I have not used an ETL tool (for importing issues into JIRA), I can't tell you if there are any advantages.

0 votes
satyakam panigrahi November 4, 2013

Thanks everybody, especially Nic/Brand for providing such a good understanding with respect to JIRA schema.

0 votes
Peter Van de Voorde
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 4, 2013

Satyakam,

Why do you keep on pushing this ETL solution when 3 people already explained you in detail that this is not the way to go?

Both Nic and Udo know what they are talking about, when they suggest not to do it with an ETL I would listen.

Best regards,

Peter

Peter Van de Voorde
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 4, 2013

Satyakam,

Then I wish you good luck on your endeavour.

Maybe this documentation might help you too :

https://developer.atlassian.com/display/JIRADEV/Database+Schema

Best regards,

Peter

satyakam panigrahi November 4, 2013

Peter,

Definitely i would go with the suggestion of not going with ETL,

However my questions are now more related to know the JIRA schema better ..

Satyakam

0 votes
Peter Van de Voorde
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 30, 2013

Hi Satyakam,

To give you a simple answer to your question : Using ETL does only have disadvantages compared to using the build in JIRA import utility. So don't do it.

Udo clearly explained why.

Best regards,

Peter

satyakam panigrahi October 30, 2013

Yeah correct, but i still not having a clear understanding regarding the second issue mentioned by Udo , i.e. problem when inserting a value to a table we need to set the ID"

does something related while mapping the field of source CSV to the corresponding column of the target JIRA table" ??

curious to know ...

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 30, 2013

Example: when using csv importer you fill table JIRAISSUE the field ID (PK) gets a value somehow by the application. When using ETL Tool you would need to get a source for this Field (it is not nullable), so how you want you fill it?

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 30, 2013

No. You will break your Jira. You need to work out the next free value from existing data. For every issue.

satyakam panigrahi October 30, 2013

Udo, can we ignore the ID column while mapping the fields to JIRA column through ETL ??

Peter Van de Voorde
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 30, 2013

You cannot, as Udo points out it's not nullable, so you can't ignore it.

Udo Brand
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 30, 2013

No, same answer as Peter and Nic.

Suggest an answer

Log in or Sign up to answer