It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

I am getting duplicate repo list while working with get repolist API

Hi,

 

I was working with get repolist API and suddenly the API started giving me a duplicate repo list. Here is the scenario I have 100+ repository in my account and when I hit the API it gives me 100 projects/repository list as I have set pagelen as 100. Now when I am using the next URL from the repository list response it give me duplicate repo list which is already given me in the previous object or you can say on 1st page. 

Can someone please advise? I am using this URL: https://bitbucket.org/!api/2.0/repositories?sort=-updated_on&access_token=<TOKEN>&role=admin&pagelen=100

1 answer

0 votes
Daniil Penkin Atlassian Team Monday

Hello @Deepak Govindram Kumbhar,

Thanks for reaching out.

You're sorting the repositories by updated_on property, and I believe what you observed was caused by some repositories update which moved them to the top of the list you're fetching (to its first page) and hence pushed all other repositories down the list and made them drift to the next pages and re-appear in your results.

In fact, with a naive pagination you can get duplicates or misses no matter which property you sort on. For instance, imagine that you sort by an immutable value like repository UUID. Now, if a new repository was created while you're traversing pages, and it happened to have a UUID assigned so that it appears on one of the pages you've already fetched, it'll "push" all repositories with larger UUID forward, and you'll get a duplicate at the next page you fetch. A miss will happen if a repo from the past page was deleted.

Unfortunately, there's no cursor API available for this endpoint. However, you can still make page traversal somewhat consistent, i.e. avoid duplicates and misses. Instead of pagination use BBQL query like this:

uuid > "{uuid_of_the_last_item_on_the_last_fetched_page}

So as an example, here's how I fetch the first page (I also limited the payload to just UUIDs using fields parameter):

https://api.bitbucket.org/2.0/repositories/atlassian?fields=values.uuid,values.name&sort=uuid

Now, that page has last item with UUID {046f666c-d011-41f5-b70a-7480aa02798e}, so to fetch the next page I add a query like I shown above (it's URL encoded so is not easy to read):

https://api.bitbucket.org/2.0/repositories/atlassian?fields=values.uuid,values.name&sort=uuid&q=uuid%20%3E%20%22%7B046f666c-d011-41f5-b70a-7480aa02798e%7D%22

Hope this helps. Let me know if you have any questions.

Cheers,
Daniil 

Hi Daniil,

Thanks for your reply, but in this case, I will loose my updated_on sorting. Any suggestions for this?

Daniil Penkin Atlassian Team Monday

Not really. Pagination for this particular endpoint doesn't work like a cursor. When changes happen, you see them immediately which means while traversing you might miss repositories that have just been updated (cause they jumped from a future page to the first page) and get duplicates (because those jumped repos shifted all other repositories.

Any suggestions for this?

Well, this depends on what you're trying to use the data you fetch for. For instance, if the goal is to index all repositories, I'd fetch them in a way I described above and then sort by updated date on the client side. What's your use case?

Though there are no new commits while fetching the repository list, then only we are getting duplicates in next page.

Suggest an answer

Log in or Sign up to answer
Community showcase
Posted in Bitbucket

Share your software development horror stories!

Hey Community! I work on the Bitbucket product marketing team. With Halloween approaching, we wanted to discuss a topic tailor-made for October: development horror stories. Whether it was a lurk...

1,549 views 11 3
Join discussion

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you