Community
Answers Developer Questions
Questions
How to find the intersection between groups?

How to find the intersection between groups?

To find out which users are in group A as well as in group B I can resolve the members of A and and see if they are members of B. This is all well until we have groups with thousands of users then the system nearly grinds to a halt.

So is there already a convenient way build into the user/group management to get all the users who are in group A AND B?

One thing that could help here would be to know which of the groups is the smaller one. Though it would still be bad if I have two very large groups.

Any help on this is appreciated.

EDIT:

Okay I found the query alternative, as it was mentioned here: https://answers.atlassian.com/questions/67/how-do-i-get-the-list-of-all-users</p<>>

I can build a query like this one:

PartialEntityQueryWithRestriction<User> query = QueryBuilder
                .queryFor(User.class, EntityDescriptor.user()).with(
                    Combine.allOf(
                        Restriction.on(GroupTermKeys.NAME).exactlyMatching(group1.getName()),
                        Restriction.on(GroupTermKeys.NAME).exactlyMatching(group2.getName())));

The only problem is, that I now need an EntityQuery to do the search with Crowd. I could add a returningAtMost(int) at the end but what if I really want all matching users? Do I have to repeat the query until users returned are less than the int? Does this still perform? And what would be a feasable limit?

5 answers

1 accepted

Comments for this post are closed

Community moderators have prevented the ability to post new answers.

Post a new question

2 votes

Answer accepted

Apart from using a different query, which I am not aware of, this is a place where an optimized algorithm can help a lot.

It sounds like you are looping through all the users in group A, and comparing each one with all the users in group B. If there are 'a' users in group A and 'b' users in group B this means you have a*b comparisons to do, which is quite large if you have a large number of users in either of the groups.

Instead of doing this, a simple box-sort style counting algorithm can be used. You first create a HashMap<User, Integer> userCount. Traverse the first group A, and increase the count for each useryou find (if they are not in the map, add them with value 1, otherwise increase the value by 1). Then, traverse the second list B and increase the count again. Finally, traverse the EntrySet of the map, any entry that has a count of two is in the intersection.

This method can be extended to finding the intersect of many groups, by repeating the process for each group and looking for the count to be the same as the number of groups you have.

For the case of only two groups you can increase performance by, when running through group B, simply checking to see which users are already in the hashmap.

The performance boost you will notice from this method is due to the quick insertion and lookup time that a hashmap provides. Looking up and inserting an entry in a hash map takes constant time, so with the most optimal method for the 2 group case finding the intersection of group A and B will take around a+b operations.