In mathematics, the intersection (denoted as ∩) of two sets A and B is the set that contains all elements of A that also belong to B.
I needed the javascript array equivalent, so that given an array A = [“a”,”b”,”c”, “d”] and B = [“b”, “d”, “e”], getIntersect(A, B) = [“b”, “d”]
1 2 3 4 5 6 7 8 9 10 11 12 | function getIntersect(arr1, arr2) { var temp = []; for(var i = 0; i < arr1.length; i++){ for(var k = 0; k < arr2.length; k++){ if(arr1[i] == arr2[k]){ temp.push( arr1[i]); break; } } } return temp; } |
Update:
In the comments, Jeffery suggests a version using a javascript hash table (also called an associative array or a map), which would be much faster especially if arr2 was much shorter than arr1. I will do some benchmarking with some live data (from the app I extracted this from) to see how badly my version does. (See below : he sunk me big time! use his version.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | function getIntersect(arr1, arr2) { var r = [], o = {}, l = arr2.length, i, v; for (i = 0; i < l; i++) { o[arr2[i]] = true; } l = arr1.length; for (i = 0; i < l; i++) { v = arr1[i]; if (v in o) { r.push(v); } } return r; } |
He is also too kind to point out one inefficient / bad practice in the code above : that I should have stored the lengths of the arrays in a variable, rather than recalculating it again and again. Nevermind …
And elsewhere, Pete pointed out the Underscore, a utility-belt library for JavaScript that provides a lot of the functional programming support, and its _.intersect function .
Further update:
Now with benchmark results and less smoke up my ass! (repeat after me : “test your assumptions before making an ass of yourself (again)”, now repeat 100 times).
I created the (very) simple benchmarking file that would actual work, a html file with embedded javascript with has the 2 functions for testing and 2 arrays : one with elements 4646 and one with 1103 elements.
The file is http://www.falsepositives.com/files/test_array_interect_file.html. There is also a version which iterates the test 100 tests, and shows the average result in a table as below.
I also rewrote the test from using firefox’s console.log and console.time functions to using document.write and some raw javascript to log the time it took for a given function to run so I could run this in different browsers. (more about this later)
So the raw results where that Jeffery’s version blew my version out of the water! in FF 3.3 mine ran ~ 85 to 90 ms and his ~1 to 2 ms !! So creating the hash and searching it are really really optimized.
The second surprise was that my suggested optimization of my version (“storing the lengths of the arrays in a variable”) added several ms of run time! I think this is because a) the lengths are already calculated and stored as a property of each array, and b) their is some overhead of storing the 2 length values as local variables. Since object properties are really hashs, and we have just learned that hashs are really really fast the result make some sense. Storing a value instead of recalculating it is still a “really good idea”, just not this time.
Having rewritten the test in more browser generic fashion here are the results for different browsers (times are in milliseconds) :
getIntersect of a1 , a2 | getIntersect of a2 , a1 | getIntersect JT of a1 , a2 | getIntersect JT of a2 , a1 | |
---|---|---|---|---|
FireFox 3.5.5 | 89.27 | 74.52 | 1.19 | 1.44 |
Safari 4.0.3 | 52.32 | 43.60 | 3.11 | 2.97 |
Safari on iPdod Touch (3.1.2) | 2049.00 | 1712.00 | 48.00 | 39.00 |
Chrome 4.0.223 | 63.91 | 53.00 | 0.82 | 1.45 |
IE 7.0.5730 | 3656.00 | 3068.60 | 9.60 | 9.40 |
Your mileage will vary. The absolute number are not as interesting as the relative values.
I have also run my test against Underscore, and its _.intersect function for some interesting results:
FireFox | 232.75 |
Safari | 137.22 |
Chrome | 42.26 |
IE | 1953 |
So not as fast as even my humble version, but keep in mind that the Underscore _.intersect function handls any number of arrays to intersect, and has 50 additional utility functions.
So what can we learn from this?
Hashes are very fast; Jeffery is smart; IE stinks; Ian needs to test his ASSumptions!
Here’s how I would do it (assuming both arrays do not contain nulls or undefines):
2
3
4
5
6
7
8
9
10
11
12
13
14
var r = [], o = {}, l = arr2.length, i, v;
for (i = 0; i < l; i++) {
o[arr2[i]] = true;
}
l = arr1.length;
for (i = 0; i < l; i++) {
v = arr1[i];
if (v in o) {
r.push(v);
}
}
return r;
}
I posted a stackoverflow question to try and expand this function to accept any number of arrays. Also tried to get a working example of JSON and Jquery usage.
http://stackoverflow.com/questions/9039395/convert-getintersectarr1-arr2-javascript-function-to-any-number-of-arguments